Python Regex: Find specific phrase in any form in text (including if followed by . or ,)










2















I'm trying to find when a specific product name is mentioned in customer notes (i.e. un-standardized, messy text). The product name is "Lending QB." Within the text, the product name can appear in any of the follow ways:



str1 ='Lending QB is a great product.'
str2 ='lending qb is great.'
str3 ='I don't think lendingqb is great.'
str4 ='I like Lending QB, but not always.'
str5 ='The best product is Lending qb.'


Here is the regex that mostly works:



df['lendingQB'] = df['Text'].str.findall('(?i)(?<!S)lendings?qb(?!S)', re.IGNORECASE)


Using regex101.com to test, and confirming within my Python program, I can capture the product name in strings (str) 1-3, but not 4 and 5; which makes me believe the issue is with not finding the product name when it's followed by a punctuation mark.



My understanding is the S would include commas and periods.



I tried adding |[,.] to the regex but then nothing matches:



'(?i)(?<!S)lendings?qb(?!S|[,.])'


(I realize the IGNORECASE is redundant, but to test with regex101.com, I added the "(?i)")



Any suggestions?



AC










share|improve this question






















  • Just to note, if you use any boundary, it is possible to not match a product name. That regex is (?i)lendings?qb. Using a boundary actually qualifies what you want to match. So, in that sense no answer here is even close to your objective. Just saying .... Also, a simple underscore _ in front/behind your product name will not get matched using (?<!S) and b. So beware when you think something is actually robust, it isn't.

    – sln
    Nov 15 '18 at 22:07
















2















I'm trying to find when a specific product name is mentioned in customer notes (i.e. un-standardized, messy text). The product name is "Lending QB." Within the text, the product name can appear in any of the follow ways:



str1 ='Lending QB is a great product.'
str2 ='lending qb is great.'
str3 ='I don't think lendingqb is great.'
str4 ='I like Lending QB, but not always.'
str5 ='The best product is Lending qb.'


Here is the regex that mostly works:



df['lendingQB'] = df['Text'].str.findall('(?i)(?<!S)lendings?qb(?!S)', re.IGNORECASE)


Using regex101.com to test, and confirming within my Python program, I can capture the product name in strings (str) 1-3, but not 4 and 5; which makes me believe the issue is with not finding the product name when it's followed by a punctuation mark.



My understanding is the S would include commas and periods.



I tried adding |[,.] to the regex but then nothing matches:



'(?i)(?<!S)lendings?qb(?!S|[,.])'


(I realize the IGNORECASE is redundant, but to test with regex101.com, I added the "(?i)")



Any suggestions?



AC










share|improve this question






















  • Just to note, if you use any boundary, it is possible to not match a product name. That regex is (?i)lendings?qb. Using a boundary actually qualifies what you want to match. So, in that sense no answer here is even close to your objective. Just saying .... Also, a simple underscore _ in front/behind your product name will not get matched using (?<!S) and b. So beware when you think something is actually robust, it isn't.

    – sln
    Nov 15 '18 at 22:07














2












2








2








I'm trying to find when a specific product name is mentioned in customer notes (i.e. un-standardized, messy text). The product name is "Lending QB." Within the text, the product name can appear in any of the follow ways:



str1 ='Lending QB is a great product.'
str2 ='lending qb is great.'
str3 ='I don't think lendingqb is great.'
str4 ='I like Lending QB, but not always.'
str5 ='The best product is Lending qb.'


Here is the regex that mostly works:



df['lendingQB'] = df['Text'].str.findall('(?i)(?<!S)lendings?qb(?!S)', re.IGNORECASE)


Using regex101.com to test, and confirming within my Python program, I can capture the product name in strings (str) 1-3, but not 4 and 5; which makes me believe the issue is with not finding the product name when it's followed by a punctuation mark.



My understanding is the S would include commas and periods.



I tried adding |[,.] to the regex but then nothing matches:



'(?i)(?<!S)lendings?qb(?!S|[,.])'


(I realize the IGNORECASE is redundant, but to test with regex101.com, I added the "(?i)")



Any suggestions?



AC










share|improve this question














I'm trying to find when a specific product name is mentioned in customer notes (i.e. un-standardized, messy text). The product name is "Lending QB." Within the text, the product name can appear in any of the follow ways:



str1 ='Lending QB is a great product.'
str2 ='lending qb is great.'
str3 ='I don't think lendingqb is great.'
str4 ='I like Lending QB, but not always.'
str5 ='The best product is Lending qb.'


Here is the regex that mostly works:



df['lendingQB'] = df['Text'].str.findall('(?i)(?<!S)lendings?qb(?!S)', re.IGNORECASE)


Using regex101.com to test, and confirming within my Python program, I can capture the product name in strings (str) 1-3, but not 4 and 5; which makes me believe the issue is with not finding the product name when it's followed by a punctuation mark.



My understanding is the S would include commas and periods.



I tried adding |[,.] to the regex but then nothing matches:



'(?i)(?<!S)lendings?qb(?!S|[,.])'


(I realize the IGNORECASE is redundant, but to test with regex101.com, I added the "(?i)")



Any suggestions?



AC







python regex






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 20:12









AmandaAmanda

575




575












  • Just to note, if you use any boundary, it is possible to not match a product name. That regex is (?i)lendings?qb. Using a boundary actually qualifies what you want to match. So, in that sense no answer here is even close to your objective. Just saying .... Also, a simple underscore _ in front/behind your product name will not get matched using (?<!S) and b. So beware when you think something is actually robust, it isn't.

    – sln
    Nov 15 '18 at 22:07


















  • Just to note, if you use any boundary, it is possible to not match a product name. That regex is (?i)lendings?qb. Using a boundary actually qualifies what you want to match. So, in that sense no answer here is even close to your objective. Just saying .... Also, a simple underscore _ in front/behind your product name will not get matched using (?<!S) and b. So beware when you think something is actually robust, it isn't.

    – sln
    Nov 15 '18 at 22:07

















Just to note, if you use any boundary, it is possible to not match a product name. That regex is (?i)lendings?qb. Using a boundary actually qualifies what you want to match. So, in that sense no answer here is even close to your objective. Just saying .... Also, a simple underscore _ in front/behind your product name will not get matched using (?<!S) and b. So beware when you think something is actually robust, it isn't.

– sln
Nov 15 '18 at 22:07






Just to note, if you use any boundary, it is possible to not match a product name. That regex is (?i)lendings?qb. Using a boundary actually qualifies what you want to match. So, in that sense no answer here is even close to your objective. Just saying .... Also, a simple underscore _ in front/behind your product name will not get matched using (?<!S) and b. So beware when you think something is actually robust, it isn't.

– sln
Nov 15 '18 at 22:07













4 Answers
4






active

oldest

votes


















0














You have correctly identified one issue in the regex (punctuation immediately after QB), but there is a second edge case to consider given that the input is messy -- what if there are multiple spaces in Lending QB?.



I believe the most robust solution to your problem is:



(?i)(?<!S)lendings*qbb



  • b enforces that QB occur at the end of a word, automatically considering punctuation.


  • s? was replaced with s* to allow any amount of whitespace to be
    a match, rather than just zero-to-one whitespaces.

PS. Another point to consider is that b terminates on all punctuation, (?=s|[,.]) will only terminate on the given punctuation: , or . in this case. Given the wide range of possible punctuation (colon, semicolon, dash, hyphen, emdash...) I would strongly recommend b over (?=s|[,.]). Unless you want precise control over allowable terminating punctuation of course...



PPS. further test cases to illustrate my points



str6 ='Lending Qb: simply the best'
str7 ='I'm a fan of lending QB'





share|improve this answer






























    2














    The pattern (?!S) uses a negative lookahead to check what follows is not a non whitespace character.



    What you could so is replace the (?!S) with a word boundary b to let it not be part of a larger match:



    (?i)(?<!S)lendings?qbb



    Regex demo



    Another way could be to use a positive lookahead to check for a whitespace character or ., or the end of the string using (?=[s,.]|$)



    For example:



    str5 ="The best product is Lending qb."
    print(re.findall(r'(?<!S)lendings?qb(?=[s,.]|$)', str5, re.IGNORECASE)) # ['Lending qb']





    share|improve this answer
































      0














      This (?!S) is a forward whitespace boundary.



      It is really this (?![^s]) a negative of a negative

      with the added benefit of it matching at the EOS (end of string).



      What that means is you can use the negative class form to add characters

      that qualify as a boundary.

      So, just put the period and comma in with the whitespace.



      (?i)(?<![^s,.])lendings?qb(?![^s,.])



      https://regex101.com/r/BrOj2J/1



      As a tutorial point, this concept encapsulates multiple assertions

      and is basic engine Boolean class logic which speeds up the engine

      by a ten fold factor by comparison.






      share|improve this answer
































        0














        Thank you "The fourth bird", "sln", and "Mark_Anderson". Your answers provided solutions and also were very educational. I went with Mark's answer since it seemed to be the most robust, which is where I'm trying to get to. Ideally, I do want to capture all cases when the product name is mentioned, no matter how messy it's typed.



        I changed my code to this:



        df['lendingQB'] = df['Text'].str.findall(r'(?i)(?<!S)lendings*qbb', re.IGNORECASE)





        share|improve this answer























        • You're welcome. One further thought: findall will just return the literal characters Lending QB. From the code snippet I presume a boolean flag might be more useful for you? in which case .match() is a straight replacement for .findall(), or perhaps bool(df['Text'].str.match(r'(?i)(?<!S)lendings*qbb'))

          – Mark_Anderson
          Nov 15 '18 at 21:41











        • Thanks, Mark_Anderson. This is very helpful info!

          – Amanda
          Nov 29 '18 at 14:43











        Your Answer






        StackExchange.ifUsing("editor", function ()
        StackExchange.using("externalEditor", function ()
        StackExchange.using("snippets", function ()
        StackExchange.snippets.init();
        );
        );
        , "code-snippets");

        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "1"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53327215%2fpython-regex-find-specific-phrase-in-any-form-in-text-including-if-followed-by%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        4 Answers
        4






        active

        oldest

        votes








        4 Answers
        4






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        0














        You have correctly identified one issue in the regex (punctuation immediately after QB), but there is a second edge case to consider given that the input is messy -- what if there are multiple spaces in Lending QB?.



        I believe the most robust solution to your problem is:



        (?i)(?<!S)lendings*qbb



        • b enforces that QB occur at the end of a word, automatically considering punctuation.


        • s? was replaced with s* to allow any amount of whitespace to be
          a match, rather than just zero-to-one whitespaces.

        PS. Another point to consider is that b terminates on all punctuation, (?=s|[,.]) will only terminate on the given punctuation: , or . in this case. Given the wide range of possible punctuation (colon, semicolon, dash, hyphen, emdash...) I would strongly recommend b over (?=s|[,.]). Unless you want precise control over allowable terminating punctuation of course...



        PPS. further test cases to illustrate my points



        str6 ='Lending Qb: simply the best'
        str7 ='I'm a fan of lending QB'





        share|improve this answer



























          0














          You have correctly identified one issue in the regex (punctuation immediately after QB), but there is a second edge case to consider given that the input is messy -- what if there are multiple spaces in Lending QB?.



          I believe the most robust solution to your problem is:



          (?i)(?<!S)lendings*qbb



          • b enforces that QB occur at the end of a word, automatically considering punctuation.


          • s? was replaced with s* to allow any amount of whitespace to be
            a match, rather than just zero-to-one whitespaces.

          PS. Another point to consider is that b terminates on all punctuation, (?=s|[,.]) will only terminate on the given punctuation: , or . in this case. Given the wide range of possible punctuation (colon, semicolon, dash, hyphen, emdash...) I would strongly recommend b over (?=s|[,.]). Unless you want precise control over allowable terminating punctuation of course...



          PPS. further test cases to illustrate my points



          str6 ='Lending Qb: simply the best'
          str7 ='I'm a fan of lending QB'





          share|improve this answer

























            0












            0








            0







            You have correctly identified one issue in the regex (punctuation immediately after QB), but there is a second edge case to consider given that the input is messy -- what if there are multiple spaces in Lending QB?.



            I believe the most robust solution to your problem is:



            (?i)(?<!S)lendings*qbb



            • b enforces that QB occur at the end of a word, automatically considering punctuation.


            • s? was replaced with s* to allow any amount of whitespace to be
              a match, rather than just zero-to-one whitespaces.

            PS. Another point to consider is that b terminates on all punctuation, (?=s|[,.]) will only terminate on the given punctuation: , or . in this case. Given the wide range of possible punctuation (colon, semicolon, dash, hyphen, emdash...) I would strongly recommend b over (?=s|[,.]). Unless you want precise control over allowable terminating punctuation of course...



            PPS. further test cases to illustrate my points



            str6 ='Lending Qb: simply the best'
            str7 ='I'm a fan of lending QB'





            share|improve this answer













            You have correctly identified one issue in the regex (punctuation immediately after QB), but there is a second edge case to consider given that the input is messy -- what if there are multiple spaces in Lending QB?.



            I believe the most robust solution to your problem is:



            (?i)(?<!S)lendings*qbb



            • b enforces that QB occur at the end of a word, automatically considering punctuation.


            • s? was replaced with s* to allow any amount of whitespace to be
              a match, rather than just zero-to-one whitespaces.

            PS. Another point to consider is that b terminates on all punctuation, (?=s|[,.]) will only terminate on the given punctuation: , or . in this case. Given the wide range of possible punctuation (colon, semicolon, dash, hyphen, emdash...) I would strongly recommend b over (?=s|[,.]). Unless you want precise control over allowable terminating punctuation of course...



            PPS. further test cases to illustrate my points



            str6 ='Lending Qb: simply the best'
            str7 ='I'm a fan of lending QB'






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 15 '18 at 20:57









            Mark_AndersonMark_Anderson

            426317




            426317























                2














                The pattern (?!S) uses a negative lookahead to check what follows is not a non whitespace character.



                What you could so is replace the (?!S) with a word boundary b to let it not be part of a larger match:



                (?i)(?<!S)lendings?qbb



                Regex demo



                Another way could be to use a positive lookahead to check for a whitespace character or ., or the end of the string using (?=[s,.]|$)



                For example:



                str5 ="The best product is Lending qb."
                print(re.findall(r'(?<!S)lendings?qb(?=[s,.]|$)', str5, re.IGNORECASE)) # ['Lending qb']





                share|improve this answer





























                  2














                  The pattern (?!S) uses a negative lookahead to check what follows is not a non whitespace character.



                  What you could so is replace the (?!S) with a word boundary b to let it not be part of a larger match:



                  (?i)(?<!S)lendings?qbb



                  Regex demo



                  Another way could be to use a positive lookahead to check for a whitespace character or ., or the end of the string using (?=[s,.]|$)



                  For example:



                  str5 ="The best product is Lending qb."
                  print(re.findall(r'(?<!S)lendings?qb(?=[s,.]|$)', str5, re.IGNORECASE)) # ['Lending qb']





                  share|improve this answer



























                    2












                    2








                    2







                    The pattern (?!S) uses a negative lookahead to check what follows is not a non whitespace character.



                    What you could so is replace the (?!S) with a word boundary b to let it not be part of a larger match:



                    (?i)(?<!S)lendings?qbb



                    Regex demo



                    Another way could be to use a positive lookahead to check for a whitespace character or ., or the end of the string using (?=[s,.]|$)



                    For example:



                    str5 ="The best product is Lending qb."
                    print(re.findall(r'(?<!S)lendings?qb(?=[s,.]|$)', str5, re.IGNORECASE)) # ['Lending qb']





                    share|improve this answer















                    The pattern (?!S) uses a negative lookahead to check what follows is not a non whitespace character.



                    What you could so is replace the (?!S) with a word boundary b to let it not be part of a larger match:



                    (?i)(?<!S)lendings?qbb



                    Regex demo



                    Another way could be to use a positive lookahead to check for a whitespace character or ., or the end of the string using (?=[s,.]|$)



                    For example:



                    str5 ="The best product is Lending qb."
                    print(re.findall(r'(?<!S)lendings?qb(?=[s,.]|$)', str5, re.IGNORECASE)) # ['Lending qb']






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Nov 15 '18 at 21:04

























                    answered Nov 15 '18 at 20:18









                    The fourth birdThe fourth bird

                    24.5k81629




                    24.5k81629





















                        0














                        This (?!S) is a forward whitespace boundary.



                        It is really this (?![^s]) a negative of a negative

                        with the added benefit of it matching at the EOS (end of string).



                        What that means is you can use the negative class form to add characters

                        that qualify as a boundary.

                        So, just put the period and comma in with the whitespace.



                        (?i)(?<![^s,.])lendings?qb(?![^s,.])



                        https://regex101.com/r/BrOj2J/1



                        As a tutorial point, this concept encapsulates multiple assertions

                        and is basic engine Boolean class logic which speeds up the engine

                        by a ten fold factor by comparison.






                        share|improve this answer





























                          0














                          This (?!S) is a forward whitespace boundary.



                          It is really this (?![^s]) a negative of a negative

                          with the added benefit of it matching at the EOS (end of string).



                          What that means is you can use the negative class form to add characters

                          that qualify as a boundary.

                          So, just put the period and comma in with the whitespace.



                          (?i)(?<![^s,.])lendings?qb(?![^s,.])



                          https://regex101.com/r/BrOj2J/1



                          As a tutorial point, this concept encapsulates multiple assertions

                          and is basic engine Boolean class logic which speeds up the engine

                          by a ten fold factor by comparison.






                          share|improve this answer



























                            0












                            0








                            0







                            This (?!S) is a forward whitespace boundary.



                            It is really this (?![^s]) a negative of a negative

                            with the added benefit of it matching at the EOS (end of string).



                            What that means is you can use the negative class form to add characters

                            that qualify as a boundary.

                            So, just put the period and comma in with the whitespace.



                            (?i)(?<![^s,.])lendings?qb(?![^s,.])



                            https://regex101.com/r/BrOj2J/1



                            As a tutorial point, this concept encapsulates multiple assertions

                            and is basic engine Boolean class logic which speeds up the engine

                            by a ten fold factor by comparison.






                            share|improve this answer















                            This (?!S) is a forward whitespace boundary.



                            It is really this (?![^s]) a negative of a negative

                            with the added benefit of it matching at the EOS (end of string).



                            What that means is you can use the negative class form to add characters

                            that qualify as a boundary.

                            So, just put the period and comma in with the whitespace.



                            (?i)(?<![^s,.])lendings?qb(?![^s,.])



                            https://regex101.com/r/BrOj2J/1



                            As a tutorial point, this concept encapsulates multiple assertions

                            and is basic engine Boolean class logic which speeds up the engine

                            by a ten fold factor by comparison.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Nov 15 '18 at 21:07

























                            answered Nov 15 '18 at 20:56









                            slnsln

                            26.8k31638




                            26.8k31638





















                                0














                                Thank you "The fourth bird", "sln", and "Mark_Anderson". Your answers provided solutions and also were very educational. I went with Mark's answer since it seemed to be the most robust, which is where I'm trying to get to. Ideally, I do want to capture all cases when the product name is mentioned, no matter how messy it's typed.



                                I changed my code to this:



                                df['lendingQB'] = df['Text'].str.findall(r'(?i)(?<!S)lendings*qbb', re.IGNORECASE)





                                share|improve this answer























                                • You're welcome. One further thought: findall will just return the literal characters Lending QB. From the code snippet I presume a boolean flag might be more useful for you? in which case .match() is a straight replacement for .findall(), or perhaps bool(df['Text'].str.match(r'(?i)(?<!S)lendings*qbb'))

                                  – Mark_Anderson
                                  Nov 15 '18 at 21:41











                                • Thanks, Mark_Anderson. This is very helpful info!

                                  – Amanda
                                  Nov 29 '18 at 14:43















                                0














                                Thank you "The fourth bird", "sln", and "Mark_Anderson". Your answers provided solutions and also were very educational. I went with Mark's answer since it seemed to be the most robust, which is where I'm trying to get to. Ideally, I do want to capture all cases when the product name is mentioned, no matter how messy it's typed.



                                I changed my code to this:



                                df['lendingQB'] = df['Text'].str.findall(r'(?i)(?<!S)lendings*qbb', re.IGNORECASE)





                                share|improve this answer























                                • You're welcome. One further thought: findall will just return the literal characters Lending QB. From the code snippet I presume a boolean flag might be more useful for you? in which case .match() is a straight replacement for .findall(), or perhaps bool(df['Text'].str.match(r'(?i)(?<!S)lendings*qbb'))

                                  – Mark_Anderson
                                  Nov 15 '18 at 21:41











                                • Thanks, Mark_Anderson. This is very helpful info!

                                  – Amanda
                                  Nov 29 '18 at 14:43













                                0












                                0








                                0







                                Thank you "The fourth bird", "sln", and "Mark_Anderson". Your answers provided solutions and also were very educational. I went with Mark's answer since it seemed to be the most robust, which is where I'm trying to get to. Ideally, I do want to capture all cases when the product name is mentioned, no matter how messy it's typed.



                                I changed my code to this:



                                df['lendingQB'] = df['Text'].str.findall(r'(?i)(?<!S)lendings*qbb', re.IGNORECASE)





                                share|improve this answer













                                Thank you "The fourth bird", "sln", and "Mark_Anderson". Your answers provided solutions and also were very educational. I went with Mark's answer since it seemed to be the most robust, which is where I'm trying to get to. Ideally, I do want to capture all cases when the product name is mentioned, no matter how messy it's typed.



                                I changed my code to this:



                                df['lendingQB'] = df['Text'].str.findall(r'(?i)(?<!S)lendings*qbb', re.IGNORECASE)






                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered Nov 15 '18 at 21:25









                                AmandaAmanda

                                575




                                575












                                • You're welcome. One further thought: findall will just return the literal characters Lending QB. From the code snippet I presume a boolean flag might be more useful for you? in which case .match() is a straight replacement for .findall(), or perhaps bool(df['Text'].str.match(r'(?i)(?<!S)lendings*qbb'))

                                  – Mark_Anderson
                                  Nov 15 '18 at 21:41











                                • Thanks, Mark_Anderson. This is very helpful info!

                                  – Amanda
                                  Nov 29 '18 at 14:43

















                                • You're welcome. One further thought: findall will just return the literal characters Lending QB. From the code snippet I presume a boolean flag might be more useful for you? in which case .match() is a straight replacement for .findall(), or perhaps bool(df['Text'].str.match(r'(?i)(?<!S)lendings*qbb'))

                                  – Mark_Anderson
                                  Nov 15 '18 at 21:41











                                • Thanks, Mark_Anderson. This is very helpful info!

                                  – Amanda
                                  Nov 29 '18 at 14:43
















                                You're welcome. One further thought: findall will just return the literal characters Lending QB. From the code snippet I presume a boolean flag might be more useful for you? in which case .match() is a straight replacement for .findall(), or perhaps bool(df['Text'].str.match(r'(?i)(?<!S)lendings*qbb'))

                                – Mark_Anderson
                                Nov 15 '18 at 21:41





                                You're welcome. One further thought: findall will just return the literal characters Lending QB. From the code snippet I presume a boolean flag might be more useful for you? in which case .match() is a straight replacement for .findall(), or perhaps bool(df['Text'].str.match(r'(?i)(?<!S)lendings*qbb'))

                                – Mark_Anderson
                                Nov 15 '18 at 21:41













                                Thanks, Mark_Anderson. This is very helpful info!

                                – Amanda
                                Nov 29 '18 at 14:43





                                Thanks, Mark_Anderson. This is very helpful info!

                                – Amanda
                                Nov 29 '18 at 14:43

















                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53327215%2fpython-regex-find-specific-phrase-in-any-form-in-text-including-if-followed-by%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                這個網誌中的熱門文章

                                How to read a connectionString WITH PROVIDER in .NET Core?

                                Node.js Script on GitHub Pages or Amazon S3

                                Museum of Modern and Contemporary Art of Trento and Rovereto