REGEX for Phrase Repeated n Times?










3















I have users entering blocks of text and I'm trying to prevent them from repeating a phrase more than, say, 5 times.
So this would be fine:




I like fish very much I like fish very much I like fish very much




so would this:




Marshmallows are yummy. Marshmallows are yummy. Marshmallows are
yummy.




But this would not be:




I like fish very much I like fish very much I like fish very much I
like fish very much I like fish very much I like fish very much I like
fish very much I like fish very much




nor this:




Marshmallows are yummy. Marshmallows are yummy. Marshmallows are
yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows
are yummy. Marshmallows are yummy. Marshmallows are yummy.
Marshmallows are yummy. Marshmallows are yummy.




Ideally, it would also catch it even if it was entered like this:




I like fish very much

I like fish very much

I like fish very much

I like fish very much

I like fish very much

I like fish very much




I tried:



b(S.*S)[ ,.]*b(1)5


But it doesn't always work, depending on the phrase length and only seems to work if each sentence is ended with a period.



Any ideas?










share|improve this question


























    3















    I have users entering blocks of text and I'm trying to prevent them from repeating a phrase more than, say, 5 times.
    So this would be fine:




    I like fish very much I like fish very much I like fish very much




    so would this:




    Marshmallows are yummy. Marshmallows are yummy. Marshmallows are
    yummy.




    But this would not be:




    I like fish very much I like fish very much I like fish very much I
    like fish very much I like fish very much I like fish very much I like
    fish very much I like fish very much




    nor this:




    Marshmallows are yummy. Marshmallows are yummy. Marshmallows are
    yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows
    are yummy. Marshmallows are yummy. Marshmallows are yummy.
    Marshmallows are yummy. Marshmallows are yummy.




    Ideally, it would also catch it even if it was entered like this:




    I like fish very much

    I like fish very much

    I like fish very much

    I like fish very much

    I like fish very much

    I like fish very much




    I tried:



    b(S.*S)[ ,.]*b(1)5


    But it doesn't always work, depending on the phrase length and only seems to work if each sentence is ended with a period.



    Any ideas?










    share|improve this question
























      3












      3








      3








      I have users entering blocks of text and I'm trying to prevent them from repeating a phrase more than, say, 5 times.
      So this would be fine:




      I like fish very much I like fish very much I like fish very much




      so would this:




      Marshmallows are yummy. Marshmallows are yummy. Marshmallows are
      yummy.




      But this would not be:




      I like fish very much I like fish very much I like fish very much I
      like fish very much I like fish very much I like fish very much I like
      fish very much I like fish very much




      nor this:




      Marshmallows are yummy. Marshmallows are yummy. Marshmallows are
      yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows
      are yummy. Marshmallows are yummy. Marshmallows are yummy.
      Marshmallows are yummy. Marshmallows are yummy.




      Ideally, it would also catch it even if it was entered like this:




      I like fish very much

      I like fish very much

      I like fish very much

      I like fish very much

      I like fish very much

      I like fish very much




      I tried:



      b(S.*S)[ ,.]*b(1)5


      But it doesn't always work, depending on the phrase length and only seems to work if each sentence is ended with a period.



      Any ideas?










      share|improve this question














      I have users entering blocks of text and I'm trying to prevent them from repeating a phrase more than, say, 5 times.
      So this would be fine:




      I like fish very much I like fish very much I like fish very much




      so would this:




      Marshmallows are yummy. Marshmallows are yummy. Marshmallows are
      yummy.




      But this would not be:




      I like fish very much I like fish very much I like fish very much I
      like fish very much I like fish very much I like fish very much I like
      fish very much I like fish very much




      nor this:




      Marshmallows are yummy. Marshmallows are yummy. Marshmallows are
      yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows
      are yummy. Marshmallows are yummy. Marshmallows are yummy.
      Marshmallows are yummy. Marshmallows are yummy.




      Ideally, it would also catch it even if it was entered like this:




      I like fish very much

      I like fish very much

      I like fish very much

      I like fish very much

      I like fish very much

      I like fish very much




      I tried:



      b(S.*S)[ ,.]*b(1)5


      But it doesn't always work, depending on the phrase length and only seems to work if each sentence is ended with a period.



      Any ideas?







      regex






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 13 '18 at 5:11









      Lisa KLisa K

      261




      261






















          1 Answer
          1






          active

          oldest

          votes


















          2














          Here's one possibility:



          (bw.3,49)14


          It captures between 2 and 50 characters (starting with a word character) in a group, and checks for if that group is repeated at least 5 times in a row.



          https://regex101.com/r/tS6kHF/2



          If the regex passes, there is some repeated phrase.



          That said, this may not be a great idea, especially for large input strings - as you can see on the link, it takes a very large number of steps, because for each character in the input (eg, starting with "hello"), it has to find the corresponding substring of length 2 ("he") and check that it's not repeated, then find "hel" and what follows, then find "hell" and what follows, and so on, 50 times. Then, it starts on the next character, "e": "el", then "ell", then "ello", etc. (You do need an upper limit, like 50 characters, or something - otherwise, the computation time goes way up, eg 8k steps to 74k steps)



          Depending on the situation, it may be computationally expensive - might be better to use another method to programatically find repeating substrings.






          share|improve this answer























          • Thank you for your reply. That might work, but I'm not sure I can set the single line flag modifier. I'm using Chat Bot software and I don't think it's on by default. I literally have a box where I choose "matches regex" then I put in the regex code in the next bosx. It has the case insensitive one on by default too (which is usually good). I'll give it a try and see. Because I'm using Chatbot software, doing this programmatically would require a webhook which is what I'm trying to avoid.

            – Lisa K
            Nov 13 '18 at 15:03










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53274214%2fregex-for-phrase-repeated-n-times%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          Here's one possibility:



          (bw.3,49)14


          It captures between 2 and 50 characters (starting with a word character) in a group, and checks for if that group is repeated at least 5 times in a row.



          https://regex101.com/r/tS6kHF/2



          If the regex passes, there is some repeated phrase.



          That said, this may not be a great idea, especially for large input strings - as you can see on the link, it takes a very large number of steps, because for each character in the input (eg, starting with "hello"), it has to find the corresponding substring of length 2 ("he") and check that it's not repeated, then find "hel" and what follows, then find "hell" and what follows, and so on, 50 times. Then, it starts on the next character, "e": "el", then "ell", then "ello", etc. (You do need an upper limit, like 50 characters, or something - otherwise, the computation time goes way up, eg 8k steps to 74k steps)



          Depending on the situation, it may be computationally expensive - might be better to use another method to programatically find repeating substrings.






          share|improve this answer























          • Thank you for your reply. That might work, but I'm not sure I can set the single line flag modifier. I'm using Chat Bot software and I don't think it's on by default. I literally have a box where I choose "matches regex" then I put in the regex code in the next bosx. It has the case insensitive one on by default too (which is usually good). I'll give it a try and see. Because I'm using Chatbot software, doing this programmatically would require a webhook which is what I'm trying to avoid.

            – Lisa K
            Nov 13 '18 at 15:03















          2














          Here's one possibility:



          (bw.3,49)14


          It captures between 2 and 50 characters (starting with a word character) in a group, and checks for if that group is repeated at least 5 times in a row.



          https://regex101.com/r/tS6kHF/2



          If the regex passes, there is some repeated phrase.



          That said, this may not be a great idea, especially for large input strings - as you can see on the link, it takes a very large number of steps, because for each character in the input (eg, starting with "hello"), it has to find the corresponding substring of length 2 ("he") and check that it's not repeated, then find "hel" and what follows, then find "hell" and what follows, and so on, 50 times. Then, it starts on the next character, "e": "el", then "ell", then "ello", etc. (You do need an upper limit, like 50 characters, or something - otherwise, the computation time goes way up, eg 8k steps to 74k steps)



          Depending on the situation, it may be computationally expensive - might be better to use another method to programatically find repeating substrings.






          share|improve this answer























          • Thank you for your reply. That might work, but I'm not sure I can set the single line flag modifier. I'm using Chat Bot software and I don't think it's on by default. I literally have a box where I choose "matches regex" then I put in the regex code in the next bosx. It has the case insensitive one on by default too (which is usually good). I'll give it a try and see. Because I'm using Chatbot software, doing this programmatically would require a webhook which is what I'm trying to avoid.

            – Lisa K
            Nov 13 '18 at 15:03













          2












          2








          2







          Here's one possibility:



          (bw.3,49)14


          It captures between 2 and 50 characters (starting with a word character) in a group, and checks for if that group is repeated at least 5 times in a row.



          https://regex101.com/r/tS6kHF/2



          If the regex passes, there is some repeated phrase.



          That said, this may not be a great idea, especially for large input strings - as you can see on the link, it takes a very large number of steps, because for each character in the input (eg, starting with "hello"), it has to find the corresponding substring of length 2 ("he") and check that it's not repeated, then find "hel" and what follows, then find "hell" and what follows, and so on, 50 times. Then, it starts on the next character, "e": "el", then "ell", then "ello", etc. (You do need an upper limit, like 50 characters, or something - otherwise, the computation time goes way up, eg 8k steps to 74k steps)



          Depending on the situation, it may be computationally expensive - might be better to use another method to programatically find repeating substrings.






          share|improve this answer













          Here's one possibility:



          (bw.3,49)14


          It captures between 2 and 50 characters (starting with a word character) in a group, and checks for if that group is repeated at least 5 times in a row.



          https://regex101.com/r/tS6kHF/2



          If the regex passes, there is some repeated phrase.



          That said, this may not be a great idea, especially for large input strings - as you can see on the link, it takes a very large number of steps, because for each character in the input (eg, starting with "hello"), it has to find the corresponding substring of length 2 ("he") and check that it's not repeated, then find "hel" and what follows, then find "hell" and what follows, and so on, 50 times. Then, it starts on the next character, "e": "el", then "ell", then "ello", etc. (You do need an upper limit, like 50 characters, or something - otherwise, the computation time goes way up, eg 8k steps to 74k steps)



          Depending on the situation, it may be computationally expensive - might be better to use another method to programatically find repeating substrings.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 13 '18 at 5:35









          CertainPerformanceCertainPerformance

          78.9k143865




          78.9k143865












          • Thank you for your reply. That might work, but I'm not sure I can set the single line flag modifier. I'm using Chat Bot software and I don't think it's on by default. I literally have a box where I choose "matches regex" then I put in the regex code in the next bosx. It has the case insensitive one on by default too (which is usually good). I'll give it a try and see. Because I'm using Chatbot software, doing this programmatically would require a webhook which is what I'm trying to avoid.

            – Lisa K
            Nov 13 '18 at 15:03

















          • Thank you for your reply. That might work, but I'm not sure I can set the single line flag modifier. I'm using Chat Bot software and I don't think it's on by default. I literally have a box where I choose "matches regex" then I put in the regex code in the next bosx. It has the case insensitive one on by default too (which is usually good). I'll give it a try and see. Because I'm using Chatbot software, doing this programmatically would require a webhook which is what I'm trying to avoid.

            – Lisa K
            Nov 13 '18 at 15:03
















          Thank you for your reply. That might work, but I'm not sure I can set the single line flag modifier. I'm using Chat Bot software and I don't think it's on by default. I literally have a box where I choose "matches regex" then I put in the regex code in the next bosx. It has the case insensitive one on by default too (which is usually good). I'll give it a try and see. Because I'm using Chatbot software, doing this programmatically would require a webhook which is what I'm trying to avoid.

          – Lisa K
          Nov 13 '18 at 15:03





          Thank you for your reply. That might work, but I'm not sure I can set the single line flag modifier. I'm using Chat Bot software and I don't think it's on by default. I literally have a box where I choose "matches regex" then I put in the regex code in the next bosx. It has the case insensitive one on by default too (which is usually good). I'll give it a try and see. Because I'm using Chatbot software, doing this programmatically would require a webhook which is what I'm trying to avoid.

          – Lisa K
          Nov 13 '18 at 15:03

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53274214%2fregex-for-phrase-repeated-n-times%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          How to read a connectionString WITH PROVIDER in .NET Core?

          Node.js Script on GitHub Pages or Amazon S3

          Museum of Modern and Contemporary Art of Trento and Rovereto