Is there a simpler way of writing this sorting code?










-2















I have a list of 196 strings in the form 2009/EPS.WCR.PL6.MAIS.0036, 2016/EPS.WCR.PL6.NORM.0077 etc. What varies is the year date and the four numbers at the end. Also there are either NORM or MAIZE. I would like to go through this list and extract these bits of information to create a some sort of distance matrix. The code I have written so far is as follow:
c(substr(df[i,3], 1, 4),substr(df[1,3], 18, 21),substr(df[i,3], 22, nchar(df[i,4]))),
where df is the list of these catagorical variables.



Where i loops through the list. Is there a nice way of getting a distance between these strings based on the bits of information that I am extracting?



Thank you in advance.










share|improve this question
























  • What kind of distance? Are you using a package such as stringdist to compute the distances? And do you want, say, the distance between 20090036 and 20160077 or what exactly?

    – Rui Barradas
    Nov 13 '18 at 11:52















-2















I have a list of 196 strings in the form 2009/EPS.WCR.PL6.MAIS.0036, 2016/EPS.WCR.PL6.NORM.0077 etc. What varies is the year date and the four numbers at the end. Also there are either NORM or MAIZE. I would like to go through this list and extract these bits of information to create a some sort of distance matrix. The code I have written so far is as follow:
c(substr(df[i,3], 1, 4),substr(df[1,3], 18, 21),substr(df[i,3], 22, nchar(df[i,4]))),
where df is the list of these catagorical variables.



Where i loops through the list. Is there a nice way of getting a distance between these strings based on the bits of information that I am extracting?



Thank you in advance.










share|improve this question
























  • What kind of distance? Are you using a package such as stringdist to compute the distances? And do you want, say, the distance between 20090036 and 20160077 or what exactly?

    – Rui Barradas
    Nov 13 '18 at 11:52













-2












-2








-2








I have a list of 196 strings in the form 2009/EPS.WCR.PL6.MAIS.0036, 2016/EPS.WCR.PL6.NORM.0077 etc. What varies is the year date and the four numbers at the end. Also there are either NORM or MAIZE. I would like to go through this list and extract these bits of information to create a some sort of distance matrix. The code I have written so far is as follow:
c(substr(df[i,3], 1, 4),substr(df[1,3], 18, 21),substr(df[i,3], 22, nchar(df[i,4]))),
where df is the list of these catagorical variables.



Where i loops through the list. Is there a nice way of getting a distance between these strings based on the bits of information that I am extracting?



Thank you in advance.










share|improve this question
















I have a list of 196 strings in the form 2009/EPS.WCR.PL6.MAIS.0036, 2016/EPS.WCR.PL6.NORM.0077 etc. What varies is the year date and the four numbers at the end. Also there are either NORM or MAIZE. I would like to go through this list and extract these bits of information to create a some sort of distance matrix. The code I have written so far is as follow:
c(substr(df[i,3], 1, 4),substr(df[1,3], 18, 21),substr(df[i,3], 22, nchar(df[i,4]))),
where df is the list of these catagorical variables.



Where i loops through the list. Is there a nice way of getting a distance between these strings based on the bits of information that I am extracting?



Thank you in advance.







r list sorting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 13 '18 at 11:36









Hunaidkhan

806113




806113










asked Nov 13 '18 at 11:06









Expectation mean first momentExpectation mean first moment

465




465












  • What kind of distance? Are you using a package such as stringdist to compute the distances? And do you want, say, the distance between 20090036 and 20160077 or what exactly?

    – Rui Barradas
    Nov 13 '18 at 11:52

















  • What kind of distance? Are you using a package such as stringdist to compute the distances? And do you want, say, the distance between 20090036 and 20160077 or what exactly?

    – Rui Barradas
    Nov 13 '18 at 11:52
















What kind of distance? Are you using a package such as stringdist to compute the distances? And do you want, say, the distance between 20090036 and 20160077 or what exactly?

– Rui Barradas
Nov 13 '18 at 11:52





What kind of distance? Are you using a package such as stringdist to compute the distances? And do you want, say, the distance between 20090036 and 20160077 or what exactly?

– Rui Barradas
Nov 13 '18 at 11:52












2 Answers
2






active

oldest

votes


















0














If you data has the same structure throughout then try:



 data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
str(data)
substr(data, start = 1, stop = 4)
substr(data, start = 18, stop = 21)
substr(data, start = 23, stop = 26)





share|improve this answer






























    0














    The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method of your choice, see the help page help("stringdist").



    special_dist <- function(x, method = "osa")
    y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
    res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
    rownames(res) <- colnames(res)
    res


    x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
    special_dist(x)
    # 20090036 20160077
    #20090036 0 4
    #20160077 4 0

    special_dist(x, "jaccard")
    # 20090036 20160077
    #20090036 0.0000000 0.5714286
    #20160077 0.5714286 0.0000000





    share|improve this answer






















      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53279649%2fis-there-a-simpler-way-of-writing-this-sorting-code%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      If you data has the same structure throughout then try:



       data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
      str(data)
      substr(data, start = 1, stop = 4)
      substr(data, start = 18, stop = 21)
      substr(data, start = 23, stop = 26)





      share|improve this answer



























        0














        If you data has the same structure throughout then try:



         data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
        str(data)
        substr(data, start = 1, stop = 4)
        substr(data, start = 18, stop = 21)
        substr(data, start = 23, stop = 26)





        share|improve this answer

























          0












          0








          0







          If you data has the same structure throughout then try:



           data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
          str(data)
          substr(data, start = 1, stop = 4)
          substr(data, start = 18, stop = 21)
          substr(data, start = 23, stop = 26)





          share|improve this answer













          If you data has the same structure throughout then try:



           data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
          str(data)
          substr(data, start = 1, stop = 4)
          substr(data, start = 18, stop = 21)
          substr(data, start = 23, stop = 26)






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 13 '18 at 11:46









          user113156user113156

          8311417




          8311417























              0














              The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method of your choice, see the help page help("stringdist").



              special_dist <- function(x, method = "osa")
              y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
              res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
              rownames(res) <- colnames(res)
              res


              x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
              special_dist(x)
              # 20090036 20160077
              #20090036 0 4
              #20160077 4 0

              special_dist(x, "jaccard")
              # 20090036 20160077
              #20090036 0.0000000 0.5714286
              #20160077 0.5714286 0.0000000





              share|improve this answer



























                0














                The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method of your choice, see the help page help("stringdist").



                special_dist <- function(x, method = "osa")
                y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
                res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
                rownames(res) <- colnames(res)
                res


                x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
                special_dist(x)
                # 20090036 20160077
                #20090036 0 4
                #20160077 4 0

                special_dist(x, "jaccard")
                # 20090036 20160077
                #20090036 0.0000000 0.5714286
                #20160077 0.5714286 0.0000000





                share|improve this answer

























                  0












                  0








                  0







                  The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method of your choice, see the help page help("stringdist").



                  special_dist <- function(x, method = "osa")
                  y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
                  res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
                  rownames(res) <- colnames(res)
                  res


                  x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
                  special_dist(x)
                  # 20090036 20160077
                  #20090036 0 4
                  #20160077 4 0

                  special_dist(x, "jaccard")
                  # 20090036 20160077
                  #20090036 0.0000000 0.5714286
                  #20160077 0.5714286 0.0000000





                  share|improve this answer













                  The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method of your choice, see the help page help("stringdist").



                  special_dist <- function(x, method = "osa")
                  y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
                  res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
                  rownames(res) <- colnames(res)
                  res


                  x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
                  special_dist(x)
                  # 20090036 20160077
                  #20090036 0 4
                  #20160077 4 0

                  special_dist(x, "jaccard")
                  # 20090036 20160077
                  #20090036 0.0000000 0.5714286
                  #20160077 0.5714286 0.0000000






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 13 '18 at 12:05









                  Rui BarradasRui Barradas

                  16.4k51730




                  16.4k51730



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53279649%2fis-there-a-simpler-way-of-writing-this-sorting-code%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      Barbados

                      How to read a connectionString WITH PROVIDER in .NET Core?

                      Node.js Script on GitHub Pages or Amazon S3