Is there a simpler way of writing this sorting code?
I have a list of 196 strings in the form 2009/EPS.WCR.PL6.MAIS.0036, 2016/EPS.WCR.PL6.NORM.0077 etc. What varies is the year date and the four numbers at the end. Also there are either NORM or MAIZE. I would like to go through this list and extract these bits of information to create a some sort of distance matrix. The code I have written so far is as follow:c(substr(df[i,3], 1, 4),substr(df[1,3], 18, 21),substr(df[i,3], 22, nchar(df[i,4]))),
where df is the list of these catagorical variables.
Where i loops through the list. Is there a nice way of getting a distance between these strings based on the bits of information that I am extracting?
Thank you in advance.
r list sorting
add a comment |
I have a list of 196 strings in the form 2009/EPS.WCR.PL6.MAIS.0036, 2016/EPS.WCR.PL6.NORM.0077 etc. What varies is the year date and the four numbers at the end. Also there are either NORM or MAIZE. I would like to go through this list and extract these bits of information to create a some sort of distance matrix. The code I have written so far is as follow:c(substr(df[i,3], 1, 4),substr(df[1,3], 18, 21),substr(df[i,3], 22, nchar(df[i,4]))),
where df is the list of these catagorical variables.
Where i loops through the list. Is there a nice way of getting a distance between these strings based on the bits of information that I am extracting?
Thank you in advance.
r list sorting
What kind of distance? Are you using a package such asstringdist
to compute the distances? And do you want, say, the distance between20090036
and20160077
or what exactly?
– Rui Barradas
Nov 13 '18 at 11:52
add a comment |
I have a list of 196 strings in the form 2009/EPS.WCR.PL6.MAIS.0036, 2016/EPS.WCR.PL6.NORM.0077 etc. What varies is the year date and the four numbers at the end. Also there are either NORM or MAIZE. I would like to go through this list and extract these bits of information to create a some sort of distance matrix. The code I have written so far is as follow:c(substr(df[i,3], 1, 4),substr(df[1,3], 18, 21),substr(df[i,3], 22, nchar(df[i,4]))),
where df is the list of these catagorical variables.
Where i loops through the list. Is there a nice way of getting a distance between these strings based on the bits of information that I am extracting?
Thank you in advance.
r list sorting
I have a list of 196 strings in the form 2009/EPS.WCR.PL6.MAIS.0036, 2016/EPS.WCR.PL6.NORM.0077 etc. What varies is the year date and the four numbers at the end. Also there are either NORM or MAIZE. I would like to go through this list and extract these bits of information to create a some sort of distance matrix. The code I have written so far is as follow:c(substr(df[i,3], 1, 4),substr(df[1,3], 18, 21),substr(df[i,3], 22, nchar(df[i,4]))),
where df is the list of these catagorical variables.
Where i loops through the list. Is there a nice way of getting a distance between these strings based on the bits of information that I am extracting?
Thank you in advance.
r list sorting
r list sorting
edited Nov 13 '18 at 11:36
Hunaidkhan
806113
806113
asked Nov 13 '18 at 11:06
Expectation mean first momentExpectation mean first moment
465
465
What kind of distance? Are you using a package such asstringdist
to compute the distances? And do you want, say, the distance between20090036
and20160077
or what exactly?
– Rui Barradas
Nov 13 '18 at 11:52
add a comment |
What kind of distance? Are you using a package such asstringdist
to compute the distances? And do you want, say, the distance between20090036
and20160077
or what exactly?
– Rui Barradas
Nov 13 '18 at 11:52
What kind of distance? Are you using a package such as
stringdist
to compute the distances? And do you want, say, the distance between 20090036
and 20160077
or what exactly?– Rui Barradas
Nov 13 '18 at 11:52
What kind of distance? Are you using a package such as
stringdist
to compute the distances? And do you want, say, the distance between 20090036
and 20160077
or what exactly?– Rui Barradas
Nov 13 '18 at 11:52
add a comment |
2 Answers
2
active
oldest
votes
If you data has the same structure throughout then try:
data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
str(data)
substr(data, start = 1, stop = 4)
substr(data, start = 18, stop = 21)
substr(data, start = 23, stop = 26)
add a comment |
The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method
of your choice, see the help page help("stringdist")
.
special_dist <- function(x, method = "osa")
y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
rownames(res) <- colnames(res)
res
x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
special_dist(x)
# 20090036 20160077
#20090036 0 4
#20160077 4 0
special_dist(x, "jaccard")
# 20090036 20160077
#20090036 0.0000000 0.5714286
#20160077 0.5714286 0.0000000
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53279649%2fis-there-a-simpler-way-of-writing-this-sorting-code%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
If you data has the same structure throughout then try:
data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
str(data)
substr(data, start = 1, stop = 4)
substr(data, start = 18, stop = 21)
substr(data, start = 23, stop = 26)
add a comment |
If you data has the same structure throughout then try:
data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
str(data)
substr(data, start = 1, stop = 4)
substr(data, start = 18, stop = 21)
substr(data, start = 23, stop = 26)
add a comment |
If you data has the same structure throughout then try:
data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
str(data)
substr(data, start = 1, stop = 4)
substr(data, start = 18, stop = 21)
substr(data, start = 23, stop = 26)
If you data has the same structure throughout then try:
data <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
str(data)
substr(data, start = 1, stop = 4)
substr(data, start = 18, stop = 21)
substr(data, start = 23, stop = 26)
answered Nov 13 '18 at 11:46
user113156user113156
8311417
8311417
add a comment |
add a comment |
The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method
of your choice, see the help page help("stringdist")
.
special_dist <- function(x, method = "osa")
y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
rownames(res) <- colnames(res)
res
x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
special_dist(x)
# 20090036 20160077
#20090036 0 4
#20160077 4 0
special_dist(x, "jaccard")
# 20090036 20160077
#20090036 0.0000000 0.5714286
#20160077 0.5714286 0.0000000
add a comment |
The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method
of your choice, see the help page help("stringdist")
.
special_dist <- function(x, method = "osa")
y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
rownames(res) <- colnames(res)
res
x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
special_dist(x)
# 20090036 20160077
#20090036 0 4
#20160077 4 0
special_dist(x, "jaccard")
# 20090036 20160077
#20090036 0.0000000 0.5714286
#20160077 0.5714286 0.0000000
add a comment |
The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method
of your choice, see the help page help("stringdist")
.
special_dist <- function(x, method = "osa")
y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
rownames(res) <- colnames(res)
res
x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
special_dist(x)
# 20090036 20160077
#20090036 0 4
#20160077 4 0
special_dist(x, "jaccard")
# 20090036 20160077
#20090036 0.0000000 0.5714286
#20160077 0.5714286 0.0000000
The following function uses CRAN package stringdist to compute the distances between the strings in its first argument. You can pass a method
of your choice, see the help page help("stringdist")
.
special_dist <- function(x, method = "osa")
y <- sub("(^[[:digit:]]+).*[[:punct:]]([[:digit:]]+$)", "\1\2", x)
res <- sapply(y, function(z) stringdist::stringdist(z, y, method = method))
rownames(res) <- colnames(res)
res
x <- c("2009/EPS.WCR.PL6.MAIS.0036", "2016/EPS.WCR.PL6.NORM.0077")
special_dist(x)
# 20090036 20160077
#20090036 0 4
#20160077 4 0
special_dist(x, "jaccard")
# 20090036 20160077
#20090036 0.0000000 0.5714286
#20160077 0.5714286 0.0000000
answered Nov 13 '18 at 12:05
Rui BarradasRui Barradas
16.4k51730
16.4k51730
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53279649%2fis-there-a-simpler-way-of-writing-this-sorting-code%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What kind of distance? Are you using a package such as
stringdist
to compute the distances? And do you want, say, the distance between20090036
and20160077
or what exactly?– Rui Barradas
Nov 13 '18 at 11:52