Smallest sub-vector with all distinct values of the original vector
Let's say we have a vector of length n with k distinct values.
1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2
How do I find out the start and end coordinates (on the original vector) of the smallest subset of values, sequence and frequency of individual elements conserved, that has all the distinct values of the original vector?
For our example, the subset would be
2, 3, 4, 1, 1, 5
and the start and end coordinates would be 4
and 9
, respectively.
r vector
add a comment |
Let's say we have a vector of length n with k distinct values.
1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2
How do I find out the start and end coordinates (on the original vector) of the smallest subset of values, sequence and frequency of individual elements conserved, that has all the distinct values of the original vector?
For our example, the subset would be
2, 3, 4, 1, 1, 5
and the start and end coordinates would be 4
and 9
, respectively.
r vector
1
Dupe-oid: Get indexes of a vector of numbers in another vector
– Henrik
Nov 15 '18 at 10:42
6
Since this is R and you took the time to make a numeric sequence whyvs
c()
? Spidey-sense says#homework
– hrbrmstr
Nov 15 '18 at 10:44
Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.
– Sachin
Nov 15 '18 at 17:30
add a comment |
Let's say we have a vector of length n with k distinct values.
1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2
How do I find out the start and end coordinates (on the original vector) of the smallest subset of values, sequence and frequency of individual elements conserved, that has all the distinct values of the original vector?
For our example, the subset would be
2, 3, 4, 1, 1, 5
and the start and end coordinates would be 4
and 9
, respectively.
r vector
Let's say we have a vector of length n with k distinct values.
1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2
How do I find out the start and end coordinates (on the original vector) of the smallest subset of values, sequence and frequency of individual elements conserved, that has all the distinct values of the original vector?
For our example, the subset would be
2, 3, 4, 1, 1, 5
and the start and end coordinates would be 4
and 9
, respectively.
r vector
r vector
asked Nov 15 '18 at 10:30
SachinSachin
3817
3817
1
Dupe-oid: Get indexes of a vector of numbers in another vector
– Henrik
Nov 15 '18 at 10:42
6
Since this is R and you took the time to make a numeric sequence whyvs
c()
? Spidey-sense says#homework
– hrbrmstr
Nov 15 '18 at 10:44
Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.
– Sachin
Nov 15 '18 at 17:30
add a comment |
1
Dupe-oid: Get indexes of a vector of numbers in another vector
– Henrik
Nov 15 '18 at 10:42
6
Since this is R and you took the time to make a numeric sequence whyvs
c()
? Spidey-sense says#homework
– hrbrmstr
Nov 15 '18 at 10:44
Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.
– Sachin
Nov 15 '18 at 17:30
1
1
Dupe-oid: Get indexes of a vector of numbers in another vector
– Henrik
Nov 15 '18 at 10:42
Dupe-oid: Get indexes of a vector of numbers in another vector
– Henrik
Nov 15 '18 at 10:42
6
6
Since this is R and you took the time to make a numeric sequence why
vs c()
? Spidey-sense says #homework
– hrbrmstr
Nov 15 '18 at 10:44
Since this is R and you took the time to make a numeric sequence why
vs c()
? Spidey-sense says #homework
– hrbrmstr
Nov 15 '18 at 10:44
Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.
– Sachin
Nov 15 '18 at 17:30
Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.
– Sachin
Nov 15 '18 at 17:30
add a comment |
1 Answer
1
active
oldest
votes
Here is something that will do this task: First I create a vector index
where index[k]
is equal to the amount of indices to go (starting at k
) until one has all the elements at least once, and it is equal to Inf
if that is never the case.
# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))
# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))
# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))
# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5
where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2)
and for any given k
myFun
will return the smallest number n
such that v[k:n]
contains every element of v
.
Thank you very much.
– Sachin
Nov 15 '18 at 17:40
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317387%2fsmallest-sub-vector-with-all-distinct-values-of-the-original-vector%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here is something that will do this task: First I create a vector index
where index[k]
is equal to the amount of indices to go (starting at k
) until one has all the elements at least once, and it is equal to Inf
if that is never the case.
# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))
# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))
# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))
# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5
where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2)
and for any given k
myFun
will return the smallest number n
such that v[k:n]
contains every element of v
.
Thank you very much.
– Sachin
Nov 15 '18 at 17:40
add a comment |
Here is something that will do this task: First I create a vector index
where index[k]
is equal to the amount of indices to go (starting at k
) until one has all the elements at least once, and it is equal to Inf
if that is never the case.
# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))
# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))
# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))
# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5
where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2)
and for any given k
myFun
will return the smallest number n
such that v[k:n]
contains every element of v
.
Thank you very much.
– Sachin
Nov 15 '18 at 17:40
add a comment |
Here is something that will do this task: First I create a vector index
where index[k]
is equal to the amount of indices to go (starting at k
) until one has all the elements at least once, and it is equal to Inf
if that is never the case.
# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))
# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))
# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))
# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5
where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2)
and for any given k
myFun
will return the smallest number n
such that v[k:n]
contains every element of v
.
Here is something that will do this task: First I create a vector index
where index[k]
is equal to the amount of indices to go (starting at k
) until one has all the elements at least once, and it is equal to Inf
if that is never the case.
# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))
# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))
# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))
# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5
where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2)
and for any given k
myFun
will return the smallest number n
such that v[k:n]
contains every element of v
.
edited Nov 15 '18 at 11:29
answered Nov 15 '18 at 11:07
natenate
3,1411321
3,1411321
Thank you very much.
– Sachin
Nov 15 '18 at 17:40
add a comment |
Thank you very much.
– Sachin
Nov 15 '18 at 17:40
Thank you very much.
– Sachin
Nov 15 '18 at 17:40
Thank you very much.
– Sachin
Nov 15 '18 at 17:40
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317387%2fsmallest-sub-vector-with-all-distinct-values-of-the-original-vector%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Dupe-oid: Get indexes of a vector of numbers in another vector
– Henrik
Nov 15 '18 at 10:42
6
Since this is R and you took the time to make a numeric sequence why
vs
c()
? Spidey-sense says#homework
– hrbrmstr
Nov 15 '18 at 10:44
Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.
– Sachin
Nov 15 '18 at 17:30