Smallest sub-vector with all distinct values of the original vector










3















Let's say we have a vector of length n with k distinct values.



1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2


How do I find out the start and end coordinates (on the original vector) of the smallest subset of values, sequence and frequency of individual elements conserved, that has all the distinct values of the original vector?



For our example, the subset would be



2, 3, 4, 1, 1, 5


and the start and end coordinates would be 4 and 9, respectively.










share|improve this question

















  • 1





    Dupe-oid: Get indexes of a vector of numbers in another vector

    – Henrik
    Nov 15 '18 at 10:42






  • 6





    Since this is R and you took the time to make a numeric sequence why vs c()? Spidey-sense says #homework

    – hrbrmstr
    Nov 15 '18 at 10:44












  • Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.

    – Sachin
    Nov 15 '18 at 17:30















3















Let's say we have a vector of length n with k distinct values.



1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2


How do I find out the start and end coordinates (on the original vector) of the smallest subset of values, sequence and frequency of individual elements conserved, that has all the distinct values of the original vector?



For our example, the subset would be



2, 3, 4, 1, 1, 5


and the start and end coordinates would be 4 and 9, respectively.










share|improve this question

















  • 1





    Dupe-oid: Get indexes of a vector of numbers in another vector

    – Henrik
    Nov 15 '18 at 10:42






  • 6





    Since this is R and you took the time to make a numeric sequence why vs c()? Spidey-sense says #homework

    – hrbrmstr
    Nov 15 '18 at 10:44












  • Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.

    – Sachin
    Nov 15 '18 at 17:30













3












3








3


1






Let's say we have a vector of length n with k distinct values.



1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2


How do I find out the start and end coordinates (on the original vector) of the smallest subset of values, sequence and frequency of individual elements conserved, that has all the distinct values of the original vector?



For our example, the subset would be



2, 3, 4, 1, 1, 5


and the start and end coordinates would be 4 and 9, respectively.










share|improve this question














Let's say we have a vector of length n with k distinct values.



1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2


How do I find out the start and end coordinates (on the original vector) of the smallest subset of values, sequence and frequency of individual elements conserved, that has all the distinct values of the original vector?



For our example, the subset would be



2, 3, 4, 1, 1, 5


and the start and end coordinates would be 4 and 9, respectively.







r vector






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 10:30









SachinSachin

3817




3817







  • 1





    Dupe-oid: Get indexes of a vector of numbers in another vector

    – Henrik
    Nov 15 '18 at 10:42






  • 6





    Since this is R and you took the time to make a numeric sequence why vs c()? Spidey-sense says #homework

    – hrbrmstr
    Nov 15 '18 at 10:44












  • Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.

    – Sachin
    Nov 15 '18 at 17:30












  • 1





    Dupe-oid: Get indexes of a vector of numbers in another vector

    – Henrik
    Nov 15 '18 at 10:42






  • 6





    Since this is R and you took the time to make a numeric sequence why vs c()? Spidey-sense says #homework

    – hrbrmstr
    Nov 15 '18 at 10:44












  • Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.

    – Sachin
    Nov 15 '18 at 17:30







1




1





Dupe-oid: Get indexes of a vector of numbers in another vector

– Henrik
Nov 15 '18 at 10:42





Dupe-oid: Get indexes of a vector of numbers in another vector

– Henrik
Nov 15 '18 at 10:42




6




6





Since this is R and you took the time to make a numeric sequence why vs c()? Spidey-sense says #homework

– hrbrmstr
Nov 15 '18 at 10:44






Since this is R and you took the time to make a numeric sequence why vs c()? Spidey-sense says #homework

– hrbrmstr
Nov 15 '18 at 10:44














Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.

– Sachin
Nov 15 '18 at 17:30





Haha no, I just used the sets notation as a matter of habit. I made up the sequence while writing the question and didn't copy it.

– Sachin
Nov 15 '18 at 17:30












1 Answer
1






active

oldest

votes


















2














Here is something that will do this task: First I create a vector index where index[k] is equal to the amount of indices to go (starting at k) until one has all the elements at least once, and it is equal to Inf if that is never the case.



# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))

# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))


# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))

# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5


where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2) and for any given k myFun will return the smallest number n such that v[k:n] contains every element of v.






share|improve this answer

























  • Thank you very much.

    – Sachin
    Nov 15 '18 at 17:40










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317387%2fsmallest-sub-vector-with-all-distinct-values-of-the-original-vector%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














Here is something that will do this task: First I create a vector index where index[k] is equal to the amount of indices to go (starting at k) until one has all the elements at least once, and it is equal to Inf if that is never the case.



# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))

# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))


# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))

# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5


where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2) and for any given k myFun will return the smallest number n such that v[k:n] contains every element of v.






share|improve this answer

























  • Thank you very much.

    – Sachin
    Nov 15 '18 at 17:40















2














Here is something that will do this task: First I create a vector index where index[k] is equal to the amount of indices to go (starting at k) until one has all the elements at least once, and it is equal to Inf if that is never the case.



# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))

# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))


# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))

# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5


where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2) and for any given k myFun will return the smallest number n such that v[k:n] contains every element of v.






share|improve this answer

























  • Thank you very much.

    – Sachin
    Nov 15 '18 at 17:40













2












2








2







Here is something that will do this task: First I create a vector index where index[k] is equal to the amount of indices to go (starting at k) until one has all the elements at least once, and it is equal to Inf if that is never the case.



# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))

# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))


# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))

# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5


where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2) and for any given k myFun will return the smallest number n such that v[k:n] contains every element of v.






share|improve this answer















Here is something that will do this task: First I create a vector index where index[k] is equal to the amount of indices to go (starting at k) until one has all the elements at least once, and it is equal to Inf if that is never the case.



# determining the unique elements of v
uniqueVals <- unique(v)
index <- numeric(length(v))

# helper function
myFun <- function(k)
helper <- vapply(seq(k, length(v)),
function(s) length(setdiff(uniqueVals, v[k:s])),
numeric(1))
return (ifelse(min(helper) == 0, which.min(helper)-1, Inf))


# indices in seq1 must be infinity as there are not enough values left
seq1 <- which(length(v) - seq_along(v) < length(uniqueVals))
index[seq1] <- Inf
# for the other indices we now use our helper function
index[seq(1, min(seq1)-1)] <- vapply(seq(1, min(seq1)-1), myFun, numeric(1))

# applying the above
startIndex <- which.min(index)
endIndex <- index[startIndex] + startIndex
v[startIndex:endIndex]
# yielding
[1] 2 3 4 1 1 5


where v = c(1, 4, 2, 2, 3, 4, 1, 1, 5, 2, 2) and for any given k myFun will return the smallest number n such that v[k:n] contains every element of v.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 15 '18 at 11:29

























answered Nov 15 '18 at 11:07









natenate

3,1411321




3,1411321












  • Thank you very much.

    – Sachin
    Nov 15 '18 at 17:40

















  • Thank you very much.

    – Sachin
    Nov 15 '18 at 17:40
















Thank you very much.

– Sachin
Nov 15 '18 at 17:40





Thank you very much.

– Sachin
Nov 15 '18 at 17:40



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317387%2fsmallest-sub-vector-with-all-distinct-values-of-the-original-vector%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

Node.js Script on GitHub Pages or Amazon S3

Museum of Modern and Contemporary Art of Trento and Rovereto