Find the maximum in a certain time frame in a non-continuous time series










3















I have a dataframe with a time series that looks like this:



df<-structure(list(date = structure(c(-6905, -6891, -6853, -6588, 
-6588, -6586, -6523, -6515, -5856, -5753), class = "Date"), flow = c(2.22,
2.56, 3.3, 1.38, 4, 1.4, 1.32, 1.26, 6, 35.69)), .Names = c("date",
"flow"), row.names = c(NA, 10L), class = "data.frame")


I want to remove all the lines that are not the maximum within 2 days forward or backward of its date. So in the case above, lines 4 and 6 will be removed. I couldn't find similar answered questions.



I wrote this code that doesn't work and it is ugly, long and doesn't take care of the edges of the dataframe:



 idx <- c()
for (j in 3:(length(df$date)-2))
as.Date(df$date[j])-as.Date(df$date[j-2])<3)
if (df$flow[j]!=max(df$flow[(j-2):(j+2)]))
idx <- c(idx,j)

else if (as.Date(df$date[j+1])-as.Date(df$date[j])<3


Notice that the dates in the dataframe are not consecutive.










share|improve this question



















  • 1





    Are you sure that row 7 would be removed?

    – RLave
    Nov 14 '18 at 8:53











  • @RLave No. I was wrong. row 7 will not be removed. It is not the same year. Thanks for pointing it out. I also edited my answer: my code most definitely doesn't work.

    – asher
    Nov 14 '18 at 9:08












  • are you sure we're not suppose to be left only with date flow <date> <dbl> 1 1951-03-29 3.30 2 1951-12-19 4.00 3 1954-04-02 35.7

    – DJV
    Nov 14 '18 at 9:13











  • Also in your example you have a date twice, is it correct?

    – RLave
    Nov 14 '18 at 9:21






  • 1





    @RLave Yes. That is correct. It is a time-series of separated flood events which might happen on the same day.

    – asher
    Nov 14 '18 at 9:26















3















I have a dataframe with a time series that looks like this:



df<-structure(list(date = structure(c(-6905, -6891, -6853, -6588, 
-6588, -6586, -6523, -6515, -5856, -5753), class = "Date"), flow = c(2.22,
2.56, 3.3, 1.38, 4, 1.4, 1.32, 1.26, 6, 35.69)), .Names = c("date",
"flow"), row.names = c(NA, 10L), class = "data.frame")


I want to remove all the lines that are not the maximum within 2 days forward or backward of its date. So in the case above, lines 4 and 6 will be removed. I couldn't find similar answered questions.



I wrote this code that doesn't work and it is ugly, long and doesn't take care of the edges of the dataframe:



 idx <- c()
for (j in 3:(length(df$date)-2))
as.Date(df$date[j])-as.Date(df$date[j-2])<3)
if (df$flow[j]!=max(df$flow[(j-2):(j+2)]))
idx <- c(idx,j)

else if (as.Date(df$date[j+1])-as.Date(df$date[j])<3


Notice that the dates in the dataframe are not consecutive.










share|improve this question



















  • 1





    Are you sure that row 7 would be removed?

    – RLave
    Nov 14 '18 at 8:53











  • @RLave No. I was wrong. row 7 will not be removed. It is not the same year. Thanks for pointing it out. I also edited my answer: my code most definitely doesn't work.

    – asher
    Nov 14 '18 at 9:08












  • are you sure we're not suppose to be left only with date flow <date> <dbl> 1 1951-03-29 3.30 2 1951-12-19 4.00 3 1954-04-02 35.7

    – DJV
    Nov 14 '18 at 9:13











  • Also in your example you have a date twice, is it correct?

    – RLave
    Nov 14 '18 at 9:21






  • 1





    @RLave Yes. That is correct. It is a time-series of separated flood events which might happen on the same day.

    – asher
    Nov 14 '18 at 9:26













3












3








3








I have a dataframe with a time series that looks like this:



df<-structure(list(date = structure(c(-6905, -6891, -6853, -6588, 
-6588, -6586, -6523, -6515, -5856, -5753), class = "Date"), flow = c(2.22,
2.56, 3.3, 1.38, 4, 1.4, 1.32, 1.26, 6, 35.69)), .Names = c("date",
"flow"), row.names = c(NA, 10L), class = "data.frame")


I want to remove all the lines that are not the maximum within 2 days forward or backward of its date. So in the case above, lines 4 and 6 will be removed. I couldn't find similar answered questions.



I wrote this code that doesn't work and it is ugly, long and doesn't take care of the edges of the dataframe:



 idx <- c()
for (j in 3:(length(df$date)-2))
as.Date(df$date[j])-as.Date(df$date[j-2])<3)
if (df$flow[j]!=max(df$flow[(j-2):(j+2)]))
idx <- c(idx,j)

else if (as.Date(df$date[j+1])-as.Date(df$date[j])<3


Notice that the dates in the dataframe are not consecutive.










share|improve this question
















I have a dataframe with a time series that looks like this:



df<-structure(list(date = structure(c(-6905, -6891, -6853, -6588, 
-6588, -6586, -6523, -6515, -5856, -5753), class = "Date"), flow = c(2.22,
2.56, 3.3, 1.38, 4, 1.4, 1.32, 1.26, 6, 35.69)), .Names = c("date",
"flow"), row.names = c(NA, 10L), class = "data.frame")


I want to remove all the lines that are not the maximum within 2 days forward or backward of its date. So in the case above, lines 4 and 6 will be removed. I couldn't find similar answered questions.



I wrote this code that doesn't work and it is ugly, long and doesn't take care of the edges of the dataframe:



 idx <- c()
for (j in 3:(length(df$date)-2))
as.Date(df$date[j])-as.Date(df$date[j-2])<3)
if (df$flow[j]!=max(df$flow[(j-2):(j+2)]))
idx <- c(idx,j)

else if (as.Date(df$date[j+1])-as.Date(df$date[j])<3


Notice that the dates in the dataframe are not consecutive.







r dataframe time-series max






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 14 '18 at 9:31







asher

















asked Nov 14 '18 at 8:37









asherasher

506




506







  • 1





    Are you sure that row 7 would be removed?

    – RLave
    Nov 14 '18 at 8:53











  • @RLave No. I was wrong. row 7 will not be removed. It is not the same year. Thanks for pointing it out. I also edited my answer: my code most definitely doesn't work.

    – asher
    Nov 14 '18 at 9:08












  • are you sure we're not suppose to be left only with date flow <date> <dbl> 1 1951-03-29 3.30 2 1951-12-19 4.00 3 1954-04-02 35.7

    – DJV
    Nov 14 '18 at 9:13











  • Also in your example you have a date twice, is it correct?

    – RLave
    Nov 14 '18 at 9:21






  • 1





    @RLave Yes. That is correct. It is a time-series of separated flood events which might happen on the same day.

    – asher
    Nov 14 '18 at 9:26












  • 1





    Are you sure that row 7 would be removed?

    – RLave
    Nov 14 '18 at 8:53











  • @RLave No. I was wrong. row 7 will not be removed. It is not the same year. Thanks for pointing it out. I also edited my answer: my code most definitely doesn't work.

    – asher
    Nov 14 '18 at 9:08












  • are you sure we're not suppose to be left only with date flow <date> <dbl> 1 1951-03-29 3.30 2 1951-12-19 4.00 3 1954-04-02 35.7

    – DJV
    Nov 14 '18 at 9:13











  • Also in your example you have a date twice, is it correct?

    – RLave
    Nov 14 '18 at 9:21






  • 1





    @RLave Yes. That is correct. It is a time-series of separated flood events which might happen on the same day.

    – asher
    Nov 14 '18 at 9:26







1




1





Are you sure that row 7 would be removed?

– RLave
Nov 14 '18 at 8:53





Are you sure that row 7 would be removed?

– RLave
Nov 14 '18 at 8:53













@RLave No. I was wrong. row 7 will not be removed. It is not the same year. Thanks for pointing it out. I also edited my answer: my code most definitely doesn't work.

– asher
Nov 14 '18 at 9:08






@RLave No. I was wrong. row 7 will not be removed. It is not the same year. Thanks for pointing it out. I also edited my answer: my code most definitely doesn't work.

– asher
Nov 14 '18 at 9:08














are you sure we're not suppose to be left only with date flow <date> <dbl> 1 1951-03-29 3.30 2 1951-12-19 4.00 3 1954-04-02 35.7

– DJV
Nov 14 '18 at 9:13





are you sure we're not suppose to be left only with date flow <date> <dbl> 1 1951-03-29 3.30 2 1951-12-19 4.00 3 1954-04-02 35.7

– DJV
Nov 14 '18 at 9:13













Also in your example you have a date twice, is it correct?

– RLave
Nov 14 '18 at 9:21





Also in your example you have a date twice, is it correct?

– RLave
Nov 14 '18 at 9:21




1




1





@RLave Yes. That is correct. It is a time-series of separated flood events which might happen on the same day.

– asher
Nov 14 '18 at 9:26





@RLave Yes. That is correct. It is a time-series of separated flood events which might happen on the same day.

– asher
Nov 14 '18 at 9:26












3 Answers
3






active

oldest

votes


















3














Using the zoo library.



library(zoo)

# convert into a zoo time series
dtf.zoo <- zoo(dt$flow, order.by=dt$date)

# remove duplicate dates by keeping the maximum value
dtf.zoo <- aggregate(dtf.zoo, time(dtf.zoo), max)

# pad with NAs to make the time series regular
dtf.zoo <- merge(
dtf.zoo,
zoo(, seq(min(index(dtf.zoo)), max(index(dtf.zoo)), "day"))
)

# find rows that are less than a value two days prior or hence
rem <- which(dtf.zoo < rollapply(dtf.zoo, 5, max, na.rm=TRUE, partial=TRUE))

# remove those rows
dtf.zoo2 <- dtf.zoo[-rem]

# remove NAs
dt2 <- data.frame(flow=na.omit(dtf.zoo2))
dt2
# flow
# 1951-02-05 2.22
# 1951-02-19 2.56
# 1951-03-29 3.30
# 1951-12-19 4.00
# 1952-02-22 1.32
# 1952-03-01 1.26
# 1953-12-20 6.00
# 1954-04-02 35.69

which(!(dt$flow %in% dt2$flow))
# 4 6





share|improve this answer

























  • hey, if I don't misunderstand, the OP wants to remove line 4 and 6, but you remove line 4 and 5.

    – Darren Tsai
    Nov 14 '18 at 10:56











  • @DarrenTsai: Indeed, thank you. I forgot to set partial=TRUE in rollapply(), which caused an index mismatch.

    – AkselA
    Nov 14 '18 at 11:28












  • @AkselA: I added this condition: if (length(rem>0))df.zoo <- df.zoo[-rem] to escape cases that don't have events that are less than 3 days apart

    – asher
    Nov 14 '18 at 13:26











  • @asher: That will work. Removing which() and replacing -rem with !rem should also work.

    – AkselA
    Nov 14 '18 at 13:41











  • @AkselA: makes sense. Thanks for the ongoing exchange it is very helpful.

    – asher
    Nov 15 '18 at 11:40


















2














You can also use the tidyverse approch:



require(tidyverse)

df %>%
#Arrange by date
arrange(date) %>%
#Picking the max for each da
group_by(date) %>%
top_n(1, flow) %>%
ungroup() %>%
#Adding missing dates with NAs
complete(date = seq.Date(min(date), max(date), by="day")) %>%
#Remove Two up/down
mutate(
remove = case_when(
flow < rowMeans(data.frame(lag(flow, 1),
lag(flow, 2)), na.rm = TRUE) ~ "remove",
flow < rowMeans(data.frame(lead(flow, 1),
lead(flow, 2)), na.rm = TRUE) ~ "remove",
TRUE ~ "keep")) %>%
na.omit() %>%
filter(remove == "keep") %>%
select(-remove)


# A tibble: 8 x 2
date flow
<date> <dbl>
1 1951-02-05 2.22
2 1951-02-19 2.56
3 1951-03-29 3.30
4 1951-12-19 4.00
5 1952-02-22 1.32
6 1952-03-01 1.26
7 1953-12-20 6.00
8 1954-04-02 35.7





share|improve this answer
































    2














    I use lapply() to check the range : [date - 2 days , date + 2 days] of each date.



    rm.list <- lapply(df$date, function(x) 
    ind <- which(abs(df$date - x) <= 2)
    flow <- df$flow[ind]
    if(length(ind) > 1) which(flow < max(flow)) + min(ind) - 1
    else NULL
    )

    rm <- unique(unlist(rm.list)) # [1] 4 6
    df[-rm, ]

    # date flow
    # 1 1951-02-05 2.22
    # 2 1951-02-19 2.56
    # 3 1951-03-29 3.30
    # 5 1951-12-19 4.00
    # 7 1952-02-22 1.32
    # 8 1952-03-01 1.26
    # 9 1953-12-20 6.00
    # 10 1954-04-02 35.69





    share|improve this answer
























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53295976%2ffind-the-maximum-in-a-certain-time-frame-in-a-non-continuous-time-series%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      3














      Using the zoo library.



      library(zoo)

      # convert into a zoo time series
      dtf.zoo <- zoo(dt$flow, order.by=dt$date)

      # remove duplicate dates by keeping the maximum value
      dtf.zoo <- aggregate(dtf.zoo, time(dtf.zoo), max)

      # pad with NAs to make the time series regular
      dtf.zoo <- merge(
      dtf.zoo,
      zoo(, seq(min(index(dtf.zoo)), max(index(dtf.zoo)), "day"))
      )

      # find rows that are less than a value two days prior or hence
      rem <- which(dtf.zoo < rollapply(dtf.zoo, 5, max, na.rm=TRUE, partial=TRUE))

      # remove those rows
      dtf.zoo2 <- dtf.zoo[-rem]

      # remove NAs
      dt2 <- data.frame(flow=na.omit(dtf.zoo2))
      dt2
      # flow
      # 1951-02-05 2.22
      # 1951-02-19 2.56
      # 1951-03-29 3.30
      # 1951-12-19 4.00
      # 1952-02-22 1.32
      # 1952-03-01 1.26
      # 1953-12-20 6.00
      # 1954-04-02 35.69

      which(!(dt$flow %in% dt2$flow))
      # 4 6





      share|improve this answer

























      • hey, if I don't misunderstand, the OP wants to remove line 4 and 6, but you remove line 4 and 5.

        – Darren Tsai
        Nov 14 '18 at 10:56











      • @DarrenTsai: Indeed, thank you. I forgot to set partial=TRUE in rollapply(), which caused an index mismatch.

        – AkselA
        Nov 14 '18 at 11:28












      • @AkselA: I added this condition: if (length(rem>0))df.zoo <- df.zoo[-rem] to escape cases that don't have events that are less than 3 days apart

        – asher
        Nov 14 '18 at 13:26











      • @asher: That will work. Removing which() and replacing -rem with !rem should also work.

        – AkselA
        Nov 14 '18 at 13:41











      • @AkselA: makes sense. Thanks for the ongoing exchange it is very helpful.

        – asher
        Nov 15 '18 at 11:40















      3














      Using the zoo library.



      library(zoo)

      # convert into a zoo time series
      dtf.zoo <- zoo(dt$flow, order.by=dt$date)

      # remove duplicate dates by keeping the maximum value
      dtf.zoo <- aggregate(dtf.zoo, time(dtf.zoo), max)

      # pad with NAs to make the time series regular
      dtf.zoo <- merge(
      dtf.zoo,
      zoo(, seq(min(index(dtf.zoo)), max(index(dtf.zoo)), "day"))
      )

      # find rows that are less than a value two days prior or hence
      rem <- which(dtf.zoo < rollapply(dtf.zoo, 5, max, na.rm=TRUE, partial=TRUE))

      # remove those rows
      dtf.zoo2 <- dtf.zoo[-rem]

      # remove NAs
      dt2 <- data.frame(flow=na.omit(dtf.zoo2))
      dt2
      # flow
      # 1951-02-05 2.22
      # 1951-02-19 2.56
      # 1951-03-29 3.30
      # 1951-12-19 4.00
      # 1952-02-22 1.32
      # 1952-03-01 1.26
      # 1953-12-20 6.00
      # 1954-04-02 35.69

      which(!(dt$flow %in% dt2$flow))
      # 4 6





      share|improve this answer

























      • hey, if I don't misunderstand, the OP wants to remove line 4 and 6, but you remove line 4 and 5.

        – Darren Tsai
        Nov 14 '18 at 10:56











      • @DarrenTsai: Indeed, thank you. I forgot to set partial=TRUE in rollapply(), which caused an index mismatch.

        – AkselA
        Nov 14 '18 at 11:28












      • @AkselA: I added this condition: if (length(rem>0))df.zoo <- df.zoo[-rem] to escape cases that don't have events that are less than 3 days apart

        – asher
        Nov 14 '18 at 13:26











      • @asher: That will work. Removing which() and replacing -rem with !rem should also work.

        – AkselA
        Nov 14 '18 at 13:41











      • @AkselA: makes sense. Thanks for the ongoing exchange it is very helpful.

        – asher
        Nov 15 '18 at 11:40













      3












      3








      3







      Using the zoo library.



      library(zoo)

      # convert into a zoo time series
      dtf.zoo <- zoo(dt$flow, order.by=dt$date)

      # remove duplicate dates by keeping the maximum value
      dtf.zoo <- aggregate(dtf.zoo, time(dtf.zoo), max)

      # pad with NAs to make the time series regular
      dtf.zoo <- merge(
      dtf.zoo,
      zoo(, seq(min(index(dtf.zoo)), max(index(dtf.zoo)), "day"))
      )

      # find rows that are less than a value two days prior or hence
      rem <- which(dtf.zoo < rollapply(dtf.zoo, 5, max, na.rm=TRUE, partial=TRUE))

      # remove those rows
      dtf.zoo2 <- dtf.zoo[-rem]

      # remove NAs
      dt2 <- data.frame(flow=na.omit(dtf.zoo2))
      dt2
      # flow
      # 1951-02-05 2.22
      # 1951-02-19 2.56
      # 1951-03-29 3.30
      # 1951-12-19 4.00
      # 1952-02-22 1.32
      # 1952-03-01 1.26
      # 1953-12-20 6.00
      # 1954-04-02 35.69

      which(!(dt$flow %in% dt2$flow))
      # 4 6





      share|improve this answer















      Using the zoo library.



      library(zoo)

      # convert into a zoo time series
      dtf.zoo <- zoo(dt$flow, order.by=dt$date)

      # remove duplicate dates by keeping the maximum value
      dtf.zoo <- aggregate(dtf.zoo, time(dtf.zoo), max)

      # pad with NAs to make the time series regular
      dtf.zoo <- merge(
      dtf.zoo,
      zoo(, seq(min(index(dtf.zoo)), max(index(dtf.zoo)), "day"))
      )

      # find rows that are less than a value two days prior or hence
      rem <- which(dtf.zoo < rollapply(dtf.zoo, 5, max, na.rm=TRUE, partial=TRUE))

      # remove those rows
      dtf.zoo2 <- dtf.zoo[-rem]

      # remove NAs
      dt2 <- data.frame(flow=na.omit(dtf.zoo2))
      dt2
      # flow
      # 1951-02-05 2.22
      # 1951-02-19 2.56
      # 1951-03-29 3.30
      # 1951-12-19 4.00
      # 1952-02-22 1.32
      # 1952-03-01 1.26
      # 1953-12-20 6.00
      # 1954-04-02 35.69

      which(!(dt$flow %in% dt2$flow))
      # 4 6






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Nov 14 '18 at 11:24

























      answered Nov 14 '18 at 9:55









      AkselAAkselA

      4,49121325




      4,49121325












      • hey, if I don't misunderstand, the OP wants to remove line 4 and 6, but you remove line 4 and 5.

        – Darren Tsai
        Nov 14 '18 at 10:56











      • @DarrenTsai: Indeed, thank you. I forgot to set partial=TRUE in rollapply(), which caused an index mismatch.

        – AkselA
        Nov 14 '18 at 11:28












      • @AkselA: I added this condition: if (length(rem>0))df.zoo <- df.zoo[-rem] to escape cases that don't have events that are less than 3 days apart

        – asher
        Nov 14 '18 at 13:26











      • @asher: That will work. Removing which() and replacing -rem with !rem should also work.

        – AkselA
        Nov 14 '18 at 13:41











      • @AkselA: makes sense. Thanks for the ongoing exchange it is very helpful.

        – asher
        Nov 15 '18 at 11:40

















      • hey, if I don't misunderstand, the OP wants to remove line 4 and 6, but you remove line 4 and 5.

        – Darren Tsai
        Nov 14 '18 at 10:56











      • @DarrenTsai: Indeed, thank you. I forgot to set partial=TRUE in rollapply(), which caused an index mismatch.

        – AkselA
        Nov 14 '18 at 11:28












      • @AkselA: I added this condition: if (length(rem>0))df.zoo <- df.zoo[-rem] to escape cases that don't have events that are less than 3 days apart

        – asher
        Nov 14 '18 at 13:26











      • @asher: That will work. Removing which() and replacing -rem with !rem should also work.

        – AkselA
        Nov 14 '18 at 13:41











      • @AkselA: makes sense. Thanks for the ongoing exchange it is very helpful.

        – asher
        Nov 15 '18 at 11:40
















      hey, if I don't misunderstand, the OP wants to remove line 4 and 6, but you remove line 4 and 5.

      – Darren Tsai
      Nov 14 '18 at 10:56





      hey, if I don't misunderstand, the OP wants to remove line 4 and 6, but you remove line 4 and 5.

      – Darren Tsai
      Nov 14 '18 at 10:56













      @DarrenTsai: Indeed, thank you. I forgot to set partial=TRUE in rollapply(), which caused an index mismatch.

      – AkselA
      Nov 14 '18 at 11:28






      @DarrenTsai: Indeed, thank you. I forgot to set partial=TRUE in rollapply(), which caused an index mismatch.

      – AkselA
      Nov 14 '18 at 11:28














      @AkselA: I added this condition: if (length(rem>0))df.zoo <- df.zoo[-rem] to escape cases that don't have events that are less than 3 days apart

      – asher
      Nov 14 '18 at 13:26





      @AkselA: I added this condition: if (length(rem>0))df.zoo <- df.zoo[-rem] to escape cases that don't have events that are less than 3 days apart

      – asher
      Nov 14 '18 at 13:26













      @asher: That will work. Removing which() and replacing -rem with !rem should also work.

      – AkselA
      Nov 14 '18 at 13:41





      @asher: That will work. Removing which() and replacing -rem with !rem should also work.

      – AkselA
      Nov 14 '18 at 13:41













      @AkselA: makes sense. Thanks for the ongoing exchange it is very helpful.

      – asher
      Nov 15 '18 at 11:40





      @AkselA: makes sense. Thanks for the ongoing exchange it is very helpful.

      – asher
      Nov 15 '18 at 11:40













      2














      You can also use the tidyverse approch:



      require(tidyverse)

      df %>%
      #Arrange by date
      arrange(date) %>%
      #Picking the max for each da
      group_by(date) %>%
      top_n(1, flow) %>%
      ungroup() %>%
      #Adding missing dates with NAs
      complete(date = seq.Date(min(date), max(date), by="day")) %>%
      #Remove Two up/down
      mutate(
      remove = case_when(
      flow < rowMeans(data.frame(lag(flow, 1),
      lag(flow, 2)), na.rm = TRUE) ~ "remove",
      flow < rowMeans(data.frame(lead(flow, 1),
      lead(flow, 2)), na.rm = TRUE) ~ "remove",
      TRUE ~ "keep")) %>%
      na.omit() %>%
      filter(remove == "keep") %>%
      select(-remove)


      # A tibble: 8 x 2
      date flow
      <date> <dbl>
      1 1951-02-05 2.22
      2 1951-02-19 2.56
      3 1951-03-29 3.30
      4 1951-12-19 4.00
      5 1952-02-22 1.32
      6 1952-03-01 1.26
      7 1953-12-20 6.00
      8 1954-04-02 35.7





      share|improve this answer





























        2














        You can also use the tidyverse approch:



        require(tidyverse)

        df %>%
        #Arrange by date
        arrange(date) %>%
        #Picking the max for each da
        group_by(date) %>%
        top_n(1, flow) %>%
        ungroup() %>%
        #Adding missing dates with NAs
        complete(date = seq.Date(min(date), max(date), by="day")) %>%
        #Remove Two up/down
        mutate(
        remove = case_when(
        flow < rowMeans(data.frame(lag(flow, 1),
        lag(flow, 2)), na.rm = TRUE) ~ "remove",
        flow < rowMeans(data.frame(lead(flow, 1),
        lead(flow, 2)), na.rm = TRUE) ~ "remove",
        TRUE ~ "keep")) %>%
        na.omit() %>%
        filter(remove == "keep") %>%
        select(-remove)


        # A tibble: 8 x 2
        date flow
        <date> <dbl>
        1 1951-02-05 2.22
        2 1951-02-19 2.56
        3 1951-03-29 3.30
        4 1951-12-19 4.00
        5 1952-02-22 1.32
        6 1952-03-01 1.26
        7 1953-12-20 6.00
        8 1954-04-02 35.7





        share|improve this answer



























          2












          2








          2







          You can also use the tidyverse approch:



          require(tidyverse)

          df %>%
          #Arrange by date
          arrange(date) %>%
          #Picking the max for each da
          group_by(date) %>%
          top_n(1, flow) %>%
          ungroup() %>%
          #Adding missing dates with NAs
          complete(date = seq.Date(min(date), max(date), by="day")) %>%
          #Remove Two up/down
          mutate(
          remove = case_when(
          flow < rowMeans(data.frame(lag(flow, 1),
          lag(flow, 2)), na.rm = TRUE) ~ "remove",
          flow < rowMeans(data.frame(lead(flow, 1),
          lead(flow, 2)), na.rm = TRUE) ~ "remove",
          TRUE ~ "keep")) %>%
          na.omit() %>%
          filter(remove == "keep") %>%
          select(-remove)


          # A tibble: 8 x 2
          date flow
          <date> <dbl>
          1 1951-02-05 2.22
          2 1951-02-19 2.56
          3 1951-03-29 3.30
          4 1951-12-19 4.00
          5 1952-02-22 1.32
          6 1952-03-01 1.26
          7 1953-12-20 6.00
          8 1954-04-02 35.7





          share|improve this answer















          You can also use the tidyverse approch:



          require(tidyverse)

          df %>%
          #Arrange by date
          arrange(date) %>%
          #Picking the max for each da
          group_by(date) %>%
          top_n(1, flow) %>%
          ungroup() %>%
          #Adding missing dates with NAs
          complete(date = seq.Date(min(date), max(date), by="day")) %>%
          #Remove Two up/down
          mutate(
          remove = case_when(
          flow < rowMeans(data.frame(lag(flow, 1),
          lag(flow, 2)), na.rm = TRUE) ~ "remove",
          flow < rowMeans(data.frame(lead(flow, 1),
          lead(flow, 2)), na.rm = TRUE) ~ "remove",
          TRUE ~ "keep")) %>%
          na.omit() %>%
          filter(remove == "keep") %>%
          select(-remove)


          # A tibble: 8 x 2
          date flow
          <date> <dbl>
          1 1951-02-05 2.22
          2 1951-02-19 2.56
          3 1951-03-29 3.30
          4 1951-12-19 4.00
          5 1952-02-22 1.32
          6 1952-03-01 1.26
          7 1953-12-20 6.00
          8 1954-04-02 35.7






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 14 '18 at 10:41

























          answered Nov 14 '18 at 9:16









          DJVDJV

          1,3761317




          1,3761317





















              2














              I use lapply() to check the range : [date - 2 days , date + 2 days] of each date.



              rm.list <- lapply(df$date, function(x) 
              ind <- which(abs(df$date - x) <= 2)
              flow <- df$flow[ind]
              if(length(ind) > 1) which(flow < max(flow)) + min(ind) - 1
              else NULL
              )

              rm <- unique(unlist(rm.list)) # [1] 4 6
              df[-rm, ]

              # date flow
              # 1 1951-02-05 2.22
              # 2 1951-02-19 2.56
              # 3 1951-03-29 3.30
              # 5 1951-12-19 4.00
              # 7 1952-02-22 1.32
              # 8 1952-03-01 1.26
              # 9 1953-12-20 6.00
              # 10 1954-04-02 35.69





              share|improve this answer





























                2














                I use lapply() to check the range : [date - 2 days , date + 2 days] of each date.



                rm.list <- lapply(df$date, function(x) 
                ind <- which(abs(df$date - x) <= 2)
                flow <- df$flow[ind]
                if(length(ind) > 1) which(flow < max(flow)) + min(ind) - 1
                else NULL
                )

                rm <- unique(unlist(rm.list)) # [1] 4 6
                df[-rm, ]

                # date flow
                # 1 1951-02-05 2.22
                # 2 1951-02-19 2.56
                # 3 1951-03-29 3.30
                # 5 1951-12-19 4.00
                # 7 1952-02-22 1.32
                # 8 1952-03-01 1.26
                # 9 1953-12-20 6.00
                # 10 1954-04-02 35.69





                share|improve this answer



























                  2












                  2








                  2







                  I use lapply() to check the range : [date - 2 days , date + 2 days] of each date.



                  rm.list <- lapply(df$date, function(x) 
                  ind <- which(abs(df$date - x) <= 2)
                  flow <- df$flow[ind]
                  if(length(ind) > 1) which(flow < max(flow)) + min(ind) - 1
                  else NULL
                  )

                  rm <- unique(unlist(rm.list)) # [1] 4 6
                  df[-rm, ]

                  # date flow
                  # 1 1951-02-05 2.22
                  # 2 1951-02-19 2.56
                  # 3 1951-03-29 3.30
                  # 5 1951-12-19 4.00
                  # 7 1952-02-22 1.32
                  # 8 1952-03-01 1.26
                  # 9 1953-12-20 6.00
                  # 10 1954-04-02 35.69





                  share|improve this answer















                  I use lapply() to check the range : [date - 2 days , date + 2 days] of each date.



                  rm.list <- lapply(df$date, function(x) 
                  ind <- which(abs(df$date - x) <= 2)
                  flow <- df$flow[ind]
                  if(length(ind) > 1) which(flow < max(flow)) + min(ind) - 1
                  else NULL
                  )

                  rm <- unique(unlist(rm.list)) # [1] 4 6
                  df[-rm, ]

                  # date flow
                  # 1 1951-02-05 2.22
                  # 2 1951-02-19 2.56
                  # 3 1951-03-29 3.30
                  # 5 1951-12-19 4.00
                  # 7 1952-02-22 1.32
                  # 8 1952-03-01 1.26
                  # 9 1953-12-20 6.00
                  # 10 1954-04-02 35.69






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 14 '18 at 10:49

























                  answered Nov 14 '18 at 10:37









                  Darren TsaiDarren Tsai

                  2,3102428




                  2,3102428



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53295976%2ffind-the-maximum-in-a-certain-time-frame-in-a-non-continuous-time-series%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      這個網誌中的熱門文章

                      How to read a connectionString WITH PROVIDER in .NET Core?

                      Node.js Script on GitHub Pages or Amazon S3

                      Museum of Modern and Contemporary Art of Trento and Rovereto