How to take mean from month April of previous year to July in R?










0















Month Year Rainfall
4 2010
5 2010
6 2010
7 2010
8 2010
9 2010
10 2010
11 2010
12 2010
1 2011
2 2011
3 2011
4 2011
5 2011
6 2011
7 2011


I want to get the average from the month of 4 of 2010 to 7 of July 2011 and then start to get average from the month of 4 of 2011 to 7 of July 2012?



I have tried this code but it worked for the first part only so can anyone help me on the second part?



## The code
subdataLGSP<-
subset(df2.ppt.mon, (Year %in% c(2010,2011,2012,2013,2014,2015,2016)) & (month %in% c(4,5,6,7,8,9,10,11,12))) #Apr from previous year tp July
Subdatanext<-
subset(df2.ppt.mon, (Year %in% c(2011,2012,2013,2014,2015,2016)) & (month %in% c(1,2,3,4,5,6,7))) # Apr from previous year to next July

subdataprnext<-
rbind(subdataLGSP,Subdatanext)

df2prnext<-
aggregate(subdataprnext$RAIN, by = list(month = subdataprnext$month, Year= subdataprnext$Year), mean)

library(data.table)
setDT(df2prnext)
n <- 16 # every 16 rows
datPRApOct<-
df2prnext[, mean(x), by= (seq(nrow(df2prnext)) - 1) %/% n]# This is what we want for seasonal precipitation









share|improve this question
























  • Welcome to SO! Just to clarify: do you mean July by 7 ot July.

    – Jrakru56
    Nov 15 '18 at 17:26











  • Yes July mean month 7. Thank you.

    – Sonisa Sharma
    Nov 15 '18 at 17:27











  • If one of the answers addresses your question, please accept it; doing so not only provides a little perk to the answerer with some points, but also provides some closure for readers with similar questions. Though you can only accept one answer, you have the option to up-vote as many as you think are helpful. (If there are still issues, you will likely need to edit your question with further details.)

    – r2evans
    Nov 15 '18 at 19:53















0















Month Year Rainfall
4 2010
5 2010
6 2010
7 2010
8 2010
9 2010
10 2010
11 2010
12 2010
1 2011
2 2011
3 2011
4 2011
5 2011
6 2011
7 2011


I want to get the average from the month of 4 of 2010 to 7 of July 2011 and then start to get average from the month of 4 of 2011 to 7 of July 2012?



I have tried this code but it worked for the first part only so can anyone help me on the second part?



## The code
subdataLGSP<-
subset(df2.ppt.mon, (Year %in% c(2010,2011,2012,2013,2014,2015,2016)) & (month %in% c(4,5,6,7,8,9,10,11,12))) #Apr from previous year tp July
Subdatanext<-
subset(df2.ppt.mon, (Year %in% c(2011,2012,2013,2014,2015,2016)) & (month %in% c(1,2,3,4,5,6,7))) # Apr from previous year to next July

subdataprnext<-
rbind(subdataLGSP,Subdatanext)

df2prnext<-
aggregate(subdataprnext$RAIN, by = list(month = subdataprnext$month, Year= subdataprnext$Year), mean)

library(data.table)
setDT(df2prnext)
n <- 16 # every 16 rows
datPRApOct<-
df2prnext[, mean(x), by= (seq(nrow(df2prnext)) - 1) %/% n]# This is what we want for seasonal precipitation









share|improve this question
























  • Welcome to SO! Just to clarify: do you mean July by 7 ot July.

    – Jrakru56
    Nov 15 '18 at 17:26











  • Yes July mean month 7. Thank you.

    – Sonisa Sharma
    Nov 15 '18 at 17:27











  • If one of the answers addresses your question, please accept it; doing so not only provides a little perk to the answerer with some points, but also provides some closure for readers with similar questions. Though you can only accept one answer, you have the option to up-vote as many as you think are helpful. (If there are still issues, you will likely need to edit your question with further details.)

    – r2evans
    Nov 15 '18 at 19:53













0












0








0


1






Month Year Rainfall
4 2010
5 2010
6 2010
7 2010
8 2010
9 2010
10 2010
11 2010
12 2010
1 2011
2 2011
3 2011
4 2011
5 2011
6 2011
7 2011


I want to get the average from the month of 4 of 2010 to 7 of July 2011 and then start to get average from the month of 4 of 2011 to 7 of July 2012?



I have tried this code but it worked for the first part only so can anyone help me on the second part?



## The code
subdataLGSP<-
subset(df2.ppt.mon, (Year %in% c(2010,2011,2012,2013,2014,2015,2016)) & (month %in% c(4,5,6,7,8,9,10,11,12))) #Apr from previous year tp July
Subdatanext<-
subset(df2.ppt.mon, (Year %in% c(2011,2012,2013,2014,2015,2016)) & (month %in% c(1,2,3,4,5,6,7))) # Apr from previous year to next July

subdataprnext<-
rbind(subdataLGSP,Subdatanext)

df2prnext<-
aggregate(subdataprnext$RAIN, by = list(month = subdataprnext$month, Year= subdataprnext$Year), mean)

library(data.table)
setDT(df2prnext)
n <- 16 # every 16 rows
datPRApOct<-
df2prnext[, mean(x), by= (seq(nrow(df2prnext)) - 1) %/% n]# This is what we want for seasonal precipitation









share|improve this question
















Month Year Rainfall
4 2010
5 2010
6 2010
7 2010
8 2010
9 2010
10 2010
11 2010
12 2010
1 2011
2 2011
3 2011
4 2011
5 2011
6 2011
7 2011


I want to get the average from the month of 4 of 2010 to 7 of July 2011 and then start to get average from the month of 4 of 2011 to 7 of July 2012?



I have tried this code but it worked for the first part only so can anyone help me on the second part?



## The code
subdataLGSP<-
subset(df2.ppt.mon, (Year %in% c(2010,2011,2012,2013,2014,2015,2016)) & (month %in% c(4,5,6,7,8,9,10,11,12))) #Apr from previous year tp July
Subdatanext<-
subset(df2.ppt.mon, (Year %in% c(2011,2012,2013,2014,2015,2016)) & (month %in% c(1,2,3,4,5,6,7))) # Apr from previous year to next July

subdataprnext<-
rbind(subdataLGSP,Subdatanext)

df2prnext<-
aggregate(subdataprnext$RAIN, by = list(month = subdataprnext$month, Year= subdataprnext$Year), mean)

library(data.table)
setDT(df2prnext)
n <- 16 # every 16 rows
datPRApOct<-
df2prnext[, mean(x), by= (seq(nrow(df2prnext)) - 1) %/% n]# This is what we want for seasonal precipitation






r mean






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 17:25









abhiieor

1,27431531




1,27431531










asked Nov 15 '18 at 17:16









Sonisa SharmaSonisa Sharma

104




104












  • Welcome to SO! Just to clarify: do you mean July by 7 ot July.

    – Jrakru56
    Nov 15 '18 at 17:26











  • Yes July mean month 7. Thank you.

    – Sonisa Sharma
    Nov 15 '18 at 17:27











  • If one of the answers addresses your question, please accept it; doing so not only provides a little perk to the answerer with some points, but also provides some closure for readers with similar questions. Though you can only accept one answer, you have the option to up-vote as many as you think are helpful. (If there are still issues, you will likely need to edit your question with further details.)

    – r2evans
    Nov 15 '18 at 19:53

















  • Welcome to SO! Just to clarify: do you mean July by 7 ot July.

    – Jrakru56
    Nov 15 '18 at 17:26











  • Yes July mean month 7. Thank you.

    – Sonisa Sharma
    Nov 15 '18 at 17:27











  • If one of the answers addresses your question, please accept it; doing so not only provides a little perk to the answerer with some points, but also provides some closure for readers with similar questions. Though you can only accept one answer, you have the option to up-vote as many as you think are helpful. (If there are still issues, you will likely need to edit your question with further details.)

    – r2evans
    Nov 15 '18 at 19:53
















Welcome to SO! Just to clarify: do you mean July by 7 ot July.

– Jrakru56
Nov 15 '18 at 17:26





Welcome to SO! Just to clarify: do you mean July by 7 ot July.

– Jrakru56
Nov 15 '18 at 17:26













Yes July mean month 7. Thank you.

– Sonisa Sharma
Nov 15 '18 at 17:27





Yes July mean month 7. Thank you.

– Sonisa Sharma
Nov 15 '18 at 17:27













If one of the answers addresses your question, please accept it; doing so not only provides a little perk to the answerer with some points, but also provides some closure for readers with similar questions. Though you can only accept one answer, you have the option to up-vote as many as you think are helpful. (If there are still issues, you will likely need to edit your question with further details.)

– r2evans
Nov 15 '18 at 19:53





If one of the answers addresses your question, please accept it; doing so not only provides a little perk to the answerer with some points, but also provides some closure for readers with similar questions. Though you can only accept one answer, you have the option to up-vote as many as you think are helpful. (If there are still issues, you will likely need to edit your question with further details.)

– r2evans
Nov 15 '18 at 19:53












2 Answers
2






active

oldest

votes


















1














Something like this would work:



One line to create the grouping and the rest is standard R stuff



df$gp<- sapply(1:nrow(df), function(x) x%/%12)



All together we have:





library(dplyr)

df <- structure(list(Month = c(4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
1L, 2L, 3L, 4L, 5L, 6L, 7L), Year = c(2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 2011L, 2011L,
2011L, 2011L, 2011L), Rainfall = c(3L, 4L, 5L, 3L, 4L, 5L, 6L,
7L, 8L, 4L, 3L, 4L, 5L, 6L, 5L, 4L)), row.names = c(NA, -16L), class = c("data.table",
"data.frame"))

df
#> Month Year Rainfall
#> 1 4 2010 3
#> 2 5 2010 4
#> 3 6 2010 5
#> 4 7 2010 3
#> 5 8 2010 4
#> 6 9 2010 5
#> 7 10 2010 6
#> 8 11 2010 7
#> 9 12 2010 8
#> 10 1 2011 4
#> 11 2 2011 3
#> 12 3 2011 4
#> 13 4 2011 5
#> 14 5 2011 6
#> 15 6 2011 5
#> 16 7 2011 4

df$gp<- sapply(1:nrow(df), function(x) x%/%12)

df
#> Month Year Rainfall gp
#> 1 4 2010 3 0
#> 2 5 2010 4 0
#> 3 6 2010 5 0
#> 4 7 2010 3 0
#> 5 8 2010 4 0
#> 6 9 2010 5 0
#> 7 10 2010 6 0
#> 8 11 2010 7 0
#> 9 12 2010 8 0
#> 10 1 2011 4 0
#> 11 2 2011 3 0
#> 12 3 2011 4 1
#> 13 4 2011 5 1
#> 14 5 2011 6 1
#> 15 6 2011 5 1
#> 16 7 2011 4 1

df %>% group_by(gp) %>% summarise(mean(Rainfall))
#> # A tibble: 2 x 2
#> gp `mean(Rainfall)`
#> <dbl> <dbl>
#> 1 0 4.73
#> 2 1 4.8


There are arguably better ways to deal with this windowing problem using lubridate package or by converting to a ts object.






share|improve this answer

























  • Thank you so much for your code. Your code helped me a lot. But I am trying to get the mean of all the 16th row and the next set of the mean will be from 13th row to 32th row.

    – Sonisa Sharma
    Nov 15 '18 at 19:50


















0














Using my own fabricated data (below), here's a solution:



sapply(years, function(yr) (Year == yr+1 & Month <= 7))$Rainfall)
)
# [1] 0.5421714 0.4412616 0.4867803


(for 2010, 2011, and 2012, respectively).



This does not strictly check to ensure we have all months (including 4 and 7) in each range, that's a different discussion.



For explanation:




  • seq(min(x$Year), max(x$Year)-1): iterate by year from the first to the second-to-last (assuming contiguous years);


  • (Year == yr & Month >= 4): include all data that is in this year and at or after month 4, or ...


  • | (Year == yr+1 & Month <= 7): next year and month at/before 7.

  • from there, simply sum( subset(...)$Rainfall )

The mid-step looks like this (with my data):



sapply(seq(min(x$Year), max(x$Year)-1), function(yr) 
subset(x, (Year == yr & Month >= 4) , simplify=F)
# [[1]]
# Month Year Rainfall
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750
# 7 7 2010 0.1291590
# 8 8 2010 0.8334488
# 9 9 2010 0.4680185
# 10 10 2010 0.5499837
# 11 11 2010 0.5526741
# 12 12 2010 0.2388948
# 13 1 2011 0.7605133
# 14 2 2011 0.1808201
# 15 3 2011 0.4052822
# 16 4 2011 0.8535485
# 17 5 2011 0.9763985
# 18 6 2011 0.2258255
# 19 7 2011 0.4448092
# [[2]]
# Month Year Rainfall
# 16 4 2011 0.85354845
# 17 5 2011 0.97639849
# 18 6 2011 0.22582546
# 19 7 2011 0.44480923
# 20 8 2011 0.07497942
# 21 9 2011 0.66189876
# 22 10 2011 0.38754954
# 23 11 2011 0.83688918
# 24 12 2011 0.15050144
# 25 1 2012 0.34727225
# 26 2 2012 0.48877323
# 27 3 2012 0.14924686
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# [[3]]
# Month Year Rainfall
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# 32 8 2012 0.16464224
# 33 9 2012 0.81019214
# 34 10 2012 0.86886104
# 35 11 2012 0.51428176
# 36 12 2012 0.62719629
# 37 1 2013 0.84442900
# 38 2 2013 0.28487057
# 39 3 2013 0.66722565
# 40 4 2013 0.15046975
# 41 5 2013 0.98172786
# 42 6 2013 0.29701074
# 43 7 2013 0.11508408



Data:



set.seed(2)
years <- 4
x <- data.frame(
Month = rep(1:12, times=years),
Year = rep(2009 + seq_len(years), each=12),
Rainfall = runif(12*years)
)
head(x)
# Month Year Rainfall
# 1 1 2010 0.1848823
# 2 2 2010 0.7023740
# 3 3 2010 0.5733263
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750





share|improve this answer

























  • It perfectly worked thank you.

    – Sonisa Sharma
    Nov 15 '18 at 19:50











  • Month Year Rainfall 4 2010 484.6 5 2010 630.32 6 2010 35.31 7 2010 637.64 8 2010 238.57 9 2010 1129.35 10 2010 376.78 11 2010 282.78 12 2010 324.58 1 2011 338.6 2 2011 859.37 3 2011 66.24 4 2011 38.36 We should get 418 as average value but I am getting 369.12

    – Sonisa Sharma
    Nov 15 '18 at 20:20












  • That's an incomplete dataset, it does not span from 4/2010 to 7/2011. However, when I use that data, I get 418.6538. For future discussions, data like that does poorly in comments, please edit your question and put it there in an easily consumed format, such as the output from dput(x), dput(head(x,n=?)) (top ? rows if large), data.frame(...), or read.table(text="...", ...).

    – r2evans
    Nov 15 '18 at 21:13











  • If you mean that the missing months should count as 0, then ... you need to provide a usable example in your question that states that and includes some missingness in the data.

    – r2evans
    Nov 15 '18 at 21:14











  • Is there a way that I can attach the data?

    – Sonisa Sharma
    Nov 15 '18 at 21:33










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53324746%2fhow-to-take-mean-from-month-april-of-previous-year-to-july-in-r%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Something like this would work:



One line to create the grouping and the rest is standard R stuff



df$gp<- sapply(1:nrow(df), function(x) x%/%12)



All together we have:





library(dplyr)

df <- structure(list(Month = c(4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
1L, 2L, 3L, 4L, 5L, 6L, 7L), Year = c(2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 2011L, 2011L,
2011L, 2011L, 2011L), Rainfall = c(3L, 4L, 5L, 3L, 4L, 5L, 6L,
7L, 8L, 4L, 3L, 4L, 5L, 6L, 5L, 4L)), row.names = c(NA, -16L), class = c("data.table",
"data.frame"))

df
#> Month Year Rainfall
#> 1 4 2010 3
#> 2 5 2010 4
#> 3 6 2010 5
#> 4 7 2010 3
#> 5 8 2010 4
#> 6 9 2010 5
#> 7 10 2010 6
#> 8 11 2010 7
#> 9 12 2010 8
#> 10 1 2011 4
#> 11 2 2011 3
#> 12 3 2011 4
#> 13 4 2011 5
#> 14 5 2011 6
#> 15 6 2011 5
#> 16 7 2011 4

df$gp<- sapply(1:nrow(df), function(x) x%/%12)

df
#> Month Year Rainfall gp
#> 1 4 2010 3 0
#> 2 5 2010 4 0
#> 3 6 2010 5 0
#> 4 7 2010 3 0
#> 5 8 2010 4 0
#> 6 9 2010 5 0
#> 7 10 2010 6 0
#> 8 11 2010 7 0
#> 9 12 2010 8 0
#> 10 1 2011 4 0
#> 11 2 2011 3 0
#> 12 3 2011 4 1
#> 13 4 2011 5 1
#> 14 5 2011 6 1
#> 15 6 2011 5 1
#> 16 7 2011 4 1

df %>% group_by(gp) %>% summarise(mean(Rainfall))
#> # A tibble: 2 x 2
#> gp `mean(Rainfall)`
#> <dbl> <dbl>
#> 1 0 4.73
#> 2 1 4.8


There are arguably better ways to deal with this windowing problem using lubridate package or by converting to a ts object.






share|improve this answer

























  • Thank you so much for your code. Your code helped me a lot. But I am trying to get the mean of all the 16th row and the next set of the mean will be from 13th row to 32th row.

    – Sonisa Sharma
    Nov 15 '18 at 19:50















1














Something like this would work:



One line to create the grouping and the rest is standard R stuff



df$gp<- sapply(1:nrow(df), function(x) x%/%12)



All together we have:





library(dplyr)

df <- structure(list(Month = c(4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
1L, 2L, 3L, 4L, 5L, 6L, 7L), Year = c(2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 2011L, 2011L,
2011L, 2011L, 2011L), Rainfall = c(3L, 4L, 5L, 3L, 4L, 5L, 6L,
7L, 8L, 4L, 3L, 4L, 5L, 6L, 5L, 4L)), row.names = c(NA, -16L), class = c("data.table",
"data.frame"))

df
#> Month Year Rainfall
#> 1 4 2010 3
#> 2 5 2010 4
#> 3 6 2010 5
#> 4 7 2010 3
#> 5 8 2010 4
#> 6 9 2010 5
#> 7 10 2010 6
#> 8 11 2010 7
#> 9 12 2010 8
#> 10 1 2011 4
#> 11 2 2011 3
#> 12 3 2011 4
#> 13 4 2011 5
#> 14 5 2011 6
#> 15 6 2011 5
#> 16 7 2011 4

df$gp<- sapply(1:nrow(df), function(x) x%/%12)

df
#> Month Year Rainfall gp
#> 1 4 2010 3 0
#> 2 5 2010 4 0
#> 3 6 2010 5 0
#> 4 7 2010 3 0
#> 5 8 2010 4 0
#> 6 9 2010 5 0
#> 7 10 2010 6 0
#> 8 11 2010 7 0
#> 9 12 2010 8 0
#> 10 1 2011 4 0
#> 11 2 2011 3 0
#> 12 3 2011 4 1
#> 13 4 2011 5 1
#> 14 5 2011 6 1
#> 15 6 2011 5 1
#> 16 7 2011 4 1

df %>% group_by(gp) %>% summarise(mean(Rainfall))
#> # A tibble: 2 x 2
#> gp `mean(Rainfall)`
#> <dbl> <dbl>
#> 1 0 4.73
#> 2 1 4.8


There are arguably better ways to deal with this windowing problem using lubridate package or by converting to a ts object.






share|improve this answer

























  • Thank you so much for your code. Your code helped me a lot. But I am trying to get the mean of all the 16th row and the next set of the mean will be from 13th row to 32th row.

    – Sonisa Sharma
    Nov 15 '18 at 19:50













1












1








1







Something like this would work:



One line to create the grouping and the rest is standard R stuff



df$gp<- sapply(1:nrow(df), function(x) x%/%12)



All together we have:





library(dplyr)

df <- structure(list(Month = c(4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
1L, 2L, 3L, 4L, 5L, 6L, 7L), Year = c(2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 2011L, 2011L,
2011L, 2011L, 2011L), Rainfall = c(3L, 4L, 5L, 3L, 4L, 5L, 6L,
7L, 8L, 4L, 3L, 4L, 5L, 6L, 5L, 4L)), row.names = c(NA, -16L), class = c("data.table",
"data.frame"))

df
#> Month Year Rainfall
#> 1 4 2010 3
#> 2 5 2010 4
#> 3 6 2010 5
#> 4 7 2010 3
#> 5 8 2010 4
#> 6 9 2010 5
#> 7 10 2010 6
#> 8 11 2010 7
#> 9 12 2010 8
#> 10 1 2011 4
#> 11 2 2011 3
#> 12 3 2011 4
#> 13 4 2011 5
#> 14 5 2011 6
#> 15 6 2011 5
#> 16 7 2011 4

df$gp<- sapply(1:nrow(df), function(x) x%/%12)

df
#> Month Year Rainfall gp
#> 1 4 2010 3 0
#> 2 5 2010 4 0
#> 3 6 2010 5 0
#> 4 7 2010 3 0
#> 5 8 2010 4 0
#> 6 9 2010 5 0
#> 7 10 2010 6 0
#> 8 11 2010 7 0
#> 9 12 2010 8 0
#> 10 1 2011 4 0
#> 11 2 2011 3 0
#> 12 3 2011 4 1
#> 13 4 2011 5 1
#> 14 5 2011 6 1
#> 15 6 2011 5 1
#> 16 7 2011 4 1

df %>% group_by(gp) %>% summarise(mean(Rainfall))
#> # A tibble: 2 x 2
#> gp `mean(Rainfall)`
#> <dbl> <dbl>
#> 1 0 4.73
#> 2 1 4.8


There are arguably better ways to deal with this windowing problem using lubridate package or by converting to a ts object.






share|improve this answer















Something like this would work:



One line to create the grouping and the rest is standard R stuff



df$gp<- sapply(1:nrow(df), function(x) x%/%12)



All together we have:





library(dplyr)

df <- structure(list(Month = c(4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
1L, 2L, 3L, 4L, 5L, 6L, 7L), Year = c(2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 2011L, 2011L, 2011L,
2011L, 2011L, 2011L), Rainfall = c(3L, 4L, 5L, 3L, 4L, 5L, 6L,
7L, 8L, 4L, 3L, 4L, 5L, 6L, 5L, 4L)), row.names = c(NA, -16L), class = c("data.table",
"data.frame"))

df
#> Month Year Rainfall
#> 1 4 2010 3
#> 2 5 2010 4
#> 3 6 2010 5
#> 4 7 2010 3
#> 5 8 2010 4
#> 6 9 2010 5
#> 7 10 2010 6
#> 8 11 2010 7
#> 9 12 2010 8
#> 10 1 2011 4
#> 11 2 2011 3
#> 12 3 2011 4
#> 13 4 2011 5
#> 14 5 2011 6
#> 15 6 2011 5
#> 16 7 2011 4

df$gp<- sapply(1:nrow(df), function(x) x%/%12)

df
#> Month Year Rainfall gp
#> 1 4 2010 3 0
#> 2 5 2010 4 0
#> 3 6 2010 5 0
#> 4 7 2010 3 0
#> 5 8 2010 4 0
#> 6 9 2010 5 0
#> 7 10 2010 6 0
#> 8 11 2010 7 0
#> 9 12 2010 8 0
#> 10 1 2011 4 0
#> 11 2 2011 3 0
#> 12 3 2011 4 1
#> 13 4 2011 5 1
#> 14 5 2011 6 1
#> 15 6 2011 5 1
#> 16 7 2011 4 1

df %>% group_by(gp) %>% summarise(mean(Rainfall))
#> # A tibble: 2 x 2
#> gp `mean(Rainfall)`
#> <dbl> <dbl>
#> 1 0 4.73
#> 2 1 4.8


There are arguably better ways to deal with this windowing problem using lubridate package or by converting to a ts object.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 15 '18 at 17:56

























answered Nov 15 '18 at 17:49









Jrakru56Jrakru56

609212




609212












  • Thank you so much for your code. Your code helped me a lot. But I am trying to get the mean of all the 16th row and the next set of the mean will be from 13th row to 32th row.

    – Sonisa Sharma
    Nov 15 '18 at 19:50

















  • Thank you so much for your code. Your code helped me a lot. But I am trying to get the mean of all the 16th row and the next set of the mean will be from 13th row to 32th row.

    – Sonisa Sharma
    Nov 15 '18 at 19:50
















Thank you so much for your code. Your code helped me a lot. But I am trying to get the mean of all the 16th row and the next set of the mean will be from 13th row to 32th row.

– Sonisa Sharma
Nov 15 '18 at 19:50





Thank you so much for your code. Your code helped me a lot. But I am trying to get the mean of all the 16th row and the next set of the mean will be from 13th row to 32th row.

– Sonisa Sharma
Nov 15 '18 at 19:50













0














Using my own fabricated data (below), here's a solution:



sapply(years, function(yr) (Year == yr+1 & Month <= 7))$Rainfall)
)
# [1] 0.5421714 0.4412616 0.4867803


(for 2010, 2011, and 2012, respectively).



This does not strictly check to ensure we have all months (including 4 and 7) in each range, that's a different discussion.



For explanation:




  • seq(min(x$Year), max(x$Year)-1): iterate by year from the first to the second-to-last (assuming contiguous years);


  • (Year == yr & Month >= 4): include all data that is in this year and at or after month 4, or ...


  • | (Year == yr+1 & Month <= 7): next year and month at/before 7.

  • from there, simply sum( subset(...)$Rainfall )

The mid-step looks like this (with my data):



sapply(seq(min(x$Year), max(x$Year)-1), function(yr) 
subset(x, (Year == yr & Month >= 4) , simplify=F)
# [[1]]
# Month Year Rainfall
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750
# 7 7 2010 0.1291590
# 8 8 2010 0.8334488
# 9 9 2010 0.4680185
# 10 10 2010 0.5499837
# 11 11 2010 0.5526741
# 12 12 2010 0.2388948
# 13 1 2011 0.7605133
# 14 2 2011 0.1808201
# 15 3 2011 0.4052822
# 16 4 2011 0.8535485
# 17 5 2011 0.9763985
# 18 6 2011 0.2258255
# 19 7 2011 0.4448092
# [[2]]
# Month Year Rainfall
# 16 4 2011 0.85354845
# 17 5 2011 0.97639849
# 18 6 2011 0.22582546
# 19 7 2011 0.44480923
# 20 8 2011 0.07497942
# 21 9 2011 0.66189876
# 22 10 2011 0.38754954
# 23 11 2011 0.83688918
# 24 12 2011 0.15050144
# 25 1 2012 0.34727225
# 26 2 2012 0.48877323
# 27 3 2012 0.14924686
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# [[3]]
# Month Year Rainfall
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# 32 8 2012 0.16464224
# 33 9 2012 0.81019214
# 34 10 2012 0.86886104
# 35 11 2012 0.51428176
# 36 12 2012 0.62719629
# 37 1 2013 0.84442900
# 38 2 2013 0.28487057
# 39 3 2013 0.66722565
# 40 4 2013 0.15046975
# 41 5 2013 0.98172786
# 42 6 2013 0.29701074
# 43 7 2013 0.11508408



Data:



set.seed(2)
years <- 4
x <- data.frame(
Month = rep(1:12, times=years),
Year = rep(2009 + seq_len(years), each=12),
Rainfall = runif(12*years)
)
head(x)
# Month Year Rainfall
# 1 1 2010 0.1848823
# 2 2 2010 0.7023740
# 3 3 2010 0.5733263
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750





share|improve this answer

























  • It perfectly worked thank you.

    – Sonisa Sharma
    Nov 15 '18 at 19:50











  • Month Year Rainfall 4 2010 484.6 5 2010 630.32 6 2010 35.31 7 2010 637.64 8 2010 238.57 9 2010 1129.35 10 2010 376.78 11 2010 282.78 12 2010 324.58 1 2011 338.6 2 2011 859.37 3 2011 66.24 4 2011 38.36 We should get 418 as average value but I am getting 369.12

    – Sonisa Sharma
    Nov 15 '18 at 20:20












  • That's an incomplete dataset, it does not span from 4/2010 to 7/2011. However, when I use that data, I get 418.6538. For future discussions, data like that does poorly in comments, please edit your question and put it there in an easily consumed format, such as the output from dput(x), dput(head(x,n=?)) (top ? rows if large), data.frame(...), or read.table(text="...", ...).

    – r2evans
    Nov 15 '18 at 21:13











  • If you mean that the missing months should count as 0, then ... you need to provide a usable example in your question that states that and includes some missingness in the data.

    – r2evans
    Nov 15 '18 at 21:14











  • Is there a way that I can attach the data?

    – Sonisa Sharma
    Nov 15 '18 at 21:33















0














Using my own fabricated data (below), here's a solution:



sapply(years, function(yr) (Year == yr+1 & Month <= 7))$Rainfall)
)
# [1] 0.5421714 0.4412616 0.4867803


(for 2010, 2011, and 2012, respectively).



This does not strictly check to ensure we have all months (including 4 and 7) in each range, that's a different discussion.



For explanation:




  • seq(min(x$Year), max(x$Year)-1): iterate by year from the first to the second-to-last (assuming contiguous years);


  • (Year == yr & Month >= 4): include all data that is in this year and at or after month 4, or ...


  • | (Year == yr+1 & Month <= 7): next year and month at/before 7.

  • from there, simply sum( subset(...)$Rainfall )

The mid-step looks like this (with my data):



sapply(seq(min(x$Year), max(x$Year)-1), function(yr) 
subset(x, (Year == yr & Month >= 4) , simplify=F)
# [[1]]
# Month Year Rainfall
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750
# 7 7 2010 0.1291590
# 8 8 2010 0.8334488
# 9 9 2010 0.4680185
# 10 10 2010 0.5499837
# 11 11 2010 0.5526741
# 12 12 2010 0.2388948
# 13 1 2011 0.7605133
# 14 2 2011 0.1808201
# 15 3 2011 0.4052822
# 16 4 2011 0.8535485
# 17 5 2011 0.9763985
# 18 6 2011 0.2258255
# 19 7 2011 0.4448092
# [[2]]
# Month Year Rainfall
# 16 4 2011 0.85354845
# 17 5 2011 0.97639849
# 18 6 2011 0.22582546
# 19 7 2011 0.44480923
# 20 8 2011 0.07497942
# 21 9 2011 0.66189876
# 22 10 2011 0.38754954
# 23 11 2011 0.83688918
# 24 12 2011 0.15050144
# 25 1 2012 0.34727225
# 26 2 2012 0.48877323
# 27 3 2012 0.14924686
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# [[3]]
# Month Year Rainfall
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# 32 8 2012 0.16464224
# 33 9 2012 0.81019214
# 34 10 2012 0.86886104
# 35 11 2012 0.51428176
# 36 12 2012 0.62719629
# 37 1 2013 0.84442900
# 38 2 2013 0.28487057
# 39 3 2013 0.66722565
# 40 4 2013 0.15046975
# 41 5 2013 0.98172786
# 42 6 2013 0.29701074
# 43 7 2013 0.11508408



Data:



set.seed(2)
years <- 4
x <- data.frame(
Month = rep(1:12, times=years),
Year = rep(2009 + seq_len(years), each=12),
Rainfall = runif(12*years)
)
head(x)
# Month Year Rainfall
# 1 1 2010 0.1848823
# 2 2 2010 0.7023740
# 3 3 2010 0.5733263
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750





share|improve this answer

























  • It perfectly worked thank you.

    – Sonisa Sharma
    Nov 15 '18 at 19:50











  • Month Year Rainfall 4 2010 484.6 5 2010 630.32 6 2010 35.31 7 2010 637.64 8 2010 238.57 9 2010 1129.35 10 2010 376.78 11 2010 282.78 12 2010 324.58 1 2011 338.6 2 2011 859.37 3 2011 66.24 4 2011 38.36 We should get 418 as average value but I am getting 369.12

    – Sonisa Sharma
    Nov 15 '18 at 20:20












  • That's an incomplete dataset, it does not span from 4/2010 to 7/2011. However, when I use that data, I get 418.6538. For future discussions, data like that does poorly in comments, please edit your question and put it there in an easily consumed format, such as the output from dput(x), dput(head(x,n=?)) (top ? rows if large), data.frame(...), or read.table(text="...", ...).

    – r2evans
    Nov 15 '18 at 21:13











  • If you mean that the missing months should count as 0, then ... you need to provide a usable example in your question that states that and includes some missingness in the data.

    – r2evans
    Nov 15 '18 at 21:14











  • Is there a way that I can attach the data?

    – Sonisa Sharma
    Nov 15 '18 at 21:33













0












0








0







Using my own fabricated data (below), here's a solution:



sapply(years, function(yr) (Year == yr+1 & Month <= 7))$Rainfall)
)
# [1] 0.5421714 0.4412616 0.4867803


(for 2010, 2011, and 2012, respectively).



This does not strictly check to ensure we have all months (including 4 and 7) in each range, that's a different discussion.



For explanation:




  • seq(min(x$Year), max(x$Year)-1): iterate by year from the first to the second-to-last (assuming contiguous years);


  • (Year == yr & Month >= 4): include all data that is in this year and at or after month 4, or ...


  • | (Year == yr+1 & Month <= 7): next year and month at/before 7.

  • from there, simply sum( subset(...)$Rainfall )

The mid-step looks like this (with my data):



sapply(seq(min(x$Year), max(x$Year)-1), function(yr) 
subset(x, (Year == yr & Month >= 4) , simplify=F)
# [[1]]
# Month Year Rainfall
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750
# 7 7 2010 0.1291590
# 8 8 2010 0.8334488
# 9 9 2010 0.4680185
# 10 10 2010 0.5499837
# 11 11 2010 0.5526741
# 12 12 2010 0.2388948
# 13 1 2011 0.7605133
# 14 2 2011 0.1808201
# 15 3 2011 0.4052822
# 16 4 2011 0.8535485
# 17 5 2011 0.9763985
# 18 6 2011 0.2258255
# 19 7 2011 0.4448092
# [[2]]
# Month Year Rainfall
# 16 4 2011 0.85354845
# 17 5 2011 0.97639849
# 18 6 2011 0.22582546
# 19 7 2011 0.44480923
# 20 8 2011 0.07497942
# 21 9 2011 0.66189876
# 22 10 2011 0.38754954
# 23 11 2011 0.83688918
# 24 12 2011 0.15050144
# 25 1 2012 0.34727225
# 26 2 2012 0.48877323
# 27 3 2012 0.14924686
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# [[3]]
# Month Year Rainfall
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# 32 8 2012 0.16464224
# 33 9 2012 0.81019214
# 34 10 2012 0.86886104
# 35 11 2012 0.51428176
# 36 12 2012 0.62719629
# 37 1 2013 0.84442900
# 38 2 2013 0.28487057
# 39 3 2013 0.66722565
# 40 4 2013 0.15046975
# 41 5 2013 0.98172786
# 42 6 2013 0.29701074
# 43 7 2013 0.11508408



Data:



set.seed(2)
years <- 4
x <- data.frame(
Month = rep(1:12, times=years),
Year = rep(2009 + seq_len(years), each=12),
Rainfall = runif(12*years)
)
head(x)
# Month Year Rainfall
# 1 1 2010 0.1848823
# 2 2 2010 0.7023740
# 3 3 2010 0.5733263
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750





share|improve this answer















Using my own fabricated data (below), here's a solution:



sapply(years, function(yr) (Year == yr+1 & Month <= 7))$Rainfall)
)
# [1] 0.5421714 0.4412616 0.4867803


(for 2010, 2011, and 2012, respectively).



This does not strictly check to ensure we have all months (including 4 and 7) in each range, that's a different discussion.



For explanation:




  • seq(min(x$Year), max(x$Year)-1): iterate by year from the first to the second-to-last (assuming contiguous years);


  • (Year == yr & Month >= 4): include all data that is in this year and at or after month 4, or ...


  • | (Year == yr+1 & Month <= 7): next year and month at/before 7.

  • from there, simply sum( subset(...)$Rainfall )

The mid-step looks like this (with my data):



sapply(seq(min(x$Year), max(x$Year)-1), function(yr) 
subset(x, (Year == yr & Month >= 4) , simplify=F)
# [[1]]
# Month Year Rainfall
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750
# 7 7 2010 0.1291590
# 8 8 2010 0.8334488
# 9 9 2010 0.4680185
# 10 10 2010 0.5499837
# 11 11 2010 0.5526741
# 12 12 2010 0.2388948
# 13 1 2011 0.7605133
# 14 2 2011 0.1808201
# 15 3 2011 0.4052822
# 16 4 2011 0.8535485
# 17 5 2011 0.9763985
# 18 6 2011 0.2258255
# 19 7 2011 0.4448092
# [[2]]
# Month Year Rainfall
# 16 4 2011 0.85354845
# 17 5 2011 0.97639849
# 18 6 2011 0.22582546
# 19 7 2011 0.44480923
# 20 8 2011 0.07497942
# 21 9 2011 0.66189876
# 22 10 2011 0.38754954
# 23 11 2011 0.83688918
# 24 12 2011 0.15050144
# 25 1 2012 0.34727225
# 26 2 2012 0.48877323
# 27 3 2012 0.14924686
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# [[3]]
# Month Year Rainfall
# 28 4 2012 0.35706259
# 29 5 2012 0.96264405
# 30 6 2012 0.13237200
# 31 7 2012 0.01041453
# 32 8 2012 0.16464224
# 33 9 2012 0.81019214
# 34 10 2012 0.86886104
# 35 11 2012 0.51428176
# 36 12 2012 0.62719629
# 37 1 2013 0.84442900
# 38 2 2013 0.28487057
# 39 3 2013 0.66722565
# 40 4 2013 0.15046975
# 41 5 2013 0.98172786
# 42 6 2013 0.29701074
# 43 7 2013 0.11508408



Data:



set.seed(2)
years <- 4
x <- data.frame(
Month = rep(1:12, times=years),
Year = rep(2009 + seq_len(years), each=12),
Rainfall = runif(12*years)
)
head(x)
# Month Year Rainfall
# 1 1 2010 0.1848823
# 2 2 2010 0.7023740
# 3 3 2010 0.5733263
# 4 4 2010 0.1680519
# 5 5 2010 0.9438393
# 6 6 2010 0.9434750






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 15 '18 at 21:10

























answered Nov 15 '18 at 17:55









r2evansr2evans

27.9k33159




27.9k33159












  • It perfectly worked thank you.

    – Sonisa Sharma
    Nov 15 '18 at 19:50











  • Month Year Rainfall 4 2010 484.6 5 2010 630.32 6 2010 35.31 7 2010 637.64 8 2010 238.57 9 2010 1129.35 10 2010 376.78 11 2010 282.78 12 2010 324.58 1 2011 338.6 2 2011 859.37 3 2011 66.24 4 2011 38.36 We should get 418 as average value but I am getting 369.12

    – Sonisa Sharma
    Nov 15 '18 at 20:20












  • That's an incomplete dataset, it does not span from 4/2010 to 7/2011. However, when I use that data, I get 418.6538. For future discussions, data like that does poorly in comments, please edit your question and put it there in an easily consumed format, such as the output from dput(x), dput(head(x,n=?)) (top ? rows if large), data.frame(...), or read.table(text="...", ...).

    – r2evans
    Nov 15 '18 at 21:13











  • If you mean that the missing months should count as 0, then ... you need to provide a usable example in your question that states that and includes some missingness in the data.

    – r2evans
    Nov 15 '18 at 21:14











  • Is there a way that I can attach the data?

    – Sonisa Sharma
    Nov 15 '18 at 21:33

















  • It perfectly worked thank you.

    – Sonisa Sharma
    Nov 15 '18 at 19:50











  • Month Year Rainfall 4 2010 484.6 5 2010 630.32 6 2010 35.31 7 2010 637.64 8 2010 238.57 9 2010 1129.35 10 2010 376.78 11 2010 282.78 12 2010 324.58 1 2011 338.6 2 2011 859.37 3 2011 66.24 4 2011 38.36 We should get 418 as average value but I am getting 369.12

    – Sonisa Sharma
    Nov 15 '18 at 20:20












  • That's an incomplete dataset, it does not span from 4/2010 to 7/2011. However, when I use that data, I get 418.6538. For future discussions, data like that does poorly in comments, please edit your question and put it there in an easily consumed format, such as the output from dput(x), dput(head(x,n=?)) (top ? rows if large), data.frame(...), or read.table(text="...", ...).

    – r2evans
    Nov 15 '18 at 21:13











  • If you mean that the missing months should count as 0, then ... you need to provide a usable example in your question that states that and includes some missingness in the data.

    – r2evans
    Nov 15 '18 at 21:14











  • Is there a way that I can attach the data?

    – Sonisa Sharma
    Nov 15 '18 at 21:33
















It perfectly worked thank you.

– Sonisa Sharma
Nov 15 '18 at 19:50





It perfectly worked thank you.

– Sonisa Sharma
Nov 15 '18 at 19:50













Month Year Rainfall 4 2010 484.6 5 2010 630.32 6 2010 35.31 7 2010 637.64 8 2010 238.57 9 2010 1129.35 10 2010 376.78 11 2010 282.78 12 2010 324.58 1 2011 338.6 2 2011 859.37 3 2011 66.24 4 2011 38.36 We should get 418 as average value but I am getting 369.12

– Sonisa Sharma
Nov 15 '18 at 20:20






Month Year Rainfall 4 2010 484.6 5 2010 630.32 6 2010 35.31 7 2010 637.64 8 2010 238.57 9 2010 1129.35 10 2010 376.78 11 2010 282.78 12 2010 324.58 1 2011 338.6 2 2011 859.37 3 2011 66.24 4 2011 38.36 We should get 418 as average value but I am getting 369.12

– Sonisa Sharma
Nov 15 '18 at 20:20














That's an incomplete dataset, it does not span from 4/2010 to 7/2011. However, when I use that data, I get 418.6538. For future discussions, data like that does poorly in comments, please edit your question and put it there in an easily consumed format, such as the output from dput(x), dput(head(x,n=?)) (top ? rows if large), data.frame(...), or read.table(text="...", ...).

– r2evans
Nov 15 '18 at 21:13





That's an incomplete dataset, it does not span from 4/2010 to 7/2011. However, when I use that data, I get 418.6538. For future discussions, data like that does poorly in comments, please edit your question and put it there in an easily consumed format, such as the output from dput(x), dput(head(x,n=?)) (top ? rows if large), data.frame(...), or read.table(text="...", ...).

– r2evans
Nov 15 '18 at 21:13













If you mean that the missing months should count as 0, then ... you need to provide a usable example in your question that states that and includes some missingness in the data.

– r2evans
Nov 15 '18 at 21:14





If you mean that the missing months should count as 0, then ... you need to provide a usable example in your question that states that and includes some missingness in the data.

– r2evans
Nov 15 '18 at 21:14













Is there a way that I can attach the data?

– Sonisa Sharma
Nov 15 '18 at 21:33





Is there a way that I can attach the data?

– Sonisa Sharma
Nov 15 '18 at 21:33

















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53324746%2fhow-to-take-mean-from-month-april-of-previous-year-to-july-in-r%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

Guadeloupe

Node.js Script on GitHub Pages or Amazon S3