Identifying intervals
up vote
-2
down vote
favorite
I have a data frame with 3523 observation and 92 variables.
Below an example of a data frame with 10 observations and 04:00-05:00
04:00 04:15 04:30 05:00 ...................04:35
1 - - - -
2 2 2 2 - ....................-
3 2 - - 2.......................
4 - - 2 -.....................
5 - - - -........................
6 - - - -..................
7 - - - -.......................
8 - - - -.......................
9 - - - -.......................
10 - - - -......................
...
3523......................................
The columns define 24h time from 4:00am till 4:00am (15 minutes interval). The rows define number of observation.
Each row contain values '-' and '2'.
I want to extract the beginning and the ending of the intervals starting with '2'
For example 2: 04:15-04:30;
3: 04:00 ; 05:00
4: 04:30
Could you help me please also how can i import the output in excel or txt file?
Thank you
r matrix time extract tail
add a comment |
up vote
-2
down vote
favorite
I have a data frame with 3523 observation and 92 variables.
Below an example of a data frame with 10 observations and 04:00-05:00
04:00 04:15 04:30 05:00 ...................04:35
1 - - - -
2 2 2 2 - ....................-
3 2 - - 2.......................
4 - - 2 -.....................
5 - - - -........................
6 - - - -..................
7 - - - -.......................
8 - - - -.......................
9 - - - -.......................
10 - - - -......................
...
3523......................................
The columns define 24h time from 4:00am till 4:00am (15 minutes interval). The rows define number of observation.
Each row contain values '-' and '2'.
I want to extract the beginning and the ending of the intervals starting with '2'
For example 2: 04:15-04:30;
3: 04:00 ; 05:00
4: 04:30
Could you help me please also how can i import the output in excel or txt file?
Thank you
r matrix time extract tail
1
Please add a reproducible example. Take a look at other posts; we need to have a dataframe that we can reproduce ourselves (perhaps make a simple table and copy paste it here), and also clear output.
– arg0naut
Nov 11 at 9:53
add a comment |
up vote
-2
down vote
favorite
up vote
-2
down vote
favorite
I have a data frame with 3523 observation and 92 variables.
Below an example of a data frame with 10 observations and 04:00-05:00
04:00 04:15 04:30 05:00 ...................04:35
1 - - - -
2 2 2 2 - ....................-
3 2 - - 2.......................
4 - - 2 -.....................
5 - - - -........................
6 - - - -..................
7 - - - -.......................
8 - - - -.......................
9 - - - -.......................
10 - - - -......................
...
3523......................................
The columns define 24h time from 4:00am till 4:00am (15 minutes interval). The rows define number of observation.
Each row contain values '-' and '2'.
I want to extract the beginning and the ending of the intervals starting with '2'
For example 2: 04:15-04:30;
3: 04:00 ; 05:00
4: 04:30
Could you help me please also how can i import the output in excel or txt file?
Thank you
r matrix time extract tail
I have a data frame with 3523 observation and 92 variables.
Below an example of a data frame with 10 observations and 04:00-05:00
04:00 04:15 04:30 05:00 ...................04:35
1 - - - -
2 2 2 2 - ....................-
3 2 - - 2.......................
4 - - 2 -.....................
5 - - - -........................
6 - - - -..................
7 - - - -.......................
8 - - - -.......................
9 - - - -.......................
10 - - - -......................
...
3523......................................
The columns define 24h time from 4:00am till 4:00am (15 minutes interval). The rows define number of observation.
Each row contain values '-' and '2'.
I want to extract the beginning and the ending of the intervals starting with '2'
For example 2: 04:15-04:30;
3: 04:00 ; 05:00
4: 04:30
Could you help me please also how can i import the output in excel or txt file?
Thank you
r matrix time extract tail
r matrix time extract tail
edited Nov 11 at 14:20
asked Nov 11 at 9:45
RforDummies
378
378
1
Please add a reproducible example. Take a look at other posts; we need to have a dataframe that we can reproduce ourselves (perhaps make a simple table and copy paste it here), and also clear output.
– arg0naut
Nov 11 at 9:53
add a comment |
1
Please add a reproducible example. Take a look at other posts; we need to have a dataframe that we can reproduce ourselves (perhaps make a simple table and copy paste it here), and also clear output.
– arg0naut
Nov 11 at 9:53
1
1
Please add a reproducible example. Take a look at other posts; we need to have a dataframe that we can reproduce ourselves (perhaps make a simple table and copy paste it here), and also clear output.
– arg0naut
Nov 11 at 9:53
Please add a reproducible example. Take a look at other posts; we need to have a dataframe that we can reproduce ourselves (perhaps make a simple table and copy paste it here), and also clear output.
– arg0naut
Nov 11 at 9:53
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
Let's expand a bit your example. In the expanded example, we can note that there is no 2
for the row number 1, and that there are also several trickier ones, like for example row 6 where we have 2
, then a break (-
), after that a sequence of two 2
s, a -
, and a 2
again.
04:00 04:15 04:30 05:00 05:15 05:30
1: - - - - - -
2: 2 2 2 - 2 2
3: 2 - - 2 2 2
4: - - 2 - 2 2
5: - - - - 2 2
6: 2 - 2 2 - 2
7: - - - - 2 2
8: 2 2 - 2 2 2
9: - - - - 2 2
10: 2 2 - 2 2 2
You can reproduce it if you type in:
WorkSchedulesDay1 <- structure(list(`04:00` = c("-", "2", "2", "-", "-", "2", "-",
"2", "-", "2"), `04:15` = c("-", "2", "-", "-", "-", "-", "-",
"2", "-", "2"), `04:30` = c("-", "2", "-", "2", "-", "2", "-",
"-", "-", "-"), `05:00` = c("-", "-", "2", "-", "-", "2", "-",
"2", "-", "2"), `05:15` = c("-", "2", "2", "2", "2", "-", "2",
"2", "2", "2"), `05:30` = c("-", "2", "2", "2", "2", "2", "2",
"2", "2", "2")), row.names = c(NA, -10L), class = c("data.table",
"data.frame"))
After that you apply the code:
WorkSchedulesDay1 <- WorkSchedulesDay1 %>%
group_by(rn = row_number()) %>%
gather(time, val, 1:6) %>%
arrange(time) %>%
mutate(tmp = cumsum(coalesce(val != lag(val), FALSE))) %>% arrange(rn) %>%
filter(!val == "-") %>%
group_by(rn, tmp) %>%
mutate(
time = case_when(
n() > 1 ~ paste(min(time), max(time), sep = " - "),
TRUE ~ time
)
) %>%
ungroup() %>% distinct(rn, tmp, time) %>%
group_by(rn) %>%
mutate(
intervals = case_when(
n() > 1 ~ paste(time, collapse = ", "),
TRUE ~ time
)
) %>% distinct(rn, intervals) %>%
write_csv("WorkSchedulesDay1.csv")
You will see that what you get is:
rn intervals
<int> <chr>
2 04:00 - 04:30, 05:15 - 05:30
3 04:00, 05:00 - 05:30
4 04:30, 05:15 - 05:30
5 05:15 - 05:30
6 04:00, 04:30 - 05:00, 05:30
7 05:15 - 05:30
8 04:00 - 04:15, 05:00 - 05:30
9 05:15 - 05:30
10 04:00 - 04:15, 05:00 - 05:30
There is no record for the row number 1, simply because there are only -
in there.
Similarly, there is no record for 05:00
in row number 2, simply because there is a -
in there.
In a similar fashion, there is 04:00, 04:30 - 05:00, 05:30
for row number 6, because there are -
for 04:15
and 05:15
.
Comments are not for extended discussion; this conversation has been moved to chat.
– Samuel Liew♦
Nov 11 at 22:37
Dear @arg0naut tahnk you fro your help and time; but when I am running the code I receive all intervals not just intervals defined by 2s. So for the above example I receive 1: 04:00-05:30; 2:04:00-04:30;05:00; 05:15-05:30; and so on. Could you help me please to receive just the '2s' intervals? Thank you –
– RforDummies
Nov 12 at 18:38
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
Let's expand a bit your example. In the expanded example, we can note that there is no 2
for the row number 1, and that there are also several trickier ones, like for example row 6 where we have 2
, then a break (-
), after that a sequence of two 2
s, a -
, and a 2
again.
04:00 04:15 04:30 05:00 05:15 05:30
1: - - - - - -
2: 2 2 2 - 2 2
3: 2 - - 2 2 2
4: - - 2 - 2 2
5: - - - - 2 2
6: 2 - 2 2 - 2
7: - - - - 2 2
8: 2 2 - 2 2 2
9: - - - - 2 2
10: 2 2 - 2 2 2
You can reproduce it if you type in:
WorkSchedulesDay1 <- structure(list(`04:00` = c("-", "2", "2", "-", "-", "2", "-",
"2", "-", "2"), `04:15` = c("-", "2", "-", "-", "-", "-", "-",
"2", "-", "2"), `04:30` = c("-", "2", "-", "2", "-", "2", "-",
"-", "-", "-"), `05:00` = c("-", "-", "2", "-", "-", "2", "-",
"2", "-", "2"), `05:15` = c("-", "2", "2", "2", "2", "-", "2",
"2", "2", "2"), `05:30` = c("-", "2", "2", "2", "2", "2", "2",
"2", "2", "2")), row.names = c(NA, -10L), class = c("data.table",
"data.frame"))
After that you apply the code:
WorkSchedulesDay1 <- WorkSchedulesDay1 %>%
group_by(rn = row_number()) %>%
gather(time, val, 1:6) %>%
arrange(time) %>%
mutate(tmp = cumsum(coalesce(val != lag(val), FALSE))) %>% arrange(rn) %>%
filter(!val == "-") %>%
group_by(rn, tmp) %>%
mutate(
time = case_when(
n() > 1 ~ paste(min(time), max(time), sep = " - "),
TRUE ~ time
)
) %>%
ungroup() %>% distinct(rn, tmp, time) %>%
group_by(rn) %>%
mutate(
intervals = case_when(
n() > 1 ~ paste(time, collapse = ", "),
TRUE ~ time
)
) %>% distinct(rn, intervals) %>%
write_csv("WorkSchedulesDay1.csv")
You will see that what you get is:
rn intervals
<int> <chr>
2 04:00 - 04:30, 05:15 - 05:30
3 04:00, 05:00 - 05:30
4 04:30, 05:15 - 05:30
5 05:15 - 05:30
6 04:00, 04:30 - 05:00, 05:30
7 05:15 - 05:30
8 04:00 - 04:15, 05:00 - 05:30
9 05:15 - 05:30
10 04:00 - 04:15, 05:00 - 05:30
There is no record for the row number 1, simply because there are only -
in there.
Similarly, there is no record for 05:00
in row number 2, simply because there is a -
in there.
In a similar fashion, there is 04:00, 04:30 - 05:00, 05:30
for row number 6, because there are -
for 04:15
and 05:15
.
Comments are not for extended discussion; this conversation has been moved to chat.
– Samuel Liew♦
Nov 11 at 22:37
Dear @arg0naut tahnk you fro your help and time; but when I am running the code I receive all intervals not just intervals defined by 2s. So for the above example I receive 1: 04:00-05:30; 2:04:00-04:30;05:00; 05:15-05:30; and so on. Could you help me please to receive just the '2s' intervals? Thank you –
– RforDummies
Nov 12 at 18:38
add a comment |
up vote
0
down vote
accepted
Let's expand a bit your example. In the expanded example, we can note that there is no 2
for the row number 1, and that there are also several trickier ones, like for example row 6 where we have 2
, then a break (-
), after that a sequence of two 2
s, a -
, and a 2
again.
04:00 04:15 04:30 05:00 05:15 05:30
1: - - - - - -
2: 2 2 2 - 2 2
3: 2 - - 2 2 2
4: - - 2 - 2 2
5: - - - - 2 2
6: 2 - 2 2 - 2
7: - - - - 2 2
8: 2 2 - 2 2 2
9: - - - - 2 2
10: 2 2 - 2 2 2
You can reproduce it if you type in:
WorkSchedulesDay1 <- structure(list(`04:00` = c("-", "2", "2", "-", "-", "2", "-",
"2", "-", "2"), `04:15` = c("-", "2", "-", "-", "-", "-", "-",
"2", "-", "2"), `04:30` = c("-", "2", "-", "2", "-", "2", "-",
"-", "-", "-"), `05:00` = c("-", "-", "2", "-", "-", "2", "-",
"2", "-", "2"), `05:15` = c("-", "2", "2", "2", "2", "-", "2",
"2", "2", "2"), `05:30` = c("-", "2", "2", "2", "2", "2", "2",
"2", "2", "2")), row.names = c(NA, -10L), class = c("data.table",
"data.frame"))
After that you apply the code:
WorkSchedulesDay1 <- WorkSchedulesDay1 %>%
group_by(rn = row_number()) %>%
gather(time, val, 1:6) %>%
arrange(time) %>%
mutate(tmp = cumsum(coalesce(val != lag(val), FALSE))) %>% arrange(rn) %>%
filter(!val == "-") %>%
group_by(rn, tmp) %>%
mutate(
time = case_when(
n() > 1 ~ paste(min(time), max(time), sep = " - "),
TRUE ~ time
)
) %>%
ungroup() %>% distinct(rn, tmp, time) %>%
group_by(rn) %>%
mutate(
intervals = case_when(
n() > 1 ~ paste(time, collapse = ", "),
TRUE ~ time
)
) %>% distinct(rn, intervals) %>%
write_csv("WorkSchedulesDay1.csv")
You will see that what you get is:
rn intervals
<int> <chr>
2 04:00 - 04:30, 05:15 - 05:30
3 04:00, 05:00 - 05:30
4 04:30, 05:15 - 05:30
5 05:15 - 05:30
6 04:00, 04:30 - 05:00, 05:30
7 05:15 - 05:30
8 04:00 - 04:15, 05:00 - 05:30
9 05:15 - 05:30
10 04:00 - 04:15, 05:00 - 05:30
There is no record for the row number 1, simply because there are only -
in there.
Similarly, there is no record for 05:00
in row number 2, simply because there is a -
in there.
In a similar fashion, there is 04:00, 04:30 - 05:00, 05:30
for row number 6, because there are -
for 04:15
and 05:15
.
Comments are not for extended discussion; this conversation has been moved to chat.
– Samuel Liew♦
Nov 11 at 22:37
Dear @arg0naut tahnk you fro your help and time; but when I am running the code I receive all intervals not just intervals defined by 2s. So for the above example I receive 1: 04:00-05:30; 2:04:00-04:30;05:00; 05:15-05:30; and so on. Could you help me please to receive just the '2s' intervals? Thank you –
– RforDummies
Nov 12 at 18:38
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
Let's expand a bit your example. In the expanded example, we can note that there is no 2
for the row number 1, and that there are also several trickier ones, like for example row 6 where we have 2
, then a break (-
), after that a sequence of two 2
s, a -
, and a 2
again.
04:00 04:15 04:30 05:00 05:15 05:30
1: - - - - - -
2: 2 2 2 - 2 2
3: 2 - - 2 2 2
4: - - 2 - 2 2
5: - - - - 2 2
6: 2 - 2 2 - 2
7: - - - - 2 2
8: 2 2 - 2 2 2
9: - - - - 2 2
10: 2 2 - 2 2 2
You can reproduce it if you type in:
WorkSchedulesDay1 <- structure(list(`04:00` = c("-", "2", "2", "-", "-", "2", "-",
"2", "-", "2"), `04:15` = c("-", "2", "-", "-", "-", "-", "-",
"2", "-", "2"), `04:30` = c("-", "2", "-", "2", "-", "2", "-",
"-", "-", "-"), `05:00` = c("-", "-", "2", "-", "-", "2", "-",
"2", "-", "2"), `05:15` = c("-", "2", "2", "2", "2", "-", "2",
"2", "2", "2"), `05:30` = c("-", "2", "2", "2", "2", "2", "2",
"2", "2", "2")), row.names = c(NA, -10L), class = c("data.table",
"data.frame"))
After that you apply the code:
WorkSchedulesDay1 <- WorkSchedulesDay1 %>%
group_by(rn = row_number()) %>%
gather(time, val, 1:6) %>%
arrange(time) %>%
mutate(tmp = cumsum(coalesce(val != lag(val), FALSE))) %>% arrange(rn) %>%
filter(!val == "-") %>%
group_by(rn, tmp) %>%
mutate(
time = case_when(
n() > 1 ~ paste(min(time), max(time), sep = " - "),
TRUE ~ time
)
) %>%
ungroup() %>% distinct(rn, tmp, time) %>%
group_by(rn) %>%
mutate(
intervals = case_when(
n() > 1 ~ paste(time, collapse = ", "),
TRUE ~ time
)
) %>% distinct(rn, intervals) %>%
write_csv("WorkSchedulesDay1.csv")
You will see that what you get is:
rn intervals
<int> <chr>
2 04:00 - 04:30, 05:15 - 05:30
3 04:00, 05:00 - 05:30
4 04:30, 05:15 - 05:30
5 05:15 - 05:30
6 04:00, 04:30 - 05:00, 05:30
7 05:15 - 05:30
8 04:00 - 04:15, 05:00 - 05:30
9 05:15 - 05:30
10 04:00 - 04:15, 05:00 - 05:30
There is no record for the row number 1, simply because there are only -
in there.
Similarly, there is no record for 05:00
in row number 2, simply because there is a -
in there.
In a similar fashion, there is 04:00, 04:30 - 05:00, 05:30
for row number 6, because there are -
for 04:15
and 05:15
.
Let's expand a bit your example. In the expanded example, we can note that there is no 2
for the row number 1, and that there are also several trickier ones, like for example row 6 where we have 2
, then a break (-
), after that a sequence of two 2
s, a -
, and a 2
again.
04:00 04:15 04:30 05:00 05:15 05:30
1: - - - - - -
2: 2 2 2 - 2 2
3: 2 - - 2 2 2
4: - - 2 - 2 2
5: - - - - 2 2
6: 2 - 2 2 - 2
7: - - - - 2 2
8: 2 2 - 2 2 2
9: - - - - 2 2
10: 2 2 - 2 2 2
You can reproduce it if you type in:
WorkSchedulesDay1 <- structure(list(`04:00` = c("-", "2", "2", "-", "-", "2", "-",
"2", "-", "2"), `04:15` = c("-", "2", "-", "-", "-", "-", "-",
"2", "-", "2"), `04:30` = c("-", "2", "-", "2", "-", "2", "-",
"-", "-", "-"), `05:00` = c("-", "-", "2", "-", "-", "2", "-",
"2", "-", "2"), `05:15` = c("-", "2", "2", "2", "2", "-", "2",
"2", "2", "2"), `05:30` = c("-", "2", "2", "2", "2", "2", "2",
"2", "2", "2")), row.names = c(NA, -10L), class = c("data.table",
"data.frame"))
After that you apply the code:
WorkSchedulesDay1 <- WorkSchedulesDay1 %>%
group_by(rn = row_number()) %>%
gather(time, val, 1:6) %>%
arrange(time) %>%
mutate(tmp = cumsum(coalesce(val != lag(val), FALSE))) %>% arrange(rn) %>%
filter(!val == "-") %>%
group_by(rn, tmp) %>%
mutate(
time = case_when(
n() > 1 ~ paste(min(time), max(time), sep = " - "),
TRUE ~ time
)
) %>%
ungroup() %>% distinct(rn, tmp, time) %>%
group_by(rn) %>%
mutate(
intervals = case_when(
n() > 1 ~ paste(time, collapse = ", "),
TRUE ~ time
)
) %>% distinct(rn, intervals) %>%
write_csv("WorkSchedulesDay1.csv")
You will see that what you get is:
rn intervals
<int> <chr>
2 04:00 - 04:30, 05:15 - 05:30
3 04:00, 05:00 - 05:30
4 04:30, 05:15 - 05:30
5 05:15 - 05:30
6 04:00, 04:30 - 05:00, 05:30
7 05:15 - 05:30
8 04:00 - 04:15, 05:00 - 05:30
9 05:15 - 05:30
10 04:00 - 04:15, 05:00 - 05:30
There is no record for the row number 1, simply because there are only -
in there.
Similarly, there is no record for 05:00
in row number 2, simply because there is a -
in there.
In a similar fashion, there is 04:00, 04:30 - 05:00, 05:30
for row number 6, because there are -
for 04:15
and 05:15
.
edited Nov 11 at 18:02
answered Nov 11 at 10:12
arg0naut
1,687312
1,687312
Comments are not for extended discussion; this conversation has been moved to chat.
– Samuel Liew♦
Nov 11 at 22:37
Dear @arg0naut tahnk you fro your help and time; but when I am running the code I receive all intervals not just intervals defined by 2s. So for the above example I receive 1: 04:00-05:30; 2:04:00-04:30;05:00; 05:15-05:30; and so on. Could you help me please to receive just the '2s' intervals? Thank you –
– RforDummies
Nov 12 at 18:38
add a comment |
Comments are not for extended discussion; this conversation has been moved to chat.
– Samuel Liew♦
Nov 11 at 22:37
Dear @arg0naut tahnk you fro your help and time; but when I am running the code I receive all intervals not just intervals defined by 2s. So for the above example I receive 1: 04:00-05:30; 2:04:00-04:30;05:00; 05:15-05:30; and so on. Could you help me please to receive just the '2s' intervals? Thank you –
– RforDummies
Nov 12 at 18:38
Comments are not for extended discussion; this conversation has been moved to chat.
– Samuel Liew♦
Nov 11 at 22:37
Comments are not for extended discussion; this conversation has been moved to chat.
– Samuel Liew♦
Nov 11 at 22:37
Dear @arg0naut tahnk you fro your help and time; but when I am running the code I receive all intervals not just intervals defined by 2s. So for the above example I receive 1: 04:00-05:30; 2:04:00-04:30;05:00; 05:15-05:30; and so on. Could you help me please to receive just the '2s' intervals? Thank you –
– RforDummies
Nov 12 at 18:38
Dear @arg0naut tahnk you fro your help and time; but when I am running the code I receive all intervals not just intervals defined by 2s. So for the above example I receive 1: 04:00-05:30; 2:04:00-04:30;05:00; 05:15-05:30; and so on. Could you help me please to receive just the '2s' intervals? Thank you –
– RforDummies
Nov 12 at 18:38
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247486%2fidentifying-intervals%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Please add a reproducible example. Take a look at other posts; we need to have a dataframe that we can reproduce ourselves (perhaps make a simple table and copy paste it here), and also clear output.
– arg0naut
Nov 11 at 9:53