Filter in dplyr interval of dates - r

I have the following simulated dataset in R:
library(tidyverse)
A = seq(from = as.Date("2021/1/1"),to=as.Date("2022/1/1"), length.out = 252)
length(A)
x = rnorm(252)
d = tibble(A,x);d
that looks like :
# A tibble: 252 × 2
A x
<date> <dbl>
1 2021-01-01 0.445
2 2021-01-02 -0.793
3 2021-01-03 -0.367
4 2021-01-05 1.64
5 2021-01-06 -1.15
6 2021-01-08 0.276
7 2021-01-09 1.09
8 2021-01-11 0.443
9 2021-01-12 -0.378
10 2021-01-14 0.203
# … with 242 more rows
Is one year of 252 trading days.Let's say I have a date of my interest which is:
start = as.Date("2021-05-23");start.
I want to filter the data set and the result to be a new dataset starting from this starting date and the next 20 index dates NOT simple days, and then to find the total indexes that the new dataset contains.
For example from the starting date and after I have :
d1=d%>%
dplyr::filter(A>start)%>%
dplyr::summarise(n())
d1
# A tibble: 1 × 1
`n()`
<int>
1 98
but I want from the starting date and after the next 20 trading days.How can I do that ? Any help?

Perhaps a brute-force attempt:
d %>%
filter(between(A, start, max(head(sort(A[A > start]), 20))))
# # A tibble: 20 x 2
# A x
# <date> <dbl>
# 1 2021-05-23 -0.185
# 2 2021-05-24 0.102
# 3 2021-05-26 0.429
# 4 2021-05-27 -1.21
# 5 2021-05-29 0.260
# 6 2021-05-30 0.479
# 7 2021-06-01 -0.623
# 8 2021-06-02 0.982
# 9 2021-06-04 -0.0533
# 10 2021-06-05 1.08
# 11 2021-06-07 -1.96
# 12 2021-06-08 -0.613
# 13 2021-06-09 -0.267
# 14 2021-06-11 -0.284
# 15 2021-06-12 0.0851
# 16 2021-06-14 0.355
# 17 2021-06-15 -0.635
# 18 2021-06-17 -0.606
# 19 2021-06-18 -0.485
# 20 2021-06-20 0.255
If you have duplicate dates, you may prefer to use head(sort(unique(A[A > start])),20), depending on what "20 index dates" means.
And to find the number of indices, you can summarise or count as needed.

You could first sort by the date, filter for days greater than given date and then pull top 20 records.
d1 = d %>%
arrange(A) %>%
filter(A > start) %>%
head(20)

Related

Aggregate daily data into weeks

I have data resembling the following structure, where the when variable denotes the day of measurement:
## Generate data.
set.seed(1986)
n <- 1000
y <- rnorm(n)
when <- as.POSIXct(strftime(seq(as.POSIXct("2021-11-01 23:00:00 UTC", tryFormats = "%Y-%m-%d"),
as.POSIXct("2022-11-01 23:00:00 UTC", tryFormats = "%Y-%m-%d"),
length.out = n), format = "%Y-%m-%d"))
dta <- data.frame(y, when)
head(dta)
#> y when
#> 1 -0.04625141 2021-11-01
#> 2 0.28000082 2021-11-01
#> 3 0.25317063 2021-11-01
#> 4 -0.96411077 2021-11-02
#> 5 0.49222664 2021-11-02
#> 6 -0.69874551 2021-11-02
I need to compute averages of y over time. For instance, the following computes daily averages:
## Compute daily averages of y.
library(dplyr)
daily_avg <- dta %>%
group_by(when) %>%
summarise(daily_mean = mean(y)) %>%
ungroup()
daily_avg
#> # A tibble: 366 × 2
#> when daily_mean
#> <dttm> <dbl>
#> 1 2021-11-01 00:00:00 0.162
#> 2 2021-11-02 00:00:00 -0.390
#> 3 2021-11-03 00:00:00 -0.485
#> 4 2021-11-04 00:00:00 -0.152
#> 5 2021-11-05 00:00:00 0.425
#> 6 2021-11-06 00:00:00 0.726
#> 7 2021-11-07 00:00:00 0.855
#> 8 2021-11-08 00:00:00 0.0608
#> 9 2021-11-09 00:00:00 -0.995
#> 10 2021-11-10 00:00:00 0.395
#> # … with 356 more rows
I am having a hard time computing weekly averages. Here is what I have tried so far:
## Fail - compute weekly averages of y.
library(lubridate)
dta$week <- week(dta$when) # This is wrong.
dta[165: 171, ]
#> y when week
#> 165 0.9758333 2021-12-30 52
#> 166 -0.8630091 2021-12-31 53
#> 167 0.3054031 2021-12-31 53
#> 168 1.2814421 2022-01-01 1
#> 169 0.1025440 2022-01-01 1
#> 170 1.3665411 2022-01-01 1
#> 171 -0.5373058 2022-01-02 1
Using the week function from the lubridate package ignores the fact that my data spawn across years. So, if I were to use a code similar to the one I used for the daily averages, I would aggregate observations belonging to different years (but to the same week number). How can I solve this?
You can use %V (from ?strptime) for weeks, combining it with the year.
dta %>%
group_by(week = format(when, format = "%Y-%V")) %>%
summarize(daily_mean = mean(y)) %>%
ungroup()
# # A tibble: 54 x 2
# week daily_mean
# <chr> <dbl>
# 1 2021-44 0.179
# 2 2021-45 0.0477
# 3 2021-46 0.0340
# 4 2021-47 0.356
# 5 2021-48 0.0544
# 6 2021-49 -0.0948
# 7 2021-50 -0.0419
# 8 2021-51 0.209
# 9 2021-52 0.251
# 10 2022-01 -0.197
# # ... with 44 more rows
There are different variants of "week", depending on your preference.
%V
Week of the year as decimal number (01–53) as defined in ISO 8601.
If the week (starting on Monday) containing 1 January has four or more
days in the new year, then it is considered week 1. Otherwise, it is
the last week of the previous year, and the next week is week 1.
(Accepted but ignored on input.)
%W
Week of the year as decimal number (00–53) using Monday as the first
day of week (and typically with the first Monday of the year as day 1
of week 1). The UK convention.
You can extract year and week from the dates and group by both:
dta %>%
mutate(year = year(when),
week = week(when)) %>%
group_by(year, week) %>%
summarise(y_mean = mean(y)) %>%
ungroup()
# # A tibble: 54 x 3
# # Groups: year, week [54]
# year week y_mean
# <dbl> <dbl> <dbl>
# 1 2021 44 -0.222
# 2 2021 45 0.234
# 3 2021 46 0.0953
# 4 2021 47 0.206
# 5 2021 48 0.192
# 6 2021 49 -0.0831
# 7 2021 50 0.0282
# 8 2021 51 0.196
# 9 2021 52 0.132
# 10 2021 53 -0.279
# # ... with 44 more rows

How to separate daily data into weekly or monthly data in R

I have daily discharge data from a local stream near me. I am trying to sum and take the average of the daily data into weekly or monthly chunks so I can plot discharge_m3d(discharge) and Qs_sum(depletion) by weekly and monthly timeframes. Does anyone know how I can do this? I attached a figure of how my data frame looks.
People often use floor_date() from lubridate for these purposes. You can floor to a unit of month or week and then group by the resulting date column. Then you can use summarize() to compute the monthly or weekly sums/averages. From there you can use your plotting library of choice to visualize the result (like ggplot2, not shown).
This works even if you have more than one year of data (i.e. where the month or week number might repeat).
library(dplyr)
library(lubridate)
set.seed(123)
df <- tibble(
date = seq(
from = as.Date("2014-03-01"),
to = as.Date("2016-12-31"),
by = 1
),
Qs_sum = runif(length(date)),
discharge_m3d = runif(length(date))
)
df
#> # A tibble: 1,037 × 3
#> date Qs_sum discharge_m3d
#> <date> <dbl> <dbl>
#> 1 2014-03-01 0.288 0.560
#> 2 2014-03-02 0.788 0.427
#> 3 2014-03-03 0.409 0.448
#> 4 2014-03-04 0.883 0.833
#> 5 2014-03-05 0.940 0.720
#> 6 2014-03-06 0.0456 0.457
#> 7 2014-03-07 0.528 0.521
#> 8 2014-03-08 0.892 0.242
#> 9 2014-03-09 0.551 0.0759
#> 10 2014-03-10 0.457 0.391
#> # … with 1,027 more rows
df %>%
mutate(date = floor_date(date, unit = "month")) %>%
group_by(date) %>%
summarise(
n = n(),
qs_total = sum(Qs_sum),
qs_average = mean(Qs_sum),
discharge_total = sum(discharge_m3d),
discharge_average = mean(discharge_m3d),
.groups = "drop"
)
#> # A tibble: 34 × 6
#> date n qs_total qs_average discharge_total discharge_average
#> <date> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 2014-03-01 31 18.1 0.585 15.3 0.494
#> 2 2014-04-01 30 12.9 0.429 15.2 0.507
#> 3 2014-05-01 31 15.5 0.500 15.3 0.493
#> 4 2014-06-01 30 15.8 0.525 16.3 0.542
#> 5 2014-07-01 31 15.1 0.487 13.9 0.449
#> 6 2014-08-01 31 14.8 0.478 16.2 0.522
#> 7 2014-09-01 30 15.3 0.511 13.1 0.436
#> 8 2014-10-01 31 15.6 0.504 14.7 0.475
#> 9 2014-11-01 30 16.0 0.532 15.1 0.502
#> 10 2014-12-01 31 14.2 0.458 15.5 0.502
#> # … with 24 more rows
# Assert that the "start of the week" is Sunday.
# So groups are made of data from [Sunday -> Monday]
sunday <- 7L
df %>%
mutate(date = floor_date(date, unit = "week", week_start = sunday)) %>%
group_by(date) %>%
summarise(
n = n(),
qs_total = sum(Qs_sum),
qs_average = mean(Qs_sum),
discharge_total = sum(discharge_m3d),
discharge_average = mean(discharge_m3d),
.groups = "drop"
)
#> # A tibble: 149 × 6
#> date n qs_total qs_average discharge_total discharge_average
#> <date> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 2014-02-23 1 0.288 0.288 0.560 0.560
#> 2 2014-03-02 7 4.49 0.641 3.65 0.521
#> 3 2014-03-09 7 3.77 0.539 3.88 0.554
#> 4 2014-03-16 7 4.05 0.579 3.45 0.493
#> 5 2014-03-23 7 4.43 0.632 3.08 0.440
#> 6 2014-03-30 7 4.00 0.572 4.74 0.677
#> 7 2014-04-06 7 2.50 0.357 3.15 0.449
#> 8 2014-04-13 7 2.48 0.355 2.44 0.349
#> 9 2014-04-20 7 2.30 0.329 2.45 0.349
#> 10 2014-04-27 7 3.44 0.492 4.40 0.629
#> # … with 139 more rows
Created on 2022-04-13 by the reprex package (v2.0.1)
One way to approach this is using the lubridate and dplyr packages in the tidyverse. I assume here that your dates are year-month-day which they appear to be and that you only have one calendar year or at least no repeated months/weeks across two years.
monthly_discharge <- discharge %>%
filter(variable == "discharge") # First select just the rows that represent discharge (not clear if that's necessary here)
mutate(date = ymd(date), # convert date to a lubridate date object
month = month(date), # extract the numbered month from the date
week = week(date)) %>% # extract the numbered week in a year from the date
group_by(month, stream) %>% # group your data by month and stream
summarize(discharge_summary = mean(discharge_m3d)) # summarize your data so that each month has a single row with a single (mean) discharge value
# you can include multiple summary variables within the summarize function
This should produce a data frame with one row per month for each stream and a summary value for discharge. You could summarize by week by changing the month label in group_by to week.
Make use of the functions week(), month() and year() from the package lubridate to get the corresponding values for your date column. Afterwards we can find the means per week, month or year. For illustration, I added a row with year 2015, since there was only year 2014 in your sample data. Furthermore, for plotting reasons, I added a column "Year_Month" that shows the abbreviated month followed by year (x axis of the plot).
library(dplyr)
library(lubridate)
data <- data %>% mutate(Week = week(date), Month = month(date), Year = year(date)) %>%
group_by(Year, Week) %>%
mutate(mean_Week_Qs = mean(Qs_sum)) %>%
ungroup() %>%
group_by(Year, Month) %>%
mutate(mean_Month_Qs = mean(Qs_sum)) %>%
ungroup() %>%
group_by(Year) %>%
mutate(mean_Year_Qs = mean(Qs_sum)) %>%
ungroup() %>%
mutate(Year_Month = paste0(lubridate::month(date, label = TRUE), " ", Year)) %>%
ungroup()
> data
# A tibble: 12 x 10
date discharge_m3d Qs_sum Week Month Year mean_Week_Qs mean_Month_Qs mean_Year_Qs Year_Month
<date> <dbl> <dbl> <int> <int> <int> <dbl> <dbl> <dbl> <chr>
1 2014-03-01 797 0 9 3 2014 0.0409 0.629 0.629 Mar 2014
2 2014-03-02 826 0.00833 9 3 2014 0.0409 0.629 0.629 Mar 2014
3 2014-03-03 3760 0.114 9 3 2014 0.0409 0.629 0.629 Mar 2014
4 2014-03-04 4330 0.292 10 3 2014 0.785 0.629 0.629 Mar 2014
5 2014-03-05 2600 0.480 10 3 2014 0.785 0.629 0.629 Mar 2014
6 2014-03-06 4620 0.656 10 3 2014 0.785 0.629 0.629 Mar 2014
7 2014-03-07 2510 0.816 10 3 2014 0.785 0.629 0.629 Mar 2014
8 2014-03-08 1620 0.959 10 3 2014 0.785 0.629 0.629 Mar 2014
9 2014-03-09 2270 1.09 10 3 2014 0.785 0.629 0.629 Mar 2014
10 2014-03-10 5650 1.20 10 3 2014 0.785 0.629 0.629 Mar 2014
11 2014-03-11 2530 1.31 11 3 2014 1.31 0.629 0.629 Mar 2014
12 2015-03-06 1470 1.52 10 3 2015 1.52 1.52 1.52 Mar 2015
Now we can plot, for example Qs_sum per year and month, and add the mean as a red dot:
ggplot(data, aes(Year_Month, Qs_sum)) +
theme_classic() +
geom_point(size = 2) +
geom_point(aes(Year_Month, mean_Month_Qs), color = "red", size = 5, alpha = 0.6)
To summarize the results by weekly or monthly averages, you can do as follows, using distinct():
data %>% distinct(Year, Week, mean_Week_Qs)
# A tibble: 4 x 3
Week Year mean_Week_Qs
<int> <int> <dbl>
1 9 2014 0.0409
2 10 2014 0.785
3 11 2014 1.31
4 10 2015 1.52
data %>% distinct(Year, Month, mean_Month_Qs)
# A tibble: 2 x 3
Month Year mean_Month_Qs
<int> <int> <dbl>
1 3 2014 0.629
2 3 2015 1.52
This can only be done after the mutate() and mean() commands above. If you want to get directly to summarized results, you can use summarize() directly on the initial dataframe:
data %>% group_by(Year, Week) %>% summarise(Week_Avg = mean(Qs_sum))
# A tibble: 4 x 3
# Groups: Year [2]
Year Week Week_Avg
<int> <int> <dbl>
1 2014 9 0.0409
2 2014 10 0.785
3 2014 11 1.31
4 2015 10 1.52
data %>% group_by(Year, Month) %>% summarise(Month_Avg = mean(Qs_sum))
# A tibble: 2 x 3
# Groups: Year [2]
Year Month Month_Avg
<int> <int> <dbl>
1 2014 3 0.629
2 2015 3 1.52
Note that for plotting, mutate() is preferred, since it preserves the single weekly points (black in the plot above), if we used summarise() instead, we would be left with only the red points.
Data
data <- structure(list(date = structure(16130:16140, class = "Date"),
discharge_m3d = c(797, 826, 3760, 4330, 2600, 4620, 2510,
1620, 2270, 5650, 2530), Qs_sum = c(0, 0.00833424, 0.114224781,
0.291812109, 0.479780482, 0.656321971, 0.816140731, 0.959334606,
1.087579095, 1.20284046, 1.30695595), Week = c(9L, 9L, 9L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L), Month = c(3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L)), row.names = c(NA, -11L
), class = c("tbl_df", "tbl", "data.frame"))

Dividing a number within a month with the last observation in the previous month using dplyr

I am struggling with finding the correct way of achieving the relative return within a month using the last observation in the previous month. Data for reference:
set.seed(123)
Date = seq(as.Date("2021/12/31"), by = "day", length.out = 90)
Returns = runif(90, min=-0.02, max = 0.02)
mData = data.frame(Date, Returns)
Then, I would like to have a return column. For example: When calculating the returns for February the third, then it should be the returns for the respective dates: 2022-02-03 / 2022-01-31 - 1. And likewise for e.g March the third: 2022-03-03 / 2022-02-28 -1. So the question is, how can I keep the date returns within a month as the numerator while having the last observation in the previous month as the denominator using dplyr?
Using a tmp column to get the previous value from the last month (assuming sorted data) and then picking the first. Grouping is done on year-month in group_by.
mData %>%
mutate(tmp=lag(Returns)) %>%
group_by(dat=strftime(Date, format="%Y-%m")) %>%
mutate(tmp=first(tmp), result=Returns/tmp-1) %>%
ungroup() %>%
select(-c(tmp, dat))
# A tibble: 90 × 5 # before select:
Date Returns result # tmp dat
<date> <dbl> <dbl> # <dbl> <chr>
1 2021-12-31 -0.00850 NA # NA 2021-12
2 2022-01-01 0.0115 -2.36 # -0.00850 2022-01
3 2022-01-02 -0.00364 -0.571 # -0.00850 2022-01
4 2022-01-03 0.0153 -2.80 # -0.00850 2022-01
5 2022-01-04 0.0176 -3.07 # -0.00850 2022-01
6 2022-01-05 -0.0182 1.14 # -0.00850 2022-01
7 2022-01-06 0.00112 -1.13 # -0.00850 2022-01
8 2022-01-07 0.0157 -2.85 # -0.00850 2022-01
9 2022-01-08 0.00206 -1.24 # -0.00850 2022-01
10 2022-01-09 -0.00174 -0.796 # -0.00850 2022-01
# … with 80 more rows
library(tidyverse)
library(lubridate)
set.seed(123)
Date = seq(as.Date("2021/12/31"), by = "day", length.out = 90)
Returns = runif(90, min=-0.02, max = 0.02)
mData = data.frame(Date, Returns)
mData |>
group_by(month(Date)) |>
mutate(last_return = last(Returns)) |>
ungroup() |>
nest(data = c(Date, Returns)) |>
mutate(last_return_lag = lag(last_return)) |>
unnest(data) |>
mutate(x = Returns/last_return_lag)
#> # A tibble: 90 × 6
#> `month(Date)` last_return Date Returns last_return_lag x
#> <dbl> <dbl> <date> <dbl> <dbl> <dbl>
#> 1 12 -0.00850 2021-12-31 -0.00850 NA NA
#> 2 1 0.0161 2022-01-01 0.0115 -0.00850 -1.36
#> 3 1 0.0161 2022-01-02 -0.00364 -0.00850 0.429
#> 4 1 0.0161 2022-01-03 0.0153 -0.00850 -1.80
#> 5 1 0.0161 2022-01-04 0.0176 -0.00850 -2.07
#> 6 1 0.0161 2022-01-05 -0.0182 -0.00850 2.14
#> 7 1 0.0161 2022-01-06 0.00112 -0.00850 -0.132
#> 8 1 0.0161 2022-01-07 0.0157 -0.00850 -1.85
#> 9 1 0.0161 2022-01-08 0.00206 -0.00850 -0.242
#> 10 1 0.0161 2022-01-09 -0.00174 -0.00850 0.204
#> # … with 80 more rows
Created on 2022-02-03 by the reprex package (v2.0.1)

Combining a loop with stacking dataframes created by a function

I'm doing some analysis with the BaseballR package and want to be able to combine dataframes by using a loop.
For example, the following code using the standings_on_date_bref function gives me a table of division standings for the specified day (plus manually adding a column for the date of those standings):
library("baseballr")
library("dplyr")
standings_on_date_bref(date = "04-28-2021", division = "NL West") %>%
mutate(date = "04-28-2021")
Tm
W-L%
date
SFG
0.640
04-28-2021
LAD
0.640
04-28-2021
SDP
0.538
04-28-2021
ARI
0.500
04-28-2021
COL
0.375
04-28-2021
However, I'm interested in getting the standings for a whole range of days (which would end up being a dataframe with rows = 5 teams * x number of days) for example for 04-28-2021 to 04-29-2021, I'm hoping it would look something like this:
Tm
W-L%
date
SFG
0.640
04-28-2021
LAD
0.640
04-28-2021
SDP
0.538
04-28-2021
ARI
0.500
04-28-2021
COL
0.375
04-28-2021
SFG
0.640
04-29-2021
LAD
0.615
04-29-2021
SDP
0.538
04-29-2021
ARI
0.520
04-29-2021
COL
0.360
04-29-2021
I have tried to do so by implementing some sort of loop. This is what I've come up with so far, but in the end it just gives me the standings for the end date.
start <- as.Date("04-01-21",format="%m-%d-%y")
end <- as.Date("04-03-21",format="%m-%d-%y")
theDate <- start
while (theDate <= end)
{
all_standings <- standings_on_date_bref(date = theDate, division = "NL West") %>%
mutate(date = theDate)
theDate <- theDate + 1
}
You can try purrr which would do it quite nicely with map_dfr function
library(baseballr)
library(dplyr)
library(purrr)
date_seq <- seq(as.Date("04-01-21",format="%m-%d-%y"),
as.Date("04-03-21",format="%m-%d-%y"), by = "1 day")
map_dfr(.x = date_seq,
.f = function(x) {
standings_on_date_bref(date = x, division = "NL West") %>%
mutate(date = x)
})
#> # A tibble: 15 x 9
#> Tm W L `W-L%` GB RS RA `pythW-L%` date
#> <chr> <int> <int> <dbl> <chr> <int> <int> <dbl> <date>
#> 1 SDP 1 0 1 -- 8 7 0.561 2021-04-01
#> 2 COL 1 0 1 -- 8 5 0.703 2021-04-01
#> 3 ARI 0 1 0 1.0 7 8 0.439 2021-04-01
#> 4 SFG 0 1 0 1.0 7 8 0.439 2021-04-01
#> 5 LAD 0 1 0 1.0 5 8 0.297 2021-04-01
#> 6 SDP 2 0 1 -- 12 9 0.629 2021-04-02
#> 7 COL 1 1 0.5 1.0 14 16 0.439 2021-04-02
#> 8 SFG 1 1 0.5 1.0 13 11 0.576 2021-04-02
#> 9 LAD 1 1 0.5 1.0 16 14 0.561 2021-04-02
#> 10 ARI 0 2 0 2.0 9 12 0.371 2021-04-02
#> 11 SDP 3 0 1 -- 19 9 0.797 2021-04-03
#> 12 LAD 2 1 0.667 1.0 22 19 0.567 2021-04-03
#> 13 COL 1 2 0.333 2.0 19 22 0.433 2021-04-03
#> 14 SFG 1 2 0.333 2.0 13 15 0.435 2021-04-03
#> 15 ARI 0 3 0 3.0 9 19 0.203 2021-04-03
Created on 2022-01-02 by the reprex package (v2.0.1)

Summing up Certain Sequences of a Dataframe in R

I have several data frames of daily rates of different regions by age-groups:
Date 0-14 Rate 15-29 Rate 30-44 Rate 45-64 Rate 65-79 Rate 80+ Rate
2020-23-12 0 33.54 45.68 88.88 96.13 41.28
2020-24-12 0 25.14 35.28 66.14 90.28 38.41
It begins on Wednesday (2020-23-12) and I have data from then on up to date.
I want to obtain weekly row sums of rates from each Wednesday to Tuesday.
There should be a wise way of combinations with aggregate, seq and rowsum functions to do this using a few lines. Otherwise, I'll use too long ways to do this.
I created some minimal data, three weeks with some arbitrary column and numerics (no missings). You can use tidyverse language to sum over columns, create groups per week and sum over rowsums by week:
# Minimal Data
MWE <- data.frame(date = c(outer(as.Date("12/23/20", "%m/%d/%y"), 0:20, `+`)),
column1 = runif(21,0,1),
column2 = runif(21,0,1))
library(tidyverse)
MWE %>%
# Calculate Row Sum Everywhere
mutate(sum = rowSums(across(where(is.numeric)))) %>%
# Create Week Groups
group_by(week = ceiling(row_number()/7)) %>%
# Sum Over All RowSums per Group
summarise(rowSums_by_week = sum(sum))
# Groups: week [3]
date column1 column2 sum week
<date> <dbl> <dbl> <dbl> <dbl>
1 2020-12-23 0.449 0.759 1.21 1
2 2020-12-24 0.423 0.0956 0.519 1
3 2020-12-25 0.974 0.592 1.57 1
4 2020-12-26 0.798 0.250 1.05 1
5 2020-12-27 0.870 0.487 1.36 1
6 2020-12-28 0.952 0.345 1.30 1
7 2020-12-29 0.349 0.817 1.17 1
8 2020-12-30 0.227 0.727 0.954 2
9 2020-12-31 0.292 0.209 0.501 2
10 2021-01-01 0.678 0.276 0.954 2
# ... with 11 more rows
# A tibble: 3 x 2
week rowSums_by_week
<dbl> <dbl>
1 1 8.16
2 2 6.02
3 3 6.82

Resources