I have 2 tables
df1 = data.frame("dates" = c(seq(as.Date("2020-1-1"), as.Date("2020-1-10"), by = "days")))
df2 = data.frame("observations" = c("a", "b", "c", "d"), "start" = as.Date(c("2019-12-30", "2020-1-1", "2020-1-5","2020-1-10")), "end"=as.Date(c("2020-1-3", "2020-1-2", "2020-1-12","2020-1-14")))
I would like to know the number of observation periods that occur on each day of df1, based on the start/stop dates in df2. E.g. on 1/1/2020, observations a and b were in progress, hence "2".
The expected output would be as follows:
I've tried using sums
df1$number = sum(as.Date(df2$start) <= df1$dates & as.Date(df2$end)>=df1$dates)
But that only sums up the entire column values
I've then tried to create a custom function for this:
df1$number = apply(df1, 1, function(x) sum(df2$start <= x & df2$end>=x))
But it returns an NA value.
I then tried to do embed an "ifelse" within it, but get the same issue with NAs
apply(df1, 1, function(x) sum(ifelse(df2$start <= x & df2$end>=x, 1, 0)))
Can anyone suggest what the issue is? Thanks!
edit: an interval join was suggested which is not what I'm trying to get - I think naming the observations with a numeric label was what caused confusion. I am trying to find out the TOTAL number of observations with periods that fall within the day, as compared to doing a 1:1 match.
Regards
Sing
Define the comparison in a function f and pass it through outer, rowSums is what you're looking for.
f <- \(x, y) df1[x, 1] >= df2[y, 2] & df1[x, 1] <= df2[y, 3]
cbind(df1, number=rowSums(outer(1:nrow(df1), 1:nrow(df2), f)))
# dates number
# 1 2020-01-01 2
# 2 2020-01-02 2
# 3 2020-01-03 1
# 4 2020-01-04 0
# 5 2020-01-05 1
# 6 2020-01-06 1
# 7 2020-01-07 1
# 8 2020-01-08 1
# 9 2020-01-09 1
# 10 2020-01-10 2
Here is a potential solution using dplyr/tidyverse functions and the %within% function from the lubridate package. This approach is similar to Left Join Subset of Column Based on Date Interval, however there are some important differences i.e. use summarise() instead of filter() to avoid 'losing' dates where "number" == 0, and join by 'character()' as there are no common columns between datasets:
library(dplyr)
library(lubridate)
df1 = data.frame("dates" = c(seq(as.Date("2020-1-1"),
as.Date("2020-1-10"),
by = "days")))
df2 = data.frame("observations" = c("1", "2", "3", "4"),
"start" = as.Date(c("2019-12-30", "2020-1-1", "2020-1-5","2020-1-10")),
"end"=as.Date(c("2020-1-3", "2020-1-2", "2020-1-12","2020-1-14")))
df1 %>%
full_join(df2, by = character()) %>%
mutate(number = dates %within% interval(start, end)) %>%
group_by(dates) %>%
summarise(number = sum(number))
#> # A tibble: 10 × 2
#> dates number
#> <date> <dbl>
#> 1 2020-01-01 2
#> 2 2020-01-02 2
#> 3 2020-01-03 1
#> 4 2020-01-04 0
#> 5 2020-01-05 1
#> 6 2020-01-06 1
#> 7 2020-01-07 1
#> 8 2020-01-08 1
#> 9 2020-01-09 1
#> 10 2020-01-10 2
Created on 2022-06-27 by the reprex package (v2.0.1)
Does this approach work with your actual data?
Related
ID
Date
101
10-17-2021
101
10-19-2021
101
10-20-2021
101
10-31-2021
101
11-01-2021
For each ID I want to remove observations that are within 7 days of each other. I want to keep the earliest date of the dates that are within 7 days of each other. So in this case I would want to keep "10-17-2021" and "10-31-2021". This process would continue until I have unique dates for each ID that are at least 7 days apart and do not contain other dates in between.
You can do it using group_by() and slice() functions. But first the Date column must be formatted using as.Date() function. Here is the code to remove observations within 7-day interval and keep only the earliest ID:
library(tidyverse)
df$Date <- as.Date(df$Date, format = "%m-%d-%Y")
df %>%
group_by(ID) %>%
slice(c(1, which(c(0, diff(Date)) >= 7)))
output
ID Date
101 2021-10-17
101 2021-10-31
In your example, you can't evaluate every observation independently because some of them may be removed when compared to the first value. Perhaps I'm not thinking about it the right way, but I think you need a loop to do this. Here's what I came up with (note: I made the sequence of dates longer to make sure it works):
library(dplyr)
d <- tibble(
ID = 101,
Date = seq(lubridate::mdy("01-01-2023"),
lubridate::mdy("02-07-2023"), by="days")
)
i <- 1
while(i < nrow(d)){
d <- d %>% mutate(diff = Date - d$Date[i])
d <- d %>% filter(diff <= 0 | diff > 7)
if(i < nrow(d)){
i <- i+1
}
}
d <- d %>% select(-diff)
d
#> # A tibble: 5 × 2
#> ID Date
#> <dbl> <date>
#> 1 101 2023-01-01
#> 2 101 2023-01-09
#> 3 101 2023-01-17
#> 4 101 2023-01-25
#> 5 101 2023-02-02
Created on 2023-02-08 by the reprex package (v2.0.1)
Essentially, what happens is that the loop initializes with the first observation and removes every observation within seven days. If more observations remain, it increments the counter and moves to the next day and evaluates all subsequent dates from there, keeping everything that came before.
These loops are difficult to do in the tidyverse, but you could split the data by group, run the loop on each group and then put the groups back together. Here's an example:
library(dplyr)
d <- tibble(
ID = 101,
Date = seq(lubridate::mdy("01-01-2023"),
lubridate::mdy("02-07-2023"), by="days")
)
d2 <- d %>% mutate(ID = 102)
alldat <- bind_rows(d, d2)
split_dat <- alldat %>%
group_by(ID) %>%
group_split()
result <- purrr::map(split_dat, function(d){
i <- 1
while(i < nrow(d)){
d <- d %>% mutate(diff = Date - d$Date[i])
d <- d %>% filter(diff <= 0 | diff > 7)
if(i < nrow(d)){
i <- i+1
}
}
d <- d %>% select(-diff)
d
})
result <- bind_rows(result)
result
#> # A tibble: 10 × 2
#> ID Date
#> <dbl> <date>
#> 1 101 2023-01-01
#> 2 101 2023-01-09
#> 3 101 2023-01-17
#> 4 101 2023-01-25
#> 5 101 2023-02-02
#> 6 102 2023-01-01
#> 7 102 2023-01-09
#> 8 102 2023-01-17
#> 9 102 2023-01-25
#> 10 102 2023-02-02
Created on 2023-02-08 by the reprex package (v2.0.1)
You can try using a recursive function as in this answer.
f <- function(d, ind = 1) {
ind.next <- dplyr::first(which(difftime(d, d[ind], units="days") > 7))
if (is.na(ind.next))
return(ind)
else
return(c(ind, f(d, ind.next)))
}
After the first date, the function will get the next index ind.next where the date is more than 7 days away. Recursively, add that index and get the next date after that. In the end, just return all the row indexes.
The code to use this function can group_by(ID) and slice to retain those rows based on indexes returned.
library(dplyr)
df %>%
group_by(ID) %>%
slice(f(Date))
I have daily level data as mentioned below dataframe.
a = c("a","a","a","a","a","b","b","b","b","b")
a = as.data.frame(a)
a$date = seq.Date(as.Date("2022-06-01"), as.Date("2022-06-10"), by = 1)
a$value = c(8,7,7,7,8,9,9,9,7,8)
The desired output should be
a = c("a","a","a","b","b","b")
a = as.data.frame(a)
a$startdate = c("2022-06-01","2022-06-02","2022-06-05","2022-06-06","2022-06-09","2022-06-10")
a$enddate = c("2022-06-01","2022-06-04","2022-06-05","2022-06-08","2022-06-09","2022-06-10")
a$value = c(8,7,8,9,7,8)
Thanks
I have tried one solution involving 2 for loops and then aggregation but it is very slow. It would be of great help if I get a faster solution.
It looks like you want to filter to rows where there’s a change from the previous value?
library(dplyr)
a %>%
group_by(a) %>%
filter(value != lag(value, default = -Inf)) %>%
ungroup()
# A tibble: 6 × 3
a date value
<chr> <date> <dbl>
1 a 2022-06-01 8
2 a 2022-06-02 7
3 a 2022-06-05 8
4 b 2022-06-06 9
5 b 2022-06-09 7
6 b 2022-06-10 8
I'm new to R and I'm facing a problem, I have a date vector and a dataframe containing data regarding sales values and coverage start and end dates.
I need to defer the sale value at each analysis date, for the first analysis period, I can create an algorithm that gives me the desired answer. However in my real data I am working with a base of 200K+ rows and 50+ analysis periods.
I'm not able to build a loop or find an alternative function in R that allows me to create the variables Aux[i] and Test[i] according to the number of dates present in the vec_date vector.
The following is an example of code that works for the first analysis period.
library(tidyverse)
library(lubridate)
df <- tibble(DateIn = c(ymd("2021-10-21", "2021-12-25", "2022-05-11")),
DateFin = c(ymd("2022-03-10", "2022-07-12", "2023-02-15")),
Premium = c(11000, 5000, 24500))
date <- ymd("2021-12-31")
vec_date <- date %m+% months(seq(0, 12, by = 6))
df_new <- df |>
mutate(duration = as.numeric(DateFin - DateIn),
Pr_day = Premium/duration,
Aux1 = if_else(DateIn > vec_date[1] | DateFin < vec_date[1], "N", "Y"),
test1 = if_else(Aux1 == "Y" & DateFin > vec_date[1], as.numeric(DateFin - vec_date[1])*Pr_day,
if_else(DateIn > vec_date[1], Premium, 0)))
Does anyone have any idea how I could build this loop, or is there any R function/package that allows me to perform this interaction between my df dataframe and vec_date vector?
Edit: an outline of the format you would need as a result would be:
df_final <- tibble(DateIn = c(ymd("2021-10-21", "2021-12-25", "2022-05-11")),
DateFin = c(ymd("2022-03-10", "2022-07-12", "2023-02-15")),
Premium = c(11000, 5000, 24500),
Aux1 = c("Y", "Y", "N"),
test1 = c(5421.429, 4849.246, 24500.000),
Aux2 = c("N", "Y", "Y"),
test2 = c(0.0000, 301.5075, 20125.0000),
Aux3 = c("N", "N", "Y"),
test3 = c(0, 0, 4025))
Where, Aux1 and test1 are the results referring to vec_date[1], 2 = vec_date[2], 3 = vec_date[3]. For me it is important to keep the resulting variables in the same dataframe because later analysis will be done.
As #Jon Spring suggests in the comments, probably the preferred approach here
would be to use tidyr::complete() to extend your data frame, repeating each
row in it for each of your analysis dates. Then, you can stick to vectorized
calculations and get the analysis date column in the resulting data, too.
Below is how to do just that with the example data you provided. I took the
liberty of renaming some columns, and simplifying the control-flow based
calculation according to my understanding of the problem, based on what you
shared.
First, the example data slightly reframed:
library(tidyverse)
library(lubridate)
policies <- tibble(
policy_id = seq_len(3),
start = ymd("2021-10-21", "2021-12-25", "2022-05-11"),
end = ymd("2022-03-10", "2022-07-12", "2023-02-15"),
premium = c(11000, 5000, 24500)
)
policies
#> # A tibble: 3 x 4
#> policy_id start end premium
#> <int> <date> <date> <dbl>
#> 1 1 2021-10-21 2022-03-10 11000
#> 2 2 2021-12-25 2022-07-12 5000
#> 3 3 2022-05-11 2023-02-15 24500
Then, finding remaining prorated premiums for policies at given dates:
start_date <- ymd("2021-12-31")
dates <- start_date %m+% months(seq(0, 12, by = 6))
policies %>%
mutate(
days = as.numeric(end - start),
daily_premium = premium / days
) %>%
crossing(date = dates) %>%
mutate(
days_left = pmax(0, end - pmax(start, date)),
premium_left = days_left * daily_premium
) %>%
select(policy_id, date, days_left, premium_left)
#> # A tibble: 9 x 4
#> policy_id date days_left premium_left
#> <int> <date> <dbl> <dbl>
#> 1 1 2021-12-31 69 5421.
#> 2 1 2022-06-30 0 0
#> 3 1 2022-12-31 0 0
#> 4 2 2021-12-31 193 4849.
#> 5 2 2022-06-30 12 302.
#> 6 2 2022-12-31 0 0
#> 7 3 2021-12-31 280 24500
#> 8 3 2022-06-30 230 20125
#> 9 3 2022-12-31 46 4025
I have a table of data which includes, among others, an ID, a (somehow sorted) grouping column and a date. For each ID, based on the minimum value of the date for a given group, I would like to filter out the rows of another given group that occurred after that date.
I thought about using pivot_wider and pivot_longer, but I was not able to operate on columns containing list values and single values simultaneously.
How can I do it efficiently (using any tidyverse method, if possible)?
For instance, given
library(dplyr)
tbl <- tibble(id = c(rep(1,5), rep(2,5)),
type = c("A","A","A","B","C","A","A","B","B","C"),
dat = as.Date("2021-12-07") - c(3,0,1,2,0,3,6,2,4,3))
# A tibble: 10 × 3
# id type dat
# <int> <chr> <date>
# 1 1 A 2021-12-04
# 2 1 A 2021-12-07
# 3 1 A 2021-12-06
# 4 1 B 2021-12-05
# 5 1 C 2021-12-07
# 6 2 A 2021-12-04
# 7 2 A 2021-12-01
# 8 2 B 2021-12-05
# 9 2 B 2021-12-03
# 10 2 C 2021-12-04
I would like the following result, where I discarded A-typed elements that occurred after the first of the B-typed ones, but none of the C-typed ones:
# A tibble: 7 × 3
# id type dat
# <int> <chr> <date>
# 1 1 A 2021-12-04
# 2 1 B 2021-12-05
# 3 1 C 2021-12-07
# 4 2 A 2021-12-01
# 5 2 B 2021-12-05
# 6 2 B 2021-12-03
# 7 2 C 2021-12-04
I like to use pivot_wider aand pivot_longer in this case. It does the trick, but maybe you are looking for something shorter.
tbl <- tibble(id = 1:5, type = c("A","A","A","B","C"), dat = as.Date("2021-12-07") - c(3,4,1,2,0)) %>%
pivot_wider(names_from = type, values_from = dat) %>%
filter(A < min(B, na.rm = TRUE) | is.na(A)) %>%
pivot_longer(2:4, names_to = "type", values_to = "dat") %>%
na.omit()
# A tibble: 4 × 3
id type dat
<int> <chr> <date>
1 1 A 2021-12-04
2 2 A 2021-12-03
3 4 B 2021-12-05
4 5 C 2021-12-07
An easy way using kind of SQL logic :
tbl_to_delete <- tbl %>% dplyr::filter(type == "A" & dat > min(tbl$dat[tbl$type=="B"]))
tbl2 <- tbl %>% dplyr::anti_join(tbl_to_delete,by=c("type","dat"))
First you isolate the rows you want to delete, then you discard them from your original data.
You can of course merge the two lines before into one for better code management :
tbl %>% anti_join(tbl %>% filter(type == "A" & dat > min(tbl$dat[tbl$type=="B"])),by=c("type","dat"))
Or if you really hate rbase :
tbl %>% anti_join(tbl %>% filter(type == "A" & dat > tbl %>% filter(type == "B") %>% pull(dat) %>% min()),by=c("type","dat"))
I am struggling to pass column names inside my custom function while using dplyr - mutate_at.
I have a dataset "dt" with thousands of columns and I would like to perform mutate for some of these columns, but in a way which is dependent on the column name
I have this piece of code
Option 1:
relevantcols = c("A", "B", "C")
myfunc <- function(colname, x) {
#write different logic per column name
}
dt%>%
mutate_at(relevantcols, funs(myfunc(<what should i give?>,.)))
I tried approaching the problem in another way, i.e by iterating over relevantcols and applying mutate_at for each of the elements of the vector as follows
Option 2:
for (i in 1:length(relevantcols)){
dt%>%
mutate_at(relevantcols[i], funs(myfunc(relevantcols[i], .))
}
I get the colnames in Option 2, but it is 10 times slower than Option 1. Can I get somehow the column names in Option 1?
Adding an example for more clarity
df = data.frame(employee=seq(1:5), Mon_channelA=runif(5,1,10), Mon_channelB=runif(5,1,10), Tue_channelA=runif(5,1,10),Tue_channelB=runif(5,1,10))
df
employee Mon_channelA Mon_channelB Tue_channelA Tue_channelB
1 1 5.234383 6.857227 4.480943 7.233947
2 2 7.441399 3.777524 2.134075 6.310293
3 3 7.686558 8.598688 9.814882 9.192952
4 4 6.033345 5.658716 5.167388 3.018563
5 5 5.595006 7.582548 9.302917 6.071108
relevantcols = c("Mon_channelA", "Mon_channelB")
myfunc <- function(colname, x) {
#based on the channel and weekday, compare the data from corresponding column with the same channel but different weekday and return T if higher else F
}
# required output
employee Mon_channelA Mon_channelB Tue_channelA Tue_channelB
1 1 T F 4.480943 7.233947
2 2 T F 2.134075 6.310293
3 3 F F 9.814882 9.192952
4 4 T T 5.167388 3.018563
5 5 F T 9.302917 6.071108
I left a comment about data types, but assuming that that is what you're looking for, here's the approach I take to these sorts of problems. I do this in a seemingly convoluted process of reshaping a few times, but it lets you set up the variables that you're trying to compare without hard-coding much. I'll break it into pieces.
library(tidyverse)
set.seed(928)
df <- data.frame(employee=seq(1:5), Mon_channelA=runif(5,1,10), Mon_channelB=runif(5,1,10), Tue_channelA=runif(5,1,10),Tue_channelB=runif(5,1,10))
First, I'd reshape it into a long shape and break the "Mon_channelA", etc apart into a day and a channel. This lets you use the channel designation to match values for comparison.
df %>%
gather(key, value, -employee) %>%
separate(key, into = c("day", "channel"), sep = "_") %>%
head()
#> employee day channel value
#> 1 1 Mon channelA 2.039619
#> 2 2 Mon channelA 8.153684
#> 3 3 Mon channelA 9.027932
#> 4 4 Mon channelA 1.161967
#> 5 5 Mon channelA 3.583353
#> 6 1 Mon channelB 7.102797
Then, bring it back into a wide format based on the days. Now you have a column for each day for each combination of employee and channel.
df %>%
gather(key, value, -employee) %>%
separate(key, into = c("day", "channel"), sep = "_") %>%
spread(key = day, value = value) %>%
head()
#> employee channel Mon Tue
#> 1 1 channelA 2.039619 9.826677
#> 2 1 channelB 7.102797 7.388568
#> 3 2 channelA 8.153684 5.848375
#> 4 2 channelB 6.299178 9.452274
#> 5 3 channelA 9.027932 5.458906
#> 6 3 channelB 7.029408 7.087011
Then do your comparison, and take the data long again. Note that because the value column has numeric values, everything becomes numeric and the logical values are converted to 1 or 0.
df %>%
gather(key, value, -employee) %>%
separate(key, into = c("day", "channel"), sep = "_") %>%
spread(key = day, value = value) %>%
mutate(Mon = Mon > Tue) %>%
gather(key = day, value = value, Mon, Tue) %>%
head()
#> employee channel day value
#> 1 1 channelA Mon 0
#> 2 1 channelB Mon 0
#> 3 2 channelA Mon 1
#> 4 2 channelB Mon 0
#> 5 3 channelA Mon 1
#> 6 3 channelB Mon 0
Last few steps are to stick the day and channel back together to make the labels as you had them, spread back to a wide format, and turn all the columns starting with "Mon" back into logicals.
df %>%
gather(key, value, -employee) %>%
separate(key, into = c("day", "channel"), sep = "_") %>%
spread(key = day, value = value) %>%
mutate(Mon = Mon > Tue) %>%
gather(key = day, value = value, Mon, Tue) %>%
unite("variable", day, channel) %>%
spread(key = variable, value = value) %>%
mutate_at(vars(starts_with("Mon")), as.logical)
#> employee Mon_channelA Mon_channelB Tue_channelA Tue_channelB
#> 1 1 FALSE FALSE 9.826677 7.388568
#> 2 2 TRUE FALSE 5.848375 9.452274
#> 3 3 TRUE FALSE 5.458906 7.087011
#> 4 4 FALSE FALSE 8.854263 8.946458
#> 5 5 FALSE FALSE 6.933054 8.450741
Created on 2018-09-28 by the reprex package (v0.2.1)
You can do things like :
L <- c("A","B")
df <- data.frame(A=rep(1:3,2),B=1:6,C=7:12)
df
# A B C
#1 1 1 7
#2 2 2 8
#3 3 3 9
#4 1 4 10
#5 2 5 11
#6 3 6 12
f <- function(x,y) x^y
df %>% mutate_at(L,funs(f(.,2)))
# A B C
#1 1 1 7
#2 4 4 8
#3 9 9 9
#4 1 16 10
#5 4 25 11
#6 9 36 12
This is an old question, but I just stumbled over one possible way to solve it using a custom mutate/case_when function in combination with purrr::reduce.
It's important to use non-standard evaluation (NSE) inside the mutate/case_when statement to match the variable names you need for your custom function.
I do not know a way to do something similar with mutate_at.
Below I provide two examples, the most basic form (using your original data), and a more advanced version (which contains three weekdays and two channels and) which creates more than two variables. The latter requires an initial set-up using, for example, switch.
Basic example
library(tidyverse)
# your data
df <- data.frame(employee=seq(1:5),
Mon_channelA=runif(5,1,10),
Mon_channelB=runif(5,1,10),
Tue_channelA=runif(5,1,10),
Tue_channelB=runif(5,1,10)
)
# custom function which takes two arguments, df and a string variable name
myfunc <- function(df, x) {
mutate(df,
# overwrites all "Mon_channel" variables ...
!! paste0("Mon_", x) := case_when(
# ... with TRUE, when Mon_channel is smaller than Tue_channel, and FALSE else
!! sym(paste0("Mon_", x)) < !! sym(paste0("Tue_", x)) ~ T,
T ~ F
)
)
}
# define the variables you want to loop over
var_ls <- c("channelA", "channelB")
# use var_ls and myfunc with reduce on your data
df %>%
reduce(var_ls, myfunc, .init = .)
#> employee Mon_channelA Mon_channelB Tue_channelA Tue_channelB
#> 1 1 FALSE FALSE 3.437975 2.458389
#> 2 2 FALSE TRUE 3.686903 4.772390
#> 3 3 TRUE TRUE 5.158234 5.378021
#> 4 4 TRUE TRUE 5.338950 3.109760
#> 5 5 TRUE FALSE 6.365173 3.450495
Created on 2020-02-03 by the reprex package (v0.3.0)
More advanced example
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 3.5.2
#> Warning: package 'purrr' was built under R version 3.5.2
#> Warning: package 'forcats' was built under R version 3.5.2
# your data plus one weekday with two channels
df <- data.frame(employee=seq(1:5),
Mon_channelA=runif(5,1,10),
Mon_channelB=runif(5,1,10),
Tue_channelA=runif(5,1,10),
Tue_channelB=runif(5,1,10),
Wed_channelA=runif(5,1,10),
Wed_channelB=runif(5,1,10)
)
# custom function which takes two argument, df and a string variable name
myfunc <- function(df, x) {
# an initial set-up is needed
# id gets the original day
id <- str_extract(x, "^\\w{3}")
# based on id the day of comparison is mapped with switch
y <- switch(id,
"Mon" = "Tue",
"Tue" = "Wed")
# j extracts the channel name including the underscore
j <- str_extract(x, "_channel[A-Z]{1}")
# this makes the function definition rather easy:
mutate(df,
!! x := case_when(
!! sym(x) < !! sym(paste0(y, j)) ~ T,
T ~ F
)
)
}
# define the variables you want to loop over
var_ls <- c("Mon_channelA",
"Mon_channelB",
"Tue_channelA",
"Tue_channelB")
# use var_ls and myfunc with reduce on your data
df %>%
reduce(var_ls, myfunc, .init = .)
#> employee Mon_channelA Mon_channelB Tue_channelA Tue_channelB
#> 1 1 TRUE TRUE TRUE FALSE
#> 2 2 FALSE TRUE TRUE FALSE
#> 3 3 FALSE TRUE FALSE TRUE
#> 4 4 FALSE TRUE TRUE FALSE
#> 5 5 TRUE FALSE FALSE FALSE
#> Wed_channelA Wed_channelB
#> 1 9.952454 5.634686
#> 2 9.356577 4.514683
#> 3 2.721330 7.107316
#> 4 4.410240 2.740289
#> 5 5.394057 4.772162
Created on 2020-02-03 by the reprex package (v0.3.0)