full_join but with condition on matching - r

I am going to try to make this as simple as possible, I would like a dplyr solution if possible:
Let's say I have a DataFrame of 2 columns called f1. The 2 columns are the reference number of an event and the date_begin is the begin date of the event:
f1
reference date_begin
1 01100144609598 2020-08-15
2 01100144692499 2020-08-12
3 01100144609598 2020-08-09
4 01100434045112 2020-08-26
5 01100434067379 2020-08-24
6 01100723546188 2020-08-16
I also have another DataFrame called f2 with 2 columns. The 2 columns are the reference number of an event and the date_end is the ending date of the event:
reference date_end
1 01100144609598 2020-09-06
2 01100144692499 2020-08-10
3 01100434121179 2020-08-25
4 01100578756185 2020-08-17
5 01100578757962 2020-08-31
6 01100578846401 2020-08-16
I want to use a full_join by reference. That being said:
If there is an end date before there is a begin date I want an NA in stead of the begin date
The end date has to be > than the begin date
If there are 2 end dates for the same reference that are bigger than a begin date take the smallest end date
If there is a begin date with no end date the end date should have an NA
So in this reproducible example I should have an f3 that looks something like the following :
reference date_begin date_end
1 01100144609598 2020-08-15 2020-09-06
2 01100144692499 NA 2020-08-10
3 01100144692499 2020-08-12 NA
4 01100434121179 NA 2020-08-25
5 01100578756185 NA 2020-08-17
6 01100578757962 NA 2020-08-31
7 01100578846401 NA 2020-08-16
8 01100144609598 2020-08-09 NA
9 01100434045112 2020-08-26 NA
10 01100434067379 2020-08-24 NA
11 01100723546188 2020-08-16 NA

As Chuck P mentions, the conditions make this a bit complicated. Rather than use full_join, I've first combined f1 and f2 and then transformed to "long" format. We can then group by reference and sort by date to set up to use case_when to apply the conditions stated in the post or other conditions as needed. The result is then transformed back to "wide" format to present as shown in the post. The code is
library(tidyverse)
#
# combine f1 and f2 and pivot to long format
#
all <- bind_rows(f1,f2) %>%
pivot_longer(cols = c(date_begin, date_end),
names_to = "type", values_to = "date",
values_drop_na = TRUE)
#
# group by reference, sort by date, and then use
# case_when function to pair begin and end dates
#
all <- all %>% group_by(reference) %>%
arrange(date) %>%
mutate(index = 1:n(),
index = case_when(
type == "date_end" & lag(type, n = 1) == "date_begin" ~ lag(index),
TRUE ~ index))
#
# pivot back to wide format to format results as shown in post
#
result <- all %>% pivot_wider(names_from =type, values_from = date) %>% mutate(index = NULL)
The result is
> result
# A tibble: 11 x 3
# Groups: reference [9]
reference date_begin date_end
<chr> <date> <date>
1 01100144609598 2020-08-09 NA
2 01100144692499 NA 2020-08-10
3 01100144692499 2020-08-12 NA
4 01100144609598 2020-08-15 2020-09-06
5 01100723546188 2020-08-16 NA
6 01100578846401 NA 2020-08-16
7 01100578756185 NA 2020-08-17
8 01100434067379 2020-08-24 NA
9 01100434121179 NA 2020-08-25
10 01100434045112 2020-08-26 NA
11 01100578757962 NA 2020-08-31
where the results are sorted by date.

This is more complex than it first appears because of the conditional logic. I broke it down into three steps that occur after we do the initial full_join to make f3
library(dplyr)
library(tidyr)
library(purrr)
f3 <- full_join(f1, f2)
#> Joining, by = "reference"
f3
#> reference date_begin date_end
#> 1 01100144609598 2020-08-15 2020-09-06
#> 2 01100144692499 2020-08-12 2020-08-10
#> 3 01100144609598 2020-08-09 2020-09-06
#> 4 01100434045112 2020-08-26 <NA>
#> 5 01100434067379 2020-08-24 <NA>
#> 6 01100723546188 2020-08-16 <NA>
#> 7 01100434121179 <NA> 2020-08-25
#> 8 01100578756185 <NA> 2020-08-17
#> 9 01100578757962 <NA> 2020-08-31
#> 10 01100578846401 <NA> 2020-08-16
Step 1 set aside the rows where we don't have to do anything because either the begin data or the end date is NA
nothing_to_do <-
f3 %>% filter(is.na(date_begin) | is.na(date_end))
Step 2 identify rows where we have a begin date after an end date like "01100144692499" for these we actually have to add a row and then adjust the rows.
end_before_beginning <-
f3 %>% filter(date_begin > date_end) %>%
group_by(reference) %>%
do (
add_row(.,
reference = .$reference,
date_begin = .$date_begin,
.after = 1)
) %>%
ungroup() %>%
mutate(date_begin =
case_when(
!is.na(date_end) ~ as.Date(NA_character_),
TRUE ~ date_begin
))
Step 3 identify rows with multiple beginnings same ending where we have to select the one with the shortest time space like "01100144609598"
multiple_beginnings <-
f3 %>%
group_by(reference, date_end) %>%
mutate(instances = n(),
date_diff = date_end - date_begin) %>%
filter(instances > 1) %>%
mutate(date_end =
case_when(
date_diff != min(date_diff) ~ as.Date(NA_character_),
TRUE ~ date_end
)) %>%
select(-instances, -date_diff)
Glue them all to together
final_answer <-
list(nothing_to_do, end_before_beginning, multiple_beginnings) %>%
reduce(full_join)
#> Joining, by = c("reference", "date_begin", "date_end")
#> Joining, by = c("reference", "date_begin", "date_end")
final_answer
#> reference date_begin date_end
#> 1 01100434045112 2020-08-26 <NA>
#> 2 01100434067379 2020-08-24 <NA>
#> 3 01100723546188 2020-08-16 <NA>
#> 4 01100434121179 <NA> 2020-08-25
#> 5 01100578756185 <NA> 2020-08-17
#> 6 01100578757962 <NA> 2020-08-31
#> 7 01100578846401 <NA> 2020-08-16
#> 8 01100144692499 <NA> 2020-08-10
#> 9 01100144692499 2020-08-12 <NA>
#> 10 01100144609598 2020-08-15 2020-09-06
#> 11 01100144609598 2020-08-09 <NA>
Your data...
f1 <- structure(list(reference = c("01100144609598", "01100144692499",
"01100144609598", "01100434045112", "01100434067379", "01100723546188"),
date_begin = structure(c(18489, 18486, 18483, 18500, 18498,
18490), class = "Date")), row.names = c(NA, -6L), class = "data.frame")
f2 <- structure(list(reference = c("01100144609598", "01100144692499",
"01100434121179", "01100578756185", "01100578757962", "01100578846401"),
date_end = structure(c(18511, 18484, 18499, 18491, 18505,
18490), class = "Date")), row.names = c(NA, -6L), class = "data.frame")

Related

Joining two data frames on the closest date in R

I have two datasets that I would like to join based on date. One is a survey dataset, and the other is a list of prices at various dates. The dates don't match exactly, so I would like to join on the nearest date in the survey dataset (the price data is weekly).
Here's a brief snippet of what the survey dataset looks like (there are many other variables, but here's the two most relevant):
ID
actual.date
20120377
2012-09-26
2020455822
2020-11-23
20126758
2012-10-26
20124241
2012-10-25
2020426572
2020-11-28
And here's the price dataset (also much larger, but you get the idea):
date
price.var1
price.var2
2017-10-30
2.74733926399869
2.73994826674735
2015-03-16
2.77028200438506
2.74079930272231
2010-10-18
3.4265947805337
3.41591263539176
2012-10-29
4.10095806545397
4.14717556976502
2012-01-09
3.87888859352037
3.93074237884497
What I would like to do is join the price dataset to the survey dataset, joining on the nearest date.
I've tried a number of different things, none of which have worked to my satisfaction.
#reading in sample data
library(data.table)
library(dplyr)
survey <- fread(" ID actual.date
1: 20120377 2012-09-26
2: 2020455822 2020-11-23
3: 20126758 2012-10-26
4: 20124241 2012-10-25
5: 2020426572 2020-11-28
> ") %>% select(-V1)
price <- fread("date price.var1 price.var2
1: 2017-10-30 2.747339 2.739948
2: 2015-03-16 2.770282 2.740799
3: 2010-10-18 3.426595 3.415913
4: 2012-10-29 4.100958 4.147176
5: 2012-01-09 3.878889 3.930742") %>% select(-V1)
#using data.table
setDT(survey)[,DT_DATE := actual.date]
setDT(price)[,DT_DATE := date]
survey_price <- survey[price,on=.(DT_DATE),roll="nearest"]
#This works, and they join, but it drops a ton of observations, which won't work
#using dplyr
library(dplyr)
survey_price <- left_join(survey,price,by=c("actual.date"="date"))
#this joins them without dropping observations, but all of the price variables become NAs
You were almost there.
In the DT[i,on] syntax, i should be survey to join on all its rows
setDT(survey)
setDT(price)
survey_price <- price[survey,on=.(date=actual.date),roll="nearest"]
survey_price
date price.var1 price.var2 ID
<IDat> <num> <num> <int>
1: 2012-09-26 4.100958 4.147176 20120377
2: 2020-11-23 2.747339 2.739948 2020455822
3: 2012-10-26 4.100958 4.147176 20126758
4: 2012-10-25 4.100958 4.147176 20124241
5: 2020-11-28 2.747339 2.739948 2020426572
Convert the dates to numeric and find the closest date from the survey for price with Closest() from DescTools, and take that value.
Example datasets
survey <- tibble(
ID = sample(20000:40000, 9, replace = TRUE),
actual.date = seq(today() %m+% days(5), today() %m+% days(5) %m+% months(2),
"week")
)
price <- tibble(
date = seq(today(), today() %m+% months(2), by = "week"),
price_1 = sample(2:6, 9, replace = TRUE),
price_2 = sample(2:6, 9, replace = TRUE)
)
survey
# A tibble: 9 x 2
ID actual.date
<int> <date>
1 34592 2022-05-07
2 37846 2022-05-14
3 22715 2022-05-21
4 22510 2022-05-28
5 30143 2022-06-04
6 34348 2022-06-11
7 21538 2022-06-18
8 39802 2022-06-25
9 36493 2022-07-02
price
# A tibble: 9 x 3
date price_1 price_2
<date> <int> <int>
1 2022-05-02 6 6
2 2022-05-09 3 2
3 2022-05-16 6 4
4 2022-05-23 6 2
5 2022-05-30 2 6
6 2022-06-06 2 4
7 2022-06-13 2 2
8 2022-06-20 3 5
9 2022-06-27 5 6
library(tidyverse)
library(lubridate)
library(DescTools)
price <- price %>%
mutate(date = Closest(survey$actual.date %>%
as.numeric, date %>%
as.numeric) %>%
as_date())
# A tibble: 9 x 3
date price_1 price_2
<date> <int> <int>
1 2022-05-07 6 6
2 2022-05-14 3 2
3 2022-05-21 6 4
4 2022-05-28 6 2
5 2022-06-04 2 6
6 2022-06-11 2 4
7 2022-06-18 2 2
8 2022-06-25 3 5
9 2022-07-02 5 6
merge(survey, price, by.x = "actual.date", by.y = "date")
actual.date ID price_1 price_2
1 2022-05-07 34592 6 6
2 2022-05-14 37846 3 2
3 2022-05-21 22715 6 4
4 2022-05-28 22510 6 2
5 2022-06-04 30143 2 6
6 2022-06-11 34348 2 4
7 2022-06-18 21538 2 2
8 2022-06-25 39802 3 5
9 2022-07-02 36493 5 6

Create new column using tidy evaluation on the left and right of mutate in R

I know there are many questions regarding tidy evaluation in R. However, I couldn't figure out a solution to this seemingly easily problem.
I have this data.frame
structure(list(Date = c("25.02.2020", "26.02.2020", "27.02.2020",
"28.02.2020", "02.03.2020", "03.03.2020", "04.03.2020", "05.03.2020",
"06.03.2020", "09.03.2020", "10.03.2020", "11.03.2020", "12.03.2020",
"13.03.2020", "16.03.2020", "17.03.2020", "18.03.2020", "19.03.2020",
"20.03.2020", "23.03.2020", "24.03.2020", "25.03.2020", "26.03.2020",
"27.03.2020", "30.03.2020", "31.03.2020", "01.04.2020", "02.04.2020",
"03.04.2020", "06.04.2020"), change_AAPL_stocks = c("1", "0,95",
"0,93", "0,85", "0,94", "1,01", "0,99", "0,98", "0,94", "0,88",
"0,92", "0,92", "0,85", "0,88", "0,8", "0,82", "0,8", "0,82",
"0,82", "0,76", "0,79", "0,83", "0,82", "0,84", "0,83", "0,85",
"0,82", "0,8", "0,81", "0,83"), change_AMZN_stocks = c("1", "0,97",
"0,95", "0,9", "0,94", "0,97", "0,96", "0,95", "0,93", "0,88",
"0,92", "0,92", "0,85", "0,87", "0,81", "0,88", "0,86", "0,92",
"0,95", "0,9", "0,96", "0,95", "0,94", "0,95", "0,95", "0,97",
"0,95", "0,94", "0,94", "0,96")), row.names = c(NA, -30L), class = c("tbl_df",
"tbl", "data.frame"))
And I have these variables
date_col = "Date"
date_format = "%d.%m.%Y"
value_col = "change_AAPL_stocks"
And I'd like to write a function that can take arbitrary date_col and date_format values.
The code at the moment looks like this:
df %>%
select(date_col, value_col) %>%
mutate(
{{date_col}} := as.Date({date_col}, format=date_format)
)
Which creates (overwrites) the column names Date. However, the as.Date(...) function does not work. I am not entirely sure what to do about that.
We can use .data to subset
library(dplyr)
df %>%
select(all_of(date_col))%>%
mutate(!! date_col := as.Date(.data[[date_col]], format = date_format))
-output
# A tibble: 30 × 1
Date
<date>
1 2020-02-25
2 2020-02-26
3 2020-02-27
4 2020-02-28
5 2020-03-02
6 2020-03-03
7 2020-03-04
8 2020-03-05
9 2020-03-06
10 2020-03-09
# … with 20 more rows
Use sym and unsplice it on the rhs.
df %>%
select(all_of(date_col)) %>%
mutate(
{{date_col}} := as.Date(!!sym(date_col), format=date_format)
)
# A tibble: 30 x 1
Date
<date>
1 2020-02-25
2 2020-02-26
3 2020-02-27
4 2020-02-28
5 2020-03-02
6 2020-03-03
7 2020-03-04
8 2020-03-05
9 2020-03-06
10 2020-03-09
# ... with 20 more rows
You can generalize the function to take symbol or character as input as follows:
f <- function(data, date_col){
if(rlang::is_symbol(date_col)){
rhs <- enquo(date_col)
date_col <- as.character(date_col)
}else{
rhs <- sym(date_col)
}
data %>%
select(date_col) %>%
mutate(
{{date_col}} := as.Date(!!rhs, format=date_format)
)
}
df %>%
f('Date')
# A tibble: 30 x 1
Date
<date>
1 2020-02-25
2 2020-02-26
3 2020-02-27
4 2020-02-28
5 2020-03-02
6 2020-03-03
7 2020-03-04
8 2020-03-05
9 2020-03-06
10 2020-03-09
# ... with 20 more rows
df %>%
f(date_col)
# A tibble: 30 x 1
Date
<date>
1 2020-02-25
2 2020-02-26
3 2020-02-27
4 2020-02-28
5 2020-03-02
6 2020-03-03
7 2020-03-04
8 2020-03-05
9 2020-03-06
10 2020-03-09
# ... with 20 more rows

Find if a date overlaps between multiple pairs of vectorised dates

I'm trying to find whether a date exists between multiple pairs of dates which are wide in my dataset - the length I've given here is just an example, the eventual number may be larger or smaller. Not sure if this is the most sensible option but working longwise didn't seem to work, this is also a very common way to work with overlapping dates and date pairs in SPSS, where you can have multiple variables numerised as the dates are here and it works through each numbered 'set' to give you a response.
Here is an example dataset:
person key_date 1_end_date 2_end_date 3_end_date 4_end_date 1_start_date 2_start_date 3_start_date 4_start_date
1 1 2019-09-30 2019-05-23 2019-09-30 2016-07-22 <NA> 2019-05-23 2019-09-30 2016-07-22 <NA>
2 2 2019-06-07 2019-05-16 2019-06-07 <NA> <NA> 2019-05-16 <NA> <NA> <NA>
3 3 2020-03-09 2016-06-02 2019-08-09 2020-05-27 2020-02-12 2016-06-02 2019-08-09 2020-05-27 2020-03-09
test <- structure(list(person = 1:3, key_date = structure(c(18169, 18054,18330), class = "Date"), `1_end_date` = structure(c(18039, 18032,16954), class = "Date"), `2_end_date` = structure(c(18169, 18054,18117), class = "Date"), `3_end_date` = structure(c(17004, NA,18409), class = "Date"), `4_end_date` = structure(c(NA, NA, 18304), class = "Date"), `1_start_date` = structure(c(18039, 18032,16954), class = "Date"), `2_start_date` = structure(c(18169,NA, 18117), class = "Date"), `3_start_date` = structure(c(17004,NA, 18409), class = "Date"), `4_start_date` = structure(c(NA,NA, 18330), class = "Date")), row.names = c(NA, 3L), class = "data.frame")
The expected output would be just a binary flag to indicate that the key_date exists between any pair of start_date and end_date. In the example given, that would mean person 1 and 3. Any ideas how to do this? Is this really inefficient?
tidyverse approach
library(tidyverse)
result <- test %>% mutate(across(ends_with("end_date"), ~
key_date <= . & key_date >= get(str_replace(cur_column(), "end", "start")),
.names = '{.col}_flag')) %>%
rowwise() %>%
mutate(Flag1 = sum(c_across(ends_with("flag")), na.rm = T)) %>%
ungroup() %>%
select(-ends_with("flag"))
> result$Flag1
[1] 1 0 0
Complete output will look like
> result
# A tibble: 3 x 11
person key_date `1_end_date` `2_end_date` `3_end_date` `4_end_date` `1_start_date` `2_start_date` `3_start_date` `4_start_date` Flag1
<int> <date> <date> <date> <date> <date> <date> <date> <date> <date> <dbl>
1 1 2019-09-30 2019-05-23 2019-09-30 2016-07-22 NA 2019-05-23 2019-09-30 2016-07-22 NA 1
2 2 2019-06-07 2019-05-16 2019-06-07 NA NA 2019-05-16 NA NA NA 0
3 3 2020-03-09 2016-06-02 2019-08-09 2020-05-27 2020-02-12 2016-06-02 2019-08-09 2020-05-27 2020-03-09 0

Creating an statement to check multiple dates between a start and end date

I have a dataframe like this in R:
Start date
End date
Date 1
Date 2
Date 3
Date 4
11/12/2018
29/11/2019
08/03/2021
NA
NA
NA
07/03/2018
24/04/2019
08/03/2021
12/09/2016
NA
NA
04/06/2018
23/04/2019
08/03/2021
02/10/2017
05/10/2018
NA
26/07/2018
29/08/2019
08/03/2021
03/08/2015
02/10/2017
23/01/2017
I want to create a new column in R that says: If Date 1, Date 2, Date 3 or Date 4 is between Start Date and End date, it should return 1, 0 otherwise, as the table below:
Start date
End date
Date 1
Date 2
Date 3
Date 4
Change
11/12/2018
29/11/2019
08/03/2021
NA
NA
NA
0
07/03/2018
24/04/2019
08/03/2021
12/09/2016
NA
NA
0
04/06/2018
23/04/2019
08/03/2021
02/10/2017
05/10/2018
NA
1
26/07/2018
29/08/2019
08/03/2021
03/08/2015
02/10/2017
23/01/2017
0
Does anyone have a suggestion on how to solve this? Thank you :)
It'll make it much easier for people to help you if you can post code / data which we can run directly. The easiest way to do this is to use a handy R function called dput, which generates instructions to exactly recreate any R object. So you might run dput(MY_DATA), or if your data is much larger than needed to demonstrate your question, dput(head(MY_DATA)) to get the first six rows, and paste the output of that into your question. </PSA>
Here's code to generate your example data:
my_data <- data.frame(
stringsAsFactors = FALSE,
Start.date = c("11/12/2018", "07/03/2018", "04/06/2018", "26/07/2018"),
End.date = c("29/11/2019", "24/04/2019", "23/04/2019", "29/08/2019"),
Date.1 = c("08/03/2021", "08/03/2021", "08/03/2021", "08/03/2021"),
Date.2 = c(NA, "12/09/2016", "02/10/2017", "03/08/2015"),
Date.3 = c(NA, NA, "05/10/2018", "02/10/2017"),
Date.4 = c(NA, NA, NA, "23/01/2017")
)
Here's a tidyverse approach to first convert your day/month/year dates into data in R's Date type using lubridate::dmy, then to compare each of Date.1 thru Date.4 against your start dates, and then finally to show if there are any 1's (within range).
library(dplyr); library(lubridate)
my_data %>%
mutate(across(.fns = ~dmy(.x))) %>%
mutate(across(.cols = starts_with("Date"),
.fns = ~coalesce(.x >= Start.date & .x <= End.date, FALSE)*1)) %>%
mutate(Change = pmax(Date.1, Date.2, Date.3, Date.4))
coalesce(..., FALSE) used here to treat NA like FALSE.
(...)*1 to convert TRUE/FALSE to 1/0.
pmax(...) to grab the largest of the 1/0's, i.e. "are there any 1's?"
Edit: alternative to leave Date columns intact:
my_data %>%
mutate(across(.fns = ~dmy(.x))) %>%
mutate(across(.cols = starts_with("Date"),
.names = "Check_{.col}",
.fns = ~coalesce(.x >= Start.date & .x <= End.date, FALSE)*1)) %>%
rowwise() %>%
mutate(Change = max(c_across(starts_with("Check")))) %>%
select(-starts_with("Check"))
Start.date End.date Date.1 Date.2 Date.3 Date.4 Change
<date> <date> <date> <date> <date> <date> <dbl>
1 2018-12-11 2019-11-29 2021-03-08 NA NA NA 0
2 2018-03-07 2019-04-24 2021-03-08 2016-09-12 NA NA 0
3 2018-06-04 2019-04-23 2021-03-08 2017-10-02 2018-10-05 NA 1
4 2018-07-26 2019-08-29 2021-03-08 2015-08-03 2017-10-02 2017-01-23 0
library(tidyverse)
library(lubridate)
df <- read.table(textConnection("start_date;end_date;date_1;date_2;date_3;date_4
11/12/2018;29/11/2019;08/03/2021;NA;NA;NA
07/03/2018;24/04/2019;08/03/2021;12/09/2016;NA;NA
04/06/2018;23/04/2019;08/03/2021;02/10/2017;05/10/2018;NA
26/07/2018;29/08/2019;08/03/2021;03/08/2015;02/10/2017;23/01/2017"),
sep=";",
header = TRUE)
df %>%
mutate(
across(everything(), lubridate::dmy),
change = ((date_1 > start_date & date_1 < end_date) |
(date_2 > start_date & date_2 < end_date) |
(date_3 > start_date & date_3 < end_date)
) %>%
coalesce(FALSE) %>%
as.integer()
)
#> start_date end_date date_1 date_2 date_3 date_4 change
#> 1 2018-12-11 2019-11-29 2021-03-08 <NA> <NA> <NA> 0
#> 2 2018-03-07 2019-04-24 2021-03-08 2016-09-12 <NA> <NA> 0
#> 3 2018-06-04 2019-04-23 2021-03-08 2017-10-02 2018-10-05 <NA> 1
#> 4 2018-07-26 2019-08-29 2021-03-08 2015-08-03 2017-10-02 2017-01-23 0

R: Reshape every 2 rows of data into 1 row based on factor of a single column

I have the below data:
TimeStamp Fab23.A start.end
1 2020-03-02 20:44:00 27.54236 start
2 2020-03-02 20:50:00 186.08670 end
3 2020-03-03 18:12:00 37.33132 start
4 2020-03-03 18:16:00 189.78060 end
5 2020-03-04 17:48:00 33.78360 start
6 2020-03-04 17:52:00 190.08100 end
.
.
.
I'm trying to reshape them to the below format based on the last categorical value of last column:
start.TimeStamp end.TimeStamp start.Fab23.A start.Fab23.A
1 2020-03-02 20:44:00 2020-03-02 20:50:00 27.54236 186.08670
2 2020-03-03 18:12:00 2020-03-03 18:16:00 37.33132 189.78060
3 2020-03-04 17:48:00 2020-03-04 17:52:00 33.78360 190.08100
.
.
.
I have tried reshape and melt function but to no avail.
Will appreciate any advice.
You can use :
library(dplyr)
df %>%
group_by(start.end) %>%
mutate(row = row_number()) %>%
tidyr::pivot_wider(names_from = start.end,
values_from = c(TimeStamp, Fab23.A)) %>%
select(-row)
# A tibble: 3 x 4
# TimeStamp_start TimeStamp_end Fab23.A_start Fab23.A_end
# <chr> <chr> <dbl> <dbl>
#1 2020-03-0220:44:00 2020-03-0220:50:00 27.5 186.
#2 2020-03-0318:12:00 2020-03-0318:16:00 37.3 190.
#3 2020-03-0417:48:00 2020-03-0417:52:00 33.8 190.
Or using data.table :
library(data.table)
dcast(setDT(df), rowid(start.end)~start.end,value.var = c("TimeStamp", "Fab23.A"))
data
df <- structure(list(TimeStamp = c("2020-03-0220:44:00", "2020-03-0220:50:00",
"2020-03-0318:12:00", "2020-03-0318:16:00", "2020-03-0417:48:00",
"2020-03-0417:52:00"), Fab23.A = c(27.54236, 186.0867, 37.33132,
189.7806, 33.7836, 190.081), start.end = c("start", "end", "start",
"end", "start", "end")), class = "data.frame", row.names = c(NA, -6L))

Resources