Turn list with dates into data frame in R [duplicate] - r

This question already has answers here:
Combine a list of data frames into one data frame by row
(10 answers)
Closed 5 years ago.
I have troubles with converting a list containing dates into a date.frame, as the dates are converted into integers when using the unlist command.
The list I work on looks similar to this, just with way more data frames:
list(
data.frame(
date = as.POSIXct(Sys.time() + days(seq(0, 4))),
value = c(4,5,1,7,9)),
data.frame(
date = as.POSIXct(Sys.time() + days(seq(5, 9))),
value = c(3,3,5,1,7))
)
What I am looking for a method to convert it into a single data.frame that look like this:
date value
1 2017-07-24 14:30:18 4
2 2017-07-25 14:30:18 5
3 2017-07-26 14:30:18 1
4 2017-07-27 14:30:18 7
5 2017-07-28 14:30:18 9
6 2017-07-29 14:30:18 3
7 2017-07-30 14:30:18 3
8 2017-07-31 14:30:18 5
9 2017-08-01 14:30:18 1
10 2017-08-02 14:30:18 7

We can use bind_rows
library(dplyr)
bind_rows(lst)
Or with base R
do.call(rbind, lst)
Or using data.table
library(data.table)
rbindlist(lst)

Related

Selecting dates from two dataframes and creating a new dataframe in R [duplicate]

This question already has an answer here:
how to find dates that overlap from two different dataframes and subset
(1 answer)
Closed 4 years ago.
I would like to select the dates (in date B) that are closest to Date A and then create a new dataframe with these matches. There can be multiple rows for each ID (ie. multiple date combinations). I am using dplyr and data.table packages
dataframe A
ID DATE A
3 15/05/06
5 14/11/05
8 25/11/08
1 16/12/10
1 5/01/12
1 24/07/14
dataframe B
ID DATE B
3 12/12/05
3 17/04/06
5 25/07/05
5 26/09/05
5 1/12/05
8 12/09/08
8 13/11/08
8 23/12/08
8 31/03/09
1 26/11/10
1 12/08/11
1 12/11/11
1 14/03/14
1 8/08/14
Resultant dataframe:
ID DATE A DATE B
3 15/05/06 17/04/06
5 14/11/05 1/12/05
8 25/11/08 13/11/08
1 16/12/10 26/11/10
1 5/01/12 12/11/11
1 24/07/14 8/08/14
An idea is to merge on ID, subtract the dadtes and keep the minimum, i.e.
d1 <- transform(merge(df1, df2, by = 'ID'),
diff1 = as.POSIXct(DATE_A, '%d/%m/%y') - as.POSIXct(DATE_B, '%d/%m/%y'))
do.call(rbind, by(d1, d1$ID, function(i) i[which.min(i$diff1), ] ))
which gives,
ID DATE_A DATE_B diff1
3 3 15/05/06 17/04/06 -701 days
5 5 14/11/05 26/09/05 -4322 days
8 8 25/11/08 31/03/09 -1947 days

How to split each column into its own data frame? [duplicate]

This question already has answers here:
Split data.frame into groups by column name
(2 answers)
Closed 4 years ago.
I have a data frame with 3 columns, for example:
my.data <- data.frame(A=c(1:5), B=c(6:10), C=c(11:15))
I would like to split each column into its own data frame (so I'd end up with a list containing three data frames). I tried to use the "split" function but I don't know what I would set as the factor argument. I tried this:
data.split <- split(my.data, my.data[,1:3])
but that's definitely wrong and just gives me a bunch of empty data frames. It sounds fairly simple but after searching through previous questions I haven't come across a way to do this.
Not sure why you'd want to do that; lapply let's you already operate on the columns directly; but you could do
lst <- split(t(my.data), 1:3);
names(lst) <- names(my.data);
lst;
#$A
#[1] 1 2 3 4 5
#
#$B
#[1] 6 7 8 9 10
#
#$C
#[1] 11 12 13 14 15
Turn vector entries into data.frames with
lapply(lst, as.data.frame);
You can use split.default, i.e.
split.default(my.data, seq_along(my.data))
$`1`
A
1 1
2 2
3 3
4 4
5 5
$`2`
B
1 6
2 7
3 8
4 9
5 10
$`3`
C
1 11
2 12
3 13
4 14
5 15

join/merge data frames in R [duplicate]

This question already has answers here:
Merge dataframes of different sizes
(4 answers)
Left join using data.table
(3 answers)
Closed 5 years ago.
I would like to join similar data frames:
input:
x <- data_frame(a=c(1,2,3,4),b=c(4,5,6,7),c=c(1,NA,NA,NA))
y <- data_frame(a=c(2,3),b=c(5,6),c=c(1,2))
desired output:
z <- data_frame(a=c(1,2,3,4),b=c(4,5,6,7),c=c(1,1,2,NA))
I tried
x <- data_frame(a=c(1,2,3,4),b=c(4,5,6,7),c=c(1,NA,NA,NA))
y <- data_frame(a=c(2,3),b=c(5,6),c=c(1,2))
z <- merge(x,y, all=TRUE)
but it has one inconvenience:
a b c
1 1 4 1
2 2 5 1
3 2 5 NA
4 3 6 2
5 3 6 NA
6 4 7 NA
It doubles rows where there are similarities. Is there a way to get desired output without deleting unwanted rows?
EDIT
I can not delete rows with NA, x data frame consists of rows with NA which are not in y data frame. If I would do this I would deleted 4th row from x data frame (4 7 NA)
Thanks for help
You can use an update join with the data.table package:
# load the packge and convert the dataframes to data.table's
library(data.table)
setDT(x)
setDT(y)
# update join
x[y, on = .(a, b), c := i.c][]
which gives:
a b c
1: 1 4 1
2: 2 5 1
3: 3 6 2
4: 4 7 NA

R Subset using first and last column names of interest [duplicate]

This question already has answers here:
refer to range of columns by name in R
(6 answers)
Closed 6 years ago.
> df
a b c d e
1 1 4 7 10 13
2 2 5 8 11 14
3 3 6 9 12 15
To subset the columns b,c,d we can use df[,2:4] or df[,c("b", "c", "d")]. However, I am looking for a solution which fetches me the columns b,c,d using something like df[,b:d]. In other words, I want to simply use the first and last column names of interest to subset the data. I have been looking for a solution to this but am unsuccessful. All the examples I have seen till date refer to each and every specific column name while subsetting.
It's also simple in base R, e.g.:
subset(df, select=b:d)
Or roll your own:
df[do.call(seq, as.list(match(c("b","d"), names(df))) )]
If you are open to using dplyr:
dplyr::select(df, b:d)
b c d
1 4 7 10
2 5 8 11
3 6 9 12

Transpose multiple columns to rows in R [duplicate]

This question already has answers here:
How to reshape data from long to wide format
(14 answers)
Closed 6 years ago.
I have a data set as following, in which each ID has multiple rows for different attributes.
ID<-c(1,1, 2,2,3,3)
Score<-c(4,5, 5,7,8,9)
Attribute<-c("Att_1","Att_2", "Att_1","Att_2", "Att_1","Att_2")
T<-data.frame(ID, Score, Attribute)
Need to transform it to following format so each ID has one row:
ID Att_1 Att_2
1 4 5
2 5 7
3 8 9
There are threads on how to do this in excel, just wondering is there is any neat way to do in R? Thanks a lot!
You could try this:
library(reshape2)
dcast(T, ID ~ Attribute, value.var="Score")
# ID Att_1 Att_2
#1 1 4 5
#2 2 5 7
#3 3 8 9
This can be done with reshape():
reshape(data.frame(ID,Score,Attribute),idvar='ID',timevar='Attribute',dir='w');
## ID Score.Att_1 Score.Att_2
## 1 1 4 5
## 3 2 5 7
## 5 3 8 9

Resources