Making a data frame that is a subset of two data frames - r

I am stumped again.
I have two data frames
dataframe1
a b c
[1] 21 12 22
[2] 11 9 6
[3] 4 6 7
and
dataframe2
f g h
[1] 21 12 22
[2] 11 9 6
[3] 4 6 7
I want to take the first column of dataframe1 and make three new dataframes with the second column being each of the three f,g and h
Obviously I could just do a subset over and over
subset1 <- cbind(dataframe1[,1]dataframe2[,1])
subset2 <- cbind(dataframe1[,1]dataframe2[,2])
but my dataframes will have variable numbers of columns and are very long row numberwise. So I am looking for a little more something general. My data frames will always be the same length.
The closest I have come to getting anything was with apply and cbind but I got either a set of three rows that were a and f, a and g, a and h each combined as single numeric vector or I get a single data frame with four columns, a,f,g,h.
Help is deeply appreciated.

You can use lapply it iterate over the columns of dataframe2 like so:
lapply(dataframe2, function(x) as.data.frame(cbind(dataframe1[,1], x)))
This will result in a list object where each entry corresponds to a column of dataframe2. For example:
$f
V1 x
1 21 21
2 11 11
3 4 4
$g
V1 x
1 21 12
2 11 9
3 4 6
$h
V1 x
1 21 22
2 11 6
3 4 7

Related

How to split each column into its own data frame? [duplicate]

This question already has answers here:
Split data.frame into groups by column name
(2 answers)
Closed 4 years ago.
I have a data frame with 3 columns, for example:
my.data <- data.frame(A=c(1:5), B=c(6:10), C=c(11:15))
I would like to split each column into its own data frame (so I'd end up with a list containing three data frames). I tried to use the "split" function but I don't know what I would set as the factor argument. I tried this:
data.split <- split(my.data, my.data[,1:3])
but that's definitely wrong and just gives me a bunch of empty data frames. It sounds fairly simple but after searching through previous questions I haven't come across a way to do this.
Not sure why you'd want to do that; lapply let's you already operate on the columns directly; but you could do
lst <- split(t(my.data), 1:3);
names(lst) <- names(my.data);
lst;
#$A
#[1] 1 2 3 4 5
#
#$B
#[1] 6 7 8 9 10
#
#$C
#[1] 11 12 13 14 15
Turn vector entries into data.frames with
lapply(lst, as.data.frame);
You can use split.default, i.e.
split.default(my.data, seq_along(my.data))
$`1`
A
1 1
2 2
3 3
4 4
5 5
$`2`
B
1 6
2 7
3 8
4 9
5 10
$`3`
C
1 11
2 12
3 13
4 14
5 15

R - Subset rows of a data frame on a condition in all the columns

I want to subset rows of a data frame on a single condition in all the columns, avoiding the use of subset.
I understand how to subset a single column but I cannot generalize for all columns (without call all the columns).
Initial data frame :
V1 V2 V3
1 1 8 15
2 2 0 16
3 3 10 17
4 4 11 18
5 5 0 19
6 0 13 20
7 7 14 21
In this example, I want to subset the rows without zeros.
Expected output :
V1 V2 V3
1 1 8 15
2 3 10 17
3 4 11 18
4 7 14 21
Thanks
# create your data
a <- c(1,2,3,4,5,0,7)
b <- c(8,0,10,11,0,14,14)
c <- c(15,16,17,18,19,20,21)
data <- cbind(a, b, c)
# filter out rows where there is at least one 0
data[apply(data, 1, min) > 0,]
A solution using rowSums function after matching to 0.
# creating your data
data <- data.frame(a = c(1,2,3,4,5,0,7),
b = c(8,0,10,11,0,14,14),
c = c(15,16,17,18,19,20,21))
# Selecting rows containing no 0.
data[which(rowSums(as.matrix(data)==0) == 0),]
Another way
data[-unique(row(data)[grep("^0$", unlist(data))]),]

How to merge subsets of data frames from a list (i.e., merge all of the first dfs from each list component)

I have seen a number of answers as to how to merge dataframes from a list when each list element is a single data frame. However, in my case, each list element contains two data frames. I want to merge all of the first together and all of the second. As a dummy example:
lst<-list()
lst[[1]]<-list(data.frame(cat=c(1:5), type=c(11:15)), data.frame(group=c("A","B","C"), num=c(1:3)))
lst[[2]]<-list(data.frame(cat=c(22:26), type=c(50:54)), data.frame(group=c("H","I","J"), num=c(7:9)))
I want to merge the first elements together and the second elements together, to yield two data frames:
df1:
cat type
1 1 11
2 2 12
3 3 13
4 4 14
5 5 15
6 22 50
7 23 51
8 24 52
9 25 53
10 26 54
df2:
group num
1 A 1
2 B 2
3 C 3
4 H 7
5 I 8
6 J 9
I am sure there is some straightforward way to do this (somehow with do.call and rbind??) but I cannot figure out how to reference the various elements within each list properly.
Clearly with this small example I could just do it manually by:
df1<-rbind(lst[[1]][[1]], lst[[2]][[1]])
However, my actual list includes hundreds of data frames. I can do it by creating a loop and rbinding in one at a time sequentially, but I'm sure there is a more efficient way...Thanks for any help!
You can use Reduce function(where you can customize how to reduce it) to rbind data frames. Reduce takes two elements from the list every time and reduce it to one element based on your function, and for the customized rbind since each two data frames need to be bound separately, you can use Map, put them together:
Reduce(function(x, y) Map(rbind, x, y), lst)
# [[1]]
# cat type
# 1 1 11
# 2 2 12
# 3 3 13
# 4 4 14
# 5 5 15
# 6 22 50
# 7 23 51
# 8 24 52
# 9 25 53
# 10 26 54
# [[2]]
# group num
# 1 A 1
# 2 B 2
# 3 C 3
# 4 H 7
# 5 I 8
# 6 J 9
Or maybe a faster way:
lapply(1:2, function(i) do.call(rbind, lapply(lst, `[[`, i)))

R break up data frame into list using vector of number of rows

I have a data.frame that I want to break up into a list of data.frames using a vector that will tell me how many rows should be in each consecutive list element.
Sample Data
vectornom <- c(1,2,4,3)
df <- data.frame(x=1:10,y=11:20)
Desired result
> new_list
[[1]]
x y
1 11
[[2]]
x y
2 12
3 13
[[3]]
x y
4 14
5 15
6 16
7 17
[[4]]
x y
8 18
9 19
10 20
I appreciate your help
You can use the (pretty awesome) split function for this, using vectornom to create the index on which to "split"
split(df, rep(1:length(vectornom), vectornom))

Turn 3x3 data.frame into 1x9 data.frame while preserving row and column names

I am having trouble coming up with an elegant solution to this seemingly simple data manipulation problem. I can see a looped solution but I assume there is a 1-2 function single-line solution.
Here is what I have:
x <- data.frame(c1=c(1,2,3),
c2=c(4,5,6),
c3=c(7,8,9),
row.names = c("r1","r2","r3"))
> x
c1 c2 c3
r1 1 4 7
r2 2 5 8
r3 3 6 9
And here is what I want:
> y
c1.r1 c1.r2 c1.r3 c2.r1 c2.r2 c2.r3 c3.r1 c3.r2 c3.r3
1 1 2 3 4 5 6 7 8 9
How do I manipulate x to give me y?
Here's one way to do it:
R> unlist(lapply(x, setNames, rownames(x)))
c1.r1 c1.r2 c1.r3 c2.r1 c2.r2 c2.r3 c3.r1 c3.r2 c3.r3
1 2 3 4 5 6 7 8 9
A data.frame is a list, so lapply just loops over the columns. Then it sets the names of each vector to the rownames of the data.frame. Then unlist flattens the list to a vector (recursively, setting names, by default).

Resources