I have a list(list1) with n elements. Each element of list1 is data frame(df1, df2, ..., dfn) with different(also maybe same number) number of columns.
Let i'th element(data frame) dfi with column names x1,x2,x3,....,xi.
I want to paste such a formula with column names:
x1+x2+x3+......+xi.
And assign this formula as i'th element of a list(list2).
I want to this for each data frame in list1.
How can I do this using R? I will be very glad for any help. Thanks a lot.
Ex: Let list1 have two elements(two data frames: df1 and df2)
list1[[1]]:
df1:
x1 x2 x3
-- -- --
43 12 7
3 6 5
and
list1[[2]]:
df2:
x1 x2
-- --
21 45
14 16
I want to return list2 which is:
list2[[1]]:
x1+x2+x3
and
list2[[2]]:
x1+x2
I am not interested with elements of data frames(df1 and df2), just with the column names.
From my understanding of your question, this may do the work for you?
list2 <- lapply(list1, function(x) { return(paste(names(x), collapse = "+")) })
Does this do what you want:
list2 <- lapply(list1, function(x){apply(x, 1, sum)})?
I have tried to do something similar recently. I dont have a gracious solution, but the below will work only if you have equal number of columns in all the dataframes.
length_list <- length(list1)
cols_in_df <- ncol(list[[i]])
i = 1
for(i in 1:length_list)
{
assign(model[i], lm( list[[i]][1]+ list[[i]][2]+ ....+list[[i]][cols_in_df])
}
Related
I have a set of lists stored in the all_lists.
all_list=c("LIST1","LIST2")
From these, I would like to create a data frame such that
LISTn$findings${Coli}$character is entered into the n'th column with rowname from LISTn$rowname.
DATA
LIST1=list()
LIST1[["findings"]]=list(s1a=list(character="a1",number=1,string="a1type",exp="great"),
=list(number=2,string="b1type"),
in2a=list(character="c1",number=3,string="c1type"),
del3b=list(character="d1",number=4,string="d1type"))
LIST1[["rowname"]]="Row1"
LIST2=list()
LIST2[["findings"]]=list(s1a=list(character="a2",number=5,string="a2type",exp="great"),
s1b=list(character="b2",number=6,string="b2type"),
in2a=list(character="c2",number=7,string="c2type"),
del3b=list(character="d2",number=8,string="d2type"))
LIST2[["rowname"]]="Row2"
Please note that some characters are missing for which NA would suffice.
Desired output is this data frame:
s1a s1b in2a del3b
Row1 a1 NA c1 d1
Row2 a2 b2 c2 d2
There is about 1000 of these lists, speed is a factor. And each list is about 50mB after I load them through rjson::fromJSON(file=x)
The row and column names don't follow a particular pattern. They're names and attributes
We can use a couple of lapply/sapply combinations to loop over the nested list and extract the elements that have "Row" as the name
do.call(rbind, lapply(mget(all_list), function(x)
sapply(lapply(x$findings[grep("^Row\\d+", names(x$findings))], `[[`,
"character"), function(x) replace(x, is.null(x), NA))))
Or it can be also done by changing the names to a single value and then extract all those
do.call(rbind, lapply(mget(all_list), function(x) {
x1 <- setNames(x$findings, rep("Row", length(x$findings)) )
sapply(x1[names(x1)== "Row"], function(y)
pmin(NA, y$character[1], na.rm = TRUE)[1])}))
purrr has a strong function called map_chr which is built for these tasks.
library(purrr)
sapply(mget(all_list),function(x) purrr::map_chr(x$findings,"character",.default=NA))
%>% t
%>% data.frame
A little confused with how I am trying to acheive the results I want.
I have an environment in R which consists of 5 data.frames called df[i]
So;
df1
df2
df3
df4
df5
Inside of these df´s I have 5 columns called col[j]
col1
col2
col3
col4
col5
In total I have 25 columns across 5 data frames (5 df x 5 col).
I also have a static variable called R which is a vector of numbers
I am trying to calculate for each column of each dataframe a basic formula using a function/loop. The formula for column 1 of df1 would be;
Y = df1$col1 - R
I am trying to calculate this and repeat for each colum[j:5] in df[i:5] and store it in a new data.frame
j <- 1:5
i <- 1:5
fun <- function(x){
for(i in 1:col[j](df[i])){
Y[j] <- col[j] - R
}
}
EDIT: Added comment below for easier reading.
Y1a = df1$col1 - R
Y2a = df1$col2 - R
Y3a = df1$col3 - R
.....
.....
Y1b = df2$col1 - R
Y2b = df2$col2 - R
Y3b = df2$col3 - R
..... etc
# Put your data in a list:
dflist = mget(paste0("df", 1:5))
# Apply your function to every data frame
ylist = lapply(dflist, function(x) x - R)
# Name the resulting columns y1:y5
ylist = lapply(ylist, setNames, paste0("y", 1:5))
Have a look at How to make a list of data frames for examples and discussion of why using lists is better.
tidyverse version
dplyr::mutate_all apply a fonction to each column of a data.frame.
So I would do like that:
all_df <- list(df1, df2, df3, df4, df5)
map(all_df, function(x) mutate_all(x, function(y) y - R))
It should return you a list of length 5. Each df contains your desired statistic.
I'm a total noob at R and I've tried (and retried) to search for an answer to the following problem, but I've not been able to get any of the proposed solutions to do what I'm interested in.
I have two lists of named elements, with each element pointing to data frames with identical layouts:
(EDIT)
df1 <- data.frame(A=c(1,2,3),B=c("A","B","C"))
df2 <- data.frame(A=c(98,99),B=c("Y","Z"))
lst1 <- c(X=df1,Y=df2)
df3 <- data.frame(A=c(4,5),B=c("D","E"))
lst2 <- c(X=df3)
(EDIT 2)
So it seems like storing multiple data frames in a list is a bad idea, as it will convert the data frames to lists. So I'll go out looking for an alternative way to store a set of named data frames.
In general the names of the elements in the two elements might overlap partially, completely, or not at all.
I'm looking for a way to merge the two lists into a single list:
<some-function-sequence>(lst1, lst2)
->
c(X=rbind(df1,df3),Y=df2)
-resulting in something like this:
[EDIT: Syntax changed to correctly reflect desired result (list-of-data frames)]
$X
A B
1 1 A
2 2 B
3 3 C
4 4 D
5 5 E
$X.B
A B
1 98 Y
2 99 Z
I.e:
IF the lists contain identical element names, each pointing to a data frame, THEN I want to 'rbind' the rows from these two data frames and assign the resulting data frame to the same element name in the resulting list.
Otherwise the element names and data frames from both lists should just be copied into the resulting list.
I've tried the solutions from a number of discussions such as:
Can I combine a list of similar dataframes into a single dataframe?
Combine/merge lists by elements names
Simultaneously merge multiple data.frames in a list
Combine/merge lists by elements names (list in list)
Convert a list of data frames into one data frame
-but I've not been able to find the right solution. A general problem seems to be that the data frame ends up being converted into a list by the application of 'mapply/sapply/merge/...' - and usually also sliced and/or merged in ways which I am not interested in. :)
Any help with this will be much appreciated!
[SOLUTION]
The solution seems to be to change the use of c(...) when collecting data frames to list(...) after which the solution proposed by Pierre seems to give the desired result.
Here is a proposed solution using split and c to combine like terms. Please read the caveat at the bottom:
s <- split(c(lst1, lst2), names(c(lst1,lst2)))
lapply(s, function(lst) do.call(function(...) unname(c(...)), lst))
# $X.A
# [1] 1 2 3 4 5
#
# $X.B
# [1] "A" "B" "C" "D" "E"
#
# $Y.A
# [1] 98 99
#
# $Y.B
# [1] "Y" "Z"
This solution is based on NOT having factors as strings. It will not throw an error but the factors will be converted to numbers. Below I show how I transformed the data to remove factors. Let me know if you require factors:
df1 <- data.frame(A=c(1,2,3),B=c("A","B","C"), stringsAsFactors=FALSE)
df2 <- data.frame(A=c(98,99),B=c("Y","Z"), stringsAsFactors=FALSE)
lst1 <- c(X=df1,Y=df2)
df3 <- data.frame(A=c(4,5),B=c("D","E"), stringsAsFactors=FALSE)
lst2 <- c(X=df3)
If the data is stored in lists we can use:
lapply(split(c(lst1, lst2), names(c(lst1,lst2))), function(lst) do.call(rbind, lst))
The following solution is probably not the most efficient way. However, if I got your problem right this should work ;)
# Example data
# Some vectors
a <- 1:5
b <- 3:7
c <- rep(5, 5)
d <- 5:1
# Some dataframes, data1 and data3 have identical column names
data1 <- data.frame(a, b)
data2 <- data.frame(c, b)
data3 <- data.frame(a, b)
data4 <- data.frame(c, d)
# 2 lists
list1 <- list(data1, data2)
list2 <- list(data3, data4)
# Loop, wich checks for the dataframe names and rbinds dataframes with the same column names
final_list <- list1
used_lists <- numeric()
for(i in 1:length(list1)) {
for(j in 1:length(list2)) {
if(sum(colnames(list1[[i]]) == colnames(list2[[j]])) == ncol(list1[[i]])) {
final_list[[i]] <- rbind(list1[[i]], list2[[j]])
used_lists <- c(used_lists, j)
}
}
}
# Adding the other dataframes, which did not have the same column names
for(i in 1:length(list2)) {
if((i %in% used_lists) == FALSE) {
final_list[[length(final_list) + 1]] <- list2[[i]]
}
}
# Final list, which includes all other lists
final_list
I have a named list whose each element is a character vector. I want to write this list into a single dataframe where I have two columns, one with the name of the character vector and the second column with each element of the character vector. Any help would be appreciated.
NewList <- lapply(names(List), function(X) data.frame(Names=X, Characters=List[[X]]))
do.call(rbind, NewList)
Maybe
data.frame(vecname = rep(names(ll), sapply(ll, length)), chars = unlist(ll))
to have each element of each list component correspond to a row in the final dataframe.
I'm wondering if stack provides the functions you need (using the example of Henrik)
ll <- list(x1 = c("a", "b", "c"), x2 = c("d", "e"))
stack(ll)
#-------
values ind
1 a x1
2 b x1
3 c x1
4 d x2
5 e x2
A very straightforward way would be to use cbind(), like this:
cbind(names(l),l)
This will result in the following data frame, assuming that l = list(a="ax", b="bx"):
l
a "a" "ax"
b "b" "bx"
Of course, you can rename the columns and rows by adjusting the values in colnames(l) and rownames(l). In this example, the string names are automatically also applied to the rownames of the resulting data frame, so, depending on what you'd like to do with your data,
cbind(l)
might suffice, resulting in
l
a "ax"
b "bx"
Hope I could help.
I'm trying to make a data.frame from a "list in list"
l <- list(c("sam1", "GSM6683", "GSM6684", "GSM6687", "GSM6688"), c("sam2",
"GSM6681", "GSM6682", "GSM6685", "GSM6686"))
df <- data.frame(l)
1) I get a date.frame with weird column names, how can I avoid it?
2) I'd like to get the column names from the first element of the inner list in list
like so:
column names: sam1, sam2
row1 GSM6683 GSM6681
row2 GSM6684 GSM6682
row3 GSM6687 GSM6685
row4 GSM6688 GSM6686
You were almost there, since you want sam1 and sam2 to be column names you don't need to make them part of you list and specify they are column names.
>l <- list(c("GSM6683", "GSM6684", "GSM6687", "GSM6688"), c(
"GSM6681", "GSM6682", "GSM6685", "GSM6686"))
>df <- data.frame(l)
>colnames(df)<-c("sam1", "sam2")
If you're starting with the data structure in your example, do this:
df <- data.frame(lapply(l, function(x) x[-1]))
names(df) <- lapply(l, function(x) x[1])
If you have a choice on how to construct the data structure, do what R_Newbie says in his answer.