Convert nested list into flat dataframe

Convert nested list into flat dataframe - r

lets say I have nested list and I want to convert it into flat data frame in R. look into picture for reference. SO what should I do ?
Nested List

You may be able to use the unlist command,
l.ex <- list(a = list(1:5, LETTERS[1:5]), b = "Z", c = NA)
unlist(l.ex, recursive = FALSE)
unlist(l.ex, recursive = TRUE)
Also look into as.data.frame(do.call(cbind, vectorOfUnlist))

Related

R reduced subsetting a List

I have a question of subsetting a nested list by names.
I have an example list like:
test_list <- list(a = list(A1 = c(1,2,3), A2 = c(4,5,6)),
b = c(7,8,9),
c = list(C1 = c(10,11,12), C2 = list(C21 =c(13,14,15))))
And I want to subset values based on a vector like lnames <-
c('c','C2','C21'). The way I can think of doing this is using:
exp_str <- paste0('test_list','$',paste0(lnames, collapse = '$'))
eval(parse(text = exp_str))
But this seems a little be clunky to me. I am just wondering if there's a functional way to do this like using reduce function.

You can just do
test_list[[lnames]]
# [1] 13 14 15
This is somewhat cryptically described in the ?Extract help page.
[[ can be applied recursively to lists, so that if the single index i is a vector of length p, alist[[i]] is equivalent to alist[[i1]]...[[ip]] providing all but the final indexing results in a list.

Loop show head of several data frames

I have several data frames and I would like to run the head function over all of them. I tried the following but it doesn’t work, as it returns the name of the data frame but not the head of the data frame itself.
df.a <- data.frame(col1 = "a", col2 = 1)
df.b <- data.frame(col1 = "b", col2 = 2)
df.c <- data.frame(col1 = "c", col2 = 3)
list <- ls()
for (i in 1:length(list())){
head(list[i])
}
lapply(ls(),head)
Any idea on how to do it or why it is not working?

Put your data frames into a list, and add print to your loop.
my.list <- list(df.a, df.b, df.c)
for (i in seq_along(my.list)){
print(head(my.list[[i]]))
}

We need to get the value of the objects provided by the ls() as a vector of character strings. If the object names have a pattern, specify the pattern in the ls and wrap it with mget to get the values in a list, loop over the list with lapply and get the head
lapply(mget(ls(pattern="df\\.")), head)

cbind equally named vectors in multiple data.frames in a list to a single data.frame

I have a list similar to this one:
set.seed(1602)
l <- list(data.frame(subst_name = sample(LETTERS[1:10]), perc = runif(10), crop = rep("type1", 10)),
data.frame(subst_name = sample(LETTERS[1:7]), perc = runif(7), crop = rep("type2", 7)),
data.frame(subst_name = sample(LETTERS[1:4]), perc = runif(4), crop = rep("type3", 4)),
NULL,
data.frame(subst_name = sample(LETTERS[1:9]), perc = runif(9), crop = rep("type5", 9)))
Question: How can I extract the subst_name-column of each data.frame and combine them with cbind() (or similar functions) to a new data.frame without messing up the order of each column? Additionally the columns should be named after the corresponding crop type (this is possible 'cause the crop types are unique for each data.frame)
EDIT: The output should look as follows:
Having read the comments I'm aware that within R it doesn't make much sense but for the sake of having alook at the output the data.frame's View option is quite handy.

With the help of this SO-Question I came up with the following sollution. (There's probably room for improvement)
a <- lapply(l, '[[', 1) # extract the first element of the dfs in the list
a <- Filter(function(x) !is.null(unlist(x)), a) # remove NULLs
a <- lapply(a, as.character)
max.length <- max(sapply(a, length))
## Add NA values to list elements
b <- lapply(a, function(v) { c(v, rep(NA, max.length-length(v)))})
e <- as.data.frame(do.call(cbind, d))
names(e) <- unlist(lapply(lapply(lapply(l, '[[', "crop"), '[[', 2), as.character))

It is not really correct to do this with the given example because the number of rows is not the same in each one of the list's data frames . But if you don't care you can do:
nullElements = unlist(sapply(l,is.null))
l = l[!nullElements] #delete useless null elements in list
columns=lapply(l,function(x) return(as.character(x$subst_name)))
newDf = as.data.frame(Reduce(cbind,columns))
If you don't want recycled elements in the columns you can do
for(i in 1:ncol(newDf)){
colLength = nrow(l[[i]])
newDf[(colLength+1):nrow(newDf),i] = NA
}
newDf = newDf[1:max(unlist(sapply(l,nrow))),] #remove possible extra NA rows
Note that I edited my previous code to remove NULL entries from l to simplify things

Add an element to a named list of data.frames in R

I am new to R and don't know the correct term for a list of that kind. I would call it a element named list of data frames.
dfA = data.frame(a=c(1,2,3), b=c(1,2,3))
dfB = data.frame(a=c(1,2,3), b=c(1,2,3))
mylist = list ('a' = dfA, 'b' = dfB)
But now I want to add a new element to it
dfC = data.frame(a=c(1,2,3), b=c(1,2,3))
I don't know how to do this. I couldn't find examples fitting in my use case.

We can assign 'dfC' by either numeric index i.e. 3 or with a name
mylist[['c']] <- dfC
Or use c or append
mylist <- c(mylist, list(c= dfC))

Nested named list to data frame

I have the following named list output from a analysis. The reproducible code is as follows:
list(structure(c(-213.555409754509, -212.033637890131, -212.029474755074,
-211.320398316741, -211.158815833294, -210.470525157849), .Names = c("wasn",
"chappal", "mummyji", "kmph", "flung", "movie")), structure(c(-220.119433774144,
-219.186901747536, -218.743319709963, -218.088361753899, -217.338920075687,
-217.186050877079), .Names = c("crazy", "wired", "skanndtyagi",
"andr", "unveiled", "contraption")))
I want to convert this to a data frame. I have tried unlist to data frame options using reshape2, dplyr and other solutions given for converting a list to a data frame but without much success. The output that I am looking for is something like this:
Col1 Val1 Col2 Val2
1 wasn -213.55 crazy -220.11
2 chappal -212.03 wired -219.18
3 mummyji -212.02 skanndtyagi -218.74
so on and so forth. The actual out put has multiple columns with paired values and runs into many rows. I have tried the following codes already:
do.call(rbind, lapply(df, data.frame, stringsAsFactors = TRUE))
works partially provides all the character values in a column and numeric values in the second.
data.frame(Reduce(rbind, df))
didn't work - provides the names in the first list and numbers from both the lists as tow different rows
colNames <- unique(unlist(lapply(df, names)))
M <- matrix(0, nrow = length(df), ncol = length(colNames),
dimnames = list(names(df), colNames))
matches <- lapply(df, function(x) match(names(x), colNames))
M[cbind(rep(sequence(nrow(M)), sapply(matches, length)),
unlist(matches))] <- unlist(df)
M
didn't work correctly.
Can someone help?

Since the list elements are all of the same length, you should be able to stack them and then combine them by columns.
Try:
do.call(cbind, lapply(myList, stack))

Here's another way:
as.data.frame( c(col = lapply(x, names), val = lapply(x,unname)) )
How it works. lapply returns a list; two lists combined with c make another list; and a list is easily coerced to a data.frame, since the latter is just a list of vectors having the same length.
Better than coercing to a data.frame is just modifying its class, effectively telling the list "you're a data.frame now":
L = c(col = lapply(x, names), val = lapply(x,unname))
library(data.table)
setDF(L)
The result doesn't need to be assigned anywhere with = or <- because L is modified "in place."