Convert a nested list to a list [duplicate] - r

This question already has an answer here:
How to convert from a list of lists to a list in R retaining names?
(1 answer)
Closed 9 years ago.
I have a brief question, I would like to unnest this nested list:
mylist <- list(a = list(A=1, B=5),
b = list(C= 1, D = 2),
c = list(E = 1, F = 3))
Expected result is:
> list(a=c(1, 5), b = c(1, 2), c = c(1, 3))
$a
[1] 1 5
$b
[1] 1 2
$c
[1] 1 3
Any suggestions?
T

Slight variation on everyone else's and keeping it in base:
lapply(mylist, unlist, use.names=FALSE)
## $a
## [1] 1 5
##
## $b
## [1] 1 2
##
## $c
## [1] 1 3

Take a look at llply function from plyr package
> library(plyr)
> llply(mylist, unlist)
$a
A B
1 5
$b
C D
1 2
$c
E F
1 3
If you want to get rid of the names, then try:
> lapply(llply(mylist, unlist), unname)
$a
[1] 1 5
$b
[1] 1 2
$c
[1] 1 3

I think applying unlist() to each elment in your list should give you what you're looking for:
> mylist <- list(a = list(A=1, B=5), b = list(C= 1, D = 2), c = list(E = 1, F = 3))
> mylist2 <- list(a=c(1, 5), b = c(1, 2), c = c(1, 3))
> data.frame(lapply(mylist,unlist))
a b c
A 1 1 1
B 5 2 3
> data.frame(mylist2)
a b c
1 1 1 1
2 5 2 3

Related

Remove column name pattern in multiple dataframes in R

I have >100 dataframes loaded into R with column name prefixes in some but not all columns that I would like to remove. In the below example with 3 dataframes, I would like to remove the pattern x__ in the 3 dataframes but keep all the dataframe names and everything else the same. How could this be done?
df1 <- data.frame(`x__a` = rep(3, 5), `x__b` = seq(1, 5, 1), `x__c` = letters[1:5])
df2 <- data.frame(`d` = rep(5, 5), `x__e` = seq(2, 6, 1), `f` = letters[6:10])
df3 <- data.frame(`x__g` = rep(5, 5), `x__h` = seq(2, 6, 1), `i` = letters[6:10])
You could put the data frames in a list and use an anonymous function with gsub.
lst <- mget(ls(pattern='^df\\d$'))
lapply(lst, \(x) setNames(x, gsub('x__', '', names(x))))
# $df1
# a b c
# 1 3 1 a
# 2 3 2 b
# 3 3 3 c
# 4 3 4 d
# 5 3 5 e
#
# $df2
# d e f
# 1 5 2 f
# 2 5 3 g
# 3 5 4 h
# 4 5 5 i
# 5 5 6 j
#
# $df3
# g h i
# 1 5 2 f
# 2 5 3 g
# 3 5 4 h
# 4 5 5 i
# 5 5 6 j
If you have no use of the list, move the changed dfs back into .GlobalEnv using list2env, but I don't recommend it, since it overwrites.
lapply(lst, \(x) setNames(x, gsub('x__', '', names(x)))) |> list2env(.GlobalEnv)

How to unnest a list of lists of data frame in R? [duplicate]

This question already has an answer here:
How to convert from a list of lists to a list in R retaining names?
(1 answer)
Closed 9 years ago.
I have a brief question, I would like to unnest this nested list:
mylist <- list(a = list(A=1, B=5),
b = list(C= 1, D = 2),
c = list(E = 1, F = 3))
Expected result is:
> list(a=c(1, 5), b = c(1, 2), c = c(1, 3))
$a
[1] 1 5
$b
[1] 1 2
$c
[1] 1 3
Any suggestions?
T
Slight variation on everyone else's and keeping it in base:
lapply(mylist, unlist, use.names=FALSE)
## $a
## [1] 1 5
##
## $b
## [1] 1 2
##
## $c
## [1] 1 3
Take a look at llply function from plyr package
> library(plyr)
> llply(mylist, unlist)
$a
A B
1 5
$b
C D
1 2
$c
E F
1 3
If you want to get rid of the names, then try:
> lapply(llply(mylist, unlist), unname)
$a
[1] 1 5
$b
[1] 1 2
$c
[1] 1 3
I think applying unlist() to each elment in your list should give you what you're looking for:
> mylist <- list(a = list(A=1, B=5), b = list(C= 1, D = 2), c = list(E = 1, F = 3))
> mylist2 <- list(a=c(1, 5), b = c(1, 2), c = c(1, 3))
> data.frame(lapply(mylist,unlist))
a b c
A 1 1 1
B 5 2 3
> data.frame(mylist2)
a b c
1 1 1 1
2 5 2 3

Rearranging an R list

Suppose I have a list structured as such.
list(
a = list(
a1 = c(1, 2),
b1 = c(2, 3)
),
b = list(
a1 = c(3, 4),
b1 = c(4, 5)
)
)
What clever use of core R functions, without apply's or recursive functions, can I use to transform it to the following?
list(
a1 = list(
a = c(1, 2),
b = c(3, 4)
),
b1 = list(
a = c(2, 3),
b = c(4, 5)
)
)
Using stack() drops the inner indices.
Using unlist() merges both indices together.
OK. Here's an attempt:
L1 <- stack(unlist(L, recursive = FALSE))
L2 <- cbind(L1, do.call(rbind, strsplit(
as.character(L1$ind), ".", fixed = TRUE)))
c(by(L2[c("values", "1")], L2[["2"]],
FUN = function(x) split(x[["values"]], x[["1"]])))
# $a1
# $a1$a
# [1] 1 2
#
# $a1$b
# [1] 3 4
#
#
# $b1
# $b1$a
# [1] 2 3
#
# $b1$b
# [1] 4 5
I've wrapped the output of by with c to remove the by-related attributes and return the output to a basic list.
You could take advantage of the fact that the list is an environment , and use within.
x is the list.
> within(x, { a$b1 <- b$a1; b$a1 <- a$b1-1 })
$a
$a$a1
[1] 1 2
$a$b1
[1] 3 4
$b
$b$a1
[1] 2 3
$b$b1
[1] 4 5
Here are some other things that might be of interest Not sure why people seem to steer away from the base R funcions. They are very useful in these types of problems (and they make all of Ananda's loops work ;-).
Did everyone forget about recursive concatenation...
> str(x)
List of 2
$ a:List of 2
..$ a1: num [1:2] 1 2
..$ b1: num [1:2] 2 3
$ b:List of 2
..$ a1: num [1:2] 3 4
..$ b1: num [1:2] 4 5
From str(x) alone, you can plan the route down the list. In your list, it's a [2:1][1:2] reversal. By the way R is vectorized!
These things are also useful..
> do.call("names", list(c(x)))
#[1] "a" "b"
> do.call("names", list(c(x,recursive=TRUE)))
#[1] "a.a11" "a.a12" "a.b11" "a.b12" "b.a11" "b.a12" "b.b11" "b.b12"
> do.call("c", list(c(x,recursive=TRUE)))
#a.a11 a.a12 a.b11 a.b12 b.a11 b.a12 b.b11 b.b12
# 1 2 2 3 3 4 4 5
> do.call("c", list(c(x,recursive=TRUE,use.names=FALSE)))
#[1] 1 2 2 3 3 4 4 5
> do.call("as.expression", list(c(x)))
# expression(a = list(a1 = c(1, 2), b1 = c(2, 3)), b = list(a1 = c(3, 4), b1 = c(4, 5)))
> do.call("as.expression", list(c(x,recursive=TRUE)))
# expression(1, 2, 2, 3, 3, 4, 4, 5)
You'll want to do some kind of recursion, the .Primitive functions are coded entirely in C and they're not slow by any means.
Here I'm at ground level with the vectors you want to change.
> c(x,recursive=TRUE)[3:6]
# a.b11 a.b12 b.a11 b.a12
# 2 3 3 4
m <- do.call(rbind, ll) ;m2 <- split(m, col(m)) # that transposes the data
names(m2) <- names(ll[[1]]) # these two steps "transpose" the names
lapply(m2, setNames, names(ll) )
$a1
$a1$a
[1] 1 2
$a1$b
[1] 3 4
$b1
$b1$a
[1] 2 3
$b1$b
[1] 4 5

Nested lapply() in a list?

I have a list l, which has the following features:
It has 3 elements
Each element is a numeric vector of length 5
Each vector contains numbers from 1 to 5
l = list(a = c(2, 3, 1, 5, 1), b = c(4, 3, 3, 5, 2), c = c(5, 1, 3, 2, 4))
I want to do two things:
First
I want to know how many times each number occurs in the entire list and I want each result in a vector (or any form that can allow me to perform computations with the results later):
Code 1:
> a <- table(sapply(l, "["))
> x <- as.data.frame(a)
> x
Var1 Freq
1 1 3
2 2 3
3 3 4
4 4 2
5 5 3
Is there anyway to do it without using the table() function. I would like to do it "manually". I try to do it right below.
Code 2: (I know this is not very efficient!)
x <- data.frame(
"1" <- sum(sapply(l, "[")) == 1
"2" <- sum(sapply(l, "[")) == 2
"3" <- sum(sapply(l, "[")) == 3
"4" <- sum(sapply(l, "[")) == 4
"5" <- sum(sapply(l, "[")) == 5)
I tried the following, but I did not work. I actually did not understand the result.
> sapply(l, "[") == 1:5
a b c
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE FALSE
[3,] FALSE TRUE TRUE
[4,] FALSE FALSE FALSE
[5,] FALSE FALSE FALSE
> sum(sapply(l, "[") == 1:5)
[1] 2
Second
Now, I would like to get the number of times each number appears in the list, but now in each element $a, $b and $c. I thought about using the lapply() but I don't know how exactly. Following is what I tried, but it is inefficient just like Code 2:
lapply(l, function(x) sum(x == 1))
lapply(l, function(x) sum(x == 2))
lapply(l, function(x) sum(x == 3))
lapply(l, function(x) sum(x == 4))
lapply(l, function(x) sum(x == 5))
What I get with these 5 lines of code are 5 lists of 3 elements each containing a single numeric value. For example, the second line of code tells me how many times number 2 appears in each element of l.
Code 3:
> lapply(l, function(x) sum(x == 2))
$a
[1] 1
$b
[1] 1
$c
[1] 1
What I would like to obtain is a list with three elements containing all the information I am looking for.
Please, use the references "Code 1", "Code 2" and "Code 3" in your answers. Thank you very much.
Just use as.data.frame(l) for the second part and table(unlist(l)) for the first.
> table(unlist(l))
1 2 3 4 5
3 3 4 2 3
> data.frame(lapply(l, tabulate))
a b c
1 2 0 1
2 1 1 1
3 1 2 1
4 0 1 1
5 1 1 1`
For code 1/2, you could use sapply to obtain the counts for whichever values you wanted:
l = list(a = c(2, 3, 1, 5, 1), b = c(4, 3, 3, 5, 2), c = c(5, 1, 3, 2, 4))
data.frame(number = 1:5,
freq = sapply(1:5, function(x) sum(unlist(l) == x)))
# number freq
# 1 1 3
# 2 2 3
# 3 3 4
# 4 4 2
# 5 5 3
For code 3, if you wanted to get the counts for lists a, b, and c, you could just apply your frequency function to each element of the list with the lapply function:
freqs = lapply(l, function(y) sapply(1:5, function(x) sum(unlist(y) == x)))
data.frame(number = 1:5, a=freqs$a, b=freqs$b, c=freqs$c)
# number a b c
# 1 1 2 0 1
# 2 2 1 1 1
# 3 3 1 2 1
# 4 4 0 1 1
# 5 5 1 1 1
here you have another example with nested lapply().
created data:
list = NULL
list[[1]] = c(1:5)
list[[2]] = c(1:5)+3
list[[2]] = c(1:5)+4
list[[3]] = c(1:5)-1
list[[4]] = c(1:5)*3
list2 = NULL
list2[[1]] = rep(1,5)
list2[[2]] = rep(2,5)
list2[[3]] = rep(0,5)
The result is this; it serve to subtract each element of one list with all elements of the other list.
lapply(list, function(d){ lapply(list2, function(a,b) {a-b}, b=d)})

create a set of cumulative intersection counts

I want to count the intersection of var1[i] and union(var2[1],...,var2[i]).
Using this data
var1 <- list('2003' = 1:3, '2004' = c(4:3), '2005' = c(6,4,1), '2006' = 1:4 )
var2 <- list('2003' = 1:3, '2004' = c(4:5), '2005' = c(2,3,6), '2006' = 2:3 )
I would like to populate a results list with:
1. intersect(var1$2003,var2$2003)
2. intersect(var1$2004,union(var2$2003,var2$2004))
3. intersect(var1$2005,union(var2$2005(union(var2$2003,var2$2004))))
and so on, until 2012 (not shown in the example)
Disclaimer: due to editing, the comments below might not make sense.
Is something like this what you want?
# create the data
var1 <- list('2003' = 1:3, '2004' = c(4:3), '2005' = c(6,4,1), '2006' = 1:4 )
var2 <- list('2003' = 1:3, '2004' = c(4:5), '2005' = c(2,3,6), '2006' = 2:3 )
# A couple of nested lapply statements
lapply(setNames(seq_along(var1), names(var1)),
function(i,l1,l2) length(intersect(l1[[i]], Reduce(union,l2[1:i]))),
l1 = var1,l2=var2)
$`2003`
[1] 3
$`2004`
[1] 2
$`2005`
[1] 3
$`2006`
[1] 4
note that Reduce(union,var2)reduces the list var2 by successively combining the elements using union (see ?Reduce)
Reduce(union,var2)
[1] 1 2 3 4 5 6
EDIT elegant alternative
use the accumulate = T argument in Reduce
lapply(mapply(intersect,var1, Reduce(union, var2, accumulate=T)),length)
Because --
Reduce(union, var2, accumulate = T)
## [[1]]
## [1] 1 2 3
##
## [[2]]
## [1] 1 2 3 4 5
##
## [[3]]
## [1] 1 2 3 4 5 6
##
## [[4]]
## [1] 1 2 3 4 5 6

Resources