How can I make a list of lists in R? - r

I don't know how to make a list of lists in R.
I have several lists, I want to store them in one data structure to make accessing them easier. However, it looks like you cannot use a list of lists in R, so if I get list l1 from another list, say, l2 then I cannot access elements l1. How can I implement it?
EDIT- I will show an example of what does not work for me:
list1 <- list()
list1[1] = 1
list1[2] = 2
list2 <- list()
list2[1] = 'a'
list2[2] = 'b'
list_all <- list(list1, list2)
a = list_all[1]
a[2]
#[[1]]
#NULL
but a should be a list!

You can easily make lists of lists
list1 <- list(a = 2, b = 3)
list2 <- list(c = "a", d = "b")
mylist <- list(list1, list2)
mylist is now a list that contains two lists. To access list1 you can use mylist[[1]]. If you want to be able to something like mylist$list1 then you need to do somethingl like
mylist <- list(list1 = list1, list2 = list2)
# Now you can do the following
mylist$list1
Edit: To reply to your edit. Just use double bracket indexing
a <- list_all[[1]]
a[[1]]
#[1] 1
a[[2]]
#[1] 2

Using your example::
list1 <- list()
list1[1] = 1
list1[2] = 2
list2 <- list()
list2[1] = 'a'
list2[2] = 'b'
list_all <- list(list1, list2)
Use '[[' to retrieve an element of a list:
b = list_all[[1]]
b
[[1]]
[1] 1
[[2]]
[1] 2
class(b)
[1] "list"

If you are trying to keep a list of lists (similar to python's list.append()) then this might work:
a <- list(1,2,3)
b <- list(4,5,6)
c <- append(list(a), list(b))
> c
[[1]]
[[1]][[1]]
[1] 1
[[1]][[2]]
[1] 2
[[1]][[3]]
[1] 3
[[2]]
[[2]][[1]]
[1] 4
[[2]][[2]]
[1] 5
[[2]][[3]]
[1] 6

The example creates a list of named lists in a loop.
MyList <- list()
for (aName in c("name1", "name2")){
MyList[[aName]] <- list(aName)
}
MyList[["name1"]]
MyList[["name2"]]
To add another list named "name3" do write:
MyList$name3 <- list(1, 2, 3)

As other answers pointed out in a more complicated way already, you did already create a list of lists! It's just the odd output of R that confuses (everybody?). Try this:
> str(list_all)
List of 2
$ :List of 2
..$ : num 1
..$ : num 2
$ :List of 2
..$ : chr "a"
..$ : chr "b"
And the most simple construction would be this:
> str(list(list(1, 2), list("a", "b")))
List of 2
$ :List of 2
..$ : num 1
..$ : num 2
$ :List of 2
..$ : chr "a"
..$ : chr "b"

Related

List inside of data frame cell, how to extract unique lists? R

I am trying to create a data frame in which some cells have a list of strings, while others have a single string. Ideally, from this data frame, I would then be able to extract all unique lists into a new list, vector, or one-row data frame. Any tips? Reprex below:
Data frame with some lists of strings within cells:
require(stringr)
table3 <- data.frame(U1 = I(list(c("b", "d"),
c("d"),
c(NA))),
U2 = I(list(c("a", "b", "d"),
c("b"),
c("b","d"))),
U3 = I(list(c(99),
c("a"),
c("a"))),
U4= I(list(c("a"),
c(NA),
c(NA))))
rownames(table3) <- c("C1", "C2", "C3")
What I want the output to look like:
table3.elem <- data.frame(C = I(list(99, "a", "b", "d", c("b","d"), c("a", "b", "d"))))
I'm trying to ultimately reproduce the calculations for Krippendorff's alpha for multi-valued data, published in Krippendorff & Cragg (2016). Unfortunately, now that Java is no longer a thing their downloadable program to calculate this version of Krippendorff's alpha doesn't work on my computer. So trying to create a version for R that at least I can use (and hopefully others too if I can get it working okay).
Thank you!
An option is
Convert the data.frame into a list -unclass
Flatten the list (do.call + c)
Get the unique list elements
Filter out the list elements that are NA
Create a data.frame with a list column
out <- data.frame(C = I(Filter(function(x) all(complete.cases(x)),
unique(do.call(c, unclass(table3))))))
out <- out[order(lengths(out$C), !sapply(out$C, is.numeric),
sapply(out$C, head, 1)), , drop = FALSE]
row.names(out) <- NULL
-output
> out
C
1 99
2 a
3 b
4 d
5 b, d
6 a, b, d
> str(out)
'data.frame': 6 obs. of 1 variable:
$ C:List of 6
..$ : num 99
..$ : chr "a"
..$ : chr "b"
..$ : chr "d"
..$ : chr "b" "d"
..$ : chr "a" "b" "d"
..- attr(*, "class")= chr "AsIs"
You can use unlist not recursive, make unique and remove the NA to to extract unique lists.
x <- unique(unlist(table3, FALSE))
x <- x[!is.na(x)]
x <- x[order(lengths(x), sapply(x, paste, collapse= ""))] #In case it should be ordered
data.frame(C = I(x))
# C
#1 99
#2 a
#3 b
#4 d
#5 b, d
#6 a, b, d

List of named vector from list of tibbles in R

I have an appearingly very simple task but can't figure out what I'm doing wrong. I have a list of 3 Xx2 tibbles (in the example 2x2) having a character vector and an integer vector. I want to convert it to a list of 3 named vectors where the letters are the vector elements and the numbers are the names. Here is my approach:
tbl <- tibble(numbers=c(1:2), letters=letters[1:2])
vec_names <- c("name1", "name2", "name3")
lst <- list(tbl, tbl, tbl)
names(lst) <- vec_names
lst_n <- lapply(lst, function(x) x[["letters"]])
lst_n <- sapply(vec_names,
function(x) names(lst_n[[x]]) <- lst[[x]]$numbers)
I get this result
lst_n
name1 name2 name3
[1,] 1 1 1
[2,] 2 2 2
and I can't see my mistake.
Doing
names(lst_n[["name1"]]) <- lst[["name1"]]$numbers
gives me exactly what I want for "name1" but why doesn't it work with sapply?
I had [] before and changed it to [[]] to access the tibbles inside the list instead of the list elements but it still doesn't work. Can anyone help? It seems like a very basic task.
Here's one way to do it, all in one anonymous function:
z = lapply(lst, function(x) {
result = x$letters
names(result) = x$numbers
return(result)
})
str(z)
# List of 3
# $ name1: Named chr [1:2] "a" "b"
# ..- attr(*, "names")= chr [1:2] "1" "2"
# $ name2: Named chr [1:2] "a" "b"
# ..- attr(*, "names")= chr [1:2] "1" "2"
# $ name3: Named chr [1:2] "a" "b"
# ..- attr(*, "names")= chr [1:2] "1" "2"
Your approach got stuck because after you extracted all the letters, you need to iterate over both the letters and the numbers to set the names, but lapply only lets you iterate over one thing. (And assigning inside the lapply doesn't work well, the only thing that matters is the returned object.)
If you couldn't use the approach above, doing things in one pass through lst, you can use Map instead which iterates over multiple lists. We'll use the setNames function instead of names<-(), which is what is called when you try to do names(x) <-.
Map(
f = setNames,
object = lapply(lst, "[[", "letters"),
nm = lapply(lst, "[[", "numbers")
)
# $`name1`
# 1 2
# "a" "b"
#
# $name2
# 1 2
# "a" "b"
#
# $name3
# 1 2
# "a" "b"

How to display a list of lists in a nice way

I have a list of lists, such as below.
Each list (e.g. list1, list2, list3) has two attributes: Variable and Time
list1 <- list(c("Color", "Price"), "Quarter")
list2 <- list(c("Price"), "Month")
list3 <- list(c("Color"), "Month")
total <- list(list1, list2, list3)
when we print total, we'll see:
[[1]]
[[1]][[1]]
[1] "Color" "Price"
[[1]][[2]]
[1] "Quarter"
[[2]]
[[2]][[1]]
[1] "Price"
[[2]][[2]]
[1] "Month"
[[3]]
[[3]][[1]]
[1] "Color"
[[3]][[2]]
[1] "Month"
How can I turn it into a data frame such as this one?
EDIT: I am able to accomplish it using this code. Any better suggestion is appreciated!
num <- length(total)
max <- 0
for(i in 1:num) {
if(length(total[[i]][1]) > max) {
max <- length(total[[i]])
}
}
for(i in 1:num) {
length(total[[i]][[1]]) <- max
for(j in 1:max) {
if(is.null(total[[i]][[1]][[j]])) {
total[[i]][[1]][[j]] <- " "
}
}
}
df <- data.frame(matrix(unlist(total), nrow=num, byrow=T))
This isn't just a nested-list problem, it's a nested problem. If I'm interpretting things correctly, the fact that Color and Price are in one list and Quarter is in another is meaningful. So really, you should be looking at how to turn the first element of each list into a data.frame, repeat for all other elements, then join the results. (This is where #divibisan's and #camille's suggestions come into play ... reduce the problem, use the duplicates' code, then combine.)
(The fact that I believe you will never have more than two elems in each list is not strictly a factor. Below is a general way of handling 1-or-more, not just "always 2".)
Your data:
str(total)
# List of 3
# $ :List of 2
# ..$ : chr [1:2] "Color" "Price"
# ..$ : chr "Quarter"
# $ :List of 2
# ..$ : chr "Price"
# ..$ : chr "Month"
# $ :List of 2
# ..$ : chr "Color"
# ..$ : chr "Month"
What we need to do is break this down by element-of-each-list. (I'm assuming that there will be symmetry here.) Let's start by just working on the first elem of each:
total1 <- lapply(total, `[[`, 1)
str(total1)
# List of 3
# $ : chr [1:2] "Color" "Price"
# $ : chr "Price"
# $ : chr "Color"
In order to use the suggestions from the dupes, we need to know how much to pad them. That is, they need to be the same length.
( maxlen <- max(sapply(total1, function(l) length(unlist(l)))) )
# [1] 2
Now we pad them:
total1 <- lapply(total1, function(l) { length(l) <- maxlen; l; })
str(total1)
# List of 3
# $ : chr [1:2] "Color" "Price"
# $ : chr [1:2] "Price" NA
# $ : chr [1:2] "Color" NA
(You can start to see the structure break out here.) The dupes suggested cbinding them, but you want to rbind them:
do.call(rbind, total1)
# [,1] [,2]
# [1,] "Color" "Price"
# [2,] "Price" NA
# [3,] "Color" NA
Now this is a matrix, not a data.frame, but it's a start. Let's work with naming at the end. Let's write a function to do what we just did, and then we'll use it on each level of total.
In order to do this, though, we need to modify total, so that the new first element has all first elements, new second has all seconds, etc.
newtotal <- lapply(seq_len(max(sapply(total, length))), function(i) lapply(total, `[[`, i))
str(newtotal)
# List of 2
# $ :List of 3
# ..$ : chr [1:2] "Color" "Price"
# ..$ : chr "Price"
# ..$ : chr "Color"
# $ :List of 3
# ..$ : chr "Quarter"
# ..$ : chr "Month"
# ..$ : chr "Month"
m <- do.call(cbind, lapply(newtotal, func))
m
# [,1] [,2] [,3]
# [1,] "Color" "Price" "Quarter"
# [2,] "Price" NA "Month"
# [3,] "Color" NA "Month"
So this last point is pretty much what you need, though as a matrix. From here, it's easy enough to name things:
m <- do.call(cbind, lapply(newtotal, func))
colnames(m) <- c(paste0("Var", seq_len(ncol(m)-1L)), "Time")
df <- as.data.frame(m)
df$List <- paste0('List', seq_len(nrow(df)))
df
# Var1 Var2 Time List
# 1 Color Price Quarter List1
# 2 Price <NA> Month List2
# 3 Color <NA> Month List3

R strange apply returns

I use apply to a matrix in order to apply a function row by row.
My syntax is as follows :
res = apply(X,1,MyFunc)
The above function MyFunc returns a list of two values.
But the result of this apply application is a strange structure, where R seems to add some of its own (housekeeping?) data :
res = $`81`
$`81`$a
[1] 80.8078
$`81`$b
[1] 6247
Whereas the result I am waiting for is simply :
res = $a
[1] 80.8078
$b
[1] 6247
I do not know why this strange 81 is inserted by R and how can I get rid of it.
Thanks for help
This is perfectly normal behaviour. You are applying a function over a matrix with named rows. Your function returns a list for each row, and each element in this new list of lists is named with the corresponding rowname.
Here is an example that reproduces what you describe:
x <- matrix(1:4, nrow=2)
rownames(x) <- 80:81
myFunc <- function(x)list(a=1, b=2)
xx <- apply(x, 1, myFunc)
xx
This returns:
$`80`
$`80`$a
[1] 1
$`80`$b
[1] 2
$`81`
$`81`$a
[1] 1
$`81`$b
[1] 2
Take a look at the structure of this list:
str(xx)
List of 2
$ 80:List of 2
..$ a: num 1
..$ b: num 2
$ 81:List of 2
..$ a: num 1
..$ b: num 2
To index the first element, simply use xx[[1]]:
xx[[1]]
$a
[1] 1
$b
[1] 2
Here is a guess as to what you may have intended... Rather than returning a list, if you return a vector, the result of the apply will be a matrix:
myFunc <- function(x)c(a=1, b=2)
apply(x, 1, myFunc)
80 81
a 1 1
b 2 2
And to get a specific row, without names, do:
unname(xx[2, ])
[1] 2 2
It would help to know what your matrix (X) looks like. Let's try something like this:
mf <- function(x) list(a=sum(x),b=prod(x))
mat <- matrix(1:6,nrow=2)
Then:
> apply(mat,1,mf)
[[1]]
[[1]]$a
[1] 9
[[1]]$b
[1] 15
[[2]]
[[2]]$a
[1] 12
[[2]]$b
[1] 48
You need that first subscript to differentiate between the lists that each row will generate. I suspect that your rownames are numbered, which results in the $`81` that you are seeing.

Why does sapply return a matrix that I need to transpose, and then the transposed matrix will not attach to a dataframe?

I would appreciate insight into why this happens and how I might do this more eloquently.
When I use sapply, I would like it to return a 3x2 matrix, but it returns a 2x3 matrix. Why is this? And why is it difficult to attach this to another data frame?
a <- data.frame(id=c('a','b','c'), var1 = c(1,2,3), var2 = c(3,2,1))
out <- sapply(a$id, function(x) out = a[x, c('var1', 'var2')])
#out is 3x2, but I would like it to be 2x3
#I then want to append t(out) (out as a 2x3 matrix) to b, a 1x3 dataframe
b <- data.frame(var3=c(0,0,0))
when I try to attach these,
b[,c('col2','col3')] <- t(out)
The error that I get is:
Warning message:
In `[<-.data.frame`(`*tmp*`, , c("col2", "col3"), value = list(1, :
provided 6 variables to replace 2 variables
although the following appears to give the desired result:
rownames(out) <- c('col1', 'col2')
b <- cbind(b, t(out))
I can not operate on the variables:
b$var1/b$var2
returns
Error in b$var1/b$var2 : non-numeric argument to binary operator
Thanks!
To expand on DWin's answer: it would help to look at the structure of your out object. It explains why b$var1/b$var2 doesn't do what you expect.
> out <- sapply(a$id, function(x) out = a[x, c('var1', 'var2')])
> str(out) # this isn't a data.frame or a matrix...
List of 6
$ : num 1
$ : num 3
$ : num 2
$ : num 2
$ : num 3
$ : num 1
- attr(*, "dim")= int [1:2] 2 3
- attr(*, "dimnames")=List of 2
..$ : chr [1:2] "var1" "var2"
..$ : NULL
The apply family of functions are designed to work on vectors and arrays, so you need to take care when using them with data.frames (which are usually lists of vectors). You can use the fact that data.frames are lists to your advantage with lapply.
> out <- lapply(a$id, function(x) a[x, c('var1', 'var2')]) # list of data.frames
> out <- do.call(rbind, out) # data.frame
> b <- cbind(b,out)
> str(b)
'data.frame': 3 obs. of 4 variables:
$ var3: num 0 0 0
$ var1: num 1 2 3
$ var2: num 3 2 1
$ var3: num 0 0 0
> b$var1/b$var2
[1] 0.3333333 1.0000000 3.0000000
First a bit of R notation. The If you look at the code for sapply, you will find the answer to your question. The sapply function checks to see if the list lengths are all equal, and if so, it first "unlist()"s them and then takes that series of lists as the data argument to array(). Since array (like matrix() ) by default arranges its values in column major order, that is what you get. The lists get turned on their side. If you don't like it then you can define a new function tsapply that will return the transposed values:
> tsapply <- function(...) t(sapply(...))
> out <- tsapply(a$id, function(x) out = a[x, c('var1', 'var2')])
> out
var1 var2
[1,] 1 3
[2,] 2 2
[3,] 3 1
... a 3 x 2 matrix.
Have a look at ddply from the plyr package
a <- data.frame(id=c('a','b','c'), var1 = c(1,2,3), var2 = c(3,2,1))
library(plyr)
ddply(a, "id", function(x){
out <- cbind(O1 = rnorm(nrow(x), x$var1), O2 = runif(nrow(x)))
out
})

Resources