Using modifyList() to merge more than two lists - r

Ive been usingmodifyList() ` to combine two lists of similar structure, since other methods usually don't my data structure. However, I now need to apply this process over multiple lists.
lst1 <- list("name" = c("paul", "mary", "jane"), "height" = c(188,177,166))
lst2 <- list("color" = c("pink", "grey", "black"), "value" = c(22,33,44))
res <- modifyList(lst1, lst2)
gives the desired outcome for two lists
> str(res)
List of 4
$ name : chr [1:3] "paul" "mary" "jane"
$ height: num [1:3] 188 177 166
$ color : chr [1:3] "blue" "red" "green"
$ value : num [1:3] 12 13 14
but how do I apply this over > 2 lists dynamically, i.e.
lst1 <- list("name" = c("paul", "mary", "jane"), "height" = c(188,177,166))
lst2 <- list("color" = c("pink", "grey", "black"), "value" = c(22,33,44))
lst3 <- list("type" = c("good", "bad", "ugly"), "weight" = c(80,70,60))
The expected output in this case would be:
> str(res)
List of 6
$ name : chr [1:3] "paul" "mary" "jane"
$ height: num [1:3] 188 177 166
$ color : chr [1:3] "blue" "red" "green"
$ value : num [1:3] 12 13 14
$ type : chr [1:3] "good" "bad" "ugly"
$ weight: num [1:3] 80 70 60

In the OP's example, the list elements are all disjoint elements which can be joined simply by c(lst1, lst2, lst3). Using another reproducible example
Reduce(modifyList, mget(ls(pattern = "foo\\d+")))
#$a
#[1] 1
#$b
#$b$c
#[1] "d"
#$b$d
#[1] TRUE
#$e
#[1] 2
#$g
#[1] 4
data
foo1 <- list(a = 1, b = list(c = "a", d = FALSE))
foo2 <- list(e = 2, b = list(d = TRUE))
foo3 <- list(g = 4, b = list(c = "d"))

Related

Renaming nested lists with lapply

I have a nested list:
my_list <- list(A = 1, B = 2, C = 3)
my_nested_list <- list(D = my_list, E = my_list, F = my_list)
I want to change the names of the inner most elements to the following:
my_names <- c("X", "Y", "Z")
So, A, B, C should becoming X, Y, Z.
Here is my attempt:
name_changer <- lapply(my_nested_list, FUN = function(x){
lapply(x, FUN = function(y){
names(y) <- my_names
})
})
Why does this not work?
In your example you have two lapply loops when you only need one.
Your original second loop tries to apply the 3-element "my_names" vector to each individual element of "my_list", which won't work, since the lengths don't match. But you don't need the second loop at all:
my_list <- list(A = 1, B = 2, C = 3)
my_nested_list <- list(D = my_list, E = my_list, F = my_list)
my_names <- c("X", "Y", "Z")
name_changer <- lapply(my_nested_list, FUN = function(x){
names(x) <- my_names
return(x)
})
str(name_changer)
List of 3
$ D:List of 3
..$ X: num 1
..$ Y: num 2
..$ Z: num 3
$ E:List of 3
..$ X: num 1
..$ Y: num 2
..$ Z: num 3
$ F:List of 3
..$ X: num 1
..$ Y: num 2
..$ Z: num 3
You can use setNames as well -
lapply(my_nested_list, setNames, my_names)
#$D
#$D$X
#[1] 1
#$D$Y
#[1] 2
#$D$Z
#[1] 3
#$E
#$E$X
#[1] 1
#$E$Y
#[1] 2
#$E$Z
#[1] 3
#$F
#$F$X
#[1] 1
#$F$Y
#[1] 2
#$F$Z
#[1] 3

bind list of lists based on the lists location

I have some data similar to mainList below.
List of 2
$ :List of 3
..$ :List of 1
.. ..$ :'data.frame': 3 obs. of 2 variables:
.. .. ..$ col1: chr [1:3] "1" "2" "3"
.. .. ..$ col2: chr [1:3] "a" "b" "c"
..$ :List of 1
.. ..$ :'data.frame': 3 obs. of 2 variables:
.. .. ..$ col1: chr [1:3] "3" "7" "4"
.. .. ..$ col2: chr [1:3] "e" "d" "g"
..$ :List of 1
.. ..$ :'data.frame': 3 obs. of 2 variables:
.. .. ..$ col1: chr [1:3] "2" "7" "4"
.. .. ..$ col2: chr [1:3] "l" "o" "i"
$ :List of 3
..$ :List of 1
.. ..$ :'data.frame': 3 obs. of 2 variables:
.. .. ..$ col1: chr [1:3] "8" "3" "4"
.. .. ..$ col2: chr [1:3] "r" "t" "q"
..$ :List of 1
.. ..$ :'data.frame': 3 obs. of 2 variables:
.. .. ..$ col1: chr [1:3] "7" "5" "2"
.. .. ..$ col2: chr [1:3] "h" "w" "p"
..$ :List of 1
.. ..$ :'data.frame': 3 obs. of 2 variables:
.. .. ..$ col1: chr [1:3] "9" "3" "6"
.. .. ..$ col2: chr [1:3] "x" "y" "z"
I want to merge, or bind the lists based on the lists location in the list of lists.
That is, I want to merge splt1 with splt11, and then merge splt2 with splt22 and finally splt3 with splt33.
So it would take the first data frame from the first List of 3 and merge it with the first data frame from the second List of 3.
This does not get what I want
mainList %>%
map(., ~bind_rows(., .id = "split"))
Since all of the splits are merged into a single data frame (I want them kept separate).
Data:
splt1 <- list(
data.frame(
col1 = c("1", "2", "3"),
col2 = c("a", "b", "c")
)
)
splt2 <- list(
data.frame(
col1 = c("3", "7", "4"),
col2 = c("e", "d", "g")
)
)
splt3 <- list(
data.frame(
col1 = c("2", "7", "4"),
col2 = c("l", "o", "i")
)
)
nestList1 <- list(
splt1,
splt2,
splt3
)
splt11 <- list(
data.frame(
col1 = c("8", "3", "4"),
col2 = c("r", "t", "q")
)
)
splt22 <- list(
data.frame(
col1 = c("7", "5", "2"),
col2 = c("h", "w", "p")
)
)
splt33 <- list(
data.frame(
col1 = c("9", "3", "6"),
col2 = c("x", "y", "z")
)
)
nestList2 <- list(
splt11,
splt22,
splt33
)
mainList <- list(
nestList1,
nestList2
)
EDIT:
Screenshot of the lists:
I am trying to bind together all of the split's, i.e.
split1 will contain the results from 08001, 08003, 08005 ... 0801501 for each of the lists in catalunya_madrid.
split2 will contain the same results 08001, 08003, 08005 ... 0801501
and so on.
EDIT2:
# Function to invert the list structure
invertListStructure <- function(ll) {
nms <- unique(unlist(lapply(ll, function(X) names(X))))
ll <- lapply(ll, function(X) setNames(X[nms], nms))
ll <- apply(do.call(rbind, ll), 2, as.list)
lapply(ll, function(X) X[!sapply(X, is.null)])
}
invertedList <- map(analysis, ~invertListStructure(.) %>%
map(., ~bind_rows(.x, .id = "MITMA")))
You can use purrr::transpose() to group list elements with the same location (i.e. the first element in list 1 with the first element in list 2 and list 3 and so on) for any number of lists. In your case, transpose will convert 592 lists of 216 into 216 lists of 592, each properly titled. With transpose, l[[x]][[y]] becomes l[[y]][[x]].
library(tidyverse)
mainList %>% purrr::transpose() %>%
map(function(x) {
flatten(x) %>% bind_rows(.id = 'id')
})
# $splt1
# id col1 col2
# 1 1 1 a
# 2 1 2 b
# 3 1 3 c
# 4 2 8 r
# 5 2 3 t
# 6 2 4 q
#
# $splt2
# id col1 col2
# 1 1 3 e
# 2 1 7 d
# 3 1 4 g
# 4 2 7 h
# 5 2 5 w
# 6 2 2 p
#
# $splt3
# id col1 col2
# 1 1 2 l
# 2 1 7 o
# 3 1 4 i
# 4 2 9 x
# 5 2 3 y
# 6 2 6 z
Note that you only need to flatten if the data.frame is in a list of length 1, by itself. If you have a list of data.frames (as opposed to a list of lists, each of which contains one data.frame, as in your example data), you can ignore the flatten() command and just bind the rows.
Your example dataset doesn't quite match your actual data, but if you make a list of two mainLists, it's closer. These types of operations are heavily dependent on the structure of the data, though, so I can't be sure this is what you need. All you need to do here is add a subscript.
mainList2 <- list(mainList, mainList) # First is Madrid, second is Valencia
# Operations are done on Madrid only
mainList2[[1]] %>%
transpose() %>%
map(function(x) {
flatten(x) %>% bind_rows(.id = 'id')
})
If you want to do this for both elements in mainList2, you can wrap the whole thing in map.
mainList2 %>% map(function(x) {
transpose(x) %>%
map(function(x) {
flatten(x) %>% bind_rows(.id = 'id')
})
})
You can combine the pairs in following way :
Map(rbind, unlist(mainList[[1]], recursive = FALSE),
unlist(mainList[[2]], recursive = FALSE))
Or using purrr you can also add an id column easily.
library(purrr)
map2(mainList[[1]] %>% flatten,
mainList[[2]] %>% flatten, dplyr::bind_rows, .id = 'id')
#[[1]]
# id col1 col2
#1 1 1 a
#2 1 2 b
#3 1 3 c
#4 2 8 r
#5 2 3 t
#6 2 4 q
#[[2]]
# id col1 col2
#1 1 3 e
#2 1 7 d
#3 1 4 g
#4 2 7 h
#5 2 5 w
#6 2 2 p
#[[3]]
# id col1 col2
#1 1 2 l
#2 1 7 o
#3 1 4 i
#4 2 9 x
#5 2 3 y
#6 2 6 z

R - Appending multiple level 2 elements to each level 1 element of a list by name

I have a list called master, which contains three IDs:
master = list(p1 = list(id = 'abc'), p2 = list(id = 'def'), p3 = list(id = 'ghi'))
str(master)
List of 3
$ p1:List of 1
..$ id: chr "abc"
$ p2:List of 1
..$ id: chr "def"
$ p3:List of 1
..$ id: chr "ghi"
To each level 1 element of this list, I would like to append the corresponding value and radius elements from the val and rad lists:
val = list(p1 = list(value = 5), p3 = list(value = 8))
str(val)
List of 2
$ p1:List of 1
..$ value: num 5
$ p3:List of 1
..$ value: num 8
rad = list(p1 = list(radius = 2), p2 = list(radius = 10))
str(rad)
List of 2
$ p1:List of 1
..$ radius: num 2
$ p2:List of 1
..$ radius: num 10
I have to be careful to match the elements by name because val and rad do not have the same structure as master, i.e. val is missing a slot for p2 and rad is missing a slot for p3.
I can use the following to partially achieve the desired result:
master_final = lapply(X=names(master),function(x, master, val, rad) c(master[[x]], val[[x]], rad[[x]]), master, val, rad)
str(master_final)
List of 3
$ :List of 3
..$ id : chr "abc"
..$ value : num 5
..$ radius: num 2
$ :List of 2
..$ id : chr "def"
..$ radius: num 10
$ :List of 2
..$ id : chr "ghi"
..$ value: num 8
But I would like each element of the resulting list to have the same structure, i.e. an id, value and radius slot. I am not sure how to do this in a way that generalises to any number of lists? I don't like having to write [[x]] for each list in the lapply function: function(x, master, val, rad) c(master[[x]], val[[x]], rad[[x]]).
One way would be to convert the lists to dataframe and do a merge based on list name. We can then split the dataframe based on list_name.
df1 <- Reduce(function(x, y) merge(x, y, all = TRUE, by = "ind"),
list(stack(master), stack(val),stack(rad)))
names(df1) <- c("list_name", "id", "value", "radius")
lapply(split(df1[-1], df1$list_name), as.list)
#$p1
#$p1$id
#[1] "abc"
#$p1$value
#[1] 5
#$p1$radius
#[1] 2
#$p2
#$p2$id
#[1] "def"
#$p2$value
#[1] NA
#$p2$radius
#[1] 10
#$p3
#$p3$id
#[1] "ghi"
#$p3$value
#[1] 8
#$p3$radius
#[1] NA
This keeps NA values in the list as it is, if we want to remove them the code becomes a bit ugly.
lapply(split(df1[-1], df1$list_name), function(x)
{inds <- !is.na(x); as.list(setNames(x[inds], names(x)[inds]))})
You could first group all your lists in L and run
L = list(master,val,rad)
lapply(names(master),function(x) unlist(lapply(L,"[[",x)))
[[1]]
id value radius
"abc" "5" "2"
[[2]]
id radius
"def" "10"
[[3]]
id value
"ghi" "8"
Here is one way with tidyverse
library(dplyr)
library(purrr)
out <- list(master, rad, val) %>%
transpose %>%
map(flatten)
str(out)
#List of 3
# $ p1:List of 3
# ..$ id : chr "abc"
# ..$ radius: num 2
# ..$ value : num 5
# $ p2:List of 2
# ..$ id : chr "def"
# ..$ radius: num 10
# $ p3:List of 2
# ..$ id : chr "ghi"
# ..$ value: num 8

How to erase all attributes?

I want to erase all attributes from data and applied this solution. However neither one_entry() (the original) nor my one_entry2() will work and I don't see why.
one_entry2 <- function(x) {
attr(x, "label") <- NULL
attr(x, "labels") <- NULL
}
> lapply(df1, one_entry2)
$`id`
NULL
$V1
NULL
$V2
NULL
$V3
NULL
How can we do this?
Data:
df1 <- setNames(data.frame(matrix(1:12, 3, 4)),
c("id", paste0("V", 1:3)))
attr(df1$V1, "labels") <- LETTERS[1:4]
attr(df1$V1, "label") <- letters[1:4]
attr(df1$V2, "labels") <- LETTERS[1:4]
attr(df1$V2, "label") <- letters[1:4]
attr(df1$V3, "labels") <- LETTERS[1:4]
attr(df1$V3, "label") <- letters[1:4]
> str(df1)
'data.frame': 3 obs. of 4 variables:
$ id: int 1 2 3
$ V1: int 4 5 6
..- attr(*, "labels")= chr "A" "B" "C" "D"
..- attr(*, "label")= chr "a" "b" "c" "d"
$ V2: int 7 8 9
..- attr(*, "labels")= chr "A" "B" "C" "D"
..- attr(*, "label")= chr "a" "b" "c" "d"
$ V3: int 10 11 12
..- attr(*, "labels")= chr "A" "B" "C" "D"
..- attr(*, "label")= chr "a" "b" "c" "d"
To remove all attributes, how about this
df1[] <- lapply(df1, function(x) { attributes(x) <- NULL; x })
str(df1)
#'data.frame': 3 obs. of 4 variables:
# $ id: int 1 2 3
# $ V1: int 4 5 6
# $ V2: int 7 8 9
# $ V3: int 10 11 12
Simplifying a bit #maurits-evers answer:
df1[] <- lapply(df1, as.vector)
str(df1)
#'data.frame': 3 obs. of 4 variables:
# $ id: int 1 2 3
# $ V1: int 4 5 6
# $ V2: int 7 8 9
# $ V3: int 10 11 12
The original answer is by Prof. Brian Ripley in this R-Help post.
In tidyverse world:
df1 <- df1 %>% mutate(across(everything(), as.vector))
With data.table
library(data.table)
# Assuming
# setDT(df1) # or
# df1 <- as.data.table(df1)
df1 <- df1[, lapply(.SD, as.vector)]
Provided all the columns are the same type (as in your example) you can do
df1[] = c(df1, recursive=TRUE)
The PKPDmisc package has a dplyr friendly way to do this:
library(PKPDmisc)
df %>% strip_attributes(c("label", "labels"))
The following is a simple solution (and will not convert a date class to a numeric):
df1 <- data.frame(df1)
For certain situations, a modified version of the answer by #maurits-evers may be useful.
Create a function to remove attributes.
remove_attributes <- function(x) {attributes(x) <- NULL; return(x)}
To remove attributes from one element in a list.
df1$V1 <- remove_attributes(df1$V1)
To remove attributes from all elements in a list.
df1 <- lapply(df1, remove_attributes)

Unlisting nested lists and without loosing object classes

After a previous post regarding coercion of variables into their appropriate format, I realized that the problem is due to unlist():ing, which appears to kill off the object class of variables.
Consider a nested list (myList) of the following structure
> str(myList)
List of 2
$ lst1:List of 3
..$ var1: chr [1:4] "A" "B" "C" "D"
..$ var2: num [1:4] 1 2 3 4
..$ var3: Date[1:4], format: "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01"
$ lst2:List of 3
..$ var1: chr [1:4] "Q" "W" "E" "R"
..$ var2: num [1:4] 11 22 33 44
..$ var3: Date[1:4], format: "1999-01-02" "2000-01-03" "2001-01-04" "2002-01-05"
which contains different object types (character, numeric and Date) at the lowest level. I`ve been using
myNewLst <- lapply(myList, function(x) unlist(x,recursive=FALSE))
result <- do.call("rbind", myNewLst)
to get the desired structure of my resulting matrix. However, this yields a coercion into character for all variables, as seen here:
> str(result)
chr [1:2, 1:12] "A" "Q" "B" "W" "C" "E" "D" "R" "1" "11" "2" "22" "3" "33" "4" "44" "10592" "10593" "10957" "10959" "11323" "11326" ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:2] "lst1" "lst2"
..$ : chr [1:12] "var11" "var12" "var13" "var14" ...
After reading a post on a similar issue, I've attempted to utilize do.call("c", x)
myNewLst <- lapply(myList, function(x) do.call("c", x))
result <- do.call("rbind", myNewLst)
Unfortunately, this also results in all variables being characters, as my first attempt. So my question is: How do I unlist a nested list without loosing the object class of my lower-level variables? Are there alternatives which will accomplish the desired result?
Reproducible code for myList:
myList <- list(
"lst1" = list(
"var1" = c("A","B","C","D"),
"var2" = c(1,2,3,4),
"var3" = c(as.Date('1999/01/01'),as.Date('2000/01/01'),as.Date('2001/01/01'),as.Date('2002/01/01'))
),
"lst2" = list(
"var1" = c("Q","W","E","R"),
"var2" = c(11,22,33,44),
"var3" = c(as.Date('1999/01/02'),as.Date('2000/01/03'),as.Date('2001/01/4'),as.Date('2002/01/05'))
)
)
You can use Reduce() or do.call() to be able to combine all of the to one dataframe. The code below should work
Reduce(rbind,lapply(myList,data.frame,stringsAsFactors=F))
var1 var2 var3
1 A 1 1999-01-01
2 B 2 2000-01-01
3 C 3 2001-01-01
4 D 4 2002-01-01
5 Q 11 1999-01-02
6 W 22 2000-01-03
7 E 33 2001-01-04
8 R 44 2002-01-05
Also the class is maintained:
mapply(class,Reduce(rbind,lapply(myList,data.frame,stringsAsFactors=F)))
var1 var2 var3
"character" "numeric" "Date"
If your goal is to convert this list of lists into a single data frame, the following code should work:
result <- data.frame(var1 = unlist(lapply(myList, function(e) e[1]), use.names = FALSE),
var2 = unlist(lapply(myList, function(e) e[2]), use.names = FALSE),
var3 = as.Date(unlist(lapply(myList, function(e) e[3]), use.names = FALSE), origin = "1970-01-01"))
This gives:
> result
var1 var2 var3
1 A 1 1999-01-01
2 B 2 2000-01-01
3 C 3 2001-01-01
4 D 4 2002-01-01
5 Q 11 1999-01-02
6 W 22 2000-01-03
7 E 33 2001-01-04
8 R 44 2002-01-05
Of course, you could use a for-loop to make the code more succinct if there are multiple variables in each list.

Resources