Unlist nested list by name - r

I import a nested list of unknown length (here 2) and unknown names (here iter1 and iter2) and get the names of the list:
iter1 <- list(1, 2, 3, 4)
iter2 <- list(1, 2, 3, 4)
nested_list <- list(iter1 = iter1, iter2 = iter2)
names <- names(nested_list)
The next thing I want to do is actually this:
unlist <- data.frame(x=unlist(nested_list$iter1))
But due to the fact I don't know the names beforehand I want to do something like this:
unlist <- data.frame(x=unlist(nested_list$names[1]))
Which is certainly not working. There is no error, but the created list is empty.
In the end I want to do something like this:
for(i in 1:length(nested_list)) {
unlist <- data.frame(x=unlist(nested_list$names[i]))
print(unlist)
}

Using Map, avoiding the names vector.
data.frame(Map(unlist, nested_list)[1])
# iter1
# 1 1
# 2 2
# 3 3
# 4 4
Or, in order to give column names with mapply:
data.frame(x=mapply(unlist, nested_list)[,1])
# x
# 1 1
# 2 2
# 3 3
# 4 4
The 1 in brackets indicates first list name, use 2 for the second name accordingly.
Data
nested_list <- list(iter1 = list(1, 2, 3, 4), iter2 = list(1, 2, 3, 4))

Maybe you can try the code below
unlist <- data.frame(x=unlist(nested_list[names[1]]))
such that
x
iter11 1
iter12 2
iter13 3
iter14 4

I am not sure I get what you intended as result, could you precise it if needed ?
iter1 <- list(1, 2, 3, 4)
iter2 <- list(1, 2, 3, 4)
nested_list <- list(iter1 = iter1, iter2 = iter2)
names <- names(nested_list)
cbind.data.frame(lapply(nested_list, unlist))
#> iter1 iter2
#> 1 1 1
#> 2 2 2
#> 3 3 3
#> 4 4 4

Related

Return maximum of conditionally selected pairs from a vector in R

Reproducible example:
set.seed(1)
A <- round(runif(12, min = 1, max = 5))
> A
[1] 1 2 2 4 3 4 3 4 5 3 4 5
expectedResult <- c(max(A[1], A[4]), max(A[2], A[5]), max(A[3], A[6]), max(A[7], A[10]), max(A[8], A[11]), max(A[9], A[12]))
> expectedResult
[1] 4 3 4 3 4 5
Each A needs to be considered as a collection of segments with 6 elements. For example, A here has 2 segments such as A[1:6] and A[7:12]. For each segment, the first 3 elements are compared with the next 3 elements. Therefore I need to take max(A[1],A[4]), max(A[2], A[5]), max(A[2], A[5]), max(A[3], A[6]), max(A[7], A[10]), max(A[8], A[11]), max(A[9], A[12]).
My original vector has way more elements than this example and therefore I need a much simpler approach to do this. In addition, speed is also a factor for the original calculation and therefore looking for a fast solution as well.
We could create a function to split the vector by 'n' elements, loop over the list, create a matrix with nrow specified as 2, use pmax to do elementwise max after converting to data.frame, return the output by unlisting the list
f1 <- function(vec, n) {
lst1 <- split(vec, as.integer(gl(length(vec), n, length(vec))))
unname(unlist(lapply(lst1, function(x)
do.call(pmax, as.data.frame(t(matrix(x, nrow = 2, byrow = TRUE)))))))
}
-output
> f1(A, 6)
[1] 4 3 4 3 4 5
If the length is not a multiple of 3 or 6, another option is to do a group by operation with tapply after splitting
unname(unlist(lapply(split(A, as.integer(gl(length(A), 6,
length(A)))), function(x) tapply(x, (seq_along(x)-1) %% 3 + 1, FUN = max))))
[1] 4 3 4 3 4 5
data
A <- c(1, 2, 2, 4, 3, 4, 3, 4, 5, 3, 4, 5)
Another option in base R:
a <- 6
unlist(tapply(A, gl(length(A)/a, a),
function(x) pmax(head(x, a/2), tail(x, a/2))),, FALSE)
[1] 4 3 4 3 4 5
or even
a <- 6
unlist(tapply(A, gl(length(A)/a, a),
function(x) do.call(pmax, data.frame(matrix(x, ncol = 2)))),, FALSE)
[1] 4 3 4 3 4 5
You can reshape the vector to a 3d array, split by column, and take the parallel max. This should be pretty efficient as far as base R goes.
do.call(pmax.int, asplit(`dim<-`(A, c(3,2,2)), 2))
[1] 4 3 4 3 4 5

Is there a way to make a for loop repeat along elements of a list?

I am new to R. I have a data frame that contains start and end values for 45 types of items, and I used dplyr to subset that data into 45 separate data frames. I have written a for loop that outputs a sequence from start to end for each row of the data frame. I would like to use this for loop on all data frames without having to copy and paste the code 45 times and tailor it to the name of each data frame. See below for an example:
A_list <- list()
B_list <- list()
C_list <- list()
dfA <- data.frame(name = c("A", "A"), start = c(1, 3), end = c(6, 5))
dfB <- data.frame(name = c("B", "B"), start = c(2, 1), end = c(7, 8))
dfC <- data.frame(name = c("C", "C"), start = c(1, 2), end = c(4, 7))
for(i in seq_along(dfA$start)) {
output <- seq.int(dfA$start[i], dfA$end[i])
A_list[[i]] <- output
}
I tried making a list of names of each data frame and then referring to it in the for loop, but this didn't work.
list_df_names <- list(dfA, dfB, dfC)
seq.int(list_df_names[1:3]$start[i], list_df_names[1:3]$end[i])
Does anyone have any thoughts on how to do this?
We can loop of list of datasets, then create the sequence between the 'start', 'end' columns with Map to have a list of lists. If needed to create separate objects (not recommended), use list2env after setting the names of the nested list with the preferred object names
out <- lapply(list_df_names, function(x) Map(seq.int, x$start, x$end))
names(out) <- paste0(c('A', 'B', 'C'), "_list")
list2env(out, .GlobalEnv)
-output
A_list
#[[1]]
#[1] 1 2 3 4 5 6
#[[2]]
#[1] 3 4 5
B_list
#[[1]]
#[1] 2 3 4 5 6 7
#[[2]]
#[1] 1 2 3 4 5 6 7 8
C_list
#[[1]]
#[1] 1 2 3 4
#[[2]]
#[1] 2 3 4 5 6 7

Unlisting a list of list while keeping second list names

I would like to unlist a list of list while keeping the names of the second list.
For example if have a list like this:
$`listA`
$`listA_a`
[1] 1 2
$`listA_g`
[1] 1 2
$`listB`
$`listB_b`
[1] 1 2
I would like to obtain this list:
$`listA_a`
[1] 1 2
$`listA_g`
[1] 1 2
$`listB_b`
[1] 1 2
I know there is an argument in unlist to keep names (use.names = T, which is true by default)
however it keeps the names of the first list and add a number if there is several elements ("listA1", "listA2", "listB").
(This is an example but in my code the elements of the list are plots so I cannot use a data.frame or anything... I cannot easily reconstruct the names as they contain informations about the data used for the plots).
Thank you very much for your help!
Pernille
Try this approach. You can use unlist() with recursive=F to keep the desired structure and then format the names. Here the code:
#Data
List <- list(listA = list(listA_a = c(1, 2), listA_g = c(1, 2)), listB = list(
listB_b = c(1, 2)))
#Code
L <- unlist(List,recursive = F)
names(L) <- gsub(".*\\.","", names(L) )
L
Output:
L
$listA_a
[1] 1 2
$listA_g
[1] 1 2
$listB_b
[1] 1 2
Or the more simplified version without regex (Many thanks and credits to #markus):
#Code 2
L <- unlist(unname(List),recursive = F)
Output:
L
$listA_a
[1] 1 2
$listA_g
[1] 1 2
$listB_b
[1] 1 2
We could use rrapply from rrapply
library(rrapply)
rrapply(List, how = 'flatten')
#$listA_a
#[1] 1 2
#$listA_g
#[1] 1 2
#$listB_b
#[1] 1 2
data
List <- list(listA = list(listA_a = c(1, 2), listA_g = c(1, 2)), listB = list(
listB_b = c(1, 2)))
Another option is using flatten from package purrr
> purrr::flatten(lst)
$listA_a
[1] 1 2
$listA_g
[1] 1 2
$listB_b
[1] 1 2
Another option would be to make use of e.g. Reduce to concatenate the sublists:
list_of_lists <- list(
listA = list(listA_a = c(1, 2), listA_g = c(1, 2)),
listB = list(listB_b = c(1, 2)
))
Reduce(c, list_of_lists)
#> $listA_a
#> [1] 1 2
#>
#> $listA_g
#> [1] 1 2
#>
#> $listB_b
#> [1] 1 2

Combining identical columns, concatenating the column names in R

I have a matrix for a minimal example:
data <- c(1, 1, 1, 1, 1, 1, 2, 2, 3, 3, 4, 4)
Matrix = matrix(data, nrow = 3, ncol=4)
colnames(Matrix) <-c("4","3","7","100")
rownames(Matrix) <-c("bob","foo","bar")
> Matrix
4 3 7 100
bob 1 1 2 3
foo 1 1 2 4
bar 1 1 3 4
I want to combine any identical columns, besides the names, and update the colnames such that I know the original columns that were identical I have tried using loops to find the duplicates, but I can't get the combining names part.
The expected result would be something like the following matrix:
>Matrix
4-3 7 100
bob 1 2 3
foo 1 2 4
bar 1 3 4
Here is another base R option
do.call(
cbind,
Map(
function(x) `colnames<-`(Matrix[, (nm<-names(x))[1], drop = FALSE], paste0(nm, collapse = "-")),
split(u <-unlist(Map(toString, as.data.frame(Matrix))), u)
)
)
which gives
4-3 7 100
bob 1 2 3
foo 1 2 4
bar 1 3 4
We could split the columns into a list based on the pasted values of the column, then get the first column, paste the column names and cbind
do.call(cbind, lapply(unname(split.default(as.data.frame(Matrix),
apply(Matrix, 2, paste, collapse = ''))),
function(x) matrix(x[,1],
dimnames = list(NULL, paste(colnames(x), collapse='-')))))

Extend/expand data frame with column of lists each into a row

I have a a data frame of the following type:
df <- data.frame("col1" = c(1,2,3,4))
df$col2 <- list(list(1,1,1),list(2,2,2),list(3,3,3),list(4,4,4))
df$col3 <- list(c(1,1,1),c(2,2,2),c(3,3,3),c(4,4,4))
df
And get:
col1 col2 col3
1 1 1, 1, 1 1, 1, 1
2 2 2, 2, 2 2, 2, 2
3 3 3, 3, 3 3, 3, 3
4 4 4, 4, 4 4, 4, 4
Now I would like to manipulate this data frame to get something like:
col1 col3
1 1 1
1 1
1 1
2 2 2
2 2
2 2
3 3 3
3 3
3 3
...
Now I can do this with a simple loop. For each row I convert the list into a data frame. I then use rbind to append the data frames into a single one.
My question is: how do I do this with vectorized function?
I have tried apply, sapply, mapply and Reducebut with no success. applywas the only that actually execute but produced incorrect results (got only the first element of each list).
We can remove the first column (df[-1]), loop over the other columns, unlist and then convert the list to data.frame
lst <- lapply(df[-1], unlist)
dfN <- data.frame(lst)

Resources