average elements of sublists in a nested list in R - r

I need to average the elements of the nested sublist in the following way. For the example below I have list lll. I would like to have for lll[[1]] compute average of sublists (1+3+5+7)/4 =4, (2+4+6+8)/4=5. Similarly for lll[[2]] compute average of sublists we have (2+4+6+8)/4=5, (1+3+5+7)=4. I could do this using a for loop but the result is not as desired. Since I would like to have list or dataframe which is horizontal for example list(c(4,5),c(5,4)).Also when I have a list of 5000 elements for loop is not efficient. Will really appreciate a smarter way to do this.
l1<-as.matrix(c(1,2))
l2<-as.matrix(c(3,4))
l3<-as.matrix(c(5,6))
l4<-as.matrix(c(7,8))
l5<-as.matrix(c(2,1))
l6<-as.matrix(c(4,3))
l7<-as.matrix(c(6,5))
l8<-as.matrix(c(8,7))
ll1<-list(l1,l2,l3,l4)
ll2<-list(l5,l6,l7,l8)
lll<-list(ll1,ll2)
### using for loop
sum_k_a_<-list()
sum_k_b_<-list()
for (l in 1:2){
sum_k_a<-0
sum_k_b<-0
for (k in 1:4){
sum_k_a=lll[[l]][[k]][1]+sum_k_a
sum_k_b=lll[[l]][[k]][2]+sum_k_b
}
sum_k_a_[[l]]<-sum_k_a/4
sum_k_b_[[l]]<-sum_k_b/4
}

A couple of options:
lapply(lll, function(x) Reduce(`+`, x)/length(x) )
#[[1]]
# [,1]
#[1,] 4
#[2,] 5
#
#[[2]]
# [,1]
#[1,] 5
#[2,] 4
lapply(lll, function(x) rowMeans(do.call(cbind, x)))
#[[1]]
#[1] 4 5
#
#[[2]]
#[1] 5 4

You could do it using lapply and sapply:
lapply(lll,function(x) rowSums(sapply(x,function(y) c(y[1],y[2]))/4))
This returns a list of 2 elements:
[[1]]
[1] 4 5
[[2]]
[1] 5 4

We can also use tidyverse syntax
library(tidyverse)
lll %>%
map(~Reduce(`+`, .)/length(.))
#[[1]]
# [,1]
#[1,] 4
#[2,] 5
#[[2]]
# [,1]
#[1,] 5
#[2,] 4

Would really be much more simply done with an implicit sapply loop that applies mean to the unlist-ed values that are "deeper" in the list structures:
L_means <- sapply( lll, FUN=function(items) {mean( unlist(items))})
L_means
[1] 4.5 4.5
I guess I misunderstood the question, so this is what was desired:
(L_means <- sapply( lll, FUN=function(top){ apply( as.data.frame(top), 1, mean)}) )
[,1] [,2]
[1,] 4 5
[2,] 5 4

Related

Access an element of a list in the same manner how you access an element of a matrix

I have a matrix:
mat <- matrix(c(3,9,5,1,-2,8), nrow = 2)
[,1] [,2] [,3]
[1,] 3 5 -2
[2,] 9 1 8
I have a list:
lst <- as.list(data.frame(matrix(c(3,9,5,1,-2,8), nrow = 2)))
$X1
[1] 3 9
$X2
[1] 5 1
$X3
[1] -2 8
I can access my matrix by mat[i,j]
I can access my list lst[[c(i,j)]]
But if in a matrix if I do mat[1,2] I get a 5. If I use same numbers in a list lst[[c(1,2)]] I get 9.
Is there a way I can get the same numbers when I access a list? Maybe manipulate the list in certain manner? When I use lst[[c(1,2)]] I want to get 5 instead of 9.I want to get the same numbers I get when using mat[i,j].
You can try
> list2DF(lst)[1, 2]
[1] 5
You can use transpose() from purrr to transpose a list.
lst2 <- purrr::transpose(lst)
lst2[[c(1,2)]]
# [1] 5

Row-bind specific rows that are in matrices in lists in R

I would like to row-bind specific rows that are rows in matrices in a list. For instance, I might have a list that has three matrices in them like:
> t
[[1]]
[,1] [,2]
[1,] 1 3
[2,] 2 4
[[2]]
[,1] [,2]
[1,] 5 7
[2,] 6 8
[[3]]
[,1] [,2]
[1,] 9 11
[2,] 10 12
Then what I'd like to do is calculate the distances between the rows of these matrices, so if I could just write out
dist(rbind(t[[1]][1,], t[[2]][1,], t[[3]][1,]))
dist(rbind(t[[1]][2,], t[[2]][2,], t[[3]][2,]))
But in my case I could have much larger matrices and larger lists, so I was wondering if there is a way to row bind the corresponding rows of the matrices in the list in a quicker way?
Many thanks for any help!
Sure, if l is your list, we may use, e.g., sapply in the following way:
t(sapply(l, `[`, 1,))
# [,1] [,2]
# [1,] 1 3
# [2,] 5 7
# [3,] 9 11
which is short for
t(sapply(l, `[`, i = 1, j =))
or
t(sapply(l, function(ll) ll[1, ]))

List of lists to matrix

I have a list of lists and I want to convert it into a matrix such that each column = one sublist.
Mock example
list1 <- list(1, 2)
list2 <- list(1, 2, 3)
list3 <- list(1, 2, 3, 4)
list_lists <- list (list1, list2, list3)
I'm first egalizing the lengths of all the sublists (padding with NULLs if needed) so that all sublists have the length of the longest one. That is to avoid having R repeating data to fill in the rows in the final matrix (feel free if I can skip this step somehow).
max_length <- max(unlist(lapply (list_lists, FUN = length)))
list_lists <- lapply (list_lists, function (x) {length (x) <- max_length; return (x)})
My best attempt so far
mat <- lapply (list_lists, cbind)
mat does look superficially like what I want but it is actually not. It is not a matrix (and attempts to convert it into one using as.matrix are unsuccessful) and I cannot refer to columns/rows like I would do with a matrix.
I am expecting
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] NULL 3 3
[4,] NULL NULL 4
What is weird to me is that
mat <- cbind (list_lists[[1]], list_lists[[2]], list_lists[[3]])
seems to work. I would bet these two lines are the same, how can they be different?
They are different, lapply returns a list, See below from an excerpt from documentation
Use do.call instead of mat <- lapply (list_lists, cbind) as following:
mat <- do.call("cbind",list_lists)
do.call is same as cbind (list_lists[[1]], list_lists[[2]], list_lists[[3]]) , it happens to operate on a sequence of lists which would be dataframe columns.
> do.call("cbind",list_lists)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] NULL 3 3
[4,] NULL NULL 4
>
Understanding do.call:
From documentation:
do.call constructs and executes a function call from a name or a
function and a list of arguments to be passed to it.
lapply returns a list of the same length as X, each element of which
is the result of applying FUN to the corresponding element of X.
Search on r console for ?do.call and ?lapply
You can also read: do.call and lapply
Use sapply instead of lappy like this:
list_lists <- sapply (list_lists, function (x) {length (x) <- max_length; return (x)})
this should give you the matrix that you wanted. Seems like the sapply will recursively unlist each list in the list_lists then apply the function that you specified and wrap all the outputs into a matrix, effectively bypassing the other line that you specifie above.
The stri_list2matrix function should be able to handle this:
library(stringi)
stri_list2matrix(list_lists)
## [,1] [,2] [,3]
## [1,] "1" "1" "1"
## [2,] "2" "2" "2"
## [3,] NA "3" "3"
## [4,] NA NA "4"
Another option is to use your "max_length" to create the matrix:
ml <- max(lengths(list_lists))
do.call(cbind, lapply(list_lists, function(x) `length<-`(unlist(x), ml)))
## [,1] [,2] [,3]
## [1,] 1 1 1
## [2,] 2 2 2
## [3,] NA 3 3
## [4,] NA NA 4
A third option is to use melt from "reshape2":
library(reshape2)
dcast(melt(list_lists), L2 ~ L1)
## L2 1 2 3
## 1 1 1 1 1
## 2 2 2 2 2
## 3 3 NA 3 3
## 4 4 NA NA 4

sum rows in a nested list in R

I have a nested list coming out of a program its length is 100. I need to sum all elements of first row and all elements of 2nd row. Here a a small reproducible example. What I need is sum of 1+3+5+7= 16 and sum of 2+4+6+8= 20 as a vector or matrix.
l1<-as.matrix(c(1,2))
l2<-as.matrix(c(3,4))
l3<-as.matrix(c(5,6))
l4<-as.matrix(c(7,8))
ll1<-list(l1,l2)
ll2<-list(l3,l4)
lll<-list(ll1,ll2)
lll
[[1]]
[[1]][[1]]
[,1]
[1,] 1
[2,] 2
[[1]][[2]]
[,1]
[1,] 3
[2,] 4
[[2]]
[[2]][[1]]
[,1]
[1,] 5
[2,] 6
[[2]][[2]]
[,1]
[1,] 7
[2,] 8
I found the purrr package helpful for the function flatten() because it only removes one level of the hierarchy of the lists:
library(magrittr) #for pipes
library(purrr) #for flatten
lll %>% flatten %>% as.data.frame %>% rowSums
Based on akrun's answer it is similar to do.call(c, lll).
We can do this with base R by removing the nested list to a single list using do.call(c then cbind the elements of the list and get the rowSums
rowSums(do.call(cbind, do.call(c, lll)))
#[1] 16 20
Or otherwise we can unlist, create a matrix with 2 columns, and get the colSums
colSums(matrix(unlist(lll), ncol=2, byrow=TRUE))
#[1] 16 20
Reducein base R:
Reduce("+", lapply(Reduce(c, lll), rowSums))
#[1] 16 20

Create a list of matrices with a loop and merging them in R

I have several matrices, lets make it simple and say I have 3 matrices. I want to create a list of them and then use rbind to put one over the other.
If I do it by hand, using the following code, it works:
list<-list(matrix1,matrix2,matrix3)
test<-do.call("rbind",list)
and I get a matrix of 97947 rows by 4 columns which is what I want.
but if I do a loop, it does not work:
list2<-list()
for (i in 1:3)
{
y<-paste0("matrix",x)
list2[[x]] <- y
}
test2<-do.call("rbind",list2)
And I get a 3x1 character matrix ???
Can someone please point me to the error?
Any comments would be greatly appreciated.
Thank you!!!!
Consider using a function like mget to get all of your matrix objects from the globalenvironment (the default environment) and put them in a list. You can then use your do.call method and avoid the loop. Here is a toy example:
# Some data
m1 <- matrix( 1:4 , 2 , byrow = TRUE )
m2 <- matrix( 1:4 , 2 , byrow = TRUE )
m3 <- matrix( 1:4 , 2 , byrow = TRUE )
# Use mget to put them in a list. mget searches the .GlobalEnvironment (by default) for the object names in it's first argument
list <- mget( paste0( "m" , 1:3 ) )
list
#$m1
# [,1] [,2]
#[1,] 1 2
#[2,] 3 4
#$m2
# [,1] [,2]
#[1,] 1 2
#[2,] 3 4
#$m3
# [,1] [,2]
#[1,] 1 2
#[2,] 3 4
# rbind them
do.call( rbind , list )
# [,1] [,2]
#[1,] 1 2
#[2,] 3 4
#[3,] 1 2
#[4,] 3 4
#[5,] 1 2
#[6,] 3 4

Resources