Accessing dataframes within a nested list - r

I have a list of three different types of datasets, with ten datasets in each type. It looks like this:
mat1 <- replicate(n=10,data.frame(matrix(data=rnorm(20,0,1),nrow=5,ncol=5)),simplify=FALSE)
mat2 <- replicate(n=10,data.frame(matrix(data=rnorm(20,0,1),nrow=5,ncol=5)),simplify=FALSE)
mat3 <- replicate(n=10,data.frame(matrix(data=rnorm(20,0,1),nrow=5,ncol=5)),simplify=FALSE)
combined <- list(mat1,mat2,mat3)
I want to apply the same function to each of the datasets, but I can't figure out how to access them. I tried using map from purrr, but it only applies it to the first one in the list:
map(combined[[i]],~length(.))
[[1]]
[1] 5
[[2]]
[1] 5
[[3]]
[1] 5
[[4]]
[1] 5
[[5]]
[1] 5
[[6]]
[1] 5
[[7]]
[1] 5
[[8]]
[1] 5
[[9]]
[1] 5
[[10]]
[1] 5
How can I apply a function to all datasets in a nested list?
*The function is more complex than length - it's a function from another package that I need to access using ~function

You can apply lengths on each list in combined :
lapply(combined, lengths)
#[[1]]
# [1] 5 5 5 5 5 5 5 5 5 5
#[[2]]
# [1] 5 5 5 5 5 5 5 5 5 5
#[[3]]
# [1] 5 5 5 5 5 5 5 5 5 5
Using purrr's map :
purrr::map(combined, lengths)
If length is just an example and you want a general way to apply a function to each nested list you may use nested lapply :
lapply(combined, function(x) lapply(x, function(y) length(y)))
Or use rapply :
rapply(combined, length, how = 'list')

Related

Alternative to nested for loop in R without all possible combinations

Imagine I have this bit of nested for loop, which prints all combinations of a and b
a = c(1,2,3,4)
b = c(2,3,4,5)
for(i in a){
for(k in b){
print(i + k)
}}
So the output looks like this
[1] 3
[1] 4
[1] 5
[1] 6
[1] 4
[1] 5
[1] 6
[1] 7
[1] 5
[1] 6
[1] 7
[1] 8
[1] 6
[1] 7
[1] 8
[1] 9
How do I loop through the two loops to get a result with only 4 items, the sum of elements from a and b with the same index, akin to looping through a dictionary in Python? I would like to have result like this:
[1] 3
[1] 5
[1] 7
[1] 9
or this
[1] 3 5 7 9
Whereby I simply add up a and b like adding up two columns in a dataframe to produce a third of the same length.
I appreciate any help.
Try mapply:
mapply(`+`, a, b)
# [1] 3 5 7 9
We can replace + any other function, for example paste or *:
mapply(paste, a, b)
# [1] "1 2" "2 3" "3 4" "4 5"
mapply(`*`, a, b)
# [1] 2 6 12 20
In R, loops are wrapped into *apply functions, see:
Grouping functions (tapply, by, aggregate) and the *apply family
As pointed out in the comments, in R, mathematical operators such as + are Vectorized. This means that by default you can feed them vectors as arguments and they will know how to walk through the elements in each input vector. Therefore simply doing a + b will give the desired result. If you really want to do this as a loop, then you can don't need to nest it - simply take a single index, i, to pull elements from both input vectors. Another option that might be helpful here is purrr::map2() which applies the specified function across two input lists.
However it's worth noting that if you did want to see all pairwise combinations, you could use the outer() function.
# test vectors
a = c(1,2,3,4)
b = c(2,3,4,5)
# operate pairwise through the two vectors
a + b
#> [1] 3 5 7 9
# go through vectors as a loop
for(i in seq_along(a)){
print(a[i] + b[i])
}
#> [1] 3
#> [1] 5
#> [1] 7
#> [1] 9
# for more complex function can use purrr::map2 to run on two input lists
purrr::map2_dbl(.x = a, .y = b, `+`)
#> [1] 3 5 7 9
# operate on all combinations
outer(a, b, `+`)
#> [,1] [,2] [,3] [,4]
#> [1,] 3 4 5 6
#> [2,] 4 5 6 7
#> [3,] 5 6 7 8
#> [4,] 6 7 8 9
Created on 2022-04-13 by the reprex package (v2.0.1)

Using mapply to select from elements from a nested list using multiple arguments

Apologies if this has already been answered somewhere, but I checked all the pages I could find and can’t find a solution to this specific problem.
I want to use an apply function to select elements from lists nested within a list. The element I want to select from the sub-lists vary based on a arguments contained in a separate list. Here is some example code to illustrate what I am trying to do:
# Set seed for replicable results
set.seed(123)
# Create list of lists populated with randomly generated numbers
list1 <- list()
for (i in 1:10) {
list1[[i]] <- as.list(sample.int(20, 10))
}
# Create second randomly generated list
list2 <- as.list(sample.int(10, 10))
# For loop with uses values from list 2 to call specific elements from sub-lists within list1
for (i in 1:10){
print(list1[[i]][[list2[[i]]]])
}
####################################################################################
[1] 4
[1] 8
[1] 5
[1] 8
[1] 15
[1] 17
[1] 12
[1] 15
[1] 3
[1] 15
As you can see, I can use a for loop to successfully to select elements from the sub-lists nested within list1 using values from list2 in combination with the iterating value i.
Solutions offered to questions like this (R apply function with multiple parameters), suggest that I should be able to achieve this same result using the mapply function. However, when I try to do this I get the following error:
# Attempt to replicate output using mapply
mapply(function(x,y,z) x <- x[[z]][[y[[z]]]], x=list1, y=list2, z=1:10 )
####################################################################################
Error in x[[z]][[y[[z]]]] : subscript out of bounds
My questions are:
How can my code can be altered to achieve the desired outcome?
What is causing this error? I have had similar problems with mapply in the past, when I have tried to input one or more lists alongside a vector, and have never been able to work out why it sometimes fails.
Many thanks in advance!
Try this. It is better to use a function to catch the desired values. The reason why you got an error is because functions works different when using indexing. It is better to set the function directly inside the *apply() sketch to reach the desired outcome. Here the code:
#Code
unlist(mapply(function(x,y) x[y],x=list1,y=list2))
Output:
[1] 4 8 5 8 15 17 12 15 3 15
Or if you want the output in a list:
#Code 2
List <- mapply(function(x,y) x[y],x=list1,y=list2)
Output:
List
[[1]]
[1] 4
[[2]]
[1] 8
[[3]]
[1] 5
[[4]]
[1] 8
[[5]]
[1] 15
[[6]]
[1] 17
[[7]]
[1] 12
[[8]]
[1] 15
[[9]]
[1] 3
[[10]]
[1] 15
Another simplified options can be (Many thanks and all credit to #27ϕ9):
#Code3
mapply(`[[`, list1, list2)
Output:
[1] 4 8 5 8 15 17 12 15 3 15
Or:
#Code4
mapply(`[`, list1, list2)
Output:
[[1]]
[1] 4
[[2]]
[1] 8
[[3]]
[1] 5
[[4]]
[1] 8
[[5]]
[1] 15
[[6]]
[1] 17
[[7]]
[1] 12
[[8]]
[1] 15
[[9]]
[1] 3
[[10]]
[1] 15
If you look at your for loop there is only one variable which is changing i.e i. So in this case you can use lapply or even sapply since you are getting a single number back.
sapply(1:10, function(i) list1[[i]][[list2[[i]]]])
#[1] 4 8 5 8 15 17 12 15 3 15

Replacing values in a list based on a condition

I have a list of values called squares and would like to replace all values which are 0 to a 40.
I tried:
replace(squares, squares==0, 40)
but the list remains unchanged
If it is a list, then loop through the list with lapply and use replace
squares <- lapply(squares, function(x) replace(x, x==0, 40))
squares
#[[1]]
#[1] 40 1 2 3 4 5
#[[2]]
#[1] 1 2 3 4 5 6
#[[3]]
#[1] 40 1 2 3
data
squares <- list(0:5, 1:6, 0:3)
I think for this purpose, you can just treat it as if it were a vector as follows:
squares=list(2,4,6,0,8,0,10,20)
squares[squares==0]=40
Output:
[[1]]
[1] 2
[[2]]
[1] 4
[[3]]
[1] 6
[[4]]
[1] 40
[[5]]
[1] 8
[[6]]
[1] 40
[[7]]
[1] 10
[[8]]
[1] 20

How to check if the given value belong to the vectors in list?

Suppose we have a value y=4, and a list of vectors, I want to check if this value belongs to any vector in the list if yes, I will add this value to all the elements of vectors.
y<-4
M<- list( c(1,3,4,6) , c(2,3,5), c(1,3,6) ,c(1,4,5,6))
> M
[[1]]
[1] 1 3 4 6
[[2]]
[1] 2 3 5
[[3]]
[1] 1 3 6
[[4]]
[1] 1 4 5 6
The outcomes will be similar to :
> R
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10
We can use keep which only keeps elements that satisfy a predicate. In this case, it is only keeping the vectors that contain y.
We then add y to each of the vectors.
library('tidyverse')
keep(M, ~y %in% .) %>%
map(~. + y)
Here is a simple hacky way to do this:
lapply(M[sapply(M, function(x){y %in% x})],function(x){x+y})
returning:
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10
Logic: use sapply to work out which parts of M have a 4 in, then add 4 to those with lapply
You can do this with...
lapply(M[sapply(M, `%in%`, x=y)], `+`, y)
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10
Here is a method with lapply and set functions.
# loop through M, check length of intersect
myList <- lapply(M, function(x) if(length(intersect(y, x)) > 0) x + y else NULL)
# now subset, dropping the NULL elements
myList <- myList[lengths(myList) > 0]
this returns
myList
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10
Wow! everyone has given great answers, just including the use of Map functionality.
Map("+",M[unlist(Map("%in%", y,M))],y)
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10

R: enumerating sequences of permutations

I want to enumerate the distinct sequences of different permutations, and I'm using the function permn. I understand for 2!, I can just use permn(2) and that will enumerate 1, 2 and 2, 1.
> library(combinat)
> permn(2)
[[1]]
[1] 1 2
[[2]]
[1] 2 1
I want to do the same thing for the numbers 7 and 8. So what should I pass into the function so that it will return something like this?
> permn(...)
[[1]]
[1] 7 8
[[2]]
[1] 8 7
permn(c(7,8))
#[[1]]
#[1] 7 8
#
#[[2]]
#[1] 8 7

Resources