Return list of lists from foreach loop in R - r

I have a function which returns a list of two objects (a list l and a number n). I want to loop over this function in a foreach loop.
create_lists <- function(){
l = sample(100, 5)
n = sample(100, 1)
return(list(l=l, n=n))}
Because create_lists has a list as ouput, this post told me to use a combine function which looks like this:
combine_custom <- function(list1, list2){
ls = c(list1$l, list2$l)
ns = c(list1$n, list2$n)
return(list(l = ls, n = ns))
}
So now my foreach loop looks like this:
m = foreach(i=1:5, .combine = combine_custom)%do%{
create_lists()}
My desired output would be:
m$l
[[1]]
[1] 100 25 86 21 28
[[2]]
[1] 78 37 79 41 61
[[3]]
[1] 73 22 78 94 13
[[4]]
[1] 15 28 76 78 52
[[5]]
[1] 32 93 92 2 1
m$n
[1] 52 56 3 79 82
But what I get is something like this:
$l
[1] 84 28 75 59 68 84 28 75 59 68
$n
[1] 31 91 18 98 39
So I have two problems:
1) Why is everything but two of the l lists dropped?
2) How can I make m$l to be a list of lists?
EDIT:
I tried another approach I got from here which does not use c:
combine_custom <- function(list1, list2){
ls = list1$l[[length(list1$l)+1]] = list(list2$l)
ns = c(list1$n, list2$n)
return(list(l = ls, n = ns))
}
But this gave the same result as described above, to be exact:
$l
$l[[1]]
[1] 65 84 48 81 82
$n
[1] 88 79 92 36 71

I have found another way which avoids the problem mentioned above, namely that combine has to create a new list first and later only append lists.
Also, the real function I am using actually returns a list of lists, so the following proved useful:
combine_custom <- function(list1, list2) {
if (plotrix::listDepth(list1$l) > plotrix::listDepth(list2$l)) {
ls <- c(list1$l, list(list2$l))
} else {
ls <- c(list(list1$l), list(list2$l))
}
ns <- c(list1$n, list2$n)
return(list(l = ls, n = ns))
}
This is not perfect if the function can return lists of varying nesting depths, but it works in my case.

The combine part is giving a lot of trouble, because on the first iteration, it needs to make a list out of two lists , but on the second iteration, it needs to append one list as an element to a list of lists.
Another approach (may or may not work depending on the size of your actual data/problem) is to use the purrr package for working with lists:
> m <- foreach(i=1:3)%do%{create_lists()}
> m
[[1]]
[[1]]$l
[1] 21 33 12 50 36
[[1]]$n
[1] 74
[[2]]
[[2]]$l
[1] 12 80 39 78 6
[[2]]$n
[1] 74
[[3]]
[[3]]$l
[1] 9 61 75 63 94
[[3]]$n
[1] 2
> purrr::transpose(m)
$l
$l[[1]]
[1] 21 33 12 50 36
$l[[2]]
[1] 12 80 39 78 6
$l[[3]]
[1] 9 61 75 63 94
$n
$n[[1]]
[1] 74
$n[[2]]
[1] 74
$n[[3]]
[1] 2
Hope that helps!

Thank you #Maria H., you solved my problem! The 'plotrix' package didn't work for me, but I used 'collapse' and it worked fine:
combine_custom1 <- function(a, b) {
if (collapse::ldepth(a) > collapse::ldepth(b)) {
ls <- c(a, list(b))
} else {
ls <- c(list(a), list(b))
}
return(ls)
}

Related

assign objects to dynamic lists in r

I have a nested loops which produce outputs that I want to store in list objects with dynamic names. A toy example of this would look as follows:
set.seed(8020)
names<-sample(LETTERS,5,replace = F)
for(n in names)
{
#Create the list
assign(paste0("examples_",n),list())
#Poulate the list
get(paste0("examples_",n))[[1]]<-sample(100,10)
get(paste0("examples_",n))[[2]]<-sample(100,10)
get(paste0("examples_",n))[[3]]<-sample(100,10)
}
Unfortunately I keep getting the error:
Error in get(paste0("examples_", n))[[1]] <- sample(100, 10) :
target of assignment expands to non-language object
I have tried all kind of assign, eval, get type of functions to parse the object, but haven't had any luck
Expanding on my comment with a worked example:
examples <- vector(mode="list", length=length(names) )
names(examples) <- names # please change that to mynames
# or almost anything other than `names`
examples <- lapply( examples, function(L) {L[[1]] <- sample(100,10)
L[[2]] <- sample(100,10)
L[[3]] <- sample(100,10); L} )
# Top of the output:
> examples
$P
$P[[1]]
[1] 34 49 6 55 19 28 72 42 14 92
$P[[2]]
[1] 97 71 63 59 66 50 27 45 76 58
$P[[3]]
[1] 94 39 77 44 73 15 51 78 97 53
$F
$F[[1]]
[1] 12 21 89 26 16 93 4 13 62 45
$F[[2]]
[1] 83 21 68 74 32 86 52 49 16 13
$F[[3]]
[1] 14 45 40 46 64 85 88 28 53 42
This mode of programming does become more natural over time. It gets you out of writing clunky for-loops all the time. Develop your algorithms for a single list-node at a time and then use sapply or lapply to iterate the processing.

Calculating mode with modeest package in R

I am using the below code for calculating the mode of a dataframe:
library(modeest)
apply(df[ ,2:length(df)], 1, mfv)
My data looks like this:
Item A B C
Book001 56 32 56
Book002 95 95 20
Book003 50 89 50
Book004 6 65 40
It gives me the following output:
[[1]]
[1] 56
[[2]]
[1] 95
[[3]]
[1] 50
[[4]]
[1] 6 40 65
This code is perfect only if the data contains a recurring term.
How can I display the mode as NA when there is no recurring term?
Let's try with a custom function:
foo <- function(x){
out <- mfv(x)
if(length(out) > 1) out <- NA
return(out)
}
apply(df[ ,2:length(df)], 1, foo)
# [1] 56 95 50 NA

R - Apply function with different argument value for each row/column of a matrix

I am trying to apply a function to each row or column of a matrix, but I need to pass a different argument value for each row.
I thought I was familiar with lapply, mapply etc... But probably not enough.
As a simple example :
> a<-matrix(1:100,ncol=10);
> a
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 11 21 31 41 51 61 71 81 91
[2,] 2 12 22 32 42 52 62 72 82 92
[3,] 3 13 23 33 43 53 63 73 83 93
[4,] 4 14 24 34 44 54 64 74 84 94
[5,] 5 15 25 35 45 55 65 75 85 95
[6,] 6 16 26 36 46 56 66 76 86 96
[7,] 7 17 27 37 47 57 67 77 87 97
[8,] 8 18 28 38 48 58 68 78 88 98
[9,] 9 19 29 39 49 59 69 79 89 99
[10,] 10 20 30 40 50 60 70 80 90 100
Let's say I want to apply a function to each row, I would do :
apply(a, 1, myFunction);
However my function takes an argument, so :
apply(a, 1, myFunction, myArgument);
But if I want my argument to take a different value for each row, I cannot find the right way to do it.
If I define a 'myArgument' with multiple values, the whole vector will obviously be passed to each call of 'myFunction'.
I think that I would need a kind of hybrid between apply and the multivariate mapply. Does it make sense ?
One 'dirty' way to achieve my goal is to split the matrix by rows (or columns), use mapply on the resulting list and merge the result back to a matrix :
do.call(rbind, Map(myFunction, split(a,row(a)), as.list(myArgument)));
I had a look at sweep, aggregate, all the *apply variations but I wouldn't find the perfect match to my need. Did I miss it ?
Thank you for your help.
You can use sweep to do that.
a <- matrix(rnorm(100),10)
rmeans <- rowMeans(a)
a_new <- sweep(a,1,rmeans,`-`)
rowMeans(a_new)
I don't think there are any great answers, but you can somewhat simplify your solution by using mapply, which handles the "rbind" part for you, assuming your function always returns the same sizes vector (also, Map is really just mapply):
a <- matrix(1:80,ncol=8)
myFun <- function(x, y) (x - mean(x)) * y
myArg <- 1:nrow(a)
t(mapply(myFun, split(a, row(a)), myArg))
I know the topic is quiet old but I had the same issue and I solved it that way:
# Original matrix
a <- matrix(runif(n=100), ncol=5)
# Different value for each row
v <- runif(n=nrow(a))
# Result matrix -> Add a column with the row number
o <- cbind(1:nrow(a), a)
fun <- function(x, v) {
idx <- 2:length(x)
i <- x[1]
r <- x[idx] / v[i]
return(r)
}
o <- t(apply(o, 1, fun, v=v)
By adding a column with the row number to the left of the original matrix, the index of the needed value from the argument vector can be received from the first column of the data matrix.

How to write function that takes uses the single ouput from another function as starting point for new analysis?

I'm having trouble writing a function that calls another function and uses the output as the basis for running new analysis in a loop (or equivalent). For example, let's say function 1 creates this output: 10. The second function would take that as a starting point to run new analysis. The single data point from the second output would then be the basis for the next round of analysis, and so on.
Here's a simple example. The question is how to create a for loop for this. Or perhaps there's a more efficient way using lapply. In any case, the first function might be as follows:
f.1 <-function(x) {
x
a <-seq(x,by=1,length.out=5)
a.1 <-tail(a,1)
}
The second function, which calls the first function, could run as follows:
f.2 <-function(x) {
f.1 <-function(x) {
a <-seq(x,by=1,length.out=5)
a.1 <-tail(a,1)
}
z <-f.1(x)
y=z+1
seq(y,by=1,length.out=5)
}
How can I modify f.2() so that it re-runs that computation using the previous output as the basis for the next round of analysis. To be precise, f.1(10) outputs:
[1] 14
In turn, f.2(10) results in:
[1] 15 16 17 18 19
How can I re-write f.2() so that it automatically computes f.2(19) on the next iteration, and continually do so for several loops. In the process, I'd like to collect the outputs in a separate file for review. Thanks much!
The magrittr library (which is used most notably by dplyr) makes this type of chaining somewhat simple. First, define the functions,
f.1 <-function(x) {
x
a <- seq(x, by=1, length.out=5)
a.1 <- tail(a,1)
}
f.2 <-function(x) {
y <- x+1
seq(y, by=1, length.out=5)
}
then
library(magrittr)
f.1(10) %>% f.2
# [1] 15 16 17 18 19
As #BondedDust mentioned, you could use Reduce although normally it expects to use the same function over and over so you just need to flip the most common use case
Reduce(function(x,f) f(x), list(f.1, f.2), init=10)
# [1] 15 16 17 18 19
You can try this with two arguments for f.2. The first argument is the x value that you need to initialize x with and n is the number of iterations that you want to do. The output of the function will be a matrix containing n rows and 5 columns.
f.2 <-function(x, n) {
c <- matrix(nrow=n, ncol=5)
for (i in 1:nrow(c))
{
z <-f.1(x) ##if you have already defined your f.1(x) beforehand, there is no need to define it again in f.2. you can simply use z <- f.1(x) like it is done here
y=z+1
c[i,] = seq(y, by=1, length.out=5)
x = c[i,5]
}
return(c)
}
The output of
f <- f.2(10, 10) ##initialising x with 10 and running 10 loops
f
[,1] [,2] [,3] [,4] [,5]
[1,] 15 16 17 18 19
[2,] 24 25 26 27 28
[3,] 33 34 35 36 37
[4,] 42 43 44 45 46
[5,] 51 52 53 54 55
[6,] 60 61 62 63 64
[7,] 69 70 71 72 73
[8,] 78 79 80 81 82
[9,] 87 88 89 90 91
[10,] 96 97 98 99 100

Apply over all columns and rows of two diffrent dataframes in R

I try to apply a function over all rows and columns of two dataframes but I don't know how to solve it with apply.
I think the following script explains what I intend to do and the way i tried to solve it. Any advice would be warmly appreciated! Please note, that the simplefunction is only intended to be an example function to keep it simple.
# some data and a function
df1<-data.frame(name=c("aa","bb","cc","dd","ee"),a=sample(1:50,5),b=sample(1:50,5),c=sample(1:50,5))
df2<-data.frame(name=c("aa","bb","cc","dd","ee"),a=sample(1:50,5),b=sample(1:50,5),c=sample(1:50,5))
simplefunction<-function(a,b){a+b}
# apply on a single row
simplefunction(df1[1,2],df2[1,2])
# apply over all colums
apply(?)
## apply over all columns and rows
# create df to receive results
df3<-df2
# loop it
for (i in 2:5)df3[i]<-apply(?)
My first mapply answer!! For your simple example you have...
mapply( FUN = `+` , df1[,-1] , df2[,-1] )
# a b c
# [1,] 60 35 75
# [2,] 57 39 92
# [3,] 72 71 48
# [4,] 31 19 85
# [5,] 47 66 58
You can extend it like so...
mapply( FUN = function(x,y,z,etc){ simplefunctioncodehere} , df1[,-1] , df2[,-1] , ... other dataframes here )
The dataframes will be passed in order to the function, so in this example df1 would be x, df2 would be y and z and etc would be some other dataframes that you specify in that order. Hopefully that makes sense. mapply will take the first row, first column values of all dataframes and apply the function, then the first row, second column of all data frames and apply the function and so on.
You can also use Reduce:
set.seed(45) # for reproducibility
Reduce(function(x,y) { x + y}, list(df1[, -1], df2[,-1]))
# a b c
# 1 53 22 23
# 2 64 28 91
# 3 19 56 51
# 4 38 41 53
# 5 28 42 30
You can just do :
df1[,-1] + df2[,-1]
Which gives :
a b c
1 52 24 37
2 65 63 62
3 31 90 89
4 90 35 33
5 51 33 45

Resources