foreach: Keep names - r

Is there a way to make foreach() return a named list/data.frame. E.g.
foo <- list(a = 1, b = 2)
bar <- foreach (x = foo) %do% { x * 2 }
returns list(2, 4). I would want it to return list(a = 2, b = 4).
Plus, is there a way to access the name from within the loop body?
(I'm not interested in a solution which assigns the names after the foreach loop.)
Regards

I was using your solution until I needed to use a nested foreach (with the %:% operator). I came up with this:
foo <- list(a = 1, b = 2)
bar <- foreach (x = foo, .final = function(x) setNames(x, names(foo))) %do% {
x * 2
}
The trick is to use the .final argument (which is a function applied once at the final result) to set the names. This is nicer because it doesn't use a temporary variable and is overall cleaner. It works with nested lists so that you can preserve the names accross several layers of the structure.
Note that this only works correctly if foreach has the argument .inorder=T (which is the default).

Thanks for your recommendations. This is what I came up with:
foo <- list(a = 1, b = 2)
bar <- foreach (x = foo, n = names(foo), .combine = c) %do% {
rv <- list()
rv[[n]] <- x * 2
rv
}

I used to work with foreach, but having switched to purrr I am not looking back.
purrr provides a family of mapping functions which make working with named lists (and lists, and data.frames...) a complete breeze. All mapping operations preserve names.
library("purrr")
foo <- list(a = 1, b = 2)
bar <- map(foo, ~ .x * 2)
# $a
# [1] 2
# $b
# [1] 4
Concise and clear, you can't really beat that.
Plus, is there a way to access the name from within the loop body?
There sure is: ?imap
bar <- imap(foo, function(elem, name) {
print(paste("Processing", name))
elem * 2
})
Or anonymous style:
bar <- imap(foo, ~ {print(paste("Processing", .y)) ; .x * 2})

Related

Return a named list with various elements from function call

Question
I have a function like this:
myfunc <- function(x){
a1 = 1
a2 = c(2,4)
a3 = data.frame(x = 1:10)
...
an = 'str'
res = list(a1 = a1,a2 = a2,..., an=an)
return(res)
}
As we can see, I return my results with a named list. However, if the number of elements is large, I cannot type a_i = a_i one by one. I use the code snippet below to save half of my time(but I still need to type " around my elements' name, it's a waste of time):
res_short = sapply(c('a1','a2',...,'an'),FUN = function(x){list(get(x))})
return(res_short)
Note that there may not exist a pattern in my elements' name a1,a2,...,an, I just use a1,a2...,an to be simplified.
I think I return with a named list is good, since list can store different types of elements. Is there any other methods to write my function return? I want to be clear and time-saving!
mget Use mget as shown below. To return all variables use mget(ls()) or to return all variables except x use mget(setdiff(ls(), "x")). ls will not return object names that begin with a dot unless the all argument is used, i.e. ls(all = TRUE), which could be used to prevent certain variables from being returned. Another possibility is to use the mode= argument of mget to restrict the objects returned to ones that are numeric, say. See ?mget. Yet another approach to restrict the objects returned is to use Filter on the result of mget. For example, res <- Filter(is.data.frame, mget(ls())) only returns data frames.
myfunc <- function(x){
a1 = 1
a2 = c(2,4)
a3 = data.frame(x = 1:10)
an = 'str'
res = mget(ls(pattern = "^a"))
return(res)
}
myfunc(3) # test
environment Another possibility is to return the environment within the executing function. All objects in the function (not just the ones beginning with a) will be in the environment.
myfunc2 <- function(x) {
a1 = 1
a2 = c(2,4)
a3 = data.frame(x = 1:10)
an = 'str'
res = environment()
return(res)
}
out <- myfunc2(3) # test
out$a
within Another possibility is to use within. Only variables created in the within will be returned. x is used in the within but not created in the within so it is not returned.
myfunc3 <- function(x) {
res <- within(list(), {
a1 <- x
a2 <- BOD
})
return(res)
}
myfunc3(3) # test
Multiple ls Perform an ls() before and after the section creating the variables to be output and then mget the difference.
myfunc4 <- function(x) {
.excl <- ls()
a1 <- x
a2 <- BOD
res <- mget(setdiff(ls(), .excl))
return(res)
}
myfunc4(3) # test
If I understand it correctly, your requirements are very flexible. You have a bunch of variables with names that have no pattern. You want to apply a different computation for each variable. Well, you realize that you do need to type everything in at least once. One approach is to have a list of all possible variable names and their computations. You can then apply all of them, or a subset to your input. Here is an example for 3 names with 3 different computations.
mycomputer = list(
add5 = function(x) {
x + 5
},
mymean = function(x) {
mean(x)
},
square = function(x) {
x*x
}
)
computeall = function(x) {
result = lapply(names(mycomputer), function(f) {
mycomputer[[f]](x)
})
names(result) = names(mycomputer)
result
}
computeall(c(1,2,3))
## $add5
## [1] 6 7 8
##
## $mymean
## [1] 2
##
## $square
## [1] 1 4 9

iterate / compose function onto itself n times

Supposed I have an arbitrary function
foo = function(a,b) {a+b}
How can I iterate this function onto itself n times?
foo(foo(foo(foo(x, 1), 2), 3, 4)
I am looking at purrr:compose but it doesn't look hopeful for arbitrary n. purrr:reduce feels like it will come into play also... but I'm stumped.
Here is a pure purrr version, that is really functional, as you said reduce comes in handy here, since compose is just a function and functions are just elements you can reduce functions by composing them. To just fill one argument use partial.
foo_n <- reduce(map(1:n, ~partial(foo, b=.x)), compose)
You can also just append results of each foo(a,b) function into a numeric vector and then pick up the last result.
Let's x = 1 and bs are elements of 1:4:
x = 1
n = 4
out = vector("numeric")
steps = seq(1, 4, by = 1)
for( b in steps){
## initial value
if (length(out) == 0){
out = append(out, values = foo(x, b) )
}else{
out = append(out, values = foo( tail( out, 1), b) )
}
}
tail(out, 1)

List objects as function arguments with overridable list element defaults

I have an R function which takes a large number of arguments (18) which I would like to pass in as a list. When I am running this function by hand, so to speak, I generally want to use all the defaults but one or two, but I also want to run this same function many times with various combinations of default and non-default items, which I would like to assemble programmatically as lists.
I know that I could just have my 18+ arguments as individual formals and then assemble them into a list inside the function, but I wish I could have a list as a default for a formal, and then have the elements have defaults as well. Like this:
> f <<- function(x, y = list(a=0, b=3)) {with(y, (x + a + b))}
> f(1)
[1] 4
> f(x=1, y$a = 1)
Error: unexpected '=' in "f(x=1, y$a ="
(or alternatively)
In y$a <- 1 :
Error in eval(substitute(expr), data, enclos = parent.frame()) :
object 'a' not found
except with the output of 5 rather than an error. I suspect there is no way to do this, because R does not recognise the assignments in the list as creating defaults, but only as creating named elements. But maybe with the assignment form of formals? or through some clever use of do.call?
Here are some alternatives:
1) modifyList Use modifyList to process the defaults.
f1 <- function(x, y = list()) {
defaults <- list(a = 0, b = 3)
with(modifyList(defaults, y), {
x + a + b
})
}
f1(x = 1)
## [1] 4
f1(x = 1, y = list(a = 1))
## [1] 5
2) do.call Another possibility is to have two functions. The first does not use a list and the second (which is the one the user calls) does using do.call to invoke the first.
f2impl <- function(x, a = 0, b = 3) x + a + b
f2 <- function(x, y = list()) do.call("f2impl", c(x, y))
f2(x = 1)
## [1] 4
f2(x = 1, y = list(a = 1))
## [1] 5

How to return elements of R list as independent objects in function environment?

Similar to this question:
Return elements of list as independent objects in global environment
I cannot seem to adapt the answer to assign the list elements when list2env is called inside a function:
E.g.
lst <- list(a = c(1, 2), b = c(3, 4))
tmp_fn <- function(lst) {
# do computations on list elements
# first assign each to the function environment
list2env(lst, parent = parent.frame()) # fails
# do stuff
...
}
I thought the parent = parent.frame() would work, but while debugging tmp_fn I only see that lst gets assigned to the function environment as a list, not the individual variables a and b.
1) Use envir= here rather than parent= like this. Also, as shown, you may wish to add envir as an argument for flexibility:
lst <- list(a = c(1, 2), b = c(3, 4))
tmp_fn <- function(lst, envir = parent.frame()) {
invisible(list2env(lst, envir = envir))
}
tmp_fn(lst)
2) Another possibility is to use list[...]<- from the gsubfn package (development version):
devtools::install_github("ggrothendieck/gsubfn")
library(gsubfn)
func <- function(lst) lst
list[a, b] <- func(lst)
Now a and b will be in the current environment.

Apply family of functions for functions with multiple arguments

I would like to use a function from the apply family (in R) to apply a function of two arguments to two matrices. I assume this is possible. Am I correct? Otherwise, it would seem that I have to put the two matrices into one, and redefine my function in terms of the new matrix.
Here's an example of what I'd like to do:
a <- matrix(1:6,nrow = 3,ncol = 2)
b <- matrix(7:12,nrow = 3,ncol = 2)
foo <- function(vec1,vec2){
d <- sample(vec1,1)
f <- sample(vec2,1)
result <- c(d,f)
return(result)
}
I would like to apply foo to a and b.
(Strictly answering the question, not pointing you to a better approach for you particular use here....)
mapply is the function from the *apply family of functions for applying a function while looping through multiple arguments.
So what you want to do here is turn each of your matrices into a list of vectors that hold its rows or columns (you did not specify). There are many ways to do that, I like to use the following function:
split.array.along <- function(X, MARGIN) {
require(abind)
lapply(seq_len(dim(X)[MARGIN]), asub, x = X, dims = MARGIN)
}
Then all you have to do is run:
mapply(foo, split.array.along(a, 1),
split.array.along(b, 1))
Like sapply, mapply tries to put your output into an array if possible. If instead you prefer the output to be a list, add SIMPLIFY = FALSE to the mapply call, or equivalently, use the Map function:
Map(foo, split.array.along(a, 1),
split.array.along(b, 1))
You could adjust foo to take one argument (a single matrix), and use apply in the function body.
Then you can use lapply on foo to sample from each column of each matrix.
> a <- matrix(1:6,nrow = 3,ncol = 2)
> b <- matrix(7:12,nrow = 3,ncol = 2)
> foo <- function(x){
apply(x, 2, function(z) sample(z, 1))
}
> lapply(list(a, b), foo)
## [[1]]
## [1] 1 6
## [[2]]
## [1] 8 12

Resources