I have a problem with elipsis usecase. My function accepts list of objects, let's call them objects of class "X". Now, objects X are being processed inside of my function to class "Xs", so I have list of "Xs" objects. Function that I import from other package can compute multiple "Xs" objects at once but they have to be enumerated (elipsis mechanic), not passed as list. Is there a way how to solve it? I want something like this
examplefun <- function(charlist){
nums <- lapply(charlist, as.numeric)
sum(... = nums)
}
Of course example above throws an error but it shows what i want to achieve. I tried to unlist with recursive = FALSE ("X" and "Xs" are the list itself) but it does not work.
If there is no solution then:
Let's assume I decideed to accept ... insted of list of "X" objects. Can I modify elipsis elements (change them to "Xs") and then pass to function that accepts elipsis? So it will look like this:
examplefun2 <- function(...){
function that modify object in ... to "Xs" objects
sum(...)
}
In your first function, just call sum directly because sum works correctly on vectors of numbers instead of individual numbers.
examplefun <- function (charlist) {
nums <- vapply(charlist, as.numeric, numeric(1L))
sum(nums)
}
(Note the use of vapply instead of lapply: sum expects an atomic vector, we can’t pass a list.)
In your second function, you can capture ... and work with the captured variable:
examplefun2 <- function (...) {
nums <- as.numeric(c(...))
sums(nums)
}
For more complex arguments, Roland’s comment is a good alternative: Modify the function arguments as a list, and pass it to do.call.
Related
When working with packages like openxlsx, I often find myself writing repetetive code such as defining the wb and sheet arguments with the same values.
To respect the DRY principle, I would like to define one variable that contains multiple arguments. Then, when I call a function, I should be able to provide said variable to define multiple arguments.
Example:
foo <- list(a=1,b=2,c=3)
bar <- function(a,b,c,d) {
return(a+b+c+d)
}
bar(foo, d=4) # should return 10
How should the foo() function be defined to achieve this?
Apparently you are just looking for do.call, which allows you to create and evaluate a call from a function and a list of arguments.
do.call(bar, c(foo, d = 4))
#[1] 10
How should the foo() function be defined to achieve this?
You've got it slightly backwards. Rather than trying to wrangle the output of foo into something that bar can accept, write foo so that it takes input in a form that is convenient to you. That is, create a wrapper function that provides all the boilerplate arguments that bar requires, without you having to specify them manually.
Example:
bar <- function(a, b, c, d) {
return(a+b+c+d)
}
call_bar <- function(d=4) {
bar(1, 2, 3, d)
}
call_bar(42) # shorter than writing bar(1, 2, 3, 42)
I discovered a solution using rlang::exec.
First, we must have a function to structure the dots:
getDots <- function(...) {
out <- sapply(as.list(match.call())[-1], function(x) eval(parse(text=deparse(x))))
return(out)
}
Then we must have a function that executes our chosen function, feeding in our static parameters as a list (a, b, and c), in addition to d.
execute <- function(FUN, ...) {
dots <-
getDots(...) %>%
rlang::flatten()
out <- rlang::exec(FUN, !!!dots)
return(out)
}
Then calling execute(bar, abc, d=4) returns 10, as it should do.
Alternatively, we can write bar %>% execute(abc, d=4).
Let me give you an example!
How to get two or more return values from a function
Method 1: Set global variables, so that if you change global variables in formal parameters, it will also be effective in actual parameters. So you can change the value of multiple global variables in the formal parameter, then in the actual parameter is equivalent to returning multiple values.
Method 2: If you use the array name as a formal parameter, then you change the contents of the array, such as sorting, or perform addition and subtraction operations, and it is still valid when returning to the actual parameter. This will also return a set of values.
Method 3: Pointer variables can be used. This principle is the same as Method 2, because the array name itself is the address of the first element of the array. Not much to say.
Method 4: If you have learned C++, you can quote parameters
You can try these four methods here, I just think the problem is a bit similar, so I provided it to you, I hope it will help you!
Here is a simple example of a closure which is a function returning a function with embedded data (After http://adv-r.had.co.nz/Functional-programming.html#closures):
fFactory <- function(letter) {
function(Param) {
paste("Enclosed variable:", letter, "/ function parameter:", Param)
}
}
When the function is created, letter is used in the returned function:
> FUN <- fFactory("a")
> FUN("toto")
[1] "Enclosed variable: a / function parameter: toto"
It works because the variable letter is embedded in the environment of the function:
as.list(environment(FUN))
$letter
[1] "a"
If now we create functions in a list like this:
l <- list()
for(letter in letters) {
l[[letter]]$FUN <- fFactory(letter)
}
Normally, running the function for the item "a" must return the same result as before, but it's not the case:
> l[["a"]]$FUN("toto")
[1] "Enclosed variable: z / function parameter: toto"
Obviously because the environment embedded in the function is not the one we expected:
> as.list(environment(l[["a"]]$FUN))
$letter
[1] "z"
It returns the last closure created in the last item of the list for all closures in the list.
I suppose that I didn't misused the R language by doing so and that there is a bug in the language. Any of you can confirm that or explain me where is my mistake?
Force the evaluation of argument letter with, well, force().
fFactory2 <- function(letter) {
force(letter)
function(Param) {
paste("Enclosed variable:", letter, "/ function parameter:", Param)
}
}
l2 <- list()
for(letter in letters) {
l2[[letter]]$FUN <- fFactory2(letter)
}
l2[["a"]]$FUN("toto")
l2[["b"]]$FUN("toto")
l2[["w"]]$FUN("toto")
Here's an explanation (After #user2554330 answer):
In R, arguments to functions aren't evaluated until first used. So the arguments to all of the functions in your list are the global variable letter, which you change in the loop as you create them, but you never evaluate until you call them. So the functions first evaluate letter at the time of the first call, and you get strange results.
This is your error. #RuiBarradas gives you the fix. Here's an explanation:
In R, arguments to functions aren't evaluated until first used. So the arguments to all of the functions in your list are the global variable letter, which you change in the loop as you create them, but you never evaluate until you call them. So the functions first evaluate letter at the time of the first call, and you get strange results.
You can fix this problem in the way Rui said: force the argument to be evaluated before you create the function.
This code is about inverting an index using clusters.
Unfortunately I do not understand the line with recognize<-...
I know that the function Vectorize applies the inner function element-wise, but I do not understand the inner function here.
The parameters (uniq, test) are not defined, how can we apply which then? Also why is there a "uniq" as text right after?
slots <- as.integer(Sys.getenv("NSLOTS"))
cl <- makeCluster(slots, type = "PSOCK")
inverted_index4<-function(x){
y <- unique(x)
recognize <- Vectorize(function(uniq,text) which(text %in% uniq),"uniq",SIMPLIFY = F)
y2 <- parLapply(cl, y, recognize, x)
unlist(y2,recursive=FALSE)
}
The
Vectorise()
function is just making a new element wise, vectorised function of the custom function
function(uniq,text) which(text %in% uniq).
The 'uniq' string is the argument of that function that you must specify you want to iterate over. Such that now you can pass a vector of length greater than one for uniq, and get returned a list with an element for the output of the function evaluated for every element of the input vector uniq.
I would suggest the author make the code a little clearer, better commented etc. the vectorise function doesn't need to be inside the function call necessarily.
Note
ParLapply()
isn't a function I recognise. But the x will be passed to the recognise function and the second argument text should presumably be defined earlier on, in the global environment, .GlobalEnv().
If I want to create a named list, where I have named literals, I can just do this:
list(foo=1,bar=2,baz=3)
If instead I want to make a list with arbitrary computation, I can use lapply, so for example:
lapply(list(1,2,3), function(x) x)
However, the list generated by lapply will always be a regular numbered list. Is there a way I can generate a list using a function like lapply with names.
My idea is something along the lines of:
lapply(list("foo","bar","baz), function(key) {key=5}
==>
list(foo=5,bar=5,baz=5)
That way I don't have to have the keys and values as literals.
I do know that I could do this:
res = list()
for(key in list("foo","bar","baz") {
res[key] <- 5;
}
But I don't like how I have to create a empty list and mutate it to fill it out.
Edit: I would also like to do some computation based on the key. Something like this:
lapply(c("foo","bar","baz"), function(key) {paste("hello",key)=5})
sapply will use its argument for names if it is a character vector, so you can try:
sapply(c("foo","bar","baz"), function(key) 5, simplify=F)
Which produces:
$foo
[1] 5
$bar
[1] 5
$baz
[1] 5
If your list has names in the first place, lapply will preserve them
lapply(list(a=1,b=2,c=3), function(x) x)
or you can set names before or after with setNames()
#before
lapply(setNames(list(1,2,3),c("foo","bar","baz")), function(x) x)
#after
setNames(lapply(list(1,2,3), function(x) x), c("foo","bar","baz"))
One other "option" is Map(). Map will try to take the names from the first parameter you pass in. You can ignore the value in the function and use it only for the side-effect of keeping the name
Map(function(a,b) 5, c("foo","bar","baz"), list(1:3))
But names cannot be changed during lapply/Map steps. They can only be copied from another location. if you need to mutate names, you'll have to do that as a separate step.
I have a question regarding passing multiple arguments to a function, when using lapply in R.
When I use lapply with the syntax of lapply(input, myfun); - this is easily understandable, and I can define myfun like that:
myfun <- function(x) {
# doing something here with x
}
lapply(input, myfun);
and elements of input are passed as x argument to myfun.
But what if I need to pass some more arguments to myfunc? For example, it is defined like that:
myfun <- function(x, arg1) {
# doing something here with x and arg1
}
How can I use this function with passing both input elements (as x argument) and some other argument?
If you look up the help page, one of the arguments to lapply is the mysterious .... When we look at the Arguments section of the help page, we find the following line:
...: optional arguments to ‘FUN’.
So all you have to do is include your other argument in the lapply call as an argument, like so:
lapply(input, myfun, arg1=6)
and lapply, recognizing that arg1 is not an argument it knows what to do with, will automatically pass it on to myfun. All the other apply functions can do the same thing.
An addendum: You can use ... when you're writing your own functions, too. For example, say you write a function that calls plot at some point, and you want to be able to change the plot parameters from your function call. You could include each parameter as an argument in your function, but that's annoying. Instead you can use ... (as an argument to both your function and the call to plot within it), and have any argument that your function doesn't recognize be automatically passed on to plot.
As suggested by Alan, function 'mapply' applies a function to multiple Multiple Lists or Vector Arguments:
mapply(myfun, arg1, arg2)
See man page:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/mapply.html
You can do it in the following way:
myfxn <- function(var1,var2,var3){
var1*var2*var3
}
lapply(1:3,myfxn,var2=2,var3=100)
and you will get the answer:
[[1]]
[1] 200
[[2]]
[1] 400
[[3]]
[1] 600
myfun <- function(x, arg1) {
# doing something here with x and arg1
}
x is a vector or a list and myfun in lapply(x, myfun) is called for each element of x separately.
Option 1
If you'd like to use whole arg1 in each myfun call (myfun(x[1], arg1), myfun(x[2], arg1) etc.), use lapply(x, myfun, arg1) (as stated above).
Option 2
If you'd however like to call myfun to each element of arg1 separately alongside elements of x (myfun(x[1], arg1[1]), myfun(x[2], arg1[2]) etc.), it's not possible to use lapply. Instead, use mapply(myfun, x, arg1) (as stated above) or apply:
apply(cbind(x,arg1), 1, myfun)
or
apply(rbind(x,arg1), 2, myfun).