I want to use ellipsis parameters inside map function of purrr package. this is a toy example:
f1<-function(x,a=NA,b=NA,prs=seq(0, 1, 0.25),SW=T){
if(SW){
res<-data.frame(name1=a,name2=b,t(quantile(x, prs, na.rm = T)), mean=mean(x, na.rm = T), sd=sd(x, na.rm = T),
NAs=length(x[is.na(x)]),n=length(x[!is.na(x)]),SWp=shapiro.test(x)$p.value,stringsAsFactors =F)
}else
{
res<-data.frame(name1=a,name2=b,t(quantile(x, prs, na.rm = T)), mean=mean(x, na.rm = T), sd=sd(x, na.rm = T),
NAs=length(x[is.na(x)]),n=length(x[!is.na(x)]),stringsAsFactors =F)
}
return(res)
}
f1(c(NA,rnorm(25),NA),SW=F)
f1(c(NA,rnorm(25),NA))
now I want to use f1 inside another function f2:
f2<-function(df,...){
res<-map_df(colnames(df),~f1(df[,.],a=.,...))
return(res)
}
where ... is intended mainly to manipulate SW and a or b parameters in f1 function. however f2 is not doing what I want as can be seen here
f2(iris[,-5])
f2(iris[,-5],SW=F)
I appreciate any guide in how to use addecuatelly ... inside map
You just need to pass the ellipses through the map_df() call as well. Otherwise they can't get into the inner f1() call.
f2 <- function(df, ...){
res <- map_df(colnames(df), ~f1(df[,.], a=., ...), ...)
return(res)
}
You can also capture the ellipses early on in your second function, and use do.call to add them to your first function later on. This makes it more explicit where and how they are used.
f2 <- function(df, ...){
params <- list(...)
res <- map_df(colnames(df), ~ do.call(
f1, c(list(x = df[,.], a=.), params)))
return(res)
}
MrFlick solution did not work for me: I think indeed you also need to pass the ... to the anonymous function, which then requires using function(x,...) instead of ~ (as suggested by #dmi3kno).
That means you need the quite surprising triple ... call:
map(x, function(x, ...) mean(x, trim=0, ...), ...)
Example:
library(purrr)
x <- list(c(1,2), c(1,2,NA))
fo1 <- function(...) map(x, ~mean(., trim=0, ...), ...)
fo2 <- function(...) map(x, function(x, ...) mean(x, trim=0, ...), ...)
fo1()
#> Warning in if (na.rm) x <- x[!is.na(x)]: the condition has length > 1 and only
#> the first element will be used
#> Warning in if (na.rm) x <- x[!is.na(x)]: the condition has length > 1 and only
#> the first element will be used
#> [[1]]
#> [1] 1.5
#>
#> [[2]]
#> [1] 1.5
fo2()
#> [[1]]
#> [1] 1.5
#>
#> [[2]]
#> [1] NA
fo2(na.rm=TRUE)
#> [[1]]
#> [1] 1.5
#>
#> [[2]]
#> [1] 1.5
Created on 2020-11-16 by the reprex package (v0.3.0)
For this issue, I've found that rlang::exec() allows you to pass ... to purrr::map() when combined with an anonymous function, like this:
f2 <- function(df, ...){
res <- map(colnames(df), function(x) rlang::exec("f1", df[,x], ...))
return(res)
}
Related
Part of a custom function I am trying to create allows the user to provide a function as a parameter. For example
#Custom function
result <- function(.func){
do.call(.func, list(x,y))
}
#Data
x <- 1:2
y <- 0:1
#Call function
result(.func = function(x,y){ sum(x, y) })
However, the code above assumes that the user is providing a function with arguments x and y. Is there a way to use do.call (or something similar) so that the user can provide a function with different arguments? I think that the correct solution might be along the lines of:
#Custom function
result <- function(.func){
do.call(.func, formals(.func))
}
#Data
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
#Call function
result(.func = function(m,n){ sum(m, n) })
result(.func = function(x,y,z){ sum(x,y,z) })
But this is not it.
1) Use formals/names/mget to get the values in a list. An optional argument, envir, will allow the user to specify the environment that the variables are located in so it knows where to look. The default if not specified is the parent frame, i.e. the caller.
result1 <- function(.func, envir = parent.frame()) {
do.call(.func, mget(names(formals(.func)), envir))
}
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
result1(.func = function(m,n) sum(m, n) )
## [1] 9
result1(.func = function(x,y,z) sum(x,y,z) )
## [1] 14
result1(function(Time, demand) Time + demand, list2env(BOD))
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
1a) Another possibility is to evaluate the body. This also works if envir is specified as a data frame whose columns are to be looked up.
result1a <- function(.func, envir = parent.frame()) {
eval(body(.func), envir)
}
result1a(.func = function(m,n) sum(m, n) )
## [1] 9
result1a(.func = function(x,y,z) sum(x,y,z) )
## [1] 14
result1a(function(Time, demand) Time + demand, BOD)
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
2) Another design which is even simpler is to provide a one-sided formula interface. Formulas have environments so we can use that to look up the variables.
result2 <- function(fo, envir = environment(fo)) eval(fo[[2]], envir)
result2(~ sum(m, n))
## [1] 9
result2(~ sum(x,y,z))
## [1] 14
result2(~ Time + demand, BOD)
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
3) Even simpler yet is to just pass the result of the computation as an argument.
result3 <- function(x) x
result3(sum(m, n))
## [1] 9
result3(sum(x,y,z))
## [1] 14
result3(with(BOD, Time + demand))
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
This works.
#Custom function
result <- function(.func){
do.call(.func, lapply(formalArgs(.func), as.name))
}
#Data
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
#Call function
result(.func = function(m,n){ sum(m, n) })
result(.func = function(x,y,z){ sum(x,y,z) })
This seems like a bit of a pointless function, since the examples in your question imply that what you are trying to do is evaluate the body of the passed function using variables in the calling environment. You can certainly do this easily enough:
result <- function(.func){
eval(body(.func), envir = parent.frame())
}
This gives the expected results from your examples:
x <- 1:2
y <- 0:1
result(.func = function(x,y){ sum(x, y) })
#> [1] 4
and
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
result(.func = function(m,n){ sum(m, n) })
#> [1] 9
result(.func = function(x,y,z){ sum(x,y,z) })
#> [1] 14
But note that, when the user types:
result(.func = function(x,y){ ...user code... })
They get the same result they would already get if they didn't use your function and simply typed
...user code....
You could argue that it would be helpful with a pre-existing function like mean.default:
x <- 1:10
na.rm <- TRUE
trim <- 0
result(mean.default)
#> [1] 5.5
But this means users have to name their variables as the parameters being passed to the function, and this is just a less convenient way of calling the function.
It might be useful if you could demonstrate a use case where what you are proposing doesn't make the user's code longer or more complex.
You could also use ..., but like the other responses, I don't quite see the value, or perhaps I don't fully understand the use-case.
result <- function(.func, ...){
do.call(.func, list(...))
}
Create function
f1 <- function(a,b) sum(a,b)
Pass f1 and values to result()
result(f1, m,n)
Output:
[1] 9
Here is how I would do it based on your clarifying comments.
Basically since you say your function will take a data.frame as input, the function you are asking for essentially just reverses the order of arguments you pass to do.call()... which takes a function, then a list of arguments. A data.frame is just a special form of list where all elements (columns) are vectors of equal length (number of rows)
result <- function(.data, .func) {
# .data is a data.frame, which is a list of argument vectors of equal length
do.call(.func, .data)
}
result(data.frame(a=1, b=1:5), function(a, b) a * b)
result(data.frame(c=1:10, d=1:10), function(c, d) c * d)
Take the following code:
f2 <- function(...) {
print(list(...))
}
f1 <- function(x, y = 1, ...) {
z <- 20
f2(x, y, ...)
}
f1(5, k = 6)
If I change the arguments to f1, and I still want to pass all those arguments to f2, I would need to change the call to f2. Is it possible to write the call to f2 so that it does not name x and y explicitly? Something like the following (non-working code):
f1 <- function(x, y = 1, ...) {
z <- 20
do.call(f2, formals())
}
I can use environment(), but then I need to take care that I call it at the very beginning:
f1 <- function(x, y = 1, ...) {
argv <- c(as.list(environment()), ...)
z <- 20
do.call(f2, argv)
}
Is there maybe a simpler, more direct way?
It's not clear whether you wanted the variable z added to the call, but in either case you can achieve what you are looking for using match.call. You can simply swap in the quoted name of f2 as the first element in the matched call, and if you wish to add the missing defaults from the f1 formals, you can find them in formals() and write any missing ones into the matched call. Finally, you evaluate this call.
f1 <- function(x, y = 1, ...) {
mc <- match.call()
form <- names(formals())[!names(formals()) %in% names(mc)]
form <- form[form != "..."]
mc[[1]] <- quote(f2)
mc[form] <- formals()[form]
mc$z <- 20
eval(mc)
}
f2 <- function(...) {
print(list(...))
}
f1(5, k = 6)
#> $x
#> [1] 5
#>
#> $k
#> [1] 6
#>
#> $y
#> [1] 1
#>
#> $z
#> [1] 20
Created on 2020-09-29 by the reprex package (v0.3.0)
I would like to write a function which takes a list of variables out of a dataframe, say:
df <- data.frame(a = c(1,2,3,4,5), b = c(6,7,8,9,10))
And to compute always the same calculation, say calculate the standard deviation like:
test.function <- function(var){
for (i in var) {
paste0(i, "_per_sd") <- i / sd(i)
}
}
In order to create a new variable a_per_sd which is divided by it's standard deviation. Unfortunately, I am stuck and get a Error in paste0(i, "_per_sd") <- i/sd(i) : could not find function "paste0<-" error.
The expected usage should be:
test.function(df$a, df$b)
The expected result should be:
> df$a_per_sd
[1] 0.6324555 1.2649111 1.8973666 2.5298221 3.1622777
And for every other variable which was given.
Somehow I think I should use as.formula and/or eval, but I might be doing a thinking error.
Thank you very much for your attention and help.
Is this what you are after?
df <- data.frame(a = c(1,2,3,4,5), b = c(6,7,8,9,10))
test.function <- function(...){
x <- list(...)
xn <- paste0(unlist(eval(substitute(alist(...)))),
"_per_sd")
setNames(lapply(x, function(y) y/sd(y)), xn)
}
cbind(df, test.function(df$a, df$b))
#> a b df$a_per_sd df$b_per_sd
#> 1 1 6 0.6324555 3.794733
#> 2 2 7 1.2649111 4.427189
#> 3 3 8 1.8973666 5.059644
#> 4 4 9 2.5298221 5.692100
#> 5 5 10 3.1622777 6.324555
Created on 2020-07-23 by the reprex package (v0.3.0)
The question is not completely clear to me, but you might get sd of rows/columns or vectors by these approaches:
apply(as.matrix(df), MARGIN = 1, FUN = sd) #across rows
#[1] 3.535534 3.535534 3.535534 3.535534 3.535534
apply(as.matrix(df), MARGIN = 2, FUN = sd) #across columns
# a b
#1.581139 1.581139
lapply(df, sd) #if you provide list of vectors (columns of `df` in this case)
#$a
#[1] 1.581139
#
#$b
#[1] 1.581139
I got this far. Is this what you are looking for?
test.function <- function(var)
{
newvar = paste(var, "_per_sd")
assign(newvar, var/sd(var))
get(newvar)
}
Input:
test.function(df$a)
Result:
[1] 0.6324555 1.2649111 1.8973666 2.5298221 3.1622777
I got the idea from here: Assignment using get() and paste()
At the end this is what my code looks like:
test.function <- function(...){
x <- list(...)
xn <- paste0(unlist(eval(substitute(alist(...)))),
"_per_sd")
setNames(lapply(x, function(y) y/sd(y, na.rm = TRUE)), xn)
}
test.function.wrap <- function(..., dataframe) {
assign(deparse(substitute(dataframe)), cbind(dataframe, test.function(...)) , envir=.GlobalEnv)
}
test.function.wrap(df$a, df$b , dataframe = df)
To be able to assign the new variables to the existing dataframe, I put the (absolutely genius) tips together and wrapped the function in another function to do the trick. I am aware it might not be as elegant, but it does the work!
I have two simple functions: f1 and f2. Suppose we only have access to f2. How can I remove any piece of output in f2 that causes f1 to stop and return the rest of the output?
My desired output is shown below the code.
# FUNCTION #1:
f1 <- function(...){
r <- list(...)
lapply(seq_along(r), function(i) if(r[[i]] == 4) stop("Problem") else r[[i]] + 1)
}
# FUNCTION #2:
f2 <- function(...){
res <- try(f1(...), silent = TRUE)
# if any 'res' causes 'stop' remove it, and return the rest!
}
# EXAMPLE:
f2(1, 2, 4)
My Desired output is:
#[[1]]
#[1] 1
#[[2]]
#[1] 2
The logic in f1 seems to stop everything if any of the ... input gets an error.
So, in f2, you could feed inputs into f1 one by one, normal input will get the correct output.
f1 <- function(...){
r <- list(...)
lapply(seq_along(r), function(i) if(r[[i]] == 4) stop("Problem") else r[[i]] + 1)
}
# FUNCTION #2:
f2 <- function(...){
# res <- try(f1(...), silent = TRUE)
r <- list(...)
res = lapply(r, function(fluffybunny){
tmp =try(f1(fluffybunny))
if(class(tmp) =="try-error") tmp=NULL
return(tmp)
})
# if any 'res' causes 'stop' remove it, and return the rest!
res.remove_error =res[!sapply(res, is.null)]
return(res.remove_error)
}
# EXAMPLE:
result = f2(1, 2, 4)
#> Error in FUN(X[[i]], ...) : Problem
result
#> [[1]]
#> [[1]][[1]]
#> [1] 2
#>
#>
#> [[2]]
#> [[2]][[1]]
#> [1] 3
Created on 2019-10-29 by the reprex package (v0.3.0)
Edit: removed result with try-error from f2's output.
To extend the usability of a R function, we need to pass an argument of type function (FUN), Could you please demonstrate how to declare a function parameter inside in another function and how to call it. Like
MyOperation <- function(x, y, FUN){
int sum <- x+y
if (!missing(FUN)) sum<-FUN(sum)}
return sum
}
Res <- MyOperation(x=1, y=2, FUN=function(n){...})
You don't declare variables in R. Also you can specify a default value right in the formal argument list. You don't need to use missing in this situation.
This runs FUN(x + y) or returns x+y if FUN is not specified.
myOp2 <- function(x, y, FUN = identity) FUN(x + y)
myOp2(1, 2)
## [1] 3
myOp2(1, 3, sqrt)
## [1] 2
One enhancement might be to allow the function to be specified either as a function or as a character string:
myOp2a <- function(x, y, FUN = identity) {
FUN <- match.fun(FUN)
FUN(x + y)
}
myOp2a(1, 3, "sqrt")
## [1] 2
myOp2a(1, 3, sqrt)
## [1] 2
This sums x and y if FUN is not specified; otherwise, it runs FUN with the arguments x and y.
myOp3 <- function(x, y, FUN = sum) FUN(x, y)
myOp3(1, 2)
## [1] 3
myOp3(1, 2, min)
## [1] 1
You just have some basic R syntax problems there. There's no int in R, your function closing bracket was in the wrong place, return() is a function in R -- not a keyword. Check out
MyOperation<-function(x,y,FUN){
sum <- x+y
if (!missing(FUN)) sum<-FUN(sum)
return(sum)
}
MyOperation(x=1,y=2)
# [1] 3
MyOperation(x=1,y=2,FUN=function(n){n+100})
# [1] 103