rlang: using function argument in default assignment function - r

Using rlang, I'd like to have a function that works both when directly called and when passed arguments as part of constructing another function argument by default, e.g.:
refdf = data.frame(x=1:100, y=runif(100,-1,1))
test.helper <- function(z, df) {
qz <- enquo(z)
range(eval_tidy(qz, df))
}
test.helper(y, refdf) # works
test.main <- function(z, df, def = test.helper(z, df)) {
print(def)
}
test.main(y, refdf)
# doesn't work: Error in eval_tidy(qz, df) : object 'y' not found
If instead, I do
refdf = data.frame(x=1:100, y=runif(100,-1,1))
test.helper <- function(z, df) {
qz <- as_quosure(z)
range(eval_tidy(qz, df))
}
test.helper(y, refdf)
# doesn't work: Error in is_quosure(x) : object 'y' not found
test.main <- function(z, df, def = test.helper(enquo(z), df)) {
print(def)
}
test.main(y, refdf)
# now works
I feel like I'm missing something about what gets quoted when; is there an alternative syntax I can use to make both work? I know I could define a separate test.helper_quo or some such, but I'd really like to use the test.helper in the signature (as an extra hint to users about what functions are available).

This should work
library(rlang)
test.helper <- function(z, df) {
qz <- enquo(z)
range(eval_tidy(qz, df))
}
test.helper(y, refdf) # works
test.main <- function(z, df, def = test.helper(!!enquo(z), df)) {
print(def)
}
test.main(y, refdf) # works
# or with rlang >= 0.4.0
test.main <- function(z, df, def = test.helper({{z}}, df)) {
print(def)
}
test.main(y, refdf) # works
Note that in def, we need to capture the quosure passed as z and then expand that into the the call the test.helper so the it's own enquo will be able to see the original symbol.

Related

Passing column name and data frame to custom function in R

I am trying to write a function in R that:
1) Receives a data frame and column name as parameters.
2) Performs an operation on the column in the data frame.
func <- function(col, df)
{
col = deparse(substitute(col))
print(paste("Levels: ", levels(df[[col]])))
}
func(Col1, DF)
func(Col2, DF)
mapply(func, colnames(DF)[1:2], DF)
Output
> func(Col1, DF)
[1] "Levels: GREEN" "Levels: YELLOW"
> func(Col2, DF)
[1] "Levels: 0.1" "Levels: 1"
> mapply(func, colnames(DF)[1:2], DF)
Error in `[[.default`(df, col) : subscript out of bounds
Two things :
in your function func, you apply deparse(substitute(col)) to an object col you expected is not a string. So it works with func(Col1, DF). But in your mapply() call, your argument colnames(...) is a string, so it create an error. Same error obtained with func('Col1', DF).
in a mapply() call, all arguments need to be a vector or a list. So you need to use list(df, df), or if you don't want to replicate, remove the argument df of your function func.
This is one alternative that should work:
func <- function(col, df)
{
print(paste("Levels: ", levels(df[,col])))
}
mapply(FUN = func, colnames(DF)[1:2], list(DF, DF))
Please have a look at the last comment of #demarsylvain - maybe a copy-paste error on your side, you should have done:
func <- function(col,df) {
print(paste("Levels: ", levels(df[,col])))
}
mapply(FUN = func, c('Species', 'Species'), list(iris, iris))
you did:
func <- function(col) {
print(paste("Levels: ", levels(df[,col])))
}
mapply(FUN = func, c('Species', 'Species'), list(iris, iris))
Please upvote and accept the solution of #demarsylvain, it works
EDIT to adress your comment:
To have a generic version for an arbitrary list of column names you can use this code, sorry for the loop :)
func <- function(col,df) {
print(paste("Levels: ", levels(df[,col])))
}
cnames = colnames(iris)
i <- 1
l = list()
while(i <= length(cnames)) {
l[[i]] <- iris
i <- i + 1
}
mapply(FUN = func, cnames, l)

How to pass objects of previous function in R

I am creating some functions for myself and I don't know how to proceed in order to use an object (e.g. a value) returned from one function to another one, while the console is still running. As an example:
first <- function(x){
return(x)
}
second <- function(y){
z <- x + y
return(z)
}
So if you call these functions with a '+'...
first(x = 5) +
second(y = 5)
I would expect a value of 10. In this particular case, obviously the function second() can't find the object x, because the latter one was assigned in the first() environment.
This style of programming is similar to ggplot(), for example:
ggplot(aes(x = x, y = y), data = data) +
geom_point()
I know this type of programming implies the use of environments, but I can't get it work. Any suggestions?
Thanks!
EDIT
Looking to ggplot package in github I figured it out, I think:
hh_first <- function(data) {
h <- structure(list(data = data), class = c("hh"))
h
}
"+.hh" <- function(e1, e2) {
add_hh(e1, e2)
}
add_hh <- function(h, object) {
h$data <- paste(h$data, object, sep = "")
h$data
}
hh_second <- function(data) {
data
}
For example...
hh_first('Hi') +
hh_second(', how are you?')
Returns a string 'Hi, how are you?'. The plus operator in this case works with objects of class 'hh'.
Any suggestions regarding the code or perhaps possible errors that this kind of coding may produce are welcome.
Try:
first <- function(x){
return(x)
}
second <- function(x ,y){
z <- x + y
return(z)
}
second(first(5), 5)
OR
myX <- first(5)
second(myX, 5)
OR
library(magrittr) # Which uses pipes, %>%, to pass the results of a function to the first variable of the second function
first(5) %>% second(5)

How do I write an adverb function that returns a function evaluated in a different environement?

This is closely related to the question: How do I pass ``...`` to a new environment in R?
And the thread here.
My ultimate goal is to be able to have a function that:
operates on a function and returns a function
The return function creates a new environment with .GlobalEnv as its parent
it evaluates the argument function inside the new environment.
And it solves the save-size problem described below.
The save size problem is the problem that lm (and other, for example ggplot) objects save their calling environments, which can often contain irrelevant information. The goal is to have a convenient wrapper that fixes this problem. To demonstrate:
saveSize <- function (object) {
tf <- tempfile(fileext = ".RData")
on.exit(unlink(tf))
save(object, file = tf)
file.size(tf)
}
tmp_fun <- function(){
iris_big <- lapply(1:10000, function(x) iris)
lm(Sepal.Length ~ Sepal.Width, data = iris)
}
out <- tmp_fun()
object.size(out)
# 48008
saveSize(out)
# 1002448 - Far too large as it contains iris_big.
Bill Dunlap proposes this solution which works:
tmp_fun_Bill <- function(){
iris_big <- lapply(1:10000, function(x) iris)
env <- new.env(parent = globalenv())
with(env, lm(Sepal.Length ~ Sepal.Width, data = iris))
}
out <- tmp_fun_Bill()
object.size(out)
# 48008
saveSize(out)
# 4478 - this works!
I would like to generalize Bill's approach into one of these functions that return functions (like purrr::safely).
My best attempt, with help from #MrFlick:
in_new_env <- function(.f){
function(...) {
params <- list(...)
env <- new.env(parent = globalenv())
# Change the environment of any formula objects
params <- lapply(params, function(x) {if (inherits("x","formula")) {environment(x)<-env}; x})
assign(".params.", params, envir = env)
env$.f <- .f
evalq(do.call(".f", .params.), envir=env)
}
}
tmp_fun_me <- function(){
iris_big <- lapply(1:10000, function(x) iris)
in_new_env(lm)(Sepal.Length ~ Sepal.Width, data = iris)
}
out <- tmp_fun_me()
object.size(out)
# 48008
saveSize(out)
# 1002448 - too big again
Can someone point out what's going wrong here?
The problem is really the that the formula grabs the current environment. Here's a function that will set the environment for the formula to an empty environment and
dropenv <- function(x) {
env <- new.env(parent = globalenv())
if (inherits(x,"formula")) {
environment(x)<-env
}
x
}
tmp_fun_drop <- function(){
iris_big <- lapply(1:10000, function(x) iris)
lm(dropenv(Sepal.Length ~ Sepal.Width), data = iris)
}
But this requires evaulating and parsing the formula argument to the lm() function. Here's a possible workaround for your desired method calling
in_new_env <- function(.f){
function(formula, ...) {
formula <- dropenv(formula)
.f(formula, ...)
}
}
tmp_fun_drop <- function(){
iris_big <- lapply(1:10000, function(x) iris)
in_new_env(lm)(Sepal.Length ~ Sepal.Width, data = iris)
}
Now the function returned by in_new_env assumes that the first parameter will be a formula and will clear out the environment for that formula.

How to write R function that can take either a vector or formula as first argument?

I'm writing a function that I want to be able to take both a vector and a formula as a first argument. If it is a vector, I do some single variable calculations, if it is a formula, I analyze the first variable by the second variable (the second variable would always be a factor).
Here is my current code:
fun = function(formula,data) {
if (class(with(data,formula))=="formula") {
mod = model.frame(formula,data)
n.group=names(mod)[2]
group <- eval(parse(text=paste("mod$",n.group,sep=""))) #x
response <- model.response(mod) # y
return(table(response,group))
}
else {
return(table(with(data,formula)))
}
}
data(iris)
fun(Sepal.Length~Species,iris) # works correctly
fun(Sepal.Length,iris) # returns an error
The return value is just for illustration.
Cheers!
Try this:
fun.formula <- function(formula, data) {
mod = model.frame(formula, data)
n.group <- names(mod)[2]
group <- eval(parse(text=paste("mod$",n.group,sep=""))) #x
response <- model.response(mod) # y
table(response, group)
}
fun <- function(formula, data) {
ret <- try( table(eval(substitute(formula), data), silent = TRUE)
if (inherits(try, "try-error)) fun.formula(formula, data) else ret
}
# tests
fun(Sepal.Length ~ Species, iris)
fun(Sepal.Length, iris)
That said, this is a rather unusual interface and, instead, it might be better to specify the case where formula is a variable by passing its name as a character string in which case a more usual S3 implementation is possible:
fun2 <- function(formula, data, ...) UseMethod("fun2")
fun2.formula <- fun.formula
fun2.character <- function(formula, data) table(data[[formula]])
# tests
fun2(Sepal.Length ~ Species, iris)
fun2("Sepal.Length", iris) # with this approach use a character string
REVISED Now we use try and added an S3 approach.
Ideally, I would have solved this using an S3 approach, but I couldn't figure out how to do that. The following got the job done:
fun <- function(x,data) {
mod = try(model.frame(x,data),silent=T)
if (inherits(mod, "try-error")) {
x=data[,deparse(substitute(x))]
return(table(x))
}
else {
mod = model.frame(x,data)
n.group=names(mod)[2]
group <- eval(parse(text=paste("mod$",n.group,sep=""))) #x
response <- model.response(mod) # y
return(table(response,group))
}
}
fun(Sepal.Length~Species,iris) # works correctly
fun(Sepal.Length,iris) # works!

Passing optional arguments inside a wrapper function to a sub-function

I have a wrapper function, where I need to pass optional arguments to the sub-function specified. But there are so many different possible sub-functions that I can't pre-specify them.
For reference, the sub-functions exist in the environment etc...
Consider:
funInFun<- function (x, method, ...) {
method.out <- function(this.x, FUN, ...) {
FUN <- match.fun(FUN)
c <- FUN(this.x, ...)
return(c)
}
d <- method.out(x, method)
return(d)
}
data<-seq(1,10)
funInFun(data, mean) # Works
data<-c(NA,seq(1,10))
funInFun(data, mean, na.rm=TRUE) # Should remove the NA
funInFun(c(seq(1,10)), quantile, probs=c(.3, .6)) # Shoudl respect the probs option.
You need to pass the ... to method.out. Then it works fine:
funInFun<- function (x, method, ...) {
method.out <- function(this.x, FUN, ...) {
FUN <- match.fun(FUN)
c <- FUN(this.x, ...)
return(c)
}
d <- method.out(x, method, ...) # <<--- PASS `...` HERE
return(d)
}
data<-seq(1,10)
funInFun(data, mean) # Works
# [1] 5.5
data<-c(NA,seq(1,10))
funInFun(data, mean, na.rm=TRUE) # Should remove the NA
# [1] 5.5
funInFun(c(seq(1,10)), quantile, probs=c(.3, .6))
# 30% 60%
# 3.7 6.4
In addition to Thomas' answer to the OP's question you might have to forward an optional argument that is an explicit argument of the wrapper function.
In this case, instead of repeating the default value of the wrapped function in the wrapper definition you can use missing to construct a call with a missing argument.
f <- function(s = "world!") cat("Hello", s)
f()
# Hello world!
g <- function(s = NULL) eval(substitute(
f(s = sub_me),
list(sub_me = if(missing(s)) quote(expr =) else s)))
g()
# Hello world!
g("you!")
# Hello you!

Resources