I will use the following example to explain my question. But the question is not only about this specific example, but more general about meta-programming in R.
I have two specific functions to make plots
Specific function 1
draw_hists <- function(dts, indexs, title_prefix = 'sd = ') {
mapply(
function(dt, index)
{
hist(dt, main = paste(title_prefix, as.character(index)))
},
dts, indexs
)
}
plots histograms
sds <- c(0.1, 0.5, 5, 100)
raw_normals <- purrr::map(sds, ~rnorm(500, mean = 1, sd = .x))
draw_hists(raw_normals, sds)
Specific function 2
plots scatter plots of percentage ranks against raw data
draw_percentage <- function(dts, indexs, title_prefix = 'sd = ') {
mapply(
function(dt, index)
{
plot(dt, dplyr::percent_rank(dt), main = paste(title_prefix, as.character(index)))
},
dts, indexs
)
}
sds <- c(0.1, 0.5, 5, 100)
raw_normals <- purrr::map(sds, ~rnorm(500, mean = 1, sd = .x))
draw_percentage(raw_normals, sds)
Now assume I want to abstract out the general patterns of these functions and define a generic higher-order function that takes inputs of any arbitrary plotting function and its argument as an expression to be flexible enough drawing nearly whatever I want to draw. I thought something like this would work.
draw_generic <- function(dts, indexs, plfun, plfun_arguments_as_expr) {
....
}
The formal parameter plfun_arguments_as_expr would bind to an expression such like expr(dplyr::percent_rank(dt)) to make the plotting truly generic and flexible. I come up with the following solution.
draws_generic <- function(dts, indexs, plfun, title_prefix = 'sd =', ...) {
dots <- enquos(...)
mapply(
function(dt, index)
{
eval_tidy(
expr(
plfun(dt, main = paste(title_prefix, as.character(index)), !!!dots)
)
)
}
,
dts, indexs
)
}
draws_generic(raw_normals, sds, hist)
draws_generic(raw_normals, sds, plot, dplyr::percent_rank(dt))
The histogram works. But the percent_rank one gives me error
Error in x[!nas] : object of type 'closure' is not subsettable
In addition: Warning message:
In is.na(x) : is.na() applied to non-(list or vector) of type 'closure'
Called from: rank(x, ties.method = "min", na.last = "keep")
I think this might be related to the fact that the environment scope captured by enquos is global, but the expression contains a name dt for which its binding existed in local scope created by the anonymous function function(dt, index). Is this truly the reason of this error? If so, is there a neat and clean way to fix it that follows principles of "tidy evaluation"?
Update
Inspired by the comments, I modify here my question. In stead of using a pure functional abstraction to generalize procedures, what I really want is to achieve generalization by treating code as object and manipulate it freely in a R function or kind of macro programming. More precisely what I want is a draw_expression function to plot data against a given expression instead of previous draw_generic. Below are some of my attempts so far:
The 1st version plots a plotting expression with x as data argument against given data without additional indexs parameter and title. The code has been tested working.
draw_expression_1 <- function(dts, plexpr) {
plexpr <- enexpr(plexpr)
lapply(dts, eval(expr(function(x) !!plexpr)))
}
draw_expression_1(raw_normals, hist(x))
draw_expression_1(raw_normals, plot(x, dplyr::percent_rank(x))
The 2nd version adds additional indexs parameter and titles by modifying the given expression. The code has been tested working.
draw_expression_2 <- function(dts, indexs, plexpr, title_prefix = 'sd =') {
plexpr <- enexpr(plexpr)
mapply(eval(expr(function(x, index) {
UQ(rlang::call_modify(plexpr, main = quote(paste(title_prefix, as.character(index)))))
})), dts, indexs)
}
draw_expression_2(raw_normals, sds, hist(x))
draw_expression_2(raw_normals, sds, plot(x, dplyr::percent_rank(x))
The 3rd version is aimed at allowing the call expression to have any arbitrary formal parameter name instead of x. Release the assumption to be that the 1st parameter corresponds to the data to be plotted, but it can be named whatever users wished.
draw_expression_3 <- function(dts, indexs, plexpr, title_prefix = 'sd =') {
plexpr <- enexpr(plexpr)
first_arg_name <- rlang::call_args(plexpr)
mapply(eval(expr(function(first_arg_name, index) {
UQ(rlang::call_modify(plexpr, main = quote(paste(title_prefix, as.character(index)))))
})), dts, indexs)
}
draw_expression_3(raw_normals, sds, hist(x))
draw_expression_3(raw_normals, sds, plot(x, dplyr::percent_rank(x))
This prints me error:
Error in plot(x, dplyr::percent_rank(x), main = paste(title_prefix, as.character(index))) :
object 'x' not found
Apparently first_arg_name has to been unquoted in the expression. Thus I did this:
draw_expression_3 <- function(dts, indexs, plexpr, title_prefix = 'sd =') {
plexpr <- enexpr(plexpr)
first_arg_name <- rlang::call_args(plexpr)
mapply(eval(expr(function(UQ(first_arg_name), index) {
UQ(rlang::call_modify(plexpr, main = quote(paste(title_prefix, as.character(index)))))
})), dts, indexs)
}
draw_expression_3(raw_normals, sds, hist(x))
draw_expression_3(raw_normals, sds, plot(x, dplyr::percent_rank(x))
But I got weird syntax error:
Error: unexpected '}' in " }"
Now I don't understand why this happens. Any help?
Also I could not use enquo + eval_tidy here, since enquo will capture the environment of the call expression which is global, but the expression inside the function that I would like to modify and manipulate contains x which belongs to the inner scope. Thus this is not a tidy evaluation. But I am not perusing that anymore. I simply want do macro programming as freely as I can with base R plus some of convenient tools provided by rlang.
NOTE: I am not trying to do any production work. I am just trying to see the limit of this language and understand things better.
I don't know the "tidy evaluation" way to do this, but the simpler base R method is to pass a function rather than an expression. For example,
sds <- c(0.1, 0.5, 5, 100)
raw_normals <- purrr::map(sds, ~rnorm(500, mean = 1, sd = .x))
draws_generic2 <- function(dts, indexs, plfun, title_prefix = 'sd =') {
mapply(
function(dt, index)
{
plfun(dt, main = paste(title_prefix, as.character(index)))
},
dts, indexs
)
invisible(NULL)
}
par(mfrow=c(2,2))
draws_generic2(raw_normals, sds, hist)
draws_generic2(raw_normals, sds, function(dt, ...) plot(dt, dplyr::percent_rank(dt), ...))
Created on 2022-04-15 by the reprex package (v2.0.1)
I used dt in my function definition in the second example, but I could have used any variable name, e.g. this would give the same output except for the axis labels:
draws_generic2(raw_normals, sds,
function(x, ...) plot(x, dplyr::percent_rank(x), ...))
Related
I want to create a custom log function, that would get used in other functions. I am having issues with the custom function where arguments don't seem to flow through to the inner log function. My custom log function is inspired by the logger package but I am planning to expand this usage a bit further (so logger doesn't quite meet my needs)
log_fc <- function(type = c("INFO", "ERROR"), ...) {
print(
glue::glue("[{type} {Sys.time()}] ", ...)
)
}
Next I am planning to use log_fc in various other custom functions, one example:
test_fc <- function(forecast) {
log_fc(type = "INFO", "{forecast} is here")
#print(forecast)
}
If I test this, I get the following error:
> test_fc(forecast = "d")
Error in eval(parse(text = text, keep.source = FALSE), envir) :
object 'forecast' not found
I am not sure why argument forecast is not being picked up by the inner test_fc function. TIA
You could use the .envir argument:
log_fc <- function(type = c("INFO", "ERROR"), ...) {
env <- new.env(parent=parent.frame())
assign("type",type,env)
print(
glue::glue("[{type} {Sys.time()}] ", ...,.envir = env)
)
}
test_fc <- function(forecast) {
log_fc(type = "INFO", "{forecast} is here")
}
test_fc("My forecast")
#> [INFO 2022-12-18 12:44:11] My forecast is here
There are two things going on.
First, the name forecast is never passed to log_fc. The paste solution never needs the name, it just needs the value, so it still works. You'd need something like
log_fc(type = "INFO", "{forecast} is here", forecast = forecast)
to get the name into log_fc.
The second issue is more complicated. It's a design decision in many tidyverse functions. They want to be able to have code like f(x = 3, y = x + 1) where the x in the second argument gets the value that was bound to it in the first argument.
Standard R evaluation rules would not do that; they would look for x in the environment where f was called, so f(y = x + 1, x = 3) would bind the same values in the function as putting the arguments in the other order.
The tidyverse implementation of this non-standard evaluation messes up R's internal handling of .... The workaround (described here: https://github.com/tidyverse/glue/issues/231) is to tell glue() to evaluate the arguments in a particular location. You need to change your log function to fix this.
One possible change is shown below. I think #Waldi's change is actually better, but I'll leave this one to show a different approach.
log_fc <- function(type = c("INFO", "ERROR"), ...) {
# Get all the arguments from ...
args <- list(...)
# The unnamed ones are messages, the named ones are substitutions
named <- which(names(args) != "")
# Put the named ones in their own environment
e <- list2env(args[named])
# Evaluate the substitutions in the new env
print(
glue::glue("[{type} {Sys.time()}] ", ..., .envir = e)
)
}
test_fc <- function(forecast) {
log_fc(type = "INFO", "{forecast} is here", forecast = forecast)
}
test_fc(forecast = "d")
#> [INFO 2022-12-18 06:25:29] d is here
Created on 2022-12-18 with reprex v2.0.2
The reason for this is that when your test_fc function connects to the log_fc function, the forecats variable wouldn't be able to be found, because it's not a global function; thus, you can't access it from the other function.
The way to fix this is by defining a global variable:
log_fc <- function(type = c("INFO", "ERROR"), ...) {
print(
glue::glue("[{type} {Sys.time()}] ", ...)
)
}
test_fc <- function(forecast) {
forecast <<- forecast
log_fc(type = "INFO", "{forecast} is here")
}
print(test_fc(forecast = "d"))
Output:
d is here
Since you're already using glue you could use another glue::glue in test_fc to accomplish the pass-through, such as:
log_fc <- function(type = c("INFO", "ERROR"), ...) {
print(
glue::glue("[{type} {Sys.time()}] ", ...)
)
}
test_fc <- function(forecast) {
log_fc(type = "INFO", glue::glue("{forecast} is here"))
}
which yields
> test_fc('arctic blast')
[INFO 2022-12-21 15:56:18] arctic blast is here
>
I have a function which takes a dataframe and its columns and processes it in various ways (left out for simplicity). We can put in column names as arguments or transform columns directly inside function arguments (like here). I need to find out what object(s) are passed in the function.
Reproducible example:
df <- data.frame(x= 1:10, y=1:10)
myfun <- function(data, col){
col_new <- eval(substitute(col), data)
# magic part
object_name <- ...
# magic part
plot(col_new, main= object_name)
}
For instance, the expected output for myfun(data= df, x*x) is the plot plot(df$x*df$x, main= "x"). So the title is x, not x*x. What I have got so far is this:
myfun <- function(data, col){
colname <- tryCatch({eval(substitute(col))}, error= function(e) {geterrmessage()})
colname <- gsub("' not found", "", gsub("object '", "", colname))
plot(eval(substitute(col), data), main= colname)
}
This function gives the expected output but there must be some more elegant way to find out to which object the input refers to. The answer must be with base R.
Use substitute to get the expression passed as col and then use eval and all.vars to get the values and name.
myfun <- function(data, col){
s <- substitute(col)
plot(eval(s, data), main = all.vars(s), type = "o", ylab = "")
}
myfun(df, x * x)
Anothehr possibility is to pass a one-sided formula.
myfun2 <- function(formula, data){
plot(eval(formula[[2]], data), main = all.vars(formula), type = "o", ylab = "")
}
myfun2(~ x * x, df)
The rlang package can be very powerful when you get a hang of it. Does something like this do what you want?
library(rlang)
myfun <- function (data, col){
.col <- enexpr(col)
unname(sapply(call_args(.col), as_string))
}
This gives you back the "wt" column.
myfun(mtcars, as.factor(wt))
# [1] "wt"
I am not sure your use case, but this would work for multiple inputs.
myfun(mtcars, sum(x, y))
# [1] "x" "y"
And finally, it is possible you might not even need to do this, but rather store the expression and operate directly on the data. The tidyeval framework can help with that as well.
Simple question, I hope. I want to write a plotting function that has a default value for the y-axis label if the user doesn't specify. I'd also like to allow the ... argument for other plotting parameters, and allow the user to set ylab manually. But I can't figure out how to do this.
# simple scatterplot function with a default ylab
scatter <- function(x,y, ...) {
plot(x, y, ylab="Default y-axis label", ...)
}
# generate data
x <- rnorm(100)
y <- x+rnorm(100)
# use the default
scatter(x,y)
# here I want to use my own label, but I get an error!
scatter(x, y, ylab="New y-axis label")
The error I get is:
Error in plot.default(x, y, ylab = "Default y-axis label", ...) :
formal argument "ylab" matched by multiple actual arguments
I understand the problem, but I don't know the best way to fix it. Thanks for the help!
EDIT: I realize I can do something like
scatter <- function(x,y,ylab = "Default y-axis label", ...) {
plot(x, y, ylab= ylab, ...)
}
...but if I'm writing a package to submit to CRAN, and I have lots of default options I'd like to fiddle with, I don't want to have to document all these standard plotting arguments because they're used in my function definition.
Try doing this instead:
scatter <- function(x,y,ylab = "Default y-axis label", ...) {
plot(x, y, ylab= ylab, ...)
}
Expanding slightly on Arun's answer, this is a sketch of one route to take if you have many arguments:
def_args <- list(ylab = "Default Label",xlab = "Default Label")
scatter <- function(x,y, ...) {
cl <- as.list(match.call())[-1L]
do.call("plot",c(cl,def_args[!names(def_args) %in% names(cl)]))
}
Some thought would be needed to decide how you want to handle partial matching of arguments (if at all). e.g. perhaps something like this:
scatter <- function(x,y, ...) {
cl <- as.list(match.call())[-1L]
names(cl) <- match.arg(names(cl),
names(formals(plot.default)),several.ok = TRUE)
do.call("plot",c(cl,def_args[!names(def_args) %in% names(cl)]))
}
would handle partial matching of arguments.
One way using match.call to check if ylab has been specified as an argument:
scatter <- function(x,y, ...) {
mcall = as.list(match.call())[-1L]
if (!"ylab" %in% names(mcall))
plot(x, y, ylab="Default y-axis label", ...)
else plot(x, y, ...)
}
As mentioned under comment list(...) is a nicer way to get just the dots argument expanded than having to get all the formal arguments with match.call.
You might also try using pmatch instead of %in% for partial matching of arguments.
I use a function to build an argument list. In my case, I do not care about partially matching argument names, which is good because this won't support it.
# Create a list of input arguments.
# Allow arguments to be specified multiple times, first definition wins.
# The resulting list is intended to be passed to do.call().
make.args <- function(..., PRE.ARGS=list(), POST.ARGS=list()) {
a <- list()
l <- c(PRE.ARGS, list(...), POST.ARGS)
for (name in unique(names(l))) {
a[[name]] <- l[[name]] # First occurrence will be found.
}
return(a)
}
An example of its use:
plot.rate <- function(col, cond=NULL, ...) {
col <- paste(col, collapse=' + ')
f <- paste(col, '~ Rate')
if (!is.null(cond)) {
cond <- paste(cond, collapse=' + ')
f <- paste(f, cond, sep='|')
}
arg.list <- make.args(...
, x = as.formula(f)
, main=col
, grid=TRUE
, scales=list(x=list(alternating=1) # bottom(/left)
, y=list(alternating=3)) # both
, xlab='x RTM'
)
do.call(xyplot, arg.list)
}
This would seem to be an elementary question, but I can't seem to find an answer on stackoverflow.
How can I obtain the following effect:
f <- function(x = 1){x^2}
miracle(f)
[1] "x^2"
The context is a shiny app (package by RStudio) in which I have a textInput() function to which I supply an initial value x^2. While this works:
textInput(inputId = "inFun", label = h4("Enter a function:"), value = "x^2")
this doesn't:
textInput(inputId = "inFun", label = h4("Enter a function:"), value = f)
It appears that I need something like "x^2" on the rhs of value.
Below is a representative sample of several variations I have tried:
eval(parse(text = f))
Error in as.character(x) :
cannot coerce type 'closure' to vector of type 'character'
f(x = "x")
Error in x^2 : non-numeric argument to binary operator
`f`
function(x){x^2}
f(x = `x`)
Error in f(x = x) : object 'x' not found
Is there a built-in function for this?
I'd like to answer my own question, based on Roman Luštrik's comment, to invite suggestions for improvements rather than raising my meagre tally of "points".
Roman suggested the function body(), which I had never heard of. Here is what body() does to f:
f <- function(x = 1){x^2}
> body(f)
{
x^2
}
The curly brackets were unwanted, so I searched a little further. I managed to get rid of the curly brackets with this:
> gsub(' {2,}','',deparse(body(f))[2])
[1] "x^2"
The above, therefore, answers my own question. But is there a more elegant and shorter way?
Following Roman's suggestion to use body(), I came across this outstanding answer by joran, hadley, and several others, which provided me with a template:
How to create an R function programmatically?
There it explains how to create a function programmatically from an argument list, a body and an environment. I therefore decided to construct my function f with these 3 primitives and to call the body from inside shiny's textInput.
So I put this in my global.R file (the small-cap g is shorthand for global)
# Convenience function
make.function <- function(args = alist(a = 1, b = 2), body = quote(a + b),
env = parent.frame()) {
subs <- list(args = as.pairlist(args), body = body)
eval(substitute(`function`(args, body), subs), env)
}
gArg <- alist(a = 1, b = 2)
gBody <- quote(a + b)
gFun <- make.function(gArg, gBody)
Then in my server.R file, I have:
textInput(inputId = "inFun", label = h4("1. Enter a function:"),
value = deparse(body(gFun)))
And it works!
I was planning to write value = gBody or something to that effect, but my first success came with deparse(body(gFun)), so that's what I'm using now.
The use of make.function to produce a 'static' function in global.R is of course overkill, but I'm using make.function elsewhere inside server.R to process the user-supplied arguments and body to create new functions and plot them, so it's a very useful function to have.
Thanks Roman: if you write your own answer I'll accept yours.
I have a function like this dummy-one:
FUN <- function(x, parameter){
if (parameter == 1){
z <- DO SOMETHING WITH "x"}
if (parameter ==2){
z <- DO OTHER STUFF WITH "x"}
return(z)
}
Now, I would like to use the function on a dataset using apply.
The problem is, that apply(data,1,FUN(parameter=1))
wont work, as FUN doesn't know what "x" is.
Is there a way to tell apply to call FUN with "x" as the current row/col?
`
You want apply(data,1,FUN,parameter=1). Note the ... in the function definition:
> args(apply)
function (X, MARGIN, FUN, ...)
NULL
and the corresponding entry in the documentation:
...: optional arguments to ‘FUN’.
You can make an anonymous function within the call to apply so that FUN will know what "x" is:
apply(data, 1, function(x) FUN(x, parameter = 1))
See ?apply for examples at the bottom that use this method.
Here's a practical example of passing arguments using the ... object and *apply. It's slick, and this seemed like an easy example to explain the use. An important point to remember is when you define an argument as ... all calls to that function must have named arguments. (so R understands what you're trying to put where). For example, I could have called times <- fperform(longfunction, 10, noise = 5000) but leaving off noise = would have given me an error because it's being passed through ... My personal style is to name all of the arguments if a ... is used just to be safe.
You can see that the argument noise is being defined in the call to fperform(FUN = longfunction, ntimes = 10, noise = 5000) but isn't being used for another 2 levels with the call to diff <- rbind(c(x, runtime(FUN, ...))) and ultimately fun <- FUN(...)
# Made this to take up time
longfunction <- function(noise = 2500, ...) {
lapply(seq(noise), function(x) {
z <- noise * runif(x)
})
}
# Takes a function and clocks the runtime
runtime <- function(FUN, display = TRUE, ...) {
before <- Sys.time()
fun <- FUN(...)
after <- Sys.time()
if (isTRUE(display)) {
print(after-before)
}
else {
after-before
}
}
# Vectorizes runtime() to allow for multiple tests
fperform <- function(FUN, ntimes = 10, ...) {
out <- sapply(seq(ntimes), function(x) {
diff <- rbind(c(x, runtime(FUN, ...)))
})
}
times <- fperform(FUN = longfunction, ntimes = 10, noise = 5000)
avgtime <- mean(times[2,])
print(paste("Average Time difference of ", avgtime, " secs", sep=""))