This question is related to Passing variables to functions that use `enquo()`.
I have a higher function with arguments of a tibble (dat) and the columns of interest in dat (variables_of_interest_in_dat). Within that function, there is a call to another function to which I want to pass variables_of_interest_in_dat.
higher_function <- function(dat, variables_of_interest_in_dat){
variables_of_interest_in_dat <- enquos(variables_of_interest_in_dat)
lower_function(dat, ???variables_of_interest_in_dat???)
}
lower_function <- function(dat, variables_of_interest_in_dat){
variables_of_interest_in_dat <- enquos(variables_of_interest_in_dat)
dat %>%
select(!!!variables_of_interest_in_dat)
}
What is the recommended way to pass variables_of_interest_in_dat to lower_function?
I have tried lower_function(dat, !!!variables_of_interest_in_dat) but when I run higher_function(mtcars, cyl) this returns "Error: Can't use !!! at top level."
In the related post, the higher_function did not enquo the variables before passing them to lower function.
Thank you
Is this what you want?
library(tidyverse)
LF <- function(df,var){
newdf <- df %>% select({{var}})
return(newdf)
}
HF <- function(df,var){
LF(df,{{var}})
}
LF(mtcars,disp)
HF(mtcars,disp)
the {{}} (aka 'curly curly') operator replaces the approach of quoting with enquo()
Related
I have two dataframes and a function, which works when I use it on a single variable.
library(tidyverse)
iris1<-iris
iris2<-iris
iris_fn<-function(df,species_type){
df1<-df%>%
filter((Species==species_type))
return(df1)}
new_df<-iris_fn(df=iris1, species_type="setosa")
I want to pass a vector of variables to the function with the expected output being a list of dataframes (3), one filtered to each variable, for which I have been experimenting using lapply:
variables<-c("setosa","versicolor","virginica")
new_df<-lapply(df=iris1, species_type="setosa", FUN= iris_fn)
The error message is Error in is.vector(X) : argument "X" is missing, with no default which I dont understand because I have stated the variables of the function and what the name of the function is.
Can anyone suggest a solution to get the desired output? I essentially need a version of lapply or purrr function that will allow a dataframe and a vector as inputs.
lapply expects an argument called X as the main input. You could re-write it so that the function expects X instead of species_type e.g.
iris_fn <- function(df, X){
df1 <- df %>% filter((Species==X))
return(df1)
}
variables <- c("setosa", "versicolor", "virginica")
new_df <- lapply(X=variables, FUN=iris_fn, df=iris1)
EDIT:
Alternatively to avoid using X, you need the first argument of the function to match the lapply input e.g.
iris_fn <- function(species_type, df){
df1 <- df %>% filter((Species==species_type))
return(df1)
}
new_df <- lapply(variables, FUN=iris_fn, df=iris1)
Check out the split function for a convenient way to split a data.frame to a list e.g. split(iris, f=iris$Species)
From ?lapply : lapply(X, FUN, ...) , by naming all your arguments there's no X that could be passed to function as the first arg.
Try something like this:
library(dplyr)
iris1<-iris
# note the changes arg. order
iris_fn<-function(species_type, df){
df1<-df%>%
filter((Species==species_type))
return(df1)}
variables<-c("setosa","versicolor","virginica")
new_df_list <-lapply(variables, iris_fn, df=iris1 )
Or with just an anonymous function:
new_df_list <-lapply(variables, \(x) filter(iris1, Species == x))
As you already use Tidyverse, perhaps with purrr::map() instead:
library(purrr)
new_df_list <- map(variables, ~ filter(iris1, Species == .x))
Created on 2022-11-14 with reprex v2.0.2
I am trying to write a function that will apply a user-specified binary operator (e.g. < ) to a raster object. To do so is fairly simple. For example:
selection <- raster::overlay(x = data, fun = function(x) {return(x < 2)}
My issue is that this code would be running within a function, with which I would like to specify both the binary operator and the criteria value (which is 2 in the example above) as variables. For example:
my.func <- function(data, binary_operator, value){
selection <- raster::overlay(x=data, fun=function(x) {x criteria value})
return(selection)
}
I have tried to construct the function as a call without success.
my.func <- function(data, binary_operator, value){
selection <- raster::overlay(x=data, fun=function(x) {call(sprintf("x %s %s", criteria, value))}
return(selection)
}
Is there a way to construct the call of the second function using variables in the first function?
Thanks for your help.
Write your code like this:
my.func <- function(data, binary_operator, value){
selection <- raster::overlay(x=data, fun=function(x) binary_operator(x, value))
return(selection)
}
You need to call this as
my.func(data, `<`, 2)
(with backticks for quotes). If you want to allow "<" for the operator, you could use do.call:
my.func <- function(data, binary_operator, value){
selection <- raster::overlay(x=data, fun=function(x)
do.call(binary_operator, list(x, value)))
return(selection)
}
This will work with either form of argument.
The example is probably simpler than the real case, but you in the example you use, it would be more direct to do:
selection <- data < 2
SHORT SUMMARY
dplyr unquoting is failing as an argument of function summarise where the quoted object is the argument of a function the use of summarise, and that argument is assigned in a for loop.
For Loop
for(j in 1:1){
sumvar <- paste0("randnum",j)
chkfunc(sumvar)
}
Function (abbreviated here, shown in full below)
chkfunc <- function(sumvar) {
sumvar <- enquo(sumvar)
[...]
summarise(mn = mean(!!sumvar))
LONG SUMMARY
I have two columns that sometimes contain NAs and I want to use dplyr non-standard evaluation and its famous unquoting (AKA bang bang !!) to summarise each column in one for loop.
library(dplyr)
set.seed(3)
randnum1 <- rnorm(10)
randnum1[randnum1<0] <- NA
randnum2 <- rnorm(10)
randnum2[randnum2<0] <- NA
randfrm <- data.frame(cbind(randnum1, randnum2))
print(randfrm)
We see below that the filter function processes the unquoting (!!) just fine but the summarise function fails, returning an "argument is not numeric or logical" error. The same occurs when I use := in the summarise function call (not shown here), which appeared in the "Programming with dplyr" vignette. Finally, I confirmed that the class of !!sumvar is numeric within function chkfunc.
chkfunc <- function(sumvar) {
sumvar <- enquo(sumvar)
message("filter function worked with !!sumvar")
outfrm <- randfrm %>%
filter(!is.na(!!sumvar))
print(outfrm)
message("summarise function failed with !!sumvar")
outfrm <- randfrm %>%
filter(!is.na(!!sumvar)) %>%
summarise(mn = mean(!!sumvar))
}
# Just one iteration to avoid confusion
for(j in 1:1){
sumvar <- paste0("randnum",j)
chkfunc(sumvar)
}
While I would like an answer using dplyr, the following works with substitute and eval rather than using dplyr functions (answer adapted from Akrun's answer to StackOverflow question "Unquote string in R's substitute command"):
chkfunc <- function(sumvar) {
outfrm <- eval(substitute(randfrm %>%
filter(!is.na(y)) %>%
summarise(mn = mean(y)),
list(y=as.name(sumvar))))
print(outfrm)
}
for(j in 1:2){
sumvar <- paste0("randnum",j)
chkfunc(sumvar)
}
print(outfrm)
Finally, I'll note that while the pull function on !!sumvar showed the resulting class to be numeric (i.e., the same class and values of randfrm$randnum1), I figured out that !!sumvar is treated as a character string (i.e., "randnum1) in both my use of filter and summarise, hence the argument is not numeric warning.
I am trying to make a simple function which will find and remove outliers automatically. This is the function I have created so far:
fOutlier <- function(x, y) {
outlier <- with(x, boxplot.stats(y)$out)
subset(x, !(y %in% outlier))
}
data <- fOutlier(data, variable)
The problem is that the function does not read x as dataset name. It works if I use the following:
data <- fOutlier(data, data$variable)
Non-standard evaluation seems to be the culprit.
This is what I would personally do.
set.seed(1)
# mock data set
d<-data.frame(var1=rnorm(1000,500,50),
var2=rnorm(1000,1000,100),
var3=rnorm(1000,1000,100),
var4=rnorm(1000,1000,100))
fOutlier<-function(dat, var_name){
var_vec<-dat[,var_name]
outliers<-boxplot.stats(var_vec)$out
clean_dat<-dat[!(var_vec %in% outliers),]
}
# test with different variables
d_var1_clean<-fOutlier(d, 'var1')
d_var2_clean<-fOutlier(d, 'var2')
d_var3_clean<-fOutlier(d, 'var3')
If you really like the non-standard evaluation, then you can add eval() and substitute() to maintain this functionality.
This function is a workable version of what you posted (note the creation of y_vec):
fOutlier2 <- function(x, y) {
y_vec<-eval(substitute(y),eval(x))
outlier <- boxplot.stats(y_vec)$out
subset(x, !(y_vec %in% outlier))
}
d_var1_clean2<-fOutlier2(d, var1)
I'm working with dplyr and created code to compute new data that is plotted with ggplot.
I want to create a function with this code. It should take a name of a column of the data frame that is manipulated by dplyr. However, trying to work with columnnames does not work. Please consider the minimal example below:
df <- data.frame(A = seq(-5, 5, 1), B = seq(0,10,1))
library(dplyr)
foo <- function (x) {
df %>%
filter(x < 1)
}
foo(B)
Error in filter_impl(.data, dots(...), environment()) :
object 'B' not found
Is there any solution to use the name of a column as a function argument?
If you want to create a function which accepts the string "B" as an argument (as in you question's title)
foo_string <- function (x) {
eval(substitute(df %>% filter(xx < 1),list(xx=as.name(x))))
}
foo_string("B")
If you want to create a function which accepts captures B as an argument (as in dplyr)
foo_nse <- function (x) {
# capture the argument without evaluating it
x <- substitute(x)
eval(substitute(df %>% filter(xx < 1),list(xx=x)))
}
foo_nse(B)
You can find more information in Advanced R
Edit
dplyr makes things easier in version 0.3. Functions with suffixes "_" accept a string or an expression as an argument
foo_string <- function (x) {
# construct the string
string <- paste(x,"< 1")
# use filter_ instead of filter
df %>% filter_(string)
}
foo_string("B")
foo_nse <- function (x) {
# capture the argument without evaluating it
x <- substitute(x)
# construct the expression
expression <- lazyeval::interp(quote(xx < 1), xx = x)
# use filter_ instead of filter
df %>% filter_(expression)
}
foo_nse(B)
You can find more information in this vignette
I remember a similar question which was answered by #Richard Scriven. I think you need to write something like this.
foo <- function(x,...)filter(x,...)
What #Richard Scriven mentioned was that you need to use ... here. If you type ?dplyr, you will be able to find this: filter(.data, ...) I think you replace .data with x or whatever. If you want to pick up rows which have values smaller than 1 in B in your df, it will be like this.
foo <- function (x,...) filter(x,...)
foo(df, B < 1)