I would like to delay the evaluation of a function argument in R. Example:
my_func <- function(FUN){print(FUN); print(FUN)}
my_func(runif(1))
#> [1] 0.2833882
#> [1] 0.2833882
Created on 2019-07-21 by the reprex package (v0.2.1)
This works as documented because runif(1) is only evaluated once and its results printed twice.
Instead, I don't want runif(1) to be evaluated until it is within each print() statement. This would generate two different random numbers.
In other words, I don't want FUN to "resolve" --- if that is the right word --- to runif(1) until we are within a print() statement.
You can also achieve this with substitute and eval:
my_func <- function(FUN) {
print(eval(substitute(FUN)))
print(eval(substitute(FUN)))
}
my_func(runif(1))
#> [1] 0.09973534
#> [1] 0.8096205
my_func(runif(1))
#> [1] 0.2231202
#> [1] 0.5386637
NB: For additional details, check out this chapter Non-standard evaluation of Advanced R
Here is one trick with match.call and eval
my_func <- function(FUN){
print(eval(match.call()[[2]]))
print(eval(match.call()[[2]]))
}
my_func(runif(1))
#[1] 0.7439711
#[1] 0.5011816
my_func(runif(1))
#[1] 0.7864152
#[1] 0.730453
provide and expression
f = function(EXPR){
print(EXPR)
eval(EXPR)
}
EXPR = expression(runif(1))
> f(EXPR)
expression(runif(1))
[1] 0.1761139
provide an string
f2 = function(STR){
print(STR)
eval(parse(text = STR))
}
STR = "runif(1)"
> f2(STR)
[1] "runif(1)"
[1] 0.7630865
Related
I made almost the same question in another post, but asking just for column name, and received a perfect solution for that need. Now what I need is the variable full name. I reformulate here.
I use 'deparse(substitute(x))' from inside my function to get variable name passed as parameter. It works great... but not with 'lapply'
myfun <- function(x)
{
return(deparse(substitute(x)))
}
a <- c(1,2,3)
b <- c(4,5,5)
df<-data.frame(a,b)
myfun(df$a)
[1] "df$a"
but, with 'lapply'...
lapply(df, myfun)
$a
[1] "X[[i]]"
$b
[1] "X[[i]]"
How can I get the variable name inside 'lapply'?
Thanks
When you pass a data frame to lapply, it iterates through the columns by numerical indexing using the double square bracket, not name indexing using the $ accessor. It is equivalent to using the following loop:
X <- df
result <- list()
for(i in seq_along(X)) {
result[[i]] <- myfun(X[[i]])
}
names(result) <- names(X)
result
#> $a
#> [1] "X[[i]]"
#>
#> $b
#> [1] "X[[i]]"
So a simple deparse(substitute(x)) will not work inside lapply. You are not recovering the column name, but rather would need to reconstruct it from the call stack. This is full of caveats and gotchas, but a (relatively) simple approach would be:
myfun <- function(x) {
stack <- lapply(sys.calls(), function(x) sapply(as.list(x), deparse))
if(stack[[length(stack)]][1] == 'myfun') {
return(stack[[length(stack)]][2])
}
if(stack[[length(stack)]][1] == 'FUN') {
return(paste0(stack[[length(stack) - 1]][2], '$',
eval(quote(names(X)[i]), parent.frame())))
}
deparse(substitute(x))
}
This means your function will still work if called directly:
myfun(df$a)
#> [1] "df$a"
But will also work within lapply
lapply(df, myfun)
#> $a
#> [1] "df$a"
#>
#> $b
#> [1] "df$b"
lapply(iris, myfun)
#> $Sepal.Length
#> [1] "iris$Sepal.Length"
#>
#> $Sepal.Width
#> [1] "iris$Sepal.Width"
#>
#> $Petal.Length
#> [1] "iris$Petal.Length"
#>
#> $Petal.Width
#> [1] "iris$Petal.Width"
#>
#> $Species
#> [1] "iris$Species"
It is specifically written to cover direct use or use within lapply. If you wanted to expand its use to work within other functional calls like Map or the various purrr mapping functions, then these would have to be covered specifically by their own if clauses.
Here is another solution, its a bit verbose and Allen's solution is much better:
myfun <- function(x) {
pf <- parent.frame()
x_nm <- deparse(substitute(x))
frame_n <- sys.nframe()
apply <- FALSE
while(frame_n > 0) {
cl <- as.list(sys.call(frame_n))
if (grepl("apply", cl[[1]])) {
x_obj <- cl[[2]]
apply <- TRUE
break
}
frame_n <- frame_n - 1L
}
if (apply) {
idx <- parent.frame()$i[]
obj <- get(x_obj, envir = pf)
if (!is.null(names(obj)[idx])) {
nm_or_idx <- names(obj)[idx]
} else {
nm_or_idx <- idx
}
x_nm <- paste0(x_obj, '$', nm_or_idx)
}
return(x_nm)
}
myfun(df$a)
#> [1] "df$a"
lapply(df, myfun)
#> $a
#> [1] "df$a"
#>
#> $b
#> [1] "df$b"
Created on 2023-02-09 by the reprex package (v2.0.1)
We can define a character string 'col_name'to take the name of the data frame column in the function. For example, if col_name is "a", df[[col_name]] extracts "a" column from data frame.Then we can use the paste() function to concatenate the string 'df$' and 'col_name':
myfun <- function(col_name) {
col <- df[[col_name]]
return(paste("df$", col_name, sep = ""))
}
lapply(colnames(df), myfun)
output
[[1]]
[1] "df$a"
[[2]]
[1] "df$b"
If we would like to assign any data we could do the assignment and then run lapply for example:
df <- iris
lapply(colnames(df), myfun)
output
[[1]]
[1] "df$Sepal.Length"
[[2]]
[1] "df$Sepal.Width"
[[3]]
[1] "df$Petal.Length"
[[4]]
[1] "df$Petal.Width"
[[5]]
[1] "df$Species"
I hope this could helps.
I reduced some problem to the following toy code:
cc<-c("1","2")
ff<-function(x) { list(myname=x)}
aa<-unlist(lapply(cc,ff))
bb<-sapply(cc,ff)
I'd expect aa and bb to be identical, but:
> aa
myname myname
"1" "2"
> bb
$`1.myname`
[1] "1"
$`2.myname`
[1] "2"
I'm aware of the USE.NAMES argument to sapply, however -
it's documented as -
USE.NAMES logical; if TRUE and if X is character, use X as names for
the result unless it had names already.
and so should have no impact in this case,
Internally, it isn't even passed to simplify2array and thus neither to the final unlist.
What's going on here? Could this be an R issue?
Edit: after further investigation it turns out the root cause for the difference is that sapply is essentially equivalent not to
unlist(lapply(cc,ff)
but rather to
unlist(lapply(cc, ff), recursive = FALSE)
(This is the exact internal unlist call).
Look carefully at this:
lapply(cc, ff)
#> [[1]]
#> [[1]]$myname
#> [1] "1"
#>
#>
#> [[2]]
#> [[2]]$myname
#> [1] "2"
The output of lapply itself doesn't have names. Look:
a <- lapply(cc, ff)
names(a)
#> NULL
The output of the lapply is actually an unnamed list. Each element of a is a named list.
names(a[[1]])
#> [1] "myname"
names(a[[2]])
#> [1] "myname"
So in fact, USE.NAMES will apply, and sapply will assign the contents of cc as names for the output of the lapply for which sapply is a thin wrapper as stated in the documentation. It's quite straightforward to follow the code through:
sapply
#> function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
#> {
#> FUN <- match.fun(FUN)
#> answer <- lapply(X = X, FUN = FUN, ...)
#> if (USE.NAMES && is.character(X) && is.null(names(answer)))
#> names(answer) <- X
#> if (!isFALSE(simplify) && length(answer))
#> simplify2array(answer, higher = (simplify == "array"))
#> else answer
#> }
#> <bytecode: 0x036ae7a8>
#> <environment: namespace:base>
I want to be able to determine if an argument to a function is a call to a function or not. Lets say I have two functions , f() and g():
f <- function() "foo"
g <- function(x){
???
}
I want the output to the calls as below:
g(f())
#> [1] TRUE
g("bar")
#> [1] FALSE
I can get this to work by quoting the function arguments:
f <- function() "foo"
g <- function(x) is.call(x)
g(quote(f()))
#> [1] TRUE
g(quote("bar"))
#> [1] FALSE
However this is sub-optimal as I don't want users of the function to have to do this. Any suggestions?
You can use substitute():
h <- function(x) is.call(substitute(x))
h(f())
# [1] TRUE
Within a function, how can we reliably return an object that contains the function itself?
For example with:
functionBuilder <- function(wordToSay) {
function(otherWordToSay) {
print(wordToSay)
print(otherWordToSay)
get(as.character(match.call()[[1]]))
}
}
I can build a function like so:
functionToRun <- functionBuilder("hello nested world")
... and run it ...
functionToRun("A")
#[1] "hello nested world"
#[1] "A"
#
#function(otherWordToSay) {
# print(wordToSay)
# print(otherWordToSay)
# get(as.character(match.call()[[1]]))
# }
#<environment: 0x1e313678>
... as you can see functionToRun returns itself. However, this approach appears to break if I call functionToRun via sapply:
> sapply(LETTERS, functionToRun)
#[1] "hello nested world"
#[1] "A"
#Error in get(as.character(match.call()[[1]])) : object 'FUN' not found
I can see that this is because the actual call when using sapply is FUN but that FUN doesn't exist at pos = -1 (the default for get). Code that works in that position looks like:
get(as.character(match.call()[[1]]),envir = sys.frame(sys.parent()))
But that same code fails if the function hasn't been called via sapply because sys.frame(sys.parent())) goes too far back and ends up referring to R_GlobalEnv.
From the documentation (R 3.2.2) I'd have expected dynGet to perhaps solve the issue of only going as far back in the stack as needed. Although this works for an sapply call of the function, it fails when the function is called on its own. (Besides, it is marked as 'somewhat experimental'). Inversely getAnywhere seems promising, but doesn't seem to work for the sapply called function.
Is there a reliable way to return the function that is currently being processed, i.e. works for both a bare and sapply wrapped function call?
What I'm doing right now is wrapping the attempt to grab the function in a tryCatch; but I'm a little uncertain whether I can trust that get(as.character(match.call()[[1]]),envir = sys.frame(sys.parent())) will work in all wrapping cases (not just sapply). So, I'm looking for a more reasonable way to approach this problem.
Potentially Related Questions:
How to access a variable stored in a function in R
How to get the name of the calling function inside the called routine?
I can't guarantee that this will work in all cases, but it looks okay:
fun <- function(x) {
print(x)
y <- exp(x)
print(y)
sys.function(0)
}
fun(1)
# [1] 1
# [1] 2.718282
# function(x) {
# print(x)
# y <- exp(x)
# print(y)
# sys.function(0)
# }
lapply(1:5, fun)[[3]]
# [1] 1
# [1] 2.718282
# [1] 2
# [1] 7.389056
# [1] 3
# [1] 20.08554
# [1] 4
# [1] 54.59815
# [1] 5
# [1] 148.4132
# function(x) {
# print(x)
# y <- exp(x)
# print(y)
# sys.function(0)
# }
Of course, I don't understand what you need this for.
Let's say we have a statement that produces integer(0), e.g.
a <- which(1:3 == 5)
What is the safest way of catching this?
That is R's way of printing a zero length vector (an integer one), so you could test for a being of length 0:
R> length(a)
[1] 0
It might be worth rethinking the strategy you are using to identify which elements you want, but without further specific details it is difficult to suggest an alternative strategy.
If it's specifically zero length integers, then you want something like
is.integer0 <- function(x)
{
is.integer(x) && length(x) == 0L
}
Check it with:
is.integer0(integer(0)) #TRUE
is.integer0(0L) #FALSE
is.integer0(numeric(0)) #FALSE
You can also use assertive for this.
library(assertive)
x <- integer(0)
assert_is_integer(x)
assert_is_empty(x)
x <- 0L
assert_is_integer(x)
assert_is_empty(x)
## Error: is_empty : x has length 1, not 0.
x <- numeric(0)
assert_is_integer(x)
assert_is_empty(x)
## Error: is_integer : x is not of class 'integer'; it has class 'numeric'.
Maybe off-topic, but R features two nice, fast and empty-aware functions for reducing logical vectors -- any and all:
if(any(x=='dolphin')) stop("Told you, no mammals!")
Inspired by Andrie's answer, you could use identical and avoid any attribute problems by using the fact that it is the empty set of that class of object and combine it with an element of that class:
attr(a, "foo") <- "bar"
identical(1L, c(a, 1L))
#> [1] TRUE
Or more generally:
is.empty <- function(x, mode = NULL){
if (is.null(mode)) mode <- class(x)
identical(vector(mode, 1), c(x, vector(class(x), 1)))
}
b <- numeric(0)
is.empty(a)
#> [1] TRUE
is.empty(a,"numeric")
#> [1] FALSE
is.empty(b)
#> [1] TRUE
is.empty(b,"integer")
#> [1] FALSE
if ( length(a <- which(1:3 == 5) ) ) print(a) else print("nothing returned for 'a'")
#[1] "nothing returned for 'a'"
On second thought I think any is more beautiful than length(.):
if ( any(a <- which(1:3 == 5) ) ) print(a) else print("nothing returned for 'a'")
if ( any(a <- 1:3 == 5 ) ) print(a) else print("nothing returned for 'a'")
You can easily catch integer(0) with function identical(x,y)
x = integer(0)
identical(x, integer(0))
[1] TRUE
foo = function(x){identical(x, integer(0))}
foo(x)
[1] TRUE
foo(0)
[1] FALSE
another option is rlang::is_empty (useful if you're working in the tidyverse)
The rlang namespace does not seem to be attached when attaching the tidyverse via library(tidyverse) - in this case you use purrr::is_empty, which is just imported from the rlang package.
By the way, rlang::is_empty uses user Gavin's approach.
rlang::is_empty(which(1:3 == 5))
#> [1] TRUE
isEmpty() is included in the S4Vectors base package. No need to load any other packages.
a <- which(1:3 == 5)
isEmpty(a)
# [1] TRUE