I tried to create a function and then immediately call it.
function(x){x+1}(3)
This produces some strange result. Fortunately, I already know where I went wrong. I should have let the function statement be evaluated first before attempting to call it.
(function(x){x+1})(3)
# 4
However, I am confused as to what the first line of code actually evaluates to. Is someone able to explain what is going on in the R code below?
a <- function(x){x+1}(3)
a
# function(x){x+1}(3)
class(a)
# [1] "function"
a(3)
# Error in a(3) : attempt to apply non-function
a()
# Error in a() : argument "x" is missing, with no default
(a)
# function(x){x+1}(3)
# <bytecode: 0x128b52c50>
# everything in brackets on the right don't seem to be evaluated
function(x){x+1}(1)(2)(a,b,c)[1:4,d:5,,,][seq_along(letters)]
# function(x){x+1}(1)(2)(a,b,c)[1:4,d:5,,,][seq_along(letters)]
(function(x){x+1}(1)(2)(a,b,c)[1:4,d:5,,,][seq_along(letters)])
# function(x){x+1}(1)(2)(a,b,c)[1:4,d:5,,,][seq_along(letters)]
((function(x){x+1}(1)(2)(a,b,c)[1:4,d:5,,,][seq_along(letters)]))
# function(x){x+1}(1)(2)(a,b,c)[1:4,d:5,,,][seq_along(letters)]
Actually the curly braces are a call, like a function.
The first line is equivalent to
function(x)`{`(x+1)()
As the function is not evaluated, it is not known wether the bracket "call" return a function, thus is valid code.
For example:
{mean}(1:2)
# 1.5
`{`(mean)(1:2)
# 1.5
When you run a function definition but do not evaluate it, you actually see the function definition (formals and body) as output.
See ?"{" for more detail.
Related
I am trying to understand how first class functions work in R. I had understood that functions were first class in R, but was sorely disappointed when applying that understanding. When a function is saved to a list, whether that be as an ordinary list, a vector or a dictionary style list or vector, it is no longer callable, leading to the following error:
Error: attempt to apply non-function
e.g.
print_func <- function() {
print('hi')
}
print_func()
[1] "hi"
my_list = list(print_func)
my_list[0]()
Error: attempt to apply non-function
my_vector = c(print_func)
my_vector[0]()
Error: attempt to apply non-function
my_map <- c("a" = print_func)
my_map["a"]()
Error: attempt to apply non-function
So why is this? Does R not actually treat functions as first class members in all cases, or is there another reason why this occurs?
I see that R vectors also do unexpected things (for me - perhaps not for experienced R users) to nested arrays:
nested_vector <- c("a" = c("b" = 1))
nested_vector["a"]
<NA>
NA
nested_vector["a.b"]
a.b
1
Here it makes sense to me that "a.b" might reference the sub-member of the key "b" under the key "a". But apparently that logic goes out the window when trying to call the upper level key "a".
R is 1-based; so, you refer to the first element of a vector using index 1 (and not 0 like in python).
There are two approaches to accessing list elements:
accessing list elements while keeping a list (return a list containing the desired elements)
pulling an element out of a list
In the first case, the subsetting is done using a single pair of brackets ([]) and you will always get a list back. Note that this is different from python where you get a list only if you select more than one element (lst = [fun1, fun2]; lst[0] return fun1 and not a one-element list like R while lst[0:2] returns a list).
In the second approach, the subsetting is done using a double pair of brackets ([[]]). you basically pull an element completely out of a list; more like subsetting one element out of a list in python.
print_func <- function() {
print('hi')
}
print_func()
my_list = list(print_func)
mode(my_list[1]) # return a list (not a function); so, it's not callable
[1] "list"
mode(my_list[[1]]) # return a function; so, it's callable
[1] "function"
my_list[1]() # error
my_list[[1]]() # works
[1] "hi"
#
my_vector = c(print_func)
mode(my_vector) # a list, so, not callable
[1] "list"
my_vector[1]() # error because it returns a list and not a function
my_vector[[1]]() # works
[1] "hi"
When subsetting with names, the same logic of single and double pair of brackets applies
my_map <- c("a" = print_func)
mode(my_map) # list, so, not callable
[1] "list"
my_map["a"]() # error
my_map[["a"]]() # works
[1] "hi"
Limey pointed out my 2 issues in the comments. I was using a 0-index and I was using single brackets. If I use a 1-index and double brackets it works, and functions are treated as first class variables.
My issue is resolved, and hopefully I won't make that same mistake again.
I'd like to wrap around the checkmate library's qassert to check multiple variable's specification at a time. Importantly, I'd like assertion errors to still report the variable name that's out of spec.
So I create checkargs to loop through input arguments. But to get the variable passed on to qassert, I use the same code for each loop -- that ambigious code string gets used for for the error message instead of the problematic variable name.
qassert() (via vname()) is getting what to display in the assertion error like deparse(eval.parent(substitute(substitute(x))). Is there any way to box up get(var) such that that R will see e.g. 'x' on deparse instead?
At least one work around is eval(parse()). But something like checkargs(x="n', system('echo malicious',intern=T),'") has me hoping for an alternative.
checkargs <- function(...) {
args<-list(...)
for(var in names(args))
checkmate::qassert(get(var,envir=parent.frame()),args[[var]])
# scary string interpolation alternative
#eval(parse(text=paste0("qassert(",var,",'",args[[var]], "')")),parent.frame())
}
test_checkargs <- function(x, y) {checkargs(x='b',y='n'); print(y)}
# checkargs is working!
test_checkargs(T, 1)
# [1] 1
# but the error message isn't helpful.
test_checkargs(1, 1)
# Error in checkargs(x = "b", y = "n") :
# Assertion on 'get(var, envir = parent.frame())' failed. Must be of class 'logical', not 'double'.
#
# want:
# Assertion on 'x' failed. ...
substitute() with as.name seems to do the trick. This still uses eval but without string interpolation.
eval(substitute(
qassert(x,spec),
list(x=as.name(var),
spec=args[[var]])),
envir=parent.frame())
I am not sure if it is possible, but I would like to be able to grab the default argument values of a function and test them and the code within my functions without having to remove the commas (this is especially useful in the case when there are many arguments).
In effect, I want to be able to have commas when sending arguments into the function but not have those commas if I copy and paste the arguments and run them by themselves.
For example:
foo=function(
x=1,
y=2,
z=3
) {
bar(x,y,z)
}
Now to test pieces of the function outside of the code block, copy and paste
x=1,
y=2,
z=3
bar(x,y,z)
But this gives an error because there is a comma after x=1
Perhaps I am not asking the right question. If this is strange, what is the preferred method for debugging functions?
Please note, just posted nearly identical question in Julia.
If you want to programmatically get at the arguments of a function and their default values, you can use formals:
fxn = function(a, b, d = 2) {}
formals(fxn)
# $a
#
#
# $b
#
#
# $d
# [1] 2
I suppose if you wanted to store the default value of every argument to your function into a variable of that name (which is what you're asking about in the OP), then you could do this with a for loop:
info <- formals(fxn)
for (varname in names(info)) {
assign(varname, info[[varname]])
}
a
# Error: argument "a" is missing, with no default
b
# Error: argument "b" is missing, with no default
d
# [1] 2
So for arguments without default values, you'll need to provide a value after running this code (as you would expect). For arguments with defaults, they'll now be set.
What is the difference between an expression and a call?
For instance:
func <- expression(2*x*y + x^2)
funcDx <- D(func, 'x')
Then:
> class(func)
[1] "expression"
> class(funcDx)
[1] "call"
Calling eval with envir list works on both of them. But Im curious what is the difference between the two class, and under what circumstances should I use expression or call.
You should use expression when you want its capacity to hold more than one expression or call. It really returns an "expression list". The usual situation for the casual user of R is in forming arguments to ploting functions where the task is forming symbolic expressions for labels. R expression-lists are lists with potentially many items, while calls never are such. It's interesting that #hadley's Advanced R Programming suggests "you'll never need to use [the expression function]": http://adv-r.had.co.nz/Expressions.html. Parenthetically, the bquote function is highly useful, but has the limitation that it does not act on more than one expression at a time. I recently hacked a response to such a problem about parsing expressions and got the check, but I thought #mnel's answer was better: R selectively style plot axis labels
The strategy of passing an expression to the evaluator with eval( expr, envir= < a named environment or list>) is essentially another route to what function is doing. A big difference between expression and call (the functions) is that the latter expects a character object and will evaluate it by looking for a named function in the symbol table.
When you say that processing both with the eval "works", you are not saying it produces the same results, right? The D function (call) has additional arguments that get substituted and restrict and modify the result. On the other hand evaluation of the expression-object substitutes the values into the symbols.
There seem to be "levels of evaluation":
expression(mean(1:10))
# expression(mean(1:10))
call("mean" , (1:10))
# mean(1:10)
eval(expression(mean(1:10)))
# [1] 5.5
eval(call("mean" , (1:10)))
# [1] 5.5
One might have expected eval(expression(mean(1:10))) to return just the next level of returning a call object but it continues to parse the expression tree and evaluate the results. In order to get just the unevaluated function call to mean, I needed to insert a quote:
eval(expression(quote(mean(1:10))))
# mean(1:10)
From the documentation (?expression):
...an R expression vector is a list of calls, symbols etc, for example as returned by parse.
Notice:
R> class(func[[1]])
[1] "call"
When given an expression, D acts on the first call. If func were simply a call, D would work the same.
R> func2 <- substitute(2 * x * y + x^2)
R> class(func2)
[1] "call"
R> D(func2, 'x')
2 * y + 2 * x
Sometimes for the sake of consistency, you might need to treat both as expressions.
in this case as.expression comes in handy:
func <- expression(2*x*y + x^2)
funcDx <- as.expression(D(func, 'x'))
> class(func)
[1] "expression"
> class(funcDx)
[1] "expression"
I run the following code
sapply( 0:3, function(x){ls(envir = sys.frame(x))} )
And get the following result
[[1]]
[1] "mat" "mat_inverse"
[[2]]
[1] "FUN" "simplify" "USE.NAMES" "X"
[[3]]
[1] "FUN" "X"
[[4]]
[1] "x"
It seems like it lists all the objects in the current enclosing environment; I do have mat and mat_inverse as two variables. But I am not sure what it returns for [[2]], [[3]], [[4]]. Is there a way to debug this code to track what this code does? Especially the following part:
envir = sys.frame(x)
is very confusing to me.
sys.frame allows you to go back through the calling stack. sys.frame(0) is the beginning of the stack (your initial workspace, so to speak). sys.frame(1) is nested one level deep, sys.frame(2) is nested two levels deep etc.
This code is a good demonstration of what happens when you call sapply. It goes through four environments (numbered 0-3) and prints the objects in each. sapply is in fact a wrapper around lapply. What environments do you get when you actually call this code?
Environment 0 is the beginning, i.e., your entire workspace.
Environment 1 is sapply. Type sapply to see its code. You'll see that the function header has simplify, one of the variables you see in [[2]].
Environment 2 is lapply. Once again, type lapply to see its code; the function header contains FUN and X.
Environment 3 is the function you defined for sapply to run. It only has one variable, x.
As an experiment, run
sapply(0:3, function(x) { howdy = 5; ls(envir = sys.frame(x)) } )
The last line will change to [1] "howdy" "x", because you defined a new variable within that final environment (the function inside lapply inside sapply).