x <- 1
test_call <- call(">", x, 0)
eval(test_call)
[1] TRUE
x <- -1
eval(test_call)
[1] TRUE
I have a list of call objects that I want to evaluate from time to time. However, as shown in this example, when test_call is being called the 2nd time, the value of x is not updated within test_call - The 2nd output is expected to be FALSE.
I know I can get around by using parse() instead of call() but parse() really makes the code hard to read... Any workaround if I want to stick with call?
Related
Note: This is separate from, though perhaps similar to, the
deparse-substitute trick
of attaining the name of a passed argument.
Consider the following situation: I have some function to be called, and the return value
is to be assigned to some variable, say x.
Inside the function, how can I capture that the name to be assigned to the
returned value is x, upon calling and assigning the function?
For example:
nameCapture <- function() {
# arbitrary code
captureVarName()
}
x <- nameCapture()
x
## should return some reference to the name "x"
What in R closest approximates captureVarName() referenced in the example?
My intuition was that there would be something in the call stack to do with
assign(), where x would be an argument and could be extracted, but
sys.call() yielded nothing of the sort; does it then occur internally, and if
so, what is a sensible way to attain something like captureVarName()?
My notion is that it would act in a similar manner to how the following works, though without the assign() function, using the <- operator instead:
nameCapture <- function() sys.call(1)[[2]]
assign("x", nameCapture())
x
# [1] "x"
I was wondering if there a way for R to detect the existence or absence of the sign * as used in the following objects?
In other words, can R understand that a has a * sign but b doesn't?
a = 3*4
b = 12
If you keep the expressions unevaluated, R can understand their internal complexity. Under normal circumstances, though, R evaluates expressions immediately, so there is no way to tell the difference between a <- 3*4 and b <- 12 once the assignments have been made. That means that the answer to your specific question is No.
Dealing with unevaluated expressions can get a bit complex, but quote() is one simple way to keep e.g. 3*4 from being evaluated:
> length(quote(3*4))
[1] 3
> length(quote(12))
[1] 1
If you're working inside a function, you can use substitute to retrieve the unevaluated form of the function arguments:
> f <- function(a) {
+ length(substitute(a))
+ }
> f(12)
[1] 1
> f(3*4)
[1] 3
In case you're pursuing this farther, you should be aware that counting complexity might not be as easy as you think:
> f(sqrt(2*3+(7*19)^2))
[1] 2
What's going on is that R stores expressions as a tree; the top level here is made up of sqrt and <the rest of the expression>, which has length 2. If you want to measure complexity you'll need to do some kind of collapsing or counting down the branches of the tree ...
Furthermore, if you first assign a <- 3*4 and then call f(a) you get 1, not 3, because substitute() gives you back just the symbol a, which has length 1 ... the information about the difference between "12" and "3*4" gets lost as soon as the expression is evaluated, which happens when the value is assigned to the symbol a. The bottom line is that you have to be very careful in controlling when expressions get evaluated, and it's not easy.
Hadley Wickham's chapter on expressions might be a good place to read more.
I am looping over an estimation and saving all the estimation objects and then pick the one with the lowest deviance. For this, I wanted to use the Filter/Map/Position functions, but I could not find a solution, because it always returns the first object of the list, instead of the second. I probably misunderstand something about how the Position function works, but would like to know what I missed.
MWE:
ls<-list(3,2,4)
Position(min,Map(function(x) {x^2}, ls))
I ended up using unlist and which.min
You ended up doing the correct procedure. Why? First, we'll pull up a salient extract from the help page:
Position(f, x, right = FALSE, nomatch = NA_integer_)
Find and Position are patterned after Common Lisp's find-if and position-if, respectively. If there is an element for which the predicate function gives true, then the first or last such element or its position is returned depending on whether right is false (default) or true, respectively. If there is no such element, the value specified by nomatch is returned. The current implementation is not optimized for performance.
So, Position() is going to apply f() to all elements of x and when the result of f() is TRUE (directly or via coercion) Position() will return the index of that element (if nothing ends up being TRUE then it returns the value assigned to nomatch).
Here's the actual Position() function source:
Position <- function (f, x, right=FALSE, nomatch=NA_integer_) {
ind <- seq_along(x)
if (right) ind <- rev(ind)
for (i in ind) {
if (f(x[[i]])) return(i)
}
nomatch
}
Following the source you can prbly see that min() is getting called on the first element of the list and it returns the value that the min() of the vector at list position 1 and that value is non-zero so it thinks it did a good job and returns the list index it was at.
If you had been doing:
Position(min, Map(function(x) {x-3}, dat))
then you would have seen the result be:
## [1] 2
and possibly have thought it was working, but it's only returning that since the first element of the list of 3 and 3-3 == 0 and 0 is coerced to FALSE.
NOTE: ls is also the name of a base function which is fine since R knows what to do based on usage context but I don't like potentially causing weird errors down the road in function calls by crushing very common core namespace elements so I used dat instead of ls.
The idea behind Position() is more for something like:
eqls4 <- function(x) x==4
Position(eqls4, Map(function(x) {x^2}, dat))
which does return:
## [1] 2
So, what you ended up doing was 100% correct.
Note also that the purrr package provides alternative functional idioms that (IMO) tend to be more readable + have ways of ensuring proper types are maintained and it exports %>% so piping is also readily available:
library(purrr)
map(dat, ~.^2) %>%
flatten_dbl() %>%
which.min()
I saw:
“To understand computations in R, two slogans are helpful:
• Everything that exists is an object.
• Everything that happens is a function call."
— John Chambers
But I just found:
a <- 2
is.object(a)
# FALSE
Actually, if a variable is a pure base type, it's result is.object() would be FALSE. So it should not be an object.
So what's the real meaning about 'Everything that exists is an object' in R?
The function is.object seems only to look if the object has a "class" attribute. So it has not the same meaning as in the slogan.
For instance:
x <- 1
attributes(x) # it does not have a class attribute
NULL
is.object(x)
[1] FALSE
class(x) <- "my_class"
attributes(x) # now it has a class attribute
$class
[1] "my_class"
is.object(x)
[1] TRUE
Now, trying to answer your real question, about the slogan, this is how I would put it. Everything that exists in R is an object in the sense that it is a kind of data structure that can be manipulated. I think this is better understood with functions and expressions, which are not usually thought as data.
Taking a quote from Chambers (2008):
The central computation in R is a function call, defined by the
function object itself and the objects that are supplied as the
arguments. In the functional programming model, the result is defined
by another object, the value of the call. Hence the traditional motto
of the S language: everything is an object—the arguments, the value,
and in fact the function and the call itself: All of these are defined
as objects. Think of objects as collections of data of all kinds. The data contained and the way the data is organized depend on the class from which the object was generated.
Take this expression for example mean(rnorm(100), trim = 0.9). Until it is is evaluated, it is an object very much like any other. So you can change its elements just like you would do it with a list. For instance:
call <- substitute(mean(rnorm(100), trim = 0.9))
call[[2]] <- substitute(rt(100,2 ))
call
mean(rt(100, 2), trim = 0.9)
Or take a function, like rnorm:
rnorm
function (n, mean = 0, sd = 1)
.Call(C_rnorm, n, mean, sd)
<environment: namespace:stats>
You can change its default arguments just like a simple object, like a list, too:
formals(rnorm)[2] <- 100
rnorm
function (n, mean = 100, sd = 1)
.Call(C_rnorm, n, mean, sd)
<environment: namespace:stats>
Taking one more time from Chambers (2008):
The key concept is that expressions for evaluation are themselves
objects; in the traditional motto of the S language, everything is an
object. Evaluation consists of taking the object representing an
expression and returning the object that is the value of that
expression.
So going back to our call example, the call is an object which represents another object. When evaluated, it becomes that other object, which in this case is the numeric vector with one number: -0.008138572.
set.seed(1)
eval(call)
[1] -0.008138572
And that would take us to the second slogan, which you did not mention, but usually comes together with the first one: "Everything that happens is a function call".
Taking again from Chambers (2008), he actually qualifies this statement a little bit:
Nearly everything that happens in R results from a function call.
Therefore, basic programming centers on creating and refining
functions.
So what that means is that almost every transformation of data that happens in R is a function call. Even a simple thing, like a parenthesis, is a function in R.
So taking the parenthesis like an example, you can actually redefine it to do things like this:
`(` <- function(x) x + 1
(1)
[1] 2
Which is not a good idea but illustrates the point. So I guess this is how I would sum it up: Everything that exists in R is an object because they are data which can be manipulated. And (almost) everything that happens is a function call, which is an evaluation of this object which gives you another object.
I love that quote.
In another (as of now unpublished) write-up, the author continues with
R has a uniform internal structure for representing all objects. The evaluation process keys off that structure, in a simple form that is essentially
composed of function calls, with objects as arguments and an object as the
value. Understanding the central role of objects and functions in R makes
use of the software more effective for any challenging application, even those where extending R is not the goal.
but then spends several hundred pages expanding on it. It will be a great read once finished.
Objects For x to be an object means that it has a class thus class(x) returns a class for every object. Even functions have a class as do environments and other objects one might not expect:
class(sin)
## [1] "function"
class(.GlobalEnv)
## [1] "environment"
I would not pay too much attention to is.object. is.object(x) has a slightly different meaning than what we are using here -- it returns TRUE if x has a class name internally stored along with its value. If the class is stored then class(x) returns the stored value and if not then class(x) will compute it from the type. From a conceptual perspective it matters not how the class is stored internally (stored or computed) -- what matters is that in both cases x is still an object and still has a class.
Functions That all computation occurs through functions refers to the fact that even things that you might not expect to be functions are actually functions. For example when we write:
{ 1; 2 }
## [1] 2
if (pi > 0) 2 else 3
## [1] 2
1+2
## [1] 3
we are actually making invocations of the {, if and + functions:
`{`(1, 2)
## [1] 2
`if`(pi > 0, 2, 3)
## [1] 2
`+`(1, 2)
## [1] 3
Consider the following simple function:
f <- function(x, value){print(x);print(substitute(value))}
Argument x will eventually be evaluated by print, but value never will. So we can get results like this:
> f(a, a)
Error in print(x) : object 'a' not found
> f(3, a)
[1] 3
a
> f(1+1, 1+1)
[1] 2
1 + 1
> f(1+1, 1+"one")
[1] 2
1 + "one"
Everything as expected.
Now consider the same function body in a replacement function:
'g<-' <- function(x, value){print(x);print(substitute(value))}
(the single quotes should be fancy quotes)
Let's try it:
> x <- 3
> g(x) <- 4
[1] 3
[1] 4
Nothing unusual so far...
> g(x) <- a
Error: object 'a' not found
This is unexpected. Name a should be printed as a language object.
> g(x) <- 1+1
[1] 4
1 + 1
This is ok, as x's former value is 4. Notice the expression passed unevaluated.
The final test:
> g(x) <- 1+"one"
Error in 1 + "one" : non-numeric argument to binary operator
Wait a minute... Why did it try to evaluate this expression?
Well the question is: bug or feature? What is going on here? I hope some guru users will shed some light about promises and lazy evaluation on R. Or we may just conclude it's a bug.
We can reduce the problem to a slightly simpler example:
g <- function(x, value)
'g<-' <- function(x, value) x
x <- 3
# Works
g(x, a)
`g<-`(x, a)
# Fails
g(x) <- a
This suggests that R is doing something special when evaluating a replacement function: I suspect it evaluates all arguments. I'm not sure why, but the comments in the C code (https://github.com/wch/r-source/blob/trunk/src/main/eval.c#L1656 and https://github.com/wch/r-source/blob/trunk/src/main/eval.c#L1181) suggest it may be to make sure other intermediate variables are not accidentally modified.
Luke Tierney has a long comment about the drawbacks of the current approach, and illustrates some of the more complicated ways replacement functions can be used:
There are two issues with the approach here:
A complex assignment within a complex assignment, like
f(x, y[] <- 1) <- 3, can cause the value temporary
variable for the outer assignment to be overwritten and
then removed by the inner one. This could be addressed by
using multiple temporaries or using a promise for this
variable as is done for the RHS. Printing of the
replacement function call in error messages might then need
to be adjusted.
With assignments of the form f(g(x, z), y) <- w the value
of z will be computed twice, once for a call to g(x, z)
and once for the call to the replacement function g<-. It
might be possible to address this by using promises.
Using more temporaries would not work as it would mess up
replacement functions that use substitute and/or
nonstandard evaluation (and there are packages that do
that -- igraph is one).
I think the key may be found in this comment beginning at line 1682 of "eval.c" (and immediately followed by the evaluation of the assignment operation's RHS):
/* It's important that the rhs get evaluated first because
assignment is right associative i.e. a <- b <- c is parsed as
a <- (b <- c). */
PROTECT(saverhs = rhs = eval(CADR(args), rho));
We expect that if we do g(x) <- a <- b <- 4 + 5, both a and b will be assigned the value 9; this is in fact what happens.
Apparently, the way that R ensures this consistent behavior is to always evaluate the RHS of an assignment first, before carrying out the rest of the assignment. If that evaluation fails (as when you try something like g(x) <- 1 + "a"), an error is thrown and no assignment takes place.
I'm going to go out on a limb here, so please, folks with more knowledge feel free to comment/edit.
Note that when you run
'g<-' <- function(x, value){print(x);print(substitute(value))}
x <- 1
g(x) <- 5
a side effect is that 5 is assigned to x. Hence, both must be evaluated. But if you then run
'g<-'(x,10)
both the values of x and 10 are printed, but the value of x remains the same.
Speculation:
So the parser is distinguishing between whether you call g<- in the course of making an actual assignment, and when you simply call g<- directly.