Checking for existence of a variable - r

I am really struggling to understand the following behaviour of R. Let's say we want to define a function f, which is supposed to return whether its argument exists as a variable; but we want to pass the argument without quotes. So for example to check whether variable y exists, we would call f(y).
f <- function(x) {
xchar <- deparse(substitute(x))
exists(xchar)
}
So I start a brand new R session and define f, but no other variables. I then get
f(y)
# [1] FALSE
f(z)
# [1] FALSE
f(f)
# [1] TRUE
f(x)
# [1] TRUE
The first three calls (on y, z, f) give the expected result. But there is no variable named x
exists("x")
# [1] FALSE
EDIT I now realise that this is because of the use of substitute, which will create the variable x. But is there a way around this?

The object x does exist inside the function since it is the name of the parameter.
If you modify the function
f <- function(...) {
xchar <- deparse(substitute(...))
exists(xchar)
}
you can see the expected output:
f(x)
# FALSE

You may want to just search the global environment
f <- function(x) {
xchar <- deparse(substitute(x))
exists(xchar,where=globalenv())
}
in which case you get:
> f(y)
[1] FALSE
> f(f)
[1] TRUE
> f(x)
[1] FALSE
> f(z)
[1] FALSE
> f(mean)
[1] TRUE

Related

Using R, how to get the environment object from the environment string?

Let's say I create a function:
x = function() { print(environmentName(environment())); envir=environment(); print(envir); print(environmentName(envir)); print(environmentName(environment())); print(environmentName(as.environment(environment()))); print(environmentName(parent.env(environment()))); envir; }
On my computer it prints out a unique hash of the environment in only one instance. That leads to a fundamental question about what is an envir and how the function environmentName(envir) actually works.
For example,
( env = x() );
outputs:
> ( env = x() );
[1] ""
<environment: 0x000001fec97c3eb0>
[1] ""
[1] ""
[1] ""
[1] "R_GlobalEnv"
<environment: 0x000001fec97c3eb0>
and reports an "" EMPTY environment name:
> environmentName(env)
[1] ""
When I was expecting a string "0x000001fec97c3eb0" or something. I guess every time I call the function x I get a new environment.
> environmentName(x)
[1] ""
> environmentName(environment(x))
[1] "R_GlobalEnv"
New environments
Maybe unless it is "FIXED" to a package or attached as an environment?
From (How to get environment of a variable in R)
a <- new.env(parent=emptyenv())
a$x <- 3
attach(a)
b <- new.env(parent=emptyenv())
b$x <- 4
This yields:
environmentName(a);
environmentName(b);
both EMPTY "" ???
It does seem to work on the following, called from GLOBAL:
> environmentName(environment());
[1] "R_GlobalEnv"
So there appears to be a fundamental question:
Fundament Question: How exactly does environmentName work?
But that was not my primary question. The above example is a function that has an envir as an input and returns a string. How can I reverse that process. That is, if I have a stringname R_GlobalEnv how do I return the envir R object?
Primary Question: Is there a function that reverses the logic of environmentName?
That is, if I have a stringname of the environmentName, what function do I call to reverse it? Or what function can be written to enhance the R experience?
e.g.,
env.toName = function(envir=environment()) { environmentName(envir); }
env.fromName = function(envirstr) { XXVXX(envirstr); }
Not all environments have names, which is why you see "" sometimes. environmentName(env) just returns the "name" attribute of env.
There is no standard function that can take the name and return the environment: to write one, you'd have to search all variables in view for the environments, then individually check for a name on each. You might find more than one with the same name attribute, e.g.
a <- new.env()
attr(a, "name") <- "myenv"
b <- new.env()
attr(b, "name") <- "myenv"
environmentName(a)
#> [1] "myenv"
environmentName(b)
#> [1] "myenv"
Created on 2022-09-17 with reprex v2.0.2
Names on environments are there just as decoration, not for unique identification. Check for equality using identical():
d <- a
environmentName(d)
#> [1] "myenv"
identical(a, b)
#> [1] FALSE
identical(a, d)
#> [1] TRUE

Passing on missing quasiquotation arguments

I am trying to use the quasiquotation to pass a raw variable name to a function, that passes it on to another function. However, the argument is optional, so I need to test if the 1st function was not given the argument and passed on that missing argument to the 2nd function.
In these examples b refers to a variable in a data.frame.
Testing if a function was passed a raw variable expression or no argument, I do
foo <- function(a) {
print(is_missing(enexpr(a)))
}
foo()
# [1] TRUE
foo(b)
# [1] FALSE
Without the enexpr, the variable b will attempt to be evaluated - and when missing - Errors.
Next, I try to pass the missing argument to another function (bar) whom then will test for its presence:
foo2 <- function(a) {
print(is_missing(enexpr(a)))
bar(maybe_missing(a))
}
bar <- function(a) {
print(is_missing(enexpr(a)))
}
foo2()
# [1] TRUE
# [1] FALSE <-- "wrong" (but not unexpected)
foo2(b)
# [1] FALSE
# [1] FALSE
Question: How can I in bar test whether foo2 was passed an argument?
Running R 3.5.1 with rlang 0.3.0.1.
We could do a !! and an enexpr in foo2
foo2 <- function(a) {
print(is_missing(enexpr(a)))
bar(!!maybe_missing(enexpr(a)))
}
foo2()
#[1] TRUE
#[1] TRUE
foo2(b)
#[1] FALSE
#[1] FALSE

Determine if a Function Argument is a Function Call

I want to be able to determine if an argument to a function is a call to a function or not. Lets say I have two functions , f() and g():
f <- function() "foo"
g <- function(x){
???
}
I want the output to the calls as below:
g(f())
#> [1] TRUE
g("bar")
#> [1] FALSE
I can get this to work by quoting the function arguments:
f <- function() "foo"
g <- function(x) is.call(x)
g(quote(f()))
#> [1] TRUE
g(quote("bar"))
#> [1] FALSE
However this is sub-optimal as I don't want users of the function to have to do this. Any suggestions?
You can use substitute():
h <- function(x) is.call(substitute(x))
h(f())
# [1] TRUE

What/Where are the attributes of a function object?

By playing around with a function in R, I found out there are more aspects to it than meets the eye.
Consider ths simple function assignment, typed directly in the console:
f <- function(x)x^2
The usual "attributes" of f, in a broad sense, are (i) the list of formal arguments, (ii) the body expression and (iii) the environment that will be the enclosure of the function evaluation frame. They are accessible via:
> formals(f)
$x
> body(f)
x^2
> environment(f)
<environment: R_GlobalEnv>
Moreover, str returns more info attached to f:
> str(f)
function (x)
- attr(*, "srcref")=Class 'srcref' atomic [1:8] 1 6 1 19 6 19 1 1
.. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x00000000145a3cc8>
Let's try to reach them:
> attributes(f)
$srcref
function(x)x^2
This is being printed as a text, but it's stored as a numeric vector:
> c(attributes(f)$srcref)
[1] 1 6 1 19 6 19 1 1
And this object also has its own attributes:
> attributes(attributes(f)$srcref)
$srcfile
$class
[1] "srcref"
The first one is an environment, with 3 internal objects:
> mode(attributes(attributes(f)$srcref)$srcfile)
[1] "environment"
> ls(attributes(attributes(f)$srcref)$srcfile)
[1] "filename" "fixedNewlines" "lines"
> attributes(attributes(f)$srcref)$srcfile$filename
[1] ""
> attributes(attributes(f)$srcref)$srcfile$fixedNewlines
[1] TRUE
> attributes(attributes(f)$srcref)$srcfile$lines
[1] "f <- function(x)x^2" ""
There you are! This is the string used by R to print attributes(f)$srcref.
So the questions are:
Are there any other objects linked to f? If so, how to reach them?
If we strip f of its attributes, using attributes(f) <- NULL, it doesn't seem to affect the function. Are there any drawbacks of doing this?
As far as I know, srcref is the only attribute typically attached to S3 functions. (S4 functions are a different matter, and I wouldn't recommend messing with their sometimes numerous attributes).
The srcref attribute is used for things like enabling printing of comments included in a function's source code, and (for functions that have been sourced in from a file) for setting breakpoints by line number, using utils::findLineNum() and utils::setBreakpoint().
If you don't want your functions to carry such additional baggage, you can turn off recording of srcref by doing options(keep.source=FALSE). From ?options (which also documents the related keep.source.pkgs option):
‘keep.source’: When ‘TRUE’, the source code for functions (newly
defined or loaded) is stored internally allowing comments to
be kept in the right places. Retrieve the source by printing
or using ‘deparse(fn, control = "useSource")’.
Compare:
options(keep.source=TRUE)
f1 <- function(x) {
## This function is needlessly commented
x
}
options(keep.source=FALSE)
f2 <- function(x) {
## This one is too
x
}
length(attributes(f1))
# [1] 1
f1
# function(x) {
# ## This function is needlessly commented
# x
# }
length(attributes(f2))
# [1] 0
f2
# function (x)
# {
# x
# }
I jst figured out an attribute that compiled functions (package compiler) have that is not available with attributes or str. It's the bytecode.
Example:
require(compiler)
f <- function(x){ y <- 0; for(i in 1:length(x)) y <- y + x[i]; y }
g <- cmpfun(f)
The result is:
> print(f, useSource=FALSE)
function (x)
{
y <- 0
for (i in 1:length(x)) y <- y + x[i]
y
}
> print(g, useSource=FALSE)
function (x)
{
y <- 0
for (i in 1:length(x)) y <- y + x[i]
y
}
<bytecode: 0x0000000010eb29e0>
However, this doesn't show with normal commands:
> identical(f, g)
[1] TRUE
> identical(f, g, ignore.bytecode=FALSE)
[1] FALSE
> identical(body(f), body(g), ignore.bytecode=FALSE)
[1] TRUE
> identical(attributes(f), attributes(g), ignore.bytecode=FALSE)
[1] TRUE
It seems to be accessible only via .Internal(bodyCode(...)):
> .Internal(bodyCode(f))
{
y <- 0
for (i in 1:length(x)) y <- y + x[i]
y
}
> .Internal(bodyCode(g))
<bytecode: 0x0000000010eb29e0>

Assigning list attributes in an environment

The title is the self-contained question. An example clarifies it: Consider
x=list(a=1, b="name")
f <- function(){
assign('y[["d"]]', FALSE, parent.frame() )
}
g <- function(y) {f(); print(y)}
g(x)
$a
[1] 1
$b
[1] "name"
whereas I would like to get
g(x)
$a
[1] 1
$b
[1] "name"
$d
[1] FALSE
A few remarks. I knew what is wrong in my original example, but am using it to make clear my objective. I want to avoid <<-, and want x to be changed in the parent frame.
I think my understanding of environments is primitive, and any references are appreciated.
The first argument to assign must be a variable name, not the character representation of an expression. Try replacing f with:
f <- function() with(parent.frame(), y$d <- FALSE)
Note that a, b and d are list components, not list attributes. If we wanted to add an attribute "d" to y in f's parent frame we would do this:
f <- function() with(parent.frame(), attr(y, "d") <- FALSE)
Also, note that depending on what you want to do it may (or may not) be better to have x be an environment or a proto object (from the proto package).
assign's first argument needs to be an object name. Your use of assign is basically the same as the counter-example at the end of the the assign help page. Observe:
> x=list(a=1, b="name")
> f <- function(){
+ assign('x["d"]', FALSE, parent.frame() )
+ }
> g <- function(y) {f(); print(`x["d"]`)}
> g(x)
[1] FALSE # a variable with the name `x["d"]` was created
This may be where you want to use "<<-" but it's generally considered suspect.
> f <- function(){
+ x$d <<- FALSE
+ }
> g <- function(y) {f(); print(y)}
> g(x)
$a
[1] 1
$b
[1] "name"
$d
[1] FALSE
A further thought, offered in the absence of any goal for this exercise and ignoring the term "attributes" which Gabor has pointed out has a specific meaning in R, but may not have been your goal. If all you want is the output to match your specs then this achieves that goal but take notice that no alteration of x in the global environment is occurring.
> f <- function(){
+ assign('y', c(x, d=FALSE), parent.frame() )
+ }
> g <- function(y) {f(); print(y)}
> g(x)
$a
[1] 1
$b
[1] "name"
$d
[1] FALSE
> x # `x` is unchanged
$a
[1] 1
$b
[1] "name"
The parent.frame for f is what might be called the "interior of g but the alteration does not propagate out to the global environment.

Resources