Environment issues in R - r

Just to clarify, I'm not saying that R has issues. The problem is probably on my side, but I'm really confused. I have a function (make_a()) that creates a function a(). I also have a function that uses this function in its definition (fun_using_a()):
make_a <- function(x) {
a <- function(y) {
x + y
}
a
}
fun_using_a <- function(x) {
a(x)/2
}
Now, I create another function that uses these two:
my_fun <- function(x) {
a <- make_a(1)
fun_using_a(x)
}
Calling my_fun(10) gives an error:
my_fun(10)
Error in a(x) : could not find function "a"
However, everything works fine if do essentially the same thing in the global environment:
a <- make_a(1)
fun_using_a(10)
[1] 5.5
What's going on here? Why does my_fun(10) throw an error? It seems that my understanding of R environments must be a bit off somewhere, but I just can't figure it out. When I call my_fun(), shouldn't the function a() be defined in the execution environment after the first line and thus fun_using_a() should be able to find it there (due to lazy evaluation)?
Any help will be greatly appreciated. Thanks a lot!

You would need to save the result of make_a with name a in a place where fun_using_a can see it. There isn't a single "execution environment", every function invocation creates a new one. As posted, make_a returns a function, but you didn't show it being saved anywhere until your second version of the code.
By the way, make_a is likely to have a subtle bug: since x is never evaluated until the first call to a(), its value could change. For example,
x <- 1
a <- make_a(x)
x <- 5
fun_using_a(10)
will return 7.5, not 5.5, since the value of x in a(y) will be 5 instead of 1. To fix it, force the value of x in make_a:
make_a_2 <- function(x) {
force(x)
a <- function(y) {
x + y
}
a
}

Related

Better way to deal with namespace when using glue::glue

I want to create a function that itself uses the awesome glue::glue function.
However, I came to find myself dealing with some namespace issue when I want to glue a variable that exists in both function and global environments:
x=1
my_glue <- function(x, ...) {
glue::glue(x, ...)
}
my_glue("foobar x={x}") #not the expected output
# foobar x=foobar x={x}
I'd rather keep the variable named x for package consistency.
I ended up doing something like this, which works pretty well so far but only postpone the problem (a lot, but still):
my_glue2 <- function(x, ...) {
x___=x; rm(x)
glue::glue(x___, ...)
}
my_glue2("foobar x={x}") #problem is gone!
# foobar x=1
my_glue2("foobar x={x___}") #very unlikely but still...
# foobar x=foobar x={x___}
Is there a better/cleaner way to do this?
Since the value x = 1 is nowhere passed to the function, in the current scenario a way to do this would be to evaluate the string in the global environment itself where the value of x is present before passing it to the function.
my_glue(glue::glue("foobar x={x}"))
#foobar x=1
my_glue(glue::glue("foobar x={x}"), " More text")
#foobar x=1 More text
Another option (and I think this is the answer that you are looking for) is to get the value of x from the parent environment. glue has .envir parameter, where the environment to evaluate the expression can be defined.
my_glue <- function(x, ...) {
glue::glue(x, ...,.envir = parent.frame())
}
my_glue("foobar x={x}")
#foobar x=1

How to create and initialize a non existing variable to a default value in R?

This question originates from curiosity, I have nothing to deliver based on this.
Mimicking pass-by-reference (question here) I noticed that both approaches described in the answers obviously fail when the variable does not exist and one tries to use/reference them.
Regardless of its actual usefulness, I would be curious to know if there is a way to initialize the parameter x in the code below, and hence the "actual" parameter myVar, to a default value, with the help of the desired type passed as a string, xtype (passing the type, and in such basic form is not a requirement, it is simply the first thing that came to my mind of non-advanced R programmer).
The question whose solution generated this, here, shows better code in the chosen answer, here using my code as I understand it better
myF <- function(x, xtype) {
varName <- deparse(substitute(x))
if (!exists(varName)) {
# here should initialize x to a default value
# of the type passed in xtype
# to avoid that x <- x ... fails
# this may not have any practical usefulness, just curious
}
x <- x+1
assign(varName,x,envir=parent.frame(n = 1))
NA # sorry this is not a function
# in real life sometimes you also need procedures
}
if (exists(deparse(substitute(myVar)))) {
rm(myVar)
}
myF(myVar, "numeric")
print(myVar)
Error in myF(myVar, "numeric") : object 'myVar' not found
# as expected
Maybe this is what you are looking for (even though it's a terrible idea to write a function like this in R).
myF <- function(x, xtype) {
varName <- deparse(substitute(x))
if (!exists(varName)) {
x <- vector(xtype, 1)
} else {
x <- get(varName)
}
x <- x+1
assign(varName,x,envir=parent.frame(n = 1))
}

R: Global assignment of vector element works only inside a function

I'm working on a project where there are some global assignments, and I ran into something sort of odd. I was hoping someone could help me with it.
I wrote this toy example to demonstrate the problem:
x <- 1:3 ; x <- c(1, 2, 5) # this works fine
x <- 1:3 ; x[3] <- 5 # this works fine
x <<- 1:3 ; x <<- c(1, 2, 5) # this works fine
x <<- 1:3 ; x[3] <<- 5 # this does not work
# Error in x[3] <<- 5 : object 'x' not found
same.thing.but.in.a.function = function() {
x <<- 1:3
x[3] <<- 5
}
same.thing.but.in.a.function(); x
# works just fine
So, it seems it's not possible to change part of a vector using a global assignment -- unless that assignment is contained within a function. Can anyone explain why this is the case?
I figured out the problem.
Basically, in this manifestation of <<- (which is more accurately called the "superassignment operator" rather than the "global assignment operator"), it actually skips checking the global environment when trying to access the variable.
On page 19 of R Language Definition, it states the following:
x <<- data.frame(0, 0, 0) # (I added this so the code can be run)
names(x)[3] <<- "Three"
is equivalent to
x <<- data.frame(0, 0, 0) # (I added this so the code can be run)
`*tmp*` <<- get(x, envir=parent.env(), inherits=TRUE)
names(`*tmp*`)[3] <- "Three"
x <<- `*tmp*`
rm(`*tmp*`)
When I tried to run those four lines, it threw an error -- parent.env requires an argument and has no default. I can only assume that the documentation was written at a time when parent.env() contained a default value for its first argument. But I can safely guess that the default would have been environment() which returns the current environment. It then throws an error again -- x needs to be in quotes. So I fixed that too. Now, when I run the first line, it throws the same error message as I encountered originally, but with more detail:
# Error in get("x", envir = parent.env(environment()), inherits = TRUE) :
# object 'x' not found
This makes sense -- environment() itself returns .GlobalEnv, so parent.env(.GlobalEnv) misses out on the global environment entirely, instead returning the most recently loaded package environment. Then, since inherits is set to TRUE, the get() function keeps going up the levels, searching through each of the loaded package environments before eventually reaching the empty environment, and at that point it has still not found x. Thus the error.
Since parent.env(environment()) will return .GlobalEnv (or another environment below it) as long as you start inside a local environment, this same problem does not occur when the same lines are run from inside a local environment:*
local({
x <<- data.frame(0, 0, 0) # (I added this so the code can be run)
`tmp` <<- get("x", envir=parent.env(environment()), inherits=TRUE)
names(`tmp`)[3] <- "Three"
x <<- `tmp`
rm(`tmp`)
})
x
# X0 X0.1 Three
# 1 0 0 0
# so, it works properly
In contrast, when <<- is used in general, there is no extra subsetting code that occurs behind the scenes, and it first attempts to access the value in the current environment (which might be the global environment), before moving upwards. So in that situation, it doesn't run into the problem where it skips the global environment.
* I had to change the variable from *tmp* to tmp because one of the behind-the-scenes operations in the code uses the *tmp* variable and then removes it, so *tmp* disappears in the middle of line 3 and so it throws an error when I then try to access it.
If you change to single arrow assignment then it work
x <<- 1:3 ; x[3] <- 5
BTW - I would suggest these wonderful discussions for better understanding and proper use of <<- operator -
How do you use "<<-" (scoping assignment) in R?
What is the difference between assign() and <<- in R?

In R, getting the following error: "attempt to replicate an object of type 'closure'"

I am trying to write an R function that takes a data set and outputs the plot() function with the data set read in its environment. This means you don't have to use attach() anymore, which is good practice. Here's my example:
mydata <- data.frame(a = rnorm(100), b = rnorm(100,0,.2))
plot(mydata$a, mydata$b) # works just fine
scatter_plot <- function(ds) { # function I'm trying to create
ifelse(exists(deparse(quote(ds))),
function(x,y) plot(ds$x, ds$y),
sprintf("The dataset %s does not exist.", ds))
}
scatter_plot(mydata)(a, b) # not working
Here's the error I'm getting:
Error in rep(yes, length.out = length(ans)) :
attempt to replicate an object of type 'closure'
I tried several other versions, but they all give me the same error. What am I doing wrong?
EDIT: I realize the code is not too practical. My goal is to understand functional programming better. I wrote a similar macro in SAS, and I was just trying to write its counterpart in R, but I'm failing. I just picked this as an example. I think it's a pretty simple example and yet it's not working.
There are a few small issues. ifelse is a vectorized function, but you just need a simple if. In fact, you don't really need an else -- you could just throw an error immediately if the data set does not exist. Note that your error message is not using the name of the object, so it will create its own error.
You are passing a and b instead of "a" and "b". Instead of the ds$x syntax, you should use the ds[[x]] syntax when you are programming (fortunes::fortune(312)). If that's the way you want to call the function, then you'll have to deparse those arguments as well. Finally, I think you want deparse(substitute()) instead of deparse(quote())
scatter_plot <- function(ds) {
ds.name <- deparse(substitute(ds))
if (!exists(ds.name))
stop(sprintf("The dataset %s does not exist.", ds.name))
function(x, y) {
x <- deparse(substitute(x))
y <- deparse(substitute(y))
plot(ds[[x]], ds[[y]])
}
}
scatter_plot(mydata)(a, b)

How to view value of a variable inside a function? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
General suggestions for debugging R?
When debugging, I would often like to know the value of a variable used in a function that has completed executing. How can that be achieved?
Example:
I have function:
MyFunction <- function() {
x <- rnorm(10) # assign 10 random normal numbers to x
return(mean(x))
}
I would like to know the values stored in x which are not available to me after the function is done executing and the function's environment is cleaned up.
You mentioned debugging, so I assume the values are not needed later in the script, you just want to check what is happening. In that case, what I always do is use browser:
MyFunction <- function() {
browser()
x <- rnorm(10) # assign 10 random normal numbers to x
return(mean(x))
}
This drops you into an interactive console inside the scope of the function, allowing you to inspect what is happening inside.
For general information about debugging in R I suggest this SO post.
MyFunction <- function() {
x <- rnorm(10) # assign 10 random normal numbers to x
return(list(x,mean(x)))
}
This will return a list where the first element is x and the second is its mean
You have many options here. The easiest is to use the <<- operator when you assign to x. It's also the most likely to get you into trouble.
> test <- function() x <- runif(1)
> x <- NA
> test()
> x
[1] NA
> test <- function() x <<- runif(1)
> test()
> x
[1] 0.7753325
Edit
#PaulHeimstra points out that you'd like this for debugging. Here's a pointer to some general tricks:
General suggestions for debugging in R
I'd recommend either setting options(error=recover) or using trace() in combination with browser().
There are already some good solutions, I'd like to add one possibility. I emphasize on the fact that you want to know the value of a variable used in a function that has completed executing. So there is maybe no need to assign those values, and you don't want (a priori) to stop execution. The solution is to simply use print. So it is not used by default but only when you want to debug, the option to print or not can be passed as a function argument:
MyFunction <- function(x, y, verbose = FALSE) {
a <- x * y
if (verbose) print(a)
b <- x - y
if (verbose) print(b)
return(a * b)
}
In general, you would run your function like this: MyFunction(10, 4) but when you want to see those intermediate results, do MyFunction(10, 4, verbose = TRUE).

Resources