RStudio and Lazy Evaluation - r

I am mainting the package "hdm" and I encountered the following problem.
The following code runs in plain R and used to run in RStudio, but not anymore:
library(hdm)
attach(GrowthData)
fmla= "Outcome ~ ."
fmla.y= "Outcome ~ . - gdpsh465 "
rY= rlasso(fmla.y, data =GrowthData)
Error message:
Error in exists("homoscedastic", where = penalty) : object 'n' not
found
If no penalty in the function rlasso is specified it is set by default containing the variable "n", the sample size of x, which is evaluated later.
n is gotten by lazy evaluation and it seems that in RStudio the correct environment is not found anymore.
The error occurs here, but the problem is that penalty contains n which is not know
if (!exists("homoscedastic", where = penalty)) penalty$homoscedastic = "FALSE"
Somehow I am not sure to solve this and would like to ask if you have any idea.
Thanks a lot for your efforts in advance!
Best,
Martin

When x is a character object, the problem arises because n is not defined in the environment from which rlasso.formula is called, i.e. rlasso.character(), or its parents. This is roughly what's happening:
test <- function(x, ...) {
UseMethod("test")
}
test.character <- function(x, pen = list(alpha = n)) {
test.formula(x, pen = pen)
}
test.formula <- function(x, pen = list(alpha = n)) {
n <- 2
test.default(x, pen)
}
test.default <- function(x, pen = list(alpha = n)) {
n <- 3
exists("alpha", where = pen)
}
test("y ~ x")
# Error in exists("alpha", where = pen) : object 'n' not found
test(y ~ x)
# [1] TRUE
test(123)
# [1] TRUE
A workaround is to not specify pen in the call to the formula method if it's not defined when the character method is called:
test.character <- function(x, pen = list(alpha = n)) {
if (missing(pen))
test.formula(x)
else
test.formula(x, pen = pen)
}
test("y ~ x")
# [1] TRUE

Related

Nested use of call_modify

I am trying to create a call to a function f whose first argument is a call to another function (for which I've chosen dbinom as an example). The call to dbinom (passed on to f) does not include values for all the arguments as these should be finalised within f, and the completed call is returned by f. Here is my failed minimal attempt:
f <- function(a_call) {
call_modify(a_call, x=1)
}
a_call <- call2(dbinom, size=1, prob=0.5)
y <- call2(f, a_call)
The output for y is:
(function(a_call) {
call_modify(a_call, x=1)
})((function (x, size, prob, log = FALSE)
.Call(C_dbinom, x, size, prob, log))(size = 1, prob = 0.5))
This call will
call a_call without any arguments, and then;
pass this result on to f.
If I evaluate y, it errors because dinom's first argument is missing.
I similar-but-related construct:
> call2(call2(dbinom, x=1, size=1, prob=0.5))
((function (x, size, prob, log = FALSE)
.Call(C_dbinom, x, size, prob, log))(x = 1, size = 1, prob = 0.5))()
(function (x, size, prob, log = FALSE)
I get the sense there is something 'not even wrong' with what I'm trying here, and nesting a call modification is best done another way.
It seems that what you are trying to do is handled more naturally by purrr::partial(), which fills in one or more arguments of a function:
f <- function( a_fun ) {purrr::partial( a_fun, x=1 )}
a_fun <- purrr::partial( dbinom, size=1, prob=0.5 )
y <- f(a_fun)
y(...) is now effectively dbinom( x=1, size=1, prob=0.5, ... )
y() # 0.5
y(log=TRUE) # -0.6931472
The great thing about partial() is that it can be naturally chained with the %>% pipe:
z <- partial(dbinom, size=1) %>% partial(prob=0.5) %>% partial(x=1)
z(log=TRUE) # -0.6931472
If I understand correctly what you're trying to do,
then maybe this works better:
f <- function(a_call) {
call_modify(call_standardise(call2(ensym(a_call)),
caller_env()),
x=1)
}
Which you can use with or without characters:
f(print)
# print(x = 1)
f("print")
# print(x = 1)
eval(f(print))
# 1
Or with more indirection:
a_call <- expr(print)
eval(call2(f, a_call))
# print(x = 1)
eval(expr(f(!!a_call)))
# print(x = 1)
Since we do a bit of non-standard evaluation here,
things get a bit tricky.
call_standardise needs to be able to find the function you specify,
and it's very probable that it will be found in the environment that calls f,
and not necessarily in the environment that calls call_standardise,
which would be f's execution environment in this case.
That's why caller_env() is explicitly specified when calling call_standardise even though that's the default for the latter's env,
because default arguments are evaluated in the function's execution environment,
whereas explicit arguments are evaluated in the caller's environment.
Here's a contrived-looking example for this problem:
f2 <- function(a_call) {
call_modify(call_standardise(call2(ensym(a_call))),
x=1)
}
e <- new.env()
e$foo <- function(x) { x + 1 }
with(e, f(foo))
# foo(x = 1)
with(e, f2(foo))
# Error in eval_bare(node_car(expr), env) : object 'foo' not found
However, if you were to develop a package that provides f,
the example is no longer contrived:
f would live in your package's environment,
and other packages could call it for functions that are only available in their respective namespaces.
For more specifics and depictions,
check this reference,
and maybe try drawing the call tree for my example.
call2 constructs a call by passing evaluated ... arguments on to the callable object (the first argument). For example, the command below outputs to the console "y" as the second argument passed to call2 is evaluated,
> A <- call2(print, x=print('y'))
[1] "y"
and constructs a call to print which takes x="y" as its argument (not x=print("y")):
> A
(function (x, ...)
UseMethod("print"))(x = "y")
In order to get around a_call being evaluated and then passed (to f) in the constructed call, it can be quoted, e.g.
f <- function(a_call) {
call_modify(a_call, x=1)
}
a_call <- call2(dbinom, size=1, prob=0.5)
y <- call2(f, quote(a_call))
Now:
> y
(function(a_call) {
call_modify(a_call, x=1)
})(a_call)

R debug() "could not find function" even though it exists

When I try to debug a certain function (itself defined within the function NbCluster), I get a could not find function error. I have checked and the function in question is definitely loaded when debug is called.
> data("USArrests")
> arrests <- scale(USArrests)
> source("NbCluster_copy.R")
> NbCluster_copy(data = arrests, diss = NULL, distance = "euclidean", min.nc = 2, max.nc = 12,
+ method = "ward.D2", index = "gap", alphaBeale = 0.1)
[1] "Indice.Gap exists"
Error in debug(fun = "Indice.Gap") : could not find function "Indice.Gap"
And the issue does not happen if I manually step through the function (by selecting and running lines instead of calling the function).
I tried making a minimal example, but was unable to, so I don't think it is the nested functions that are the problem.
###This works as expected, when I run "wrapper", debug is called from within the function:
wrapper <- function(x){
wrapper <- function(x){
fun1 <- function(x){
fun0 <- function(x){
y = x + 1
return(y)
}
debug(fun0)
y = fun0(x) * 2
return(y)
}
fun1(x)
}
> wrapper(2)
debugging in: fun0(x)
debug at #3: {
y = x + 1
return(y)
}
Browse[2]>
debug at #4: y = x + 1
Browse[2]>
debug at #5: return(y)
Browse[2]>
exiting from: fun0(x)
[1] 6
This is the part I added into the NbClust function.
if(exists("Indice.Gap")){
print("Indice.Gap exists")
}
debug(fun = "Indice.Gap")
right before the first call of Indice.Gap:
resultSGAP <- Indice.Gap(x = jeu, clall = clall,
reference.distribution = "unif", B = 10, method = "ward.D2",
d = NULL, centrotypes = "centroids")
I only made very minor changes besides the one shown above, but if you want to look at the whole function, my copy is here: https://pastebin.com/wxKKDbHy
Just remove the quotes in debug and it should work:
debug(Indice.Gap)
should do the trick.
outer_fun <- function() {
inner_fun <- function() 1
## does not work
# debug("inner_fun")
## works
debug(inner_fun)
inner_fun()
}
outer_fun()
Funny enough on the top level you can provide the function name as string:
debug("outer_fun") # works
debug(outer_fun) # works

Change the argument name of a function, then draw the curve

For some reason I do not want to use x as the name of the variable.
mse <- function(h) {
h + (1/h)
}
The curve function in R seems to require x to be the name of the argument. So I do
cl <- quote(mse(h))
cl[[2]] <- parse(text = 'x')[[1]]
Now this works
curve(expr = eval(cl, list(x)))
But the following does not work, could anyone help explain why? Thank you.
curve(expr = function(x){eval(cl)})
As mentioned by #李哲源 and #r2evans, I will simply do:
curve(expr = mse(h), xname = 'h')

pretty log scale breaks with ggplot hex plot

I want to draw a hexbin plot with ggplot, but with log scale "pretty" breaks for the frequency. Consider
df = data.frame(a=rnorm(1000)); df$b <- df$a+rnorm(1000);
I used this answer to get pretty breaks on linear scale
ggplot(df, aes(a,b)) +
geom_hex(aes(fill=cut(..value..,breaks=pretty(..value.., n=10)))) +
scale_fill_discrete("Frequency")
This works. Now say I want to use log scale pretty breaks. So I used the idea from another answer to define
base_breaks <- function(n = 10){
function(x) {
axisTicks(log10(range(x, na.rm = TRUE)), log = TRUE, n = n)
}
}
and try to do
ggplot(df, aes(a,b)) +
geom_hex(aes(fill=cut(..value..,breaks=base_breaks(n=10)(..value..))))
but it is not able to find the function. It says:
Error in cut.default(value, breaks = base_breaks(n = 10)(value)) :
could not find function "base_breaks"
Even though base_breaks is defined.
> base_breaks(n=10)(c(1:1000))
[1] 1 5 10 50 100 500 1000
How can I make my function visible in whatever environment ggplot is calling it? I even defined it as a global variable with
base_breaks <<- function(n = 10){
function(x) {
axisTicks(log10(range(x, na.rm = TRUE)), log = TRUE, n = n)
}
}
but I still get the same error.
I am not sure about it, but you could try simplifying the function like this:
base_breaks <<- function(n = 10, x){
axisTicks(log10(range(x, na.rm = TRUE)), log = TRUE, n = n)
}
Maybe the problem is that you have a function whose result is another function, and that could be causing the error. With this aproach you would have the values more directly. Check it out!
I can't check it myself, since I get an error object 'value' not found...

Detecting an error in a loop

The following:
install.packages("quantreg")
require(quantreg)
y=rnorm(10)
x=rnorm(10)
summary(rq(y~x,tau=0.01),se="ker")
Generates the error Error in summary.rq(rq(y ~ x, tau = 0.01), se = "ker") :
tau - h < 0: error in summary.rq.
Say I loop over different y and x 1000 times. I want to be able to know when the error occurs and implement a fix mid-loop.
However all my attempts to work with summary(rq(y~x,tau=0.01),se="ker") using is() etc etc doesn't get anywhere. I've never worked with this object type before (and Google/SE searches haven't revealed the answer yet).
I want something like is.error(summary(rq(y~x,tau=0.01),se="ker")), which doesn't actually exist.
The following command will return a logical value indicating whether an error occured:
class(tryCatch(summary(rq(y ~ x,tau = 0.01),se = "ker"),
error = function(e) e))[1] == "simpleError"
You can use replicate instead of a for loop. It is more efficient. In the follwing example, a list including x, y, and the logical errorvalue is returned. The procedure is replicated two times. You could use n = 1000 to replicate it 1000 times.
replicate(n = 2,
expr = {y <- rnorm(10);
x <- rnorm(10);
error <- class(tryCatch(summary(rq(y ~ x,tau = 0.01),se = "ker"), error = function(e) e))[1] == "simpleError";
return(list(x = x, y = y, error = error))},
simplify = FALSE)
Elaborating on the answer from #SvenHohenstein one would like to return the result on successful evaluation, not just whether an error occurred. We'd likely also want to return the reason for the error message using conditionMessage. We'd like to catch errors of class simpleError, so we write a handlers specific to that type of condition. So
error <- FALSE # no error yet
result <- tryCatch({ # result from summary(), or from the error handler
summary(rq(y ~ x,tau = 0.01),se = "ker")
}, simpleError = function(e) { # only catch simpleErrors
error <<- TRUE # yes, error occurred
conditionMessage(e) # 'result' gets error message
})
we'd then return list(x = x, y = y, error=error, result=result).

Resources