R: debug mode inside with() function - r

For some reason, when the body of my function is inside a with() expression, the debug mode doesn't seem to let me step inside the with() part. Why is that, and is there a way around this issue? Below is a silly (but hopefully reproducible) demo.
ff=function(x){
print("Hello")
with(iris,{
y=x;
z=y+mean(Sepal.Width);
return(z);})
}
Now enter debug mode and tryout the function...
debugonce(ff);debugonce(with);
ff(10)
Debug mode simply skips over the with() clause, and returns the answer (13.05733). How do I step INTO those inner clauses?

This works, it's just that what you expect it to do is not what it does. debug will look inside the with code, not inside the code you passed as an argument. Look closely:
> ff(10)
debugging in: ff(10)
debug at #1: {
print("Hello")
with(iris, {
y = x
z = y + mean(Sepal.Width)
return(z)
})
}
Browse[2]> n
debug at #2: print("Hello")
Browse[2]> n
[1] "Hello"
debug at #3: with(iris, {
y = x
z = y + mean(Sepal.Width)
return(z)
})
Browse[2]> n
Now look what's happening here, we are debugging in with:
debugging in: with(iris, {
y = x
z = y + mean(Sepal.Width)
return(z)
})
And this is the key:
debug: UseMethod("with")
Browse[3]> n
[1] 13.05733
What happened? Look at the with code:
> with
function (data, expr, ...)
UseMethod("with")
<bytecode: 0x00000000092f0e50>
<environment: namespace:base>
So as you can see, we did debug the single line in with. You can also debug with.default if you want to see what's happening in with in more detail, but I doubt that will do what you want. I don't know how to do that indirectly (even if you could debug {, which I don't think you can, that wouldn't help as you would be looking at the code for {, not the argument to {, same as with with), but as a hack you can use browse():
ff=function(x){
print("Hello")
with(iris,{
browser() # <<<--- this will allow you to browse inside the expression
y=x;
z=y+mean(Sepal.Width);
return(z);})
}

Related

behavior of browser() function in R

I am debugging a function in R using browser() and have encountered something that I don't understand. Below is a simple example function:
testf<-function(x)
{
if(x==1) x=x+1
return(x)
}
If I run this it behaves as expected:
> testf(1)
[1] 2
> testf(2)
[1] 2
Now, I insert a browser() function to enter debug mode:
testf<-function(x)
{
browser()
if(x==1) x=x+1
return(x)
}
If I now run testf(1) and use the Next command feature in debug mode to step through the function, it produces the expected output of 2. However, if I run the if statement directly (e.g. by highlighting and pushing the Run button), x does not get incremented:
Browse[1]> x
[1] 1
Browse[1]> if(x==1) x=x+1
debug at #3: x = x + 1
Browse[3]> x
[1] 1
If I instead run x=x+1 by itself then x does get incremented:
Browse[3]> x
[1] 1
Browse[3]> x=x+1
Browse[3]> x
[1] 2
Browse[3]>
Then x does get incremented as expected.
Why? My understanding of debug mode is that you can run any command and it will get executed as if you were running the function, but this seems not to be the case with the if statement above.
What am I missing here?
The if condition works in two separate steps :
evaluate condition
execute statement
Perhaps easier to visualize what happens this way:
testf <- function(x)
{
browser()
if(x==1) {
x=x+1
}
return(x)
}
I am going to provide a more detailed answer to my question, based on Waldi's answer above. The real heart of my question is this: Why does my if statement perform differently in normal and debug modes?
> x=1
> if(x==1) x=x+1
> x
[1] 2
> testf(1)
Called from: testf(1)
Browse[1]> x
[1] 1
Browse[1]> if(x==1) x=x+1
debug at #3: x = x + 1
Browse[3]> x
[1] 1
The answer can be seen by using the debugger to step through the function. When I use the Next tool to step through the if, it takes two steps to get through rather than the one step that we might expect. This is because the if involves two steps:
Evaluate condition
Execute command.
Normally, these are executed together at one time. However, the debugger separates them. This is presumably because it is possible that an error could be in either the evaluation step or the execution step. This is sensible, but it has the side effect that if I run the if at the command line in debug mode, it only executes the first part (evaluate) and not the second part (execute).

`trace` After Last Statement

I want to insert a tracing expression after the last statement in a function that uses on.exit. Simplified version (actual function is in the locked namespace of a package):
f <- function() {
on.exit(NULL)
x <- 1
x <- 2
}
trace(f, at=4, quote(cat(x, "\n")))
f()
## Tracing f() step 4
## 1
trace(f, at=5, quote(cat(x, "\n")))
## Error in fBody[[i]] : subscript out of bounds
The idea is to get "2" cated out to the screen by using trace. Seems like there is no way to do this. I'm hoping I'm wrong.
Okay, in a desperate re-reading of the help page I realized I could use the edit argument, and came up with this horrible contraption:
trace_editor <- function(name, file, title, ...) {
body(name) <- bquote(
{
.res <- .(body(name))
.doTrace(cat(x, "\n"), "at end")
.res
}
)
name
}
old.edit <- options(editor=trace_editor)
trace(f, edit=TRUE)
options(old.edit)
f()
## Tracing f() at end
## 2
## [1] 2
Basically, I create a custom editor function to wrap the existing function body inside another expression that computes the body, runs the trace command, and returns the value.
One issue with this is that it doesn't handle visibility (the [1] 2 at the end should not be shown). This can be addressed by changing:
.res <- .(body(name))
to
.res <- withVisible(.(body(name)))
and adding some handling at the end to return invisible(.res$value) if necessary. This unfortunately comes at the cost of messing up error reporting since withVisible becomes part of the call stack. In normal traced functions error reporting works mostly transparently.
If you don't restrict yourself to using trace() it isn't to bad to add a tracing statement wherever you like. trace_last() defined below will add a trace after the function body. I originally thought you were trying to add a trace after the on.exit call, which is what trace_after() does.
f <- function() {
on.exit(message("exit"))
x <- 1
x <- 2
}
trace_last <- function(f, expr) {
body(f) <<- call("{", body(f), expr)
}
trace_after <- function(f, expr) {
body(f) <<- call("{", body(f), bquote(on.exit(.(expr), add = TRUE)))
f()
}
f()
#> exit
trace_last(f, quote(message(x)))
f()
#> 2
#> exit
trace_after(f, quote(message("after on exit")))
#> 2
#> exit
f()
#> 2
#> exit
#> after on exit

Environments in R, mapply and get

Let x<-2 in the global env:
x <-2
x
[1] 2
Let a be a function that defines another x locally and uses get:
a<-function(){
x<-1
get("x")
}
This function correctly gets x from the local enviroment:
a()
[1] 1
Now let's define a function b as below, that uses mapply with get:
b<-function(){
x<-1
mapply(get,"x")
}
If I call b, it seems that mapply makes get not search the function environment first. Instead, it tries to get x directly form the global enviroment, and if x is not defined in the global env, it gives an error message:
b()
x
2
rm(x)
b()
Error in (function (x, pos = -1L, envir = as.environment(pos), mode = "any", :
object 'x' not found
The solution to this is to explicitly define envir=environment().
c<-function(){
x<-1
mapply(get,"x", MoreArgs = list(envir=environment()))
}
c()
x
1
But I would like to know what exactly is going on here. What is mapplydoing? (And why? is this the expected behavior?) Is this "pitfall" common in other R functions?
The problem is that get looks into the envivornment that its called from but here we are passing get to mapply and then calling get from the local environment within mapply. If x is not found within the mapply local environment then it looks the into the parent environment of that, i.e. into environment(mapply) (which is the lexical environment that mapply was defined in which is the base namespace environment); if it is not there either, it looks into the parent of that, which is the global environment, i.e. your R workspace.
This is because R uses lexical scoping, as opposed to dynamic scoping.
We can show this by getting a variable that exists within mapply.
x <- 2
b2<-function(){
x<-1
mapply(get, "USE.NAMES")
}
b2() # it finds USE.NAMES in mapply
## USE.NAMES
## TRUE
In addition to the workaround involving MoreArgs shown in the question this also works since it causes the search to look into the local environment within b after failing to find it mapply. (This is just for illustrating what is going on and in actual practice we would prefer the workaround shown in the question.)
x <- 2
b3 <-function(){
x<-1
environment(mapply) <- environment()
mapply(get, "x")
}
b3()
## 1
ADDED Expanded explanation. Also note that we can view the chain of environments like this:
> debug(get)
> b()
debugging in: (function (x, pos = -1L, envir = as.environment(pos), mode = "any",
inherits = TRUE)
.Internal(get(x, envir, mode, inherits)))(dots[[1L]][[1L]])
debug: .Internal(get(x, envir, mode, inherits))
Browse[2]> envir
<environment: 0x0000000021ada818>
Browse[2]> ls(envir) ### this shows that envir is the local env in mapply
[1] "dots" "FUN" "MoreArgs" "SIMPLIFY" "USE.NAMES"
Browse[2]> parent.env(envir) ### the parent of envir is the base namespace env
<environment: namespace:base>
Browse[2]> parent.env(parent.env(envir)) ### and grandparent of envir is the global env
<environment: R_GlobalEnv>
Thus, the ancestory of environments potentially followed is this (where arrow points to parent):
local environment within mapply --> environment(mapply) --> .GlobalEnv
where environment(mapply) equals asNamespace("base"), the base namespace environment.
R is lexically scoped, not dynamically scoped, meaning that when you search through parent environments to find a value, you are searching through the lexical parents (as written in the source code), not through the dynamic parents (as invoked). Consider this example:
x <- "Global!"
fun1 <- function() print(x)
fun2 <- function() {
x <- "Local!"
fun1a <- function() print(x)
fun1() # fun2() is dynamic but not lexical parent of fun1()
fun1a() # fun2() is both dynamic and lexical parent of fun1a()
}
fun2()
outputs:
[1] "Global!"
[1] "Local!"
In this case fun2 is the lexical parent of fun1a, but not of fun1. Since mapply is not defined inside your functions, your functions are not the lexical parents of mapply and the xs defined therein are not directly accessible to mapply.
The issue is an interplay with built-in C code. Namely, considering the following:
fx <- function(x) environment()
env <- NULL; fn <- function() { env <<- environment(); mapply(fx, 1)[[1]] }
Then
env2 <- fn()
identical(env2, env)
# [1] FALSE
identical(parent.env(env2), env)
# [1] FALSE
identical(parent.env(env2), globalenv())
# [1] TRUE
More specifically, the problem lies in the underlying C code, which fails to consider executing environment, and hands it off to an as-is underlying C eval call which creates a temp environment branching directly off of R_GlobalEnv.
Note this really is what is going on, since no level of stack nesting fixes the issue:
env <- NULL; fn2 <- function() { env <<- environment(); (function() { mapply(fx, 1)[[1]] })() }
identical(parent.env(fn2()), globalenv())
# [1] TRUE

Getting the parse tree for a predefined function in R

I feel as if this is a fairly basic question, but I can't figure it out.
If I define a function in R, how do I later use the name of the function to get its parse tree. I can't just use substitute as that will just return the parse tree of its argument, in this case just the function name.
For example,
> f <- function(x){ x^2 }
> substitute(f)
f
How should I access the parse tree of the function using its name? For example, how would I get the value of substitute(function(x){ x^2 }) without explicitly writing out the whole function?
I'm not exactly sure which of these meets your desires:
eval(f)
#function(x){ x^2 }
identical(eval(f), get("f"))
#[1] TRUE
identical(eval(f), substitute( function(x){ x^2 }) )
#[1] FALSE
deparse(f)
#[1] "function (x) " "{" " x^2" "}"
body(f)
#------
{
x^2
}
#---------
eval(parse(text=deparse(f)))
#---------
function (x)
{
x^2
}
#-----------
parse(text=deparse(f))
#--------
expression(function (x)
{
x^2
})
#--------
get("f")
# function(x){ x^2 }
The print representation may not display the full features of the values returned.
class(substitute(function(x){ x^2 }) )
#[1] "call"
class( eval(f) )
#[1] "function"
The function substitute can substitute in values bound to an environment. The odd thing is that its env argument does not possess a default value, but it defaults to the evaluation environment. This behavior seems to make it fail when the evaluation environment is the global environment, but works fine otherwise.
Here is an example:
> a = new.env()
> a$f = function(x){x^2}
> substitute(f, a)
function(x){x^2}
> f = function(x){x^2}
> environment()
<environment: R_GlobalEnv>
> substitute(f, environment())
f
> substitute(f, globalenv())
f
As demonstrated, when using the global environment as the second argument the functionality fails.
A further demosntration that it works correctly using a but not the global environment:
> evalq(substitute(f), a)
function(x){x^2}
> evalq(substitute(f), environment())
f
Quite puzzling.
Apparently that's indeed some weird quirk of substitute and is mentioned here:
/* do_substitute has two arguments, an expression and an
environment (optional). Symbols found in the expression are
substituted with their values as found in the environment. There is
no inheritance so only the supplied environment is searched. If no
environment is specified the environment in which substitute was
called is used. If the specified environment is R_GlobalEnv it is
converted to R_NilValue, for historical reasons. In substitute(),
R_NilValue signals that no substitution should be done, only
extraction of promise expressions. Arguments to do_substitute
should not be evaluated.
*/
And you have already found a way of circumventing it:
e = new.env()
e$fn = f
substitute(fn, e)

Lazy evaluation of supplied arguments

Say I have the following function:
foo <- function(x, y = min(m)) {
m <- 1:10
x + y
}
When I run foo(1), the returned value is 2, as expected. However, I cannot run foo(1, y = max(m)) and receive 11, since lazy evaluation only works for default arguments. How can I supply an argument but have it evaluate lazily?
The simple answer is that you can't and shouldn't try to. That breaks scope and could wreak havoc if it were allowed. There are a few options that you can think about the problem differently.
first pass y as a function
foo<-function(x,y=min){
m<-1:10
x+y(m)
}
if a simple function does not work you can move m to an argument with a default.
foo<-function(x,y=min(m),m=1:10){
x+y(m)
}
Since this is a toy example I would assume that this would be too trivial. If you insist on breaking scope then you can pass it as an expression that is evaluated explicitly.
foo<-function(x,y=expression(min(m))){
m<-1:10
x+eval(y)
}
Then there is the option of returning a function from another function. And that might work for you as well, depending on your purpose.
bar<-function(f)function(x,y=f(m)){
m<-1:10
x+y
}
foo.min<-bar(min)
foo.min(1) #2
foo.max<-bar(max)
foo.max(1) #10
But now we are starting to get into the ridiculous.
My solution was to just change the default argument:
R> formals(foo)$y <- call("max", as.name("m"))
R> foo(1)
[1] 11
You can use a substitute, eval combintation.
foo <- function(x, y = min(m)) {
y <- substitute(y)
m <- 1:10
x + eval(y)
}
foo(1)
## [1] 2
foo(1, y = max(m))
## [1] 11

Resources