What is the difference between finish and continue in browser()? - r

In the help file for browser, there are two options that seem very similar:
f
finish execution of the current loop or function
c
exit the browser and continue execution at the next statement.
What is the difference between them and in what situations is the difference apparent?
Some clues about what may be the difference - I wrote a script called browse.R with the following contents:
for (i in 1:2){
browser()
print(i)
}
This is the results of usingc vs f:
> source("browse.R")
Called from: eval(expr, envir, enclos)
Browse[1]> c
[1] 1
Called from: eval(expr, envir, enclos)
Browse[1]> c
[1] 2
> source("browse.R")
Called from: eval(expr, envir, enclos)
Browse[1]> f
[1] 1
Browse[2]> f
[1] 2
Note that the level of Browse[n] changes. This still doesn't highlight any practical difference between them.
I also tried to see if perhaps things would disappear from the browser environment:
for (i in 1:2){
a <- "not modified"
browser()
print(a)
}
Called from: top level
Browse[1]> a <- "modified"
Browse[1]> f
[1] "modified"
Browse[1]> a
[1] "not modified"
Browse[1]> a <- "modified"
Browse[1]> c
[1] "modified"
So there's no difference there either.

There is a small difference.
c immediately exits the browser (and debug mode) and after that executes the rest of the code in the normal way.
f on the contrary stays in the browser (and debug mode) while executing the rest of the function/loop. After the function/loop is finished, he also returns to the normal execution mode.
Source: R-source (line 1105-1117) and R-help
This has a few implications:
c closes the browser. This means that a new browser call is called from a function. Therefore you will see the line: Called from: function(). f on the other hand will not close the browser and therefore you will not see this line. The source code for this behavior is here: https://github.com/wch/r-source/....
Because f stays in the browser, f also keeps track of the contextlevel:
The browser prompt is of the form Browse[n]>: here var{n} indicates the ‘browser level’. The browser can be called when browsing (and often is when debug is in use), and each recursive call increases the number. (The actual number is the number of ‘contexts’ on the context stack: this is usually 2 for the outer level of browsing and 1 when examining dumps in debugger)
These differences can be tested with the code:
> test <- function(){
browser()
browser()
}
> test()
Called from: test()
Browse[1]> c
Called from: test()
Browse[1]> c
> test()
Called from: test()
Browse[1]> f
Browse[2]> f
As far as I see it, there is no practical difference between the two, unless there lies a practical purpose in the context stack. The debugging mode has no added value. The debug flag only opens the browser when you enter the function but since you are already inside the function, it will not trigger another effect.

Difference Between Browser and Continue
At least for me, I feel the answer can be mapped out as a table, however, let's first frame up the usage of browser(), for those who may not yet have encountered it.
The browser function is the basis for the majority of R debugging techniques. Essentially, a call to browser halts execution and starts a special interactive session where you can inspect the current state of the computations and step through the code one command at a time.
Once in the browser, you can execute any R command. For example, one might view the local environment by using ls(); or choose to set new variables, or change the values assigned to variables simply by using the standard methods for assigning values to variables. The browser also understands a small set of
commands specific to it. Which leads us to a discussion on Finish and continue...
The subtlety in relation to Finish and continue is that:
Finish, or f: finishes execution of the current loop or function.
Continue, c: leaves interactive debugging and continues regular
execution of the function. This is useful if you’ve fixed the bad
state and want to check that the function proceeds correctly.
essentially, we talking about a subtlety in mode.
Browser / Recover Overview
At least for me, you have to view this in the context of debugging a program written in R. Specifically, how you might apply Finish and continue. I am sure many understand this, but I include for completeness as I personally really didn't for a long time.
browser allows you to look at the objects in the function in which the browser call is placed.
recover allows you to look at those objects as well as the objects in the caller of that function and all other active functions.
Liberal use of browser, recover, cat and print while you are writing functions allows your expectations and R's expectations to converge.
A very handy way of doing this is with trace. For example, if browsing at
the end of the myFun function is convenient, then you can do:
trace(myFun, exit=quote(browser()))
You can customize the tracing with a command like:
trace(myFun, edit=TRUE)
If you run into an error, then debugging is the appropriate action. There are at
least two approaches to debugging. The first approach is to look at the state of
play at the point where the error occurs. Prepare for this by setting the error
option. The two most likely choices are:
options(error=recover)
or
options(error=dump.frames)
The difference is that with recover you are automatically thrown into debug
mode, but with dump.frames you start debugging by executing:
debugger()
In either case you are presented with a selection of the frames (environments)
of active functions to inspect.
You can force R to treat warnings as errors with the command:
options(warn=2)
If you want to set the error option in your .First function, then you need a
trick since not everything is in place at the time that .First is executed:
options(error=expression(recover()))
or
options(error=expression(dump.frames()))
The second idea for debugging is to step through a function as it executes. If
you want to step through function myfun, then do:
debug(myfun)
and then execute a statement involving myfun. When you are done debugging,
do:
undebug(myfun)
A more sophisticated version of this sort of debugging may be found in the
debug package.
References:
R Inferno
http://www.burns-stat.com/pages/Tutor/R_inferno.pdf (Great Reference)
Circle 8 - Believing It Does as Intended (Page 45).
Author: Patrick Burns
R Programming for Bioinformatics
Author: Robert Gentleman

You can think of finish as a break in other languages. What happens is that you no longer care about the other items in the iteration because of a certain condition such as finding a specific item or an item that would cause an error.
continue, on the other hand, will stop at the current line of the loop, ignore the rest of the code block, and continue to the next item in the iteration. You would use this option if you intend to go through every item in the iteration and just ignore the items that satisfy the condition to skip over.

Related

R: How make dump.frames() include all variables for later post-mortem debugging with debugger()

I have the following code which provokes an error and writes a dump of all frames using dump.frames() as proposed e. g. by Hadley Wickham:
a <- -1
b <- "Hello world!"
bad.function <- function(value)
{
log(value) # the log function may cause an error or warning depending on the value
}
tryCatch( {
a.local.value <- 42
bad.function(a)
bad.function(b)
},
error = function(e)
{
dump.frames(to.file = TRUE)
})
When I restart the R session and load the dump to debug the problem via
load(file = "last.dump.rda")
debugger(last.dump)
I cannot find my variables (a, b, a.local.value) nor my function "bad.function" anywhere in the frames.
This makes the dump nearly worthless to me.
What do I have to do to see all my variables and functions for a decent post-mortem analysis?
The output of debugger is:
> load(file = "last.dump.rda")
> debugger(last.dump)
Message: non-numeric argument to mathematical functionAvailable environments had calls:
1: tryCatch({
a.local.value <- 42
bad.function(a)
bad.function(b)
2: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatchOne(expr, names, parentenv, handlers[[1]])
4: value[[3]](cond)
Enter an environment number, or 0 to exit
Selection:
PS: I am using R3.3.2 with RStudio for debugging.
Update Nov. 20, 2016: Note that it is not an R bug (see answer of Martin Maechler). I did not change my answer for reproducibility. The described work around still applies.
Summary
I think dump.frames(to.file = TRUE) is currently an anti pattern (or probably a bug) in R if you want to debug errors of batch jobs in a new R session.
You should better replace it with
dump.frames()
save.image(file = "last.dump.rda")
or
options(error = quote({dump.frames(); save.image(file = "last.dump.rda")}))
instead of
options(error = dump.frames)
because the global environment (.GlobalEnv = the user workspace you normally create your objects) is included then in the dump while it is missing when you save the dump directly via dump.frames(to.file = TRUE).
Impact analysis
Without the .GlobalEnv you loose important top level objects (and their current values ;-) to understand the behaviour of your code that led to an error!
Especially in case of errors in "non-interactive" R batch jobs you are lost without .GlobalEnv since you can debug only in a newly started (empty) interactive workspace where you then can only access the objects in the call stack frames.
Using the code snippet above you can examine the object values that led to the error in a new R workspace as usual via:
load(file = "last.dump.rda")
debugger(last.dump)
Background
The implementation of dump.frames creates a variable last.dump in the workspace and fills it with the environments of the call stack (sys.frames(). Each environment contains the "local variables" of the called function). Then it saves this variable into a file using save().
The frame stack (call stack) grows with each call of a function, see ?sys.frames:
.GlobalEnv is given number 0 in the list of frames. Each subsequent
function evaluation increases the frame stack by 1 and the [...] environment for evaluation of that function are returned by [...] sys.frame with the appropriate index.
Observe that the .GlobalEnv has the index number 0.
If I now start debugging the dump produced by the code in the question and select the frame 1 (not 0!) I can see a variable parentenv which points (references) the .GlobalEnv:
Browse[1]> environmentName(parentenv)
[1] "R_GlobalEnv"
Hence I believe that sys.frames does not contain the .GlobalEnv and therefore dump.frames(to.file = TRUE) neither since it only stores the sys.frames without all other objects of the .GlobalEnv.
Maybe I am wrong, but this looks like an unwanted effect or even a bug.
Discussions welcome!
References
https://cran.r-project.org/doc/manuals/R-exts.pdf
Excerpt from section 4.2 Debugging R code (page 96):
Because last.dump can be looked at later or even in another R session,
post-mortem debug- ging is possible even for batch usage of R. We do
need to arrange for the dump to be saved: this can be done either
using the command-line flag
--save to save the workspace at the end of the run, or via a setting such as
options(error = quote({dump.frames(to.file=TRUE); q()}))
Note that it is often more productive to work with the R Core team rather than just telling that R has a bug. It clearly has no bug, here, as it behaves exactly as documented.
Also there is no problem if you work interactively, as you have full access to your workspace (which may be LARGE) there, so the problem applies only to batch jobs (as you've mentioned).
What we rather have here is a missing feature and feature requests (and bug reports!) should happen on the R bug site (aka _'R bugzilla'), https://bugs.r-project.org/ ... typically however after having read the corresponding page on the R website: https://www.r-project.org/bugs.html.
Note that R bugzilla is searchable, and in the present case, you'd pretty quickly find that Andreas Kersting made a nice proposal (namely as a wish, rather than claiming a bug),
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17116
and consequently I had added the missing feature to R, on Aug.16, already.
Yes, of course, the development version of R, aka R-devel.
See also today's thread on the R-devel mailing list,
https://stat.ethz.ch/pipermail/r-devel/2016-November/073378.html

R script line numbers at error 2016 [duplicate]

I get an error when using an R function that I wrote:
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: algorithm did not converge
What I have done:
Step through the function
Adding print to find out at what line the error occurs suggests two functions that should not use glm.fit. They are window() and save().
My general approaches include adding print and stop commands, and stepping through a function line by line until I can locate the exception.
However, it is not clear to me using those techniques where this error comes from in the code. I am not even certain which functions within the code depend on glm.fit. How do I go about diagnosing this problem?
I'd say that debugging is an art form, so there's no clear silver bullet. There are good strategies for debugging in any language, and they apply here too (e.g. read this nice article). For instance, the first thing is to reproduce the problem...if you can't do that, then you need to get more information (e.g. with logging). Once you can reproduce it, you need to reduce it down to the source.
Rather than a "trick", I would say that I have a favorite debugging routine:
When an error occurs, the first thing that I usually do is look at the stack trace by calling traceback(): that shows you where the error occurred, which is especially useful if you have several nested functions.
Next I will set options(error=recover); this immediately switches into browser mode where the error occurs, so you can browse the workspace from there.
If I still don't have enough information, I usually use the debug() function and step through the script line by line.
The best new trick in R 2.10 (when working with script files) is to use the findLineNum() and setBreakpoint() functions.
As a final comment: depending upon the error, it is also very helpful to set try() or tryCatch() statements around external function calls (especially when dealing with S4 classes). That will sometimes provide even more information, and it also gives you more control over how errors are handled at run time.
These related questions have a lot of suggestions:
Debugging tools for the R language
Debugging lapply/sapply calls
Getting the state of variables after an error occurs in R
R script line numbers at error?
The best walkthrough I've seen so far is:
http://www.biostat.jhsph.edu/%7Erpeng/docs/R-debug-tools.pdf
Anybody agree/disagree?
As was pointed out to me in another question, Rprof() and summaryRprof() are nice tools to find slow parts of your program that might benefit from speeding up or moving to a C/C++ implementation. This probably applies more if you're doing simulation work or other compute- or data-intensive activities. The profr package can help visualizing the results.
I'm on a bit of a learn-about-debugging kick, so another suggestion from another thread:
Set options(warn=2) to treat warnings like errors
You can also use options to drop you right into the heat of the action when an error or warning occurs, using your favorite debugging function of choice. For instance:
Set options(error=recover) to run recover() when an error occurs, as Shane noted (and as is documented in the R debugging guide. Or any other handy function you would find useful to have run.
And another two methods from one of #Shane's links:
Wrap an inner function call with try() to return more information on it.
For *apply functions, use .inform=TRUE (from the plyr package) as an option to the apply command
#JoshuaUlrich also pointed out a neat way of using the conditional abilities of the classic browser() command to turn on/off debugging:
Put inside the function you might want to debug browser(expr=isTRUE(getOption("myDebug")))
And set the global option by options(myDebug=TRUE)
You could even wrap the browser call: myBrowse <- browser(expr=isTRUE(getOption("myDebug"))) and then call with myBrowse() since it uses globals.
Then there are the new functions available in R 2.10:
findLineNum() takes a source file name and line number and returns the function and environment. This seems to be helpful when you source() a .R file and it returns an error at line #n, but you need to know what function is located at line #n.
setBreakpoint() takes a source file name and line number and sets a breakpoint there
The codetools package, and particularly its checkUsage function can be particularly helpful in quickly picking up syntax and stylistic errors that a compiler would typically report (unused locals, undefined global functions and variables, partial argument matching, and so forth).
setBreakpoint() is a more user-friendly front-end to trace(). Details on the internals of how this works are available in a recent R Journal article.
If you are trying to debug someone else's package, once you have located the problem you can over-write their functions with fixInNamespace and assignInNamespace, but do not use this in production code.
None of this should preclude the tried-and-true standard R debugging tools, some of which are above and others of which are not. In particular, the post-mortem debugging tools are handy when you have a time-consuming bunch of code that you'd rather not re-run.
Finally, for tricky problems which don't seem to throw an error message, you can use options(error=dump.frames) as detailed in this question:
Error without an error being thrown
At some point, glm.fit is being called. That means one of the functions you call or one of the functions called by those functions is using either glm, glm.fit.
Also, as I mention in my comment above, that is a warning not an error, which makes a big difference. You can't trigger any of R's debugging tools from a warning (with default options before someone tells me I am wrong ;-).
If we change the options to turn warnings into errors then we can start to use R's debugging tools. From ?options we have:
‘warn’: sets the handling of warning messages. If ‘warn’ is
negative all warnings are ignored. If ‘warn’ is zero (the
default) warnings are stored until the top-level function
returns. If fewer than 10 warnings were signalled they will
be printed otherwise a message saying how many (max 50) were
signalled. An object called ‘last.warning’ is created and
can be printed through the function ‘warnings’. If ‘warn’ is
one, warnings are printed as they occur. If ‘warn’ is two or
larger all warnings are turned into errors.
So if you run
options(warn = 2)
then run your code, R will throw an error. At which point, you could run
traceback()
to see the call stack. Here is an example.
> options(warn = 2)
> foo <- function(x) bar(x + 2)
> bar <- function(y) warning("don't want to use 'y'!")
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
> traceback()
7: doWithOneRestart(return(expr), restart)
6: withOneRestart(expr, restarts[[1L]])
5: withRestarts({
.Internal(.signalCondition(simpleWarning(msg, call), msg,
call))
.Internal(.dfltWarn(msg, call))
}, muffleWarning = function() NULL)
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x +
2)))
3: warning("don't want to use 'y'!")
2: bar(x + 2)
1: foo(1)
Here you can ignore the frames marked 4: and higher. We see that foo called bar and that bar generated the warning. That should show you which functions were calling glm.fit.
If you now want to debug this, we can turn to another option to tell R to enter the debugger when it encounters an error, and as we have made warnings errors we will get a debugger when the original warning is triggered. For that you should run:
options(error = recover)
Here is an example:
> options(error = recover)
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
Enter a frame number, or 0 to exit
1: foo(1)
2: bar(x + 2)
3: warning("don't want to use 'y'!")
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x + 2)))
5: withRestarts({
6: withOneRestart(expr, restarts[[1]])
7: doWithOneRestart(return(expr), restart)
Selection:
You can then step into any of those frames to see what was happening when the warning was thrown.
To reset the above options to their default, enter
options(error = NULL, warn = 0)
As for the specific warning you quote, it is highly likely that you need to allow more iterations in the code. Once you've found out what is calling glm.fit, work out how to pass it the control argument using glm.control - see ?glm.control.
So browser(), traceback() and debug() walk into a bar, but trace() waits outside and keeps the motor running.
By inserting browser somewhere in your function, the execution will halt and wait for your input. You can move forward using n (or Enter), run the entire chunk (iteration) with c, finish the current loop/function with f, or quit with Q; see ?browser.
With debug, you get the same effect as with browser, but this stops the execution of a function at its beginning. Same shortcuts apply. This function will be in a "debug" mode until you turn it off using undebug (that is, after debug(foo), running the function foo will enter "debug" mode every time until you run undebug(foo)).
A more transient alternative is debugonce, which will remove the "debug" mode from the function after the next time it's evaluated.
traceback will give you the flow of execution of functions all the way up to where something went wrong (an actual error).
You can insert code bits (i.e. custom functions) in functions using trace, for example browser. This is useful for functions from packages and you're too lazy to get the nicely folded source code.
My general strategy looks like:
Run traceback() to see look for obvious issues
Set options(warn=2) to treat warnings like errors
Set options(error=recover) to step into the call stack on error
After going through all the steps suggested here I just learned that setting .verbose = TRUE in foreach() also gives me tons of useful information. In particular foreach(.verbose=TRUE) shows exactly where an error occurs inside the foreach loop, while traceback() does not look inside the foreach loop.
Mark Bravington's debugger which is available as the package debug on CRAN is very good and pretty straight forward.
library(debug);
mtrace(myfunction);
myfunction(a,b);
#... debugging, can query objects, step, skip, run, breakpoints etc..
qqq(); # quit the debugger only
mtrace.off(); # turn off debugging
The code pops up in a highlighted Tk window so you can see what's going on and, of course you can call another mtrace() while in a different function.
HTH
I like Gavin's answer: I did not know about options(error = recover). I also like to use the 'debug' package that gives a visual way to step through your code.
require(debug)
mtrace(foo)
foo(1)
At this point it opens up a separate debug window showing your function, with a yellow line showing where you are in the code. In the main window the code enters debug mode, and you can keep hitting enter to step through the code (and there are other commands as well), and examine variable values, etc. The yellow line in the debug window keeps moving to show where you are in the code. When done with debugging, you can turn off tracing with:
mtrace.off()
Based on the answer I received here, you should definitely check out the options(error=recover) setting. When this is set, upon encountering an error, you'll see text on the console similar to the following (traceback output):
> source(<my filename>)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
Enter a frame number, or 0 to exit
1: source(<my filename>)
2: eval.with.vis(ei, envir)
3: eval.with.vis(expr, envir, enclos)
4: LinearParamSearch(data = dataset, y = data.frame(LGD = dataset$LGD10), data.names = data
5: LinearParamSearch.R#66: plot(x = x, y = y.data, xlab = names(y), ylab = data.names[i])
6: LinearParamSearch.R#66: plot.default(x = x, y = y.data, xlab = names(y), ylab = data.nam
7: LinearParamSearch.R#66: localWindow(xlim, ylim, log, asp, ...)
8: LinearParamSearch.R#66: plot.window(...)
Selection:
At which point you can choose which "frame" to enter. When you make a selection, you'll be placed into browser() mode:
Selection: 4
Called from: stop(gettextf("replacement has %d rows, data has %d", N, n),
domain = NA)
Browse[1]>
And you can examine the environment as it was at the time of the error. When you're done, type c to bring you back to the frame selection menu. When you're done, as it tells you, type 0 to exit.
I gave this answer to a more recent question, but am adding it here for completeness.
Personally I tend not to use functions to debug. I often find that this causes as much trouble as it solves. Also, coming from a Matlab background I like being able to do this in an integrated development environment (IDE) rather than doing this in the code. Using an IDE keeps your code clean and simple.
For R, I use an IDE called "RStudio" (http://www.rstudio.com), which is available for windows, mac, and linux and is pretty easy to use.
Versions of Rstudio since about October 2013 (0.98ish?) have the capability to add breakpoints in scripts and functions: to do this, just click on the left margin of the file to add a breakpoint. You can set a breakpoint and then step through from that point on. You also have access to all of the data in that environment, so you can try out commands.
See http://www.rstudio.com/ide/docs/debugging/overview for details. If you already have Rstudio installed, you may need to upgrade - this is a relatively new (late 2013) feature.
You may also find other IDEs that have similar functionality.
Admittedly, if it's a built-in function you may have to resort to some of the suggestions made by other people in this discussion. But, if it's your own code that needs fixing, an IDE-based solution might be just what you need.
To debug Reference Class methods without instance reference
ClassName$trace(methodName, browser)
I am beginning to think that not printing error line number - a most basic requirement - BY DEFAILT- is some kind of a joke in R/Rstudio. The only reliable method I have found to find where an error occurred is to make the additional effort of calloing traceback() and see the top line.

Turning off debugging shortcuts

I am examining a package by debugging in RStudio and there are objects I would like to examine - so I type the name into the console. However if the name starts with one of s,f,c or q then a debugging action is carried out as these correspond to the shortcuts.
i.e. If I want to see the contents of object q I type q and the debugger ends as this is the shortcut for quit
Is it possible to turn off these shortcuts or perhaps reassign them to something like alt + q for example?
These shortcuts are hard-coded into R itself, so you can't change or reassign them in RStudio.
However, it's easy to work around the problem: just use get("s") instead of s. E.g.:
> s <- 12
Now entering the debugger and typing s steps out:
> browser()
Called from: top level
Browse[1]> s
>
Using get("s") to see the value:
> browser()
Called from: top level
Browse[1]> get("s")
[1] 12

Error message from somewhere unknown - how to debug? [duplicate]

I get an error when using an R function that I wrote:
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: algorithm did not converge
What I have done:
Step through the function
Adding print to find out at what line the error occurs suggests two functions that should not use glm.fit. They are window() and save().
My general approaches include adding print and stop commands, and stepping through a function line by line until I can locate the exception.
However, it is not clear to me using those techniques where this error comes from in the code. I am not even certain which functions within the code depend on glm.fit. How do I go about diagnosing this problem?
I'd say that debugging is an art form, so there's no clear silver bullet. There are good strategies for debugging in any language, and they apply here too (e.g. read this nice article). For instance, the first thing is to reproduce the problem...if you can't do that, then you need to get more information (e.g. with logging). Once you can reproduce it, you need to reduce it down to the source.
Rather than a "trick", I would say that I have a favorite debugging routine:
When an error occurs, the first thing that I usually do is look at the stack trace by calling traceback(): that shows you where the error occurred, which is especially useful if you have several nested functions.
Next I will set options(error=recover); this immediately switches into browser mode where the error occurs, so you can browse the workspace from there.
If I still don't have enough information, I usually use the debug() function and step through the script line by line.
The best new trick in R 2.10 (when working with script files) is to use the findLineNum() and setBreakpoint() functions.
As a final comment: depending upon the error, it is also very helpful to set try() or tryCatch() statements around external function calls (especially when dealing with S4 classes). That will sometimes provide even more information, and it also gives you more control over how errors are handled at run time.
These related questions have a lot of suggestions:
Debugging tools for the R language
Debugging lapply/sapply calls
Getting the state of variables after an error occurs in R
R script line numbers at error?
The best walkthrough I've seen so far is:
http://www.biostat.jhsph.edu/%7Erpeng/docs/R-debug-tools.pdf
Anybody agree/disagree?
As was pointed out to me in another question, Rprof() and summaryRprof() are nice tools to find slow parts of your program that might benefit from speeding up or moving to a C/C++ implementation. This probably applies more if you're doing simulation work or other compute- or data-intensive activities. The profr package can help visualizing the results.
I'm on a bit of a learn-about-debugging kick, so another suggestion from another thread:
Set options(warn=2) to treat warnings like errors
You can also use options to drop you right into the heat of the action when an error or warning occurs, using your favorite debugging function of choice. For instance:
Set options(error=recover) to run recover() when an error occurs, as Shane noted (and as is documented in the R debugging guide. Or any other handy function you would find useful to have run.
And another two methods from one of #Shane's links:
Wrap an inner function call with try() to return more information on it.
For *apply functions, use .inform=TRUE (from the plyr package) as an option to the apply command
#JoshuaUlrich also pointed out a neat way of using the conditional abilities of the classic browser() command to turn on/off debugging:
Put inside the function you might want to debug browser(expr=isTRUE(getOption("myDebug")))
And set the global option by options(myDebug=TRUE)
You could even wrap the browser call: myBrowse <- browser(expr=isTRUE(getOption("myDebug"))) and then call with myBrowse() since it uses globals.
Then there are the new functions available in R 2.10:
findLineNum() takes a source file name and line number and returns the function and environment. This seems to be helpful when you source() a .R file and it returns an error at line #n, but you need to know what function is located at line #n.
setBreakpoint() takes a source file name and line number and sets a breakpoint there
The codetools package, and particularly its checkUsage function can be particularly helpful in quickly picking up syntax and stylistic errors that a compiler would typically report (unused locals, undefined global functions and variables, partial argument matching, and so forth).
setBreakpoint() is a more user-friendly front-end to trace(). Details on the internals of how this works are available in a recent R Journal article.
If you are trying to debug someone else's package, once you have located the problem you can over-write their functions with fixInNamespace and assignInNamespace, but do not use this in production code.
None of this should preclude the tried-and-true standard R debugging tools, some of which are above and others of which are not. In particular, the post-mortem debugging tools are handy when you have a time-consuming bunch of code that you'd rather not re-run.
Finally, for tricky problems which don't seem to throw an error message, you can use options(error=dump.frames) as detailed in this question:
Error without an error being thrown
At some point, glm.fit is being called. That means one of the functions you call or one of the functions called by those functions is using either glm, glm.fit.
Also, as I mention in my comment above, that is a warning not an error, which makes a big difference. You can't trigger any of R's debugging tools from a warning (with default options before someone tells me I am wrong ;-).
If we change the options to turn warnings into errors then we can start to use R's debugging tools. From ?options we have:
‘warn’: sets the handling of warning messages. If ‘warn’ is
negative all warnings are ignored. If ‘warn’ is zero (the
default) warnings are stored until the top-level function
returns. If fewer than 10 warnings were signalled they will
be printed otherwise a message saying how many (max 50) were
signalled. An object called ‘last.warning’ is created and
can be printed through the function ‘warnings’. If ‘warn’ is
one, warnings are printed as they occur. If ‘warn’ is two or
larger all warnings are turned into errors.
So if you run
options(warn = 2)
then run your code, R will throw an error. At which point, you could run
traceback()
to see the call stack. Here is an example.
> options(warn = 2)
> foo <- function(x) bar(x + 2)
> bar <- function(y) warning("don't want to use 'y'!")
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
> traceback()
7: doWithOneRestart(return(expr), restart)
6: withOneRestart(expr, restarts[[1L]])
5: withRestarts({
.Internal(.signalCondition(simpleWarning(msg, call), msg,
call))
.Internal(.dfltWarn(msg, call))
}, muffleWarning = function() NULL)
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x +
2)))
3: warning("don't want to use 'y'!")
2: bar(x + 2)
1: foo(1)
Here you can ignore the frames marked 4: and higher. We see that foo called bar and that bar generated the warning. That should show you which functions were calling glm.fit.
If you now want to debug this, we can turn to another option to tell R to enter the debugger when it encounters an error, and as we have made warnings errors we will get a debugger when the original warning is triggered. For that you should run:
options(error = recover)
Here is an example:
> options(error = recover)
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
Enter a frame number, or 0 to exit
1: foo(1)
2: bar(x + 2)
3: warning("don't want to use 'y'!")
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x + 2)))
5: withRestarts({
6: withOneRestart(expr, restarts[[1]])
7: doWithOneRestart(return(expr), restart)
Selection:
You can then step into any of those frames to see what was happening when the warning was thrown.
To reset the above options to their default, enter
options(error = NULL, warn = 0)
As for the specific warning you quote, it is highly likely that you need to allow more iterations in the code. Once you've found out what is calling glm.fit, work out how to pass it the control argument using glm.control - see ?glm.control.
So browser(), traceback() and debug() walk into a bar, but trace() waits outside and keeps the motor running.
By inserting browser somewhere in your function, the execution will halt and wait for your input. You can move forward using n (or Enter), run the entire chunk (iteration) with c, finish the current loop/function with f, or quit with Q; see ?browser.
With debug, you get the same effect as with browser, but this stops the execution of a function at its beginning. Same shortcuts apply. This function will be in a "debug" mode until you turn it off using undebug (that is, after debug(foo), running the function foo will enter "debug" mode every time until you run undebug(foo)).
A more transient alternative is debugonce, which will remove the "debug" mode from the function after the next time it's evaluated.
traceback will give you the flow of execution of functions all the way up to where something went wrong (an actual error).
You can insert code bits (i.e. custom functions) in functions using trace, for example browser. This is useful for functions from packages and you're too lazy to get the nicely folded source code.
My general strategy looks like:
Run traceback() to see look for obvious issues
Set options(warn=2) to treat warnings like errors
Set options(error=recover) to step into the call stack on error
After going through all the steps suggested here I just learned that setting .verbose = TRUE in foreach() also gives me tons of useful information. In particular foreach(.verbose=TRUE) shows exactly where an error occurs inside the foreach loop, while traceback() does not look inside the foreach loop.
Mark Bravington's debugger which is available as the package debug on CRAN is very good and pretty straight forward.
library(debug);
mtrace(myfunction);
myfunction(a,b);
#... debugging, can query objects, step, skip, run, breakpoints etc..
qqq(); # quit the debugger only
mtrace.off(); # turn off debugging
The code pops up in a highlighted Tk window so you can see what's going on and, of course you can call another mtrace() while in a different function.
HTH
I like Gavin's answer: I did not know about options(error = recover). I also like to use the 'debug' package that gives a visual way to step through your code.
require(debug)
mtrace(foo)
foo(1)
At this point it opens up a separate debug window showing your function, with a yellow line showing where you are in the code. In the main window the code enters debug mode, and you can keep hitting enter to step through the code (and there are other commands as well), and examine variable values, etc. The yellow line in the debug window keeps moving to show where you are in the code. When done with debugging, you can turn off tracing with:
mtrace.off()
Based on the answer I received here, you should definitely check out the options(error=recover) setting. When this is set, upon encountering an error, you'll see text on the console similar to the following (traceback output):
> source(<my filename>)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
Enter a frame number, or 0 to exit
1: source(<my filename>)
2: eval.with.vis(ei, envir)
3: eval.with.vis(expr, envir, enclos)
4: LinearParamSearch(data = dataset, y = data.frame(LGD = dataset$LGD10), data.names = data
5: LinearParamSearch.R#66: plot(x = x, y = y.data, xlab = names(y), ylab = data.names[i])
6: LinearParamSearch.R#66: plot.default(x = x, y = y.data, xlab = names(y), ylab = data.nam
7: LinearParamSearch.R#66: localWindow(xlim, ylim, log, asp, ...)
8: LinearParamSearch.R#66: plot.window(...)
Selection:
At which point you can choose which "frame" to enter. When you make a selection, you'll be placed into browser() mode:
Selection: 4
Called from: stop(gettextf("replacement has %d rows, data has %d", N, n),
domain = NA)
Browse[1]>
And you can examine the environment as it was at the time of the error. When you're done, type c to bring you back to the frame selection menu. When you're done, as it tells you, type 0 to exit.
I gave this answer to a more recent question, but am adding it here for completeness.
Personally I tend not to use functions to debug. I often find that this causes as much trouble as it solves. Also, coming from a Matlab background I like being able to do this in an integrated development environment (IDE) rather than doing this in the code. Using an IDE keeps your code clean and simple.
For R, I use an IDE called "RStudio" (http://www.rstudio.com), which is available for windows, mac, and linux and is pretty easy to use.
Versions of Rstudio since about October 2013 (0.98ish?) have the capability to add breakpoints in scripts and functions: to do this, just click on the left margin of the file to add a breakpoint. You can set a breakpoint and then step through from that point on. You also have access to all of the data in that environment, so you can try out commands.
See http://www.rstudio.com/ide/docs/debugging/overview for details. If you already have Rstudio installed, you may need to upgrade - this is a relatively new (late 2013) feature.
You may also find other IDEs that have similar functionality.
Admittedly, if it's a built-in function you may have to resort to some of the suggestions made by other people in this discussion. But, if it's your own code that needs fixing, an IDE-based solution might be just what you need.
To debug Reference Class methods without instance reference
ClassName$trace(methodName, browser)
I am beginning to think that not printing error line number - a most basic requirement - BY DEFAILT- is some kind of a joke in R/Rstudio. The only reliable method I have found to find where an error occurred is to make the additional effort of calloing traceback() and see the top line.

Debugging a function in a different source file in R

I'm using RStudio and I want to be able to stop the code execution at a specific line.
The functions are defined in the first script file and called from a second.
I source the first file into the second one using source("C:/R/script1.R")
I used run from beginning to line: where I start running from the second script which has the function calls and have highlighted a line in the first script where the function definitions are.
I then use browser() to view the variables. However this is not ideal as there are some large matrices involved. Is there a way to make these variables appear in RStudio's workspace?
Also when I restart using run from line to end it only runs to the end of the called first script file it does not return to the calling function and complete the running of the second file.
How can I achieve these goals in RStudio?
OK here is a trivial example the function adder below is defined in one script
adder<-function(a,b) {
browser()
return(a+b)
}
I than call is from a second script
x=adder(3,4)
When adder is called in the second script is starts browser() in the first one. From here I can use get("a") to get the value of a, but the values of a and b do not appear in the workspace in RStudio?
In the example here it does not really matter but when you have several large matrices it does.
If you assign the data, into the .GlobalEnv it will be shown in RStudio's "Workspace" tab.
> adder(3, 4)
Called from: adder(3, 4)
Browse[1]> a
[1] 3
Browse[1]> b
[1] 4
Browse[1]> assign('a', a, pos=.GlobalEnv)
Browse[1]> assign('b', b, pos=.GlobalEnv)
Browse[1]> c
[1] 7
> a
[1] 3
> b
[1] 4
What you refer to as RStudio's workspace is the global environment in an R session. Each function lives in its own small environment, not exposing its local variables to the global environment. Therefore a is not present in the object inspector of RStudio.
This is good programming practice as it shields sections of a larger script from each other, reducing the amount of unwanted interaction. For example, if you use i as a counter in one function, this does not influence the value of a counter i in another function.
You can inspect a when you are in the browser session by using any of the usual functions. For example,
head(a)
str(a)
summary(a)
View(a)
attributes(a)
One common tactic after calling browser is to get a summary of all variables in the current (parent) environment. Make it a habit that every time you stop code with browser, you immediately type ls.str() at the command line.

Resources