Related
Is there any way to reproduce the environment which is used by devtools::check?
I have the problem that my tests work with devtools::test() but fail within devtools::check(). My problem is now, how to find the problem. The report of check just prints the last few lines of the error log and I can't find the complete report for the testing.
checking tests ... ERROR
Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
...
I know that check uses a different environment compared to test but I don't know how I should debug these problems since they are not reproducible at all. Specially these test where running a few month ago, so not sure where to look for the problem.
EDIT
actually I tried to locate my problem and I found a solution. But to post my solution to it, I have to add more details.
So my test always failed since I was testing a markdown script if it is running without errors and afterwards I was checking if some of the environmental variables are set correctly. These where results which I calculate with the script as well as standard settings which I set. So I wanted to get a warning if I forgot to change some of my settings after developing...
Anyway, since it is a markdown script, I had to extract the code and I was using comments from this post knitr: run all chunks in an Rmarkdown document using knitr::purl to get the code and sys.source to execute it.
runAllChunks <- function(rmd, envir=globalenv()){
# as found here https://stackoverflow.com/questions/24753969
tempR <- tempfile(tmpdir = '.', fileext = ".R")
on.exit(unlink(tempR))
knitr::purl(rmd, output=tempR, quiet=TRUE)
sys.source(tempR, envir=envir)
}
For some reason, this produces an error since maybe a few weeks (not sure which new packages I installed lately...). But since there is a new comment, that I can just use knitr::knit which also executes the code, this worked as expected and now my test no longer complains.
So in the end, I don't know where the problem exactly was, but this is now working.
I recently had a similar issue with my tests breaking (succeeding with devtools::test() but failing with devtools::check()). I don't know if this solution necessarily fixes the problem above, but it should help to track down similar problems.
In my case, the problem ultimately came down to using a function that needed a package listed in Suggests rather than in Imports/Depends. In particular, my function called httr::content(), which broke when I tried to pass it the as = "parsed" argument. It turns out that as = "parsed" uses a suggested package, readr to read a csv, and I needed to add it to my dependencies for devtools::check() to work.
This is a known issue with testthat. The workaround is to add the following as the 1st line in tests/testthat.R:
Sys.setenv(R_TESTS="")
In case it helps someone else, this is what worked for me
Re-install all relevant packages. E.g. install.packages("testthat", "dplyr", "lubridate", "stringr") (I included all packages my package uses)
Close RStudio and reopen
Then all tests passed
I spent much too long looking into this error, so hoping that I can help someone out in the future. I would like to add to this that I was getting this error while using ggplot2::autoplot() in my function and it required that I added #import ggfortify to the Roxygen skeleton part of my function.
I ran into the same issue with my tests failing under devtools::check() while not failing under testthat::test()
And none of the above applied to my problem, so i decided to post my issue plus solution here as well. But first some NOTEs from my experience:
devtools::check() does - so it seems - deeper error checking then your own written tests.
Now to my code-setup. I had a function that was build to retrieve values from two different files. Those files contained named profiles with a set of values per profile. But the profiles were named differently, depending on the files:
Example files:
Content of file_one:
[default]
value_A = "foo"
value_B = "bar"
value_C = "baz"
[peter]
value_A = "oof"
value_B = "rab"
value_C = "zab"
content of file_two:
[default]
value_X = "fuzzly"
value_Z = "puzzly"
[profile peter]
value_X = "fuzzly"
value_Z = "puzzly"
As you can see, does the naming in file two follow another naming convention, when it comes to the named profiles. The profiles are written in "[]" and the default-profile is always '[default]' in both files. But as soon as it comes to named profiles, its just '[name]' in one file and then '[profile name]' in the other one.
Now i've build the function like that (simplyfied):
get_value <- function(file_content, what, profile) {
file_content <- readr::read_lines(file)
all_profiles_at <- grep("\\[.*\\]", file_content)
profile_regex <- paste0("\\[",if(file_content == "file_two" && profile != "default") "profile ",profile,"\\]")
profile_at <- grep(profile_regex, file_content)
profile_ends_at <- if(profile_at == max(all_profiles_at)) length(file_content) else all_profiles_at[grep(paste0("^",profile_at,"$"), all_profiles_at) + 1] -1
profile_content <- file_content[profile_at:profile_ends_at]
whole_what <- stringr::str_replace_all(profile_content[grep(paste0("^",what,".*"), profile_content)], " ", "")
return(stringr::str_sub(whole_what, stringr::str_length(paste0(what,"=."))))
}
With this code my tests ran smoothly and even check() found no issues.
While the whole code evolved i figured, that i should read the files content beforehand and give only the alread read_in content to the function to avoid duplication in my code. So i changed the function like so:
get_value <- function(file, what, profile) {
is_file_two <- is_file_two(file_content)
all_profiles_at <- grep("\\[.*\\]", file_content)
profile_regex <- paste0("\\[",if(file_content == "file_two" && profile != "default") "profile ",profile,"\\]")
profile_at <- grep(profile_regex, file_content)
profile_ends_at <- if(profile_at == max(all_profiles_at)) length(file_content) else all_profiles_at[grep(paste0("^",profile_at,"$"), all_profiles_at) + 1] -1
profile_content <- file_content[profile_at:profile_ends_at]
whole_what <- stringr::str_replace_all(profile_content[grep(paste0("^",what,".*"), profile_content)], " ", "")
return(stringr::str_sub(whole_what, stringr::str_length(paste0(what,"=."))))
}
As you might notice i only changed the first line of the funciton body and left the if-condition unchanged - my mistake!
But my tests didn't throw an error, as the if-condition still worked. Even though the 'file_content == "file_two"' part now generated a logical vector and if() ... else ... normally throws a warning, when the logical has length > 1. The special construct with the && doesn't throw such an error as it returns a length(1) logical:
# with warning
if(c(FALSE, FALSE, FALSE)) "Done!" else "Not done!"
# no warning:
if(c(FALSE, FALSE, FALSE) && TRUE) "Done!" else "Not done!"
Thats why my tests with testthat::test() sill worked.
But devtools::check() saw this flaw in my code and the tests failed!
And that part of the FAILURE_REPORT showed me my errors:
[...]
where 41: test_check("my_package_name")
--- value of length: 18 type: logical ---
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE
--- function from context ---
[...]
Conclusion:
testthat::test() is great! Is checks whether or not your code still runs. But devtools::check() goes far deeper - and when your tests pass with testthat::test() but fail with devtools::check() then you've probaly got some deeper bugs and flaws in your code you MUST attend to!
So as I shortly mentioned above, I changed some of my code to no longer user knitr::purl but using knitr::knit and this solved my problem.
expect_error(f <- runAllChunks('010_main_lfq_analysis.Rmd'), NA)
expect_error(f <- knitr::knit('010_main_lfq_analysis.Rmd', output='jnk.R', quiet=TRUE, envir=globalenv()), NA)
This could also happen in the following scenario: You have a library already loaded in R and you are referring to the function in that library without namespace binding. For example, suppose you use the nnzero() function from the Matrix in a test file and happen to also have had the Matrix package already loaded with library(Matrix). Then devtools::test() will pass but devtools::check() fails. Using Matrix::nnzero() should fix the problem.
I get an error when using an R function that I wrote:
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: algorithm did not converge
What I have done:
Step through the function
Adding print to find out at what line the error occurs suggests two functions that should not use glm.fit. They are window() and save().
My general approaches include adding print and stop commands, and stepping through a function line by line until I can locate the exception.
However, it is not clear to me using those techniques where this error comes from in the code. I am not even certain which functions within the code depend on glm.fit. How do I go about diagnosing this problem?
I'd say that debugging is an art form, so there's no clear silver bullet. There are good strategies for debugging in any language, and they apply here too (e.g. read this nice article). For instance, the first thing is to reproduce the problem...if you can't do that, then you need to get more information (e.g. with logging). Once you can reproduce it, you need to reduce it down to the source.
Rather than a "trick", I would say that I have a favorite debugging routine:
When an error occurs, the first thing that I usually do is look at the stack trace by calling traceback(): that shows you where the error occurred, which is especially useful if you have several nested functions.
Next I will set options(error=recover); this immediately switches into browser mode where the error occurs, so you can browse the workspace from there.
If I still don't have enough information, I usually use the debug() function and step through the script line by line.
The best new trick in R 2.10 (when working with script files) is to use the findLineNum() and setBreakpoint() functions.
As a final comment: depending upon the error, it is also very helpful to set try() or tryCatch() statements around external function calls (especially when dealing with S4 classes). That will sometimes provide even more information, and it also gives you more control over how errors are handled at run time.
These related questions have a lot of suggestions:
Debugging tools for the R language
Debugging lapply/sapply calls
Getting the state of variables after an error occurs in R
R script line numbers at error?
The best walkthrough I've seen so far is:
http://www.biostat.jhsph.edu/%7Erpeng/docs/R-debug-tools.pdf
Anybody agree/disagree?
As was pointed out to me in another question, Rprof() and summaryRprof() are nice tools to find slow parts of your program that might benefit from speeding up or moving to a C/C++ implementation. This probably applies more if you're doing simulation work or other compute- or data-intensive activities. The profr package can help visualizing the results.
I'm on a bit of a learn-about-debugging kick, so another suggestion from another thread:
Set options(warn=2) to treat warnings like errors
You can also use options to drop you right into the heat of the action when an error or warning occurs, using your favorite debugging function of choice. For instance:
Set options(error=recover) to run recover() when an error occurs, as Shane noted (and as is documented in the R debugging guide. Or any other handy function you would find useful to have run.
And another two methods from one of #Shane's links:
Wrap an inner function call with try() to return more information on it.
For *apply functions, use .inform=TRUE (from the plyr package) as an option to the apply command
#JoshuaUlrich also pointed out a neat way of using the conditional abilities of the classic browser() command to turn on/off debugging:
Put inside the function you might want to debug browser(expr=isTRUE(getOption("myDebug")))
And set the global option by options(myDebug=TRUE)
You could even wrap the browser call: myBrowse <- browser(expr=isTRUE(getOption("myDebug"))) and then call with myBrowse() since it uses globals.
Then there are the new functions available in R 2.10:
findLineNum() takes a source file name and line number and returns the function and environment. This seems to be helpful when you source() a .R file and it returns an error at line #n, but you need to know what function is located at line #n.
setBreakpoint() takes a source file name and line number and sets a breakpoint there
The codetools package, and particularly its checkUsage function can be particularly helpful in quickly picking up syntax and stylistic errors that a compiler would typically report (unused locals, undefined global functions and variables, partial argument matching, and so forth).
setBreakpoint() is a more user-friendly front-end to trace(). Details on the internals of how this works are available in a recent R Journal article.
If you are trying to debug someone else's package, once you have located the problem you can over-write their functions with fixInNamespace and assignInNamespace, but do not use this in production code.
None of this should preclude the tried-and-true standard R debugging tools, some of which are above and others of which are not. In particular, the post-mortem debugging tools are handy when you have a time-consuming bunch of code that you'd rather not re-run.
Finally, for tricky problems which don't seem to throw an error message, you can use options(error=dump.frames) as detailed in this question:
Error without an error being thrown
At some point, glm.fit is being called. That means one of the functions you call or one of the functions called by those functions is using either glm, glm.fit.
Also, as I mention in my comment above, that is a warning not an error, which makes a big difference. You can't trigger any of R's debugging tools from a warning (with default options before someone tells me I am wrong ;-).
If we change the options to turn warnings into errors then we can start to use R's debugging tools. From ?options we have:
‘warn’: sets the handling of warning messages. If ‘warn’ is
negative all warnings are ignored. If ‘warn’ is zero (the
default) warnings are stored until the top-level function
returns. If fewer than 10 warnings were signalled they will
be printed otherwise a message saying how many (max 50) were
signalled. An object called ‘last.warning’ is created and
can be printed through the function ‘warnings’. If ‘warn’ is
one, warnings are printed as they occur. If ‘warn’ is two or
larger all warnings are turned into errors.
So if you run
options(warn = 2)
then run your code, R will throw an error. At which point, you could run
traceback()
to see the call stack. Here is an example.
> options(warn = 2)
> foo <- function(x) bar(x + 2)
> bar <- function(y) warning("don't want to use 'y'!")
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
> traceback()
7: doWithOneRestart(return(expr), restart)
6: withOneRestart(expr, restarts[[1L]])
5: withRestarts({
.Internal(.signalCondition(simpleWarning(msg, call), msg,
call))
.Internal(.dfltWarn(msg, call))
}, muffleWarning = function() NULL)
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x +
2)))
3: warning("don't want to use 'y'!")
2: bar(x + 2)
1: foo(1)
Here you can ignore the frames marked 4: and higher. We see that foo called bar and that bar generated the warning. That should show you which functions were calling glm.fit.
If you now want to debug this, we can turn to another option to tell R to enter the debugger when it encounters an error, and as we have made warnings errors we will get a debugger when the original warning is triggered. For that you should run:
options(error = recover)
Here is an example:
> options(error = recover)
> foo(1)
Error in bar(x + 2) : (converted from warning) don't want to use 'y'!
Enter a frame number, or 0 to exit
1: foo(1)
2: bar(x + 2)
3: warning("don't want to use 'y'!")
4: .signalSimpleWarning("don't want to use 'y'!", quote(bar(x + 2)))
5: withRestarts({
6: withOneRestart(expr, restarts[[1]])
7: doWithOneRestart(return(expr), restart)
Selection:
You can then step into any of those frames to see what was happening when the warning was thrown.
To reset the above options to their default, enter
options(error = NULL, warn = 0)
As for the specific warning you quote, it is highly likely that you need to allow more iterations in the code. Once you've found out what is calling glm.fit, work out how to pass it the control argument using glm.control - see ?glm.control.
So browser(), traceback() and debug() walk into a bar, but trace() waits outside and keeps the motor running.
By inserting browser somewhere in your function, the execution will halt and wait for your input. You can move forward using n (or Enter), run the entire chunk (iteration) with c, finish the current loop/function with f, or quit with Q; see ?browser.
With debug, you get the same effect as with browser, but this stops the execution of a function at its beginning. Same shortcuts apply. This function will be in a "debug" mode until you turn it off using undebug (that is, after debug(foo), running the function foo will enter "debug" mode every time until you run undebug(foo)).
A more transient alternative is debugonce, which will remove the "debug" mode from the function after the next time it's evaluated.
traceback will give you the flow of execution of functions all the way up to where something went wrong (an actual error).
You can insert code bits (i.e. custom functions) in functions using trace, for example browser. This is useful for functions from packages and you're too lazy to get the nicely folded source code.
My general strategy looks like:
Run traceback() to see look for obvious issues
Set options(warn=2) to treat warnings like errors
Set options(error=recover) to step into the call stack on error
After going through all the steps suggested here I just learned that setting .verbose = TRUE in foreach() also gives me tons of useful information. In particular foreach(.verbose=TRUE) shows exactly where an error occurs inside the foreach loop, while traceback() does not look inside the foreach loop.
Mark Bravington's debugger which is available as the package debug on CRAN is very good and pretty straight forward.
library(debug);
mtrace(myfunction);
myfunction(a,b);
#... debugging, can query objects, step, skip, run, breakpoints etc..
qqq(); # quit the debugger only
mtrace.off(); # turn off debugging
The code pops up in a highlighted Tk window so you can see what's going on and, of course you can call another mtrace() while in a different function.
HTH
I like Gavin's answer: I did not know about options(error = recover). I also like to use the 'debug' package that gives a visual way to step through your code.
require(debug)
mtrace(foo)
foo(1)
At this point it opens up a separate debug window showing your function, with a yellow line showing where you are in the code. In the main window the code enters debug mode, and you can keep hitting enter to step through the code (and there are other commands as well), and examine variable values, etc. The yellow line in the debug window keeps moving to show where you are in the code. When done with debugging, you can turn off tracing with:
mtrace.off()
Based on the answer I received here, you should definitely check out the options(error=recover) setting. When this is set, upon encountering an error, you'll see text on the console similar to the following (traceback output):
> source(<my filename>)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
Enter a frame number, or 0 to exit
1: source(<my filename>)
2: eval.with.vis(ei, envir)
3: eval.with.vis(expr, envir, enclos)
4: LinearParamSearch(data = dataset, y = data.frame(LGD = dataset$LGD10), data.names = data
5: LinearParamSearch.R#66: plot(x = x, y = y.data, xlab = names(y), ylab = data.names[i])
6: LinearParamSearch.R#66: plot.default(x = x, y = y.data, xlab = names(y), ylab = data.nam
7: LinearParamSearch.R#66: localWindow(xlim, ylim, log, asp, ...)
8: LinearParamSearch.R#66: plot.window(...)
Selection:
At which point you can choose which "frame" to enter. When you make a selection, you'll be placed into browser() mode:
Selection: 4
Called from: stop(gettextf("replacement has %d rows, data has %d", N, n),
domain = NA)
Browse[1]>
And you can examine the environment as it was at the time of the error. When you're done, type c to bring you back to the frame selection menu. When you're done, as it tells you, type 0 to exit.
I gave this answer to a more recent question, but am adding it here for completeness.
Personally I tend not to use functions to debug. I often find that this causes as much trouble as it solves. Also, coming from a Matlab background I like being able to do this in an integrated development environment (IDE) rather than doing this in the code. Using an IDE keeps your code clean and simple.
For R, I use an IDE called "RStudio" (http://www.rstudio.com), which is available for windows, mac, and linux and is pretty easy to use.
Versions of Rstudio since about October 2013 (0.98ish?) have the capability to add breakpoints in scripts and functions: to do this, just click on the left margin of the file to add a breakpoint. You can set a breakpoint and then step through from that point on. You also have access to all of the data in that environment, so you can try out commands.
See http://www.rstudio.com/ide/docs/debugging/overview for details. If you already have Rstudio installed, you may need to upgrade - this is a relatively new (late 2013) feature.
You may also find other IDEs that have similar functionality.
Admittedly, if it's a built-in function you may have to resort to some of the suggestions made by other people in this discussion. But, if it's your own code that needs fixing, an IDE-based solution might be just what you need.
To debug Reference Class methods without instance reference
ClassName$trace(methodName, browser)
I am beginning to think that not printing error line number - a most basic requirement - BY DEFAILT- is some kind of a joke in R/Rstudio. The only reliable method I have found to find where an error occurred is to make the additional effort of calloing traceback() and see the top line.
I'd like to test that a function returns the expected data.frame. The data.frame is too large to define in the R file (eg, using something like structure()). I'm doing something wrong with the environments when I try a simple retrieval from disk, like:
test_that("SO example for data.frame retreival", {
path_expected <- "./inst/test_data/project_longitudinal/expected/default.rds"
actual <- data.frame(a=1:5, b=6:10) #saveRDS(actual, file=path_expected)
expected <- readRDS(path_expected)
expect_equal(actual, expected, label="The returned data.frame should be correct")
})
The lines execute correctly when run in the console. But when I run devtools::test(), the following error occurs when the rds/data.frame is read from a file.
1. Error: All Records -Default ----------------------------------------------------------------
cannot open the connection
1: withCallingHandlers(eval(code, new_test_environment), error = capture_calls, message = function(c) invokeRestart("muffleMessage"),
warning = function(c) invokeRestart("muffleWarning"))
2: eval(code, new_test_environment)
3: eval(expr, envir, enclos)
4: readRDS(path_expected) at test-read_batch_longitudinal.R:59
5: gzfile(file, "rb")
To make this work, what adjustments are necessary to the environment? If there's not an easy way, what's a good way to test large data.frames?
I suggest you check out the excellent ensurer package. You can include these functions inside the function itself (rather than as part of the testthat test set).
It will throw an error if the dataframe (or whatever object you'd like to check) doesn't fulfill your requirements, and will just return the object if it passes your tests.
The difference with testthat is that ensurer is built to check your objects at runtime, which probably circumvents the entire environment problem you are facing, as the object is tested inside the function at runtime.
See the end of this vignette, to see how to test the dataframe against a template that you can make as detailed as you like. You'll also find many other tests you can run inside the function. It looks like this approach may be preferable over testthat in this case.
Based on the comment by #Gavin Simpson, the problem didn't involve environments, but instead the file path. Changing the snippet's second line worked.
path_qualified <- base::file.path(
devtools::inst(name="REDCapR"),
test_data/project_longitudinal/expected/dummy.rds"
)
The file's location is found whether I'm debugging interactively, or testthat is running (and thus whether inst is in the path or not).
I am trying to understand the way the YourCast R package works and make it work with my data.
For example, if a function produces errors, I
get the source code of that function using YourCast:::bad.fn
add outputs of critical
values at critical stages
use reassignInPackage(name="original.fn", package="YourCast", value="my.fn")
Once I find the cause of the error, I fix it in the function and reassign it in the package.
However, for some strange reason this does not work for non-hidden functions.
For example:
install.packages("YourCast")
Library(YourCast)
YourCast:::check.depvar
This will print the hidden function check.depvar. One line if (all(ix == 1:3)) will produce an error message if any of the x is missing.
Thus, I change the whole function to the following and replace the original formula:
mzuba.check.depvar <- function(formula)
{
return (grepl("log[(]",as.character(formula)[2]))
}
reassignInPackage("check.depvar",
pkgName="YourCast",
mzuba.check.depvar)
rm(mzuba.check.depvar)
Now YourCast:::check.depvar will print my version of that function, and everything is fine.
However
YourCast::yourcast or YourCast:::yourcast or simply yourcast will print the non-hidden function yourcast. Suppose I want to change that function as well.
reassignInPackage(name="yourcast",
pkgName="YourCast",
value=test)
Now, YourCast::yourcast and YourCast:::yourcast will print the new, modified version but yourcast still gives the old version!
That might not a problem if I could simply call YourCast::yourcast instead of yourcast, but that produces some kind of error that I can't trace back because suddenly R-Studio does not print error messages at all anymore!, although it still does something if it is capable to:
> Uagh! do something!
> 1 + 1
[1] 2
> Why no error msg?
>
Restarting the R-session will solve the error-msg problem, though.
So my question is: How do I reassign non-hidden functions in packages?
Furthermore (this would faciliate testing a lot), is there a way to make all hidden functions available without using the ::: operator? I.e., How to export all functions from a package?
I have a function that will set up some folders for the rest of my workflow
library(testthat)
analysisFolderCreation<-function(projectTitle=NULL,Dated=FALSE,destPath=getwd(),SETWD=FALSE){
stopifnot(length(projectTitle)>0,is.character(projectTitle),is.logical(Dated),is.logical(SETWD))
# scrub any characters that might cause trouble
projectTitle<-gsub("[[:space:]]|[[:punct:]]","",projectTitle)
rootFolder<-file.path(destPath,projectTitle)
executionFolder<-file.path(rootFolder,if (Dated) format(Sys.Date(),"%Y%m%d"))
subfolders<-c("rawdata","intermediates","reuse","log","scripts","results")
dir.create(path=executionFolder, recursive=TRUE)
sapply(file.path(executionFolder,subfolders),dir.create)
if(Setwd) setwd(executionFolder)
}
I am trying to unit test it and my error tests work fine:
test_that("analysisFolderCreation: Given incorrect inputs, error is thrown",
{
# scenario: No arguments provided
expect_that(analysisFolderCreation(),throws_error())
})
But my tests for success, do not...
test_that("analysisFolderCreation: Given correct inputs the function performs correctly",
{
# scenario: One argument provided - new project name
analysisFolderCreation(projectTitle="unittest")
expect_that(file.exists(file.path(getwd(),"unittest","log")),
is_true())
}
Errors with
Error: analysisFolderCreation: Given correct inputs the function performs correctly ---------------------------------------------------------------
could not find function "analysisFolderCreation"
As I am checking for a folder's existence, I'm unsure how to go about testing this in an expectation format that includes the function analysisFolderCreation actually inside it.
I am running in dev_mode() and executing the test file explicitly with test_file()
Is anyone able to provide a way of rewriting the test to work, or provide an existence checking expectation?
The problem appeared to be the use of test_file(). Using test() over the whole suite of unit tests does not require the function already be created in the workspace unlike test_file().