Debugging: search packages for a specific line of code - r

I've been investigating this issue to find out which package throws the following error:
Error in if (dif < 0 && any(overlap)) { : missing value where
TRUE/FALSE needed
Long story short, I was not able to locate the bug via traceback, debug or any other debugging tool, so I downloaded the source of the package itself and its dependencies, unpacked them and ran
grep any\(overlap\) */R/*
which resulted in
directlabels/R/utility.function.R: if(dif<0&&any(overlap)){
and that's where the error is thrown, so the task is complete.
I have two major questions here, actually.
Is there a simpler way to get to the same result? Imagine I had ten depends, downloading all these sources would have become a pain. Say, I'd love to see something like
search_packages_source <- function(list_of_packages, string), is there anything like that available?
Why is traceback/debug failing to point at the function which throws an error? (traceback shows a trace for the plot rendering, with debug I have to guess whether to step into or to step over, so I gave up quickly, see the edit below for more detail) I have also tried Rstudio's debugging tools, none of them helped to locate the error.
Debugging in R is a rare task for me, so I'm probably missing something very simple. Would be delighted to hear your tips & tricks!
EDIT: here's the shortcut to the code that can be reproduced and a bit of clarification:
p1 <- ggplot(mtcars, aes(disp, mpg, col=cyl)) + geom_line()
p2 <- ggplot(subset(mtcars, cyl==4), aes(disp, mpg, col=cyl)) + geom_line()
# OK
direct.label(p1, "last.bumpup")
# error
direct.label(p2, "last.bumpup")
Now, if I traceback(), I get a long output with no indication which utility function failed. If I debug(), I get an even longer output, which also leads nowhere. So the problem is not only my debugging techniques are lacking, but also because of the cross-dependencies between the packages.

Related

ggplot2 error: Error in default + theme : non-numeric argument to binary operator

I have been getting the error Error in default + theme : non-numeric argument to binary operator. I have been using R and teaching R for a long time but I can't find this problem. I have included a reproducible example that fails this way below:
library(tibble)
library(ggplot2)
brains <- as_tibble(brains)
brains <- brains[1:10, ]
brains
ggplot(brains, aes(x = BodyWt, y = BrainWt)) +
geom_point()
The error occurs when executing the ggplot() statement.
My hardware is an HP Laptop 15-ef0xxx. I am running Windows 10 Home version 2004. I am running RStudio community edition "Water Lily" and R version R x64** 4.0.2.
I know this is a simple error and it is driving me crazy.
So I finally solved this problem. On the github issue I had opened Hiroaki commented that "One possibility is that you might set a invalid default theme in your .Rprofile, but I'm not sure..." (see link to issue below).
I'm not sure if your R file is part of a project but mine is.
So I went back and deleted the theme_set() line in my R file, went in and double checked all my project options and selected the option "Disable .Rprofile execution on session start/resume" and "Quit child processes on exit". And then I restarted the R session and now everything works. Including on the default R editor console.
I'm not sure if all those steps are necessary but that seemed to do the trick for me! Hope it helps.
I thought this was an RStudio issue but it seems it's possibly a ggplot2 > problem. I have verified using two different datasets that the same >error comes up when I try using ggplot2 in RStudio or using the default R >console. I get the exact same error with code that's been working fine >but now suddenly won't. I have opened an issue on Github (ggplot2) with a >reprex. Might be worth checking there: >https://github.com/tidyverse/ggplot2/issues/4177
I know this is not an answer per se but I don't have enough reputation >points to add a comment to the previous answer but I thought linking to >the issue on Github might help.
I think you have numbers quoted somewhere and you are trying to perform mathematical operation on character values. Consider
x <- c("5","6")
y <- x/1
> y <- x/1
Error in x/1 : non-numeric argument to binary operator
Now try converting x to numeric and perform the same operation.
y <- as.numeric(x)/1
> y
[1] 5 6
So, you need to use as.numeric on your variable.
The following should resove this issue
ggplot(brains, aes(x = as.numeric(BodyWt), y = as.numeric(BrainWt)))

Error in file(con, "w") : cannot open the connection [Using R-Studio to plot interactive bar graphs using rCharts, knitr]

I am getting an error when I am trying to run the code below in R-Studio 3.3.2 on a Mac (OS Sierra)
devtools::install_github('ramnathv/rCharts')
install.packages("knitr")
require(rCharts)
require(knitr)
haireye <- as.data.frame(HairEyeColor)
n1 <- nPlot(Freq ~ Hair, group = 'Eye', type = 'multiBarChart',
data = subset(haireye, Sex == 'Male')
)
n1$save('fig/n1.html', cdn = TRUE)
cat('<iframe src="fig/n1.html" width= 100%, height=600</iframe>')
Pls see output below:
Error in file(con, "w") : cannot open the connection
In addition: Warning message: In file(con, "w") : cannot open file 'fig/n1.html': No such file or directory
But I am able to generate the reqd bar graph in the viewer when I use:
n1$show(cdn = TRUE)
in lieu of n1$save('fig/n1.html', cdn = TRUE)
To take care of write permission issues (if any), I also tried including the line below, altering the WD path wherever necessary.
knitr::knit2html('Users/documents/n1.html')
But it did not help. I see the n1.html file created but it only opens an empty browser.
Any help to resolve this is appreciated.
Best,
S
A lot of times we face this error due to caching in RStudio and in that case, actual code errors don't show up. Restart RStudio and this error will be gone and actual code errors would show.
You have two separate problems.
The connection error appears because the fig/ folder does not exist. Create the folder and the save command will work. R has functions to check the existance of directories and create new ones if you would like to do it in your code.
The second problem comes from the way you save, you should use n1$save('fig/n1.html', standalone = TRUE). Here you have a similar situation.
As a side-note, I would say rCharts is not currently developed or mantained at all, so I would recommend you to use another library for your charts. In my opinion Plotly is quite nice. rCharts brought the NVD3 project to R and the chart style is in my opinion really nice. However, as far as I know both projects are stopped so I would look for a library that is still alive.
I have fixed this problem with good old rm(list=ls()) . I know I have
fallen into sequences where the error stops execution of my script. I fix the error, and then it won't run. This is likely due to lazy evaluation but it is a near impossible problem to diagnose, so the solution at the top works almost all the time.

How do I close unused connections after read_html in R

I am quite new to R and am trying to access some information on the internet, but am having problems with connections that don't seem to be closing. I would really appreciate it if someone here could give me some advice...
Originally I wanted to use the WebChem package, which theoretically delivers everything I want, but when some of the output data is missing from the webpage, WebChem doesn't return any data from that page. To get around this, I have taken most of the code from the package but altered it slightly to fit my needs. This worked fine, for about the first 150 usages, but now, although I have changed nothing, when I use the command read_html, I get the warning message " closing unused connection 4 (http:....." Although this is only a warning message, read_html doesn't return anything after this warning is generated.
I have written a simplified code, given below. This has the same problem
Closing R completely (or even rebooting my PC) doesn't seem to make a difference - the warning message now appears the second time I use the code. I can run the querys one at a time, outside of the loop with no problems, but as soon as I try to use the loop, the error occurs again on the 2nd iteration.
I have tried to vectorise the code, and again it returned the same error message.
I tried showConnections(all=TRUE), but only got connections 0-2 for stdin, stdout, stderr.
I have tried searching for ways to close the html connection, but I can't define the url as a con, and close(qurl) and close(ttt) also don't work. (Return errors of no applicable method for 'close' applied to an object of class "character and no applicable method for 'close' applied to an object of class "c('xml_document', 'xml_node')", repectively)
Does anybody know a way to close these connections so that they don't break my routine? Any suggestions would be very welcome. Thanks!
PS: I am using R version 3.3.0 with RStudio Version 0.99.902.
CasNrs <- c("630-08-0","463-49-0","194-59-2","86-74-8","148-79-8")
tit = character()
for (i in 1:length(CasNrs)){
CurrCasNr <- as.character(CasNrs[i])
baseurl <- 'http://chem.sis.nlm.nih.gov/chemidplus/rn/'
qurl <- paste0(baseurl, CurrCasNr, '?DT_START_ROW=0&DT_ROWS_PER_PAGE=50')
ttt <- try(read_html(qurl), silent = TRUE)
tit[i] <- xml_text(xml_find_all(ttt, "//head/title"))
}
After researching the topic I came up with the following solution:
url <- "https://website_example.com"
url = url(url, "rb")
html <- read_html(url)
close(url)
# + Whatever you wanna do with the html since it's already saved!
I haven't found a good answer for this problem. The best work-around that I came up with is to include the function below, with Secs = 3 or 4. I still don't know why the problem occurs or how to stop it without building in a large delay.
CatchupPause <- function(Secs){
Sys.sleep(Secs) #pause to let connection work
closeAllConnections()
gc()
}
I found this post as I was running into the same problems when I tried to scrape multiple datasets in the same script. The script would get progressively slower and I feel it was due to the connections. Here is a simple loop that closes out all of the connections.
for (i in seq_along(df$URLs)){function(i)
closeAllConnections(i)
}

Problems building an R package (NAs introduced by coercion)

I have written the Boggler package which includes a Play.Boggle() function that calls, on line 87, a progress bar script using shell:
shell(cmd = sprintf('Rscript.exe R/progress_bar.R "%i"', time.limit + 1), wait=FALSE)
Everything works fine when sourcing the files individually and then calling the main Play.Boggle() function, but when I try to check/build the package (under Win7-64 using RStudio), I get a failure message -- here's what the 00install.out reports:
** preparing package for lazy loading
Warning in eval(expr, envir, enclos) : NAs introduced by coercion
Error in time.limit:0 : NA/NaN argument
To make sure the argument "%i" (time.limit + 1) was correctly passed to the progress_bar.R, I added a cat(time.limit) to the script (commenting the rest out to make sure the package would build without any errors) and directed its output to a log file like this:
'Rscript.exe R/progress_bar.R "%i" > out.log'
Conclusion: the time limit is indeed passed along as expected. So I can't figure out why I get this "NA/NaN argument" error message. It must have something to do with lazy loading, concept that I haven't fully got my head around yet.
So my question is: what can I do to successfully check/build this package with full functionality (including progress_bar.R)?
Note: On github, the progress_bar.R script is there but all its content is commented out so that the package can successfully be installed. The shell(...) function call is still active, doing nothing but executing an empty script.
So the problem arises when trying to build or check, in which case all R scripts are executed, as pointed out by Roland. A simple workaround allows the package to check/build without any problems. The fix is just to add to the progress_bar.R the following lines after it tries to recuperate the commandargs (lines 10-11):
if(time.limit %in% c(NA, NaN))
time.limit <- 10 # or any minimal number
There's surely other ways to go about this. But this being a game programmed for fun, I'll happily go with that patch. Hopefully this can be of help to someone down the road and I won't have wasted 50 precious rep points in vain for that bounty! :D

Disable storing tracebacks on error

Is there a way to disable the storing of tracebacks on error in R temporarily (for a session)?
The reason I ask is that ggplot2 has a long-running problem, that they've been unable to fix. Somehow the entire dataset gets stored in the traceback, and if you work with very large datasets, this means that a mis-typed variable name can leave you with a 10-minute hang.
Especially when I make complex plots for very large data, this is crippling. Usually these are all small typos, I don't ever need tracebacks, just the error message would be fine.
I tried
options(error = expression(NULL))
but apparently that handler is called after the traceback is stored (the hang persists).
reproducible example
library(ggplot2)
data(diamonds)
diamonds = diamonds[sample(x=nrow(diamonds),size=200000,replace=T),]
qplot(data=diamonds, wrong, var)
One obvious thing that I hadn't thought about is to wrap the call in tryCatch, like this:
tryCatch({
print(qplot(data=diamonds, wrong, var))
}, error = function(e){warning(e)})
It's important to print your plot inside the tryCatch, as otherwise the error occurs once the returned plot object is automatically printed.
I would still be interested in the reverse equivalent of options(warn=2) (i.e. instead of turning warnings into errors so that they can be traced, it would turn errors into warnings, so they don't generate a huge traceback).

Resources