How to pass errors back to the Rmarkdown::render function? - r

I am trying to render an Rmarkdown file through an R script. (Code for both files below). What I would like to do is to pass information back to the render function depending on where the error is. This may be that the file can't read the input dataset. I would like to do this as I would like to run the script as a cron job and would like it to send me an email telling me why I might need to re run the code or what the error is.
I have read some of the other stackoverflow similar questions and couldn't see how it did what I wanted with some testing.
The r script: (I have attempted using something like the following)
rm(list = ls())
setwd("C:/Users/joel.kandiah/Downloads")
a <- print(try(rmarkdown::render("test.Rmd", quiet = T), TRUE))
#> [1] "C:/Users/joel.kandiah/Downloads/test.nb.html"
cat(eval(a))
#> C:/Users/joel.kandiah/Downloads/test.nb.html
The Rmarkdown document:
if(!exists("data_raw")) simpleError("Dataset has not been loaded")
#> <simpleError: Dataset has not been loaded>
What I would like is to see the simple error as an object in the R script. Something akin to an exit code might also be acceptable.

A possible approach, is the wrapping tryCatch around render in your R script.
R Script
# Render the markdown document; ####
tryCatch(
expr = rmarkdown::render(
"markdown.Rmd",
clean = TRUE
),
error = function(cond) {
message(cond)
},
warning = function(cond) {
message(cond)
}
)
R Markdown
# Force an error;
stop("You do not have permission to render. Admin password is needed.")
This will return the same error-message to your script.

Related

How to call a parallelized script from command prompt?

I'm running into this issue and I for the life of me can't figure out how to solve it.
Quick summary before example:
I have several hundred data sets from which I want create reports on everyday. In order to do this efficiently, I parallelized the process with doParallel. From within RStudio, the process works fine, but when I try to make the process automatic via Task Scheduler on windows, I can't seem to get it to work.
The process within RStudio is:
I call a script that sources all of my other scripts, each individual script has a header section that performs the appropriate package import, so for instance it would look like:
get_files <- function(){
get_files.create_path() -> path
for(file in path){
if(!(file.info(paste0(path, file))[['isdir']])){
source(paste0(path, file))
}
}
}
get_files.create_path <- function(){
return(<path to directory>)
}
#self call
get_files()
This would be simply "Source on saved" and brings in everything I need into the .GlobalEnv.
From there, I could simply type: parallel_report() which calls a script that sources another script that houses the parallelization of the report generations. There was an issue awhile back with simply calling the parallelization directly (I wonder if this is related?) and so I had to make the doParallel script a non-function housing script and thus couldn't be brought in with the get_files script which would start the report generation every time I brought everything in. Thus, I had to include it in its own script and save it elsewhere to be called when necessary. The parallel_report() function would simply be:
parallel_report <- function(){
source(<path to script>)
}
Then the script that is sourced is the real parallelization script, and would look something like:
doParallel::registerDoParallel(cl = (parallel::detectCores() - 1))
foreach(name = report.list$names,
.packages = c('tidyverse', 'knitr', 'lubridate', 'stringr', 'rmarkdown'),
.export = c('generate_report'),
.errorhandling = 'remove') %dopar% {
tryCatch(expr = {
generate_report(name)
}, error = function(e){
error_handler(error = e, caller = paste0("generate report for ", name, " from parallel"), line = 28)
})
}
doParallel::stopImplicitCluster()
The generate_report function is simply an .Rmd and render() caller:
generate_report <- function(<arguments>){
#stuff
generate_report.render(<arguments>)
#stuff
}
generate_report.render <- function(<arguments>){
rmarkdown::render(
paste0(data.information#location, 'report_generator.Rmd'),
params = list(
name = name,
date = date,
thoughts = thoughts,
auto = auto),
output_file = paste0(str_to_upper(stock), '_report_', str_remove_all(date, '-'))
)
}
So to recap, in RStudio I would simply perform the following:
1 - Source save the script to bring everything
2 - type parallel_report
2.a - this calls directly the doParallization of generate_report
2.b - generate_report calls an .Rmd file that houses the required function calling and whatnot to produce the reports
And the process starts and successfully completes without a hitch.
In order to make the situation automatic via the Task Scheduler, I made a script that the Task Scheduler can call, named automatic_caller:
source(<path to the get_files script>) # this brings in all the scripts and data into the global, just
# as if it were being done manually
tryCatch(
expr = {
parallel_report()
}, error = function(e){
error_handler(error = e, caller = "parallel_report from automatic_callng", line = 39)
})
The error_handler function is just an in-house script used to log errors throughout.
So then on the Task Schedule's tasks I have the Rscript.exe called and then the automatic_caller after that. Everything within the automatic_caller function works except for the report generation.
The process completes almost automatically, and the only output I get is an error:
"pandoc version 1.12.3 or higher is required and was not found (see the help page ?rmarkdown::pandoc_available)."
But rmarkdown is within the .export call of the doParallel and it is in the scripts that use it explicitly, and in the actual generate_report it is called directly via rmarkdown::render().
So - I am at a complete loss.
Thoughts and suggestions would be completely appreciated.
So pandoc is apprently an executable that helps convert files from one extension to another. RStudio comes with its own pandoc executable so when running the scripts from RStudio, it knew where to point when pandoc is required.
From the command prompt, the system did not know to look inside of RStudio, so simply downloading pandoc as a standalone executable gives the system the proper pointer.
Downloded pandoc and everything works fine.

Calling stop( ) within function causes R CMD Check to throw error

I am attempting to call stop( ) from within an internal package function (stop_quietly()) which should break the function and return to the topline. This works except that R CMD Check thinks this is an error because I am forcing stop.
How do I get around the R CMD check interpreting this as an error? The function needs to stop since it requires user input as a confirmation before it creates a file directory tree at a given location. The code currently produces a message and stops the function.
tryCatch({
path=normalizePath(path=where, winslash = "\\", mustWork = TRUE)
message(paste0("This will create research directories in the following directory: \n",path))
confirm=readline(prompt="Please confirm [y/n]:")
if(tolower(stringr::str_trim(confirm)) %in% c("y","yes","yes.","yes!","yes?")){
.....
dir.create(path, ... [directories])
.....
}
message("There, I did some work, now you do some work.")
}
else{
message("Okay, fine then. Don't do your research. See if I care.")
stop_quietly()
}
},error=function(e){message("This path does not work, please enter an appropriate path \n or set the working directory with setwd() and null the where parameter.")})
stop_quietly is an exit function I took from this post with the modification of error=NULL which suppresses R executing the error handler as a Browser. I do not want the function to terminate to a Browser I just want it to quit without throwing an error in the R CMD Check.
stop_quietly <- function() {
opt <- options(show.error.messages = FALSE, error=NULL)
on.exit(options(opt))
stop()
}
Here is the component of the error R CMD produces:
-- R CMD check results ------------------------------------------------ ResearchDirectoR 1.0.0 ----
Duration: 12.6s
> checking examples ... ERROR
Running examples in 'ResearchDirectoR-Ex.R' failed
The error most likely occurred in:
> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: create_directories
> ### Title: Creates research directories
> ### Aliases: create_directories
>
> ### ** Examples
>
> create_directories()
This will create research directories in your current working directory:
C:/Users/Karnner/AppData/Local/Temp/RtmpUfqXvY/ResearchDirectoR.Rcheck
Please confirm [y/n]:
Okay, fine then. Don't do your research. See if I care.
Execution halted
Since your function has global side effects, I think check isn't going to like it. It would be different if you required the user to put tryCatch at the top level, and then let it catch the error. But think about this scenario: a user defines f() and calls it:
f <- function() {
call_your_function()
do_something_essential()
}
f()
If your function silently caused it to skip the second line of f(), it could cause a lot of trouble for the user.
What you could do is tell the user to wrap the call to your function in tryCatch(), and have it catch the error:
f <- function() {
tryCatch(call_your_function(), error = function(e) ...)
do_something_essential()
}
f()
This way the user will know that your function failed, and can decide whether or not to continue.
From discussion in the comments and your edit to the question, it seems like your function is only intended to be used interactively, so the above scenario isn't an issue. In that case, you can avoid the R CMD check problems by skipping the example unless it is being run interactively. This is fairly easy: in the help page for a function like create_directories(), set up your example as
if (interactive()) {
create_directories()
# other stuff if you want
}
The checks are run with interactive() returning FALSE, so this will stop the error from ever happening in the check. You could also use tryCatch within create_directories() to catch the error coming up from below if that makes more sense in your package.

Is there a function in R to check if there is an error in r script or in a log?

I'm trying to create an if statement to check whether there is any errors in my R script (or error displayed on the console) and also log files if there are to have "error" in a variable and if there isn't to have "no error" in the same variable.
I looked at is.error() however I want to check if an error is shown on the console or log file.
There is no single-stop solution to the best of my knowledge. There are several things you can try:
1) Incorporate your script into your code and use tryCatch or try to catch any errors. More information on error catching and debugging in R can be found here.
2) Execute your script in the system shell via the system command and inspect output caught by setting intern=TRUE.
You can source the script in a new environment :
testscript <- function(scriptpath) {
tryCatch({
# Tests is the script runs without error
source(scriptpath, local = new.env())
message("Script OK")
},
error = function(cond){
message('Script not OK')
message(cond)
})}
for example, content of script.R :
x <- 1
y <- 2
x + z
testscript('script.R')
Script not OK
object 'z' not found

How can knitr know whether the R code evaluation has error?

The knitr would always evaluate the R code before formatting the output, so just wondering how can I know whether the R code evaluation has error. Thanks
Basically it boils down to three lines of code in the evaluate package. The key is withCallingHandlers(), which can be used to capture errors, messages, and warnings, etc. A minimal example:
withCallingHandlers(1 + 'a', error = function(e) {
cat('An error occurred! The error object is:\n')
str(e)
})
If you don't want the error to halt R, you can wrap the code in try().

R Markdown error in code when knit to HTML

I am trying to run code chunks in my markdown document. I have an R script that runs all the code that I need without any issues. Then, when I copy and paste the code into the markdown document, the code will run within the chunk, but will fail when trying to knit into an output document (html/pdf).
I had to create a safe.ifelse function to prevent r from converting my dates to a numeric format as discussed here.
The error appears to be with the code:
safe.ifelse = function(cond, yes, no){structure(ifelse(cond, yes, no), class = class(yes))
}
The error message I get is:
Line 121 Error in structure(ifelse(cond,yes,no), class = class(yes)) : could not find function "days" Calls: ... transform.data.frame ->eval->eval-> safe.ifelse-> structure Execution halted
The line of code following my safe.ifelse function is
seminoma1 = transform(seminoma1, recur.date = safe.ifelse(salvage.tx=="Yes",
date.diagnosis + days(pmax(time.rad, time.chemo, na.rm=TRUE)), NA))
Any help would be appreciated. Thanks.
I'm still too new to comment, but the only time I get an error like that is when I forget to define a function/variable or forget to source a package.
Since days() isn't part of R's base package, I think you need to add:
```{r echo = FALSE}
library("lubridate")
```

Resources