rmarkdown::render() output to stdout - r

When I render() an *.Rmd file locally in RStudio, the output from the render() function is displayed in the console:
Basic .Rmd file:
---
title: "test"
output: html_document
---
```{r setup, include=FALSE}
sink("./output.txt", type = 'output')
knitr::opts_chunk$set(echo = TRUE)
```
## Summary
```{r cars}
summary(cars)
```
## Error
```{r}
dbGetQuery()
```
To build:
library(rmarkdown)
render('./test.rmd')
Output:
This is great when I'm creating reports locally and I can see the progress and errors thrown (if any). I need to monitor this output in stdout (or stderr) but I can't sink this output to that location because knitr is using capture.input which uses sink() (see first comment). I even tried sinking to a file instead but though the file output.txt is created, there is nothing recorded in that file.
This is an issue for me because I'm using render() in a Docker container and I can't send the chunk output from the .Rmd file in the Docker container to stderr or stdout. I need to monitor the chunk output for errors in the container R code inside the .Rmd file (to diagnose connection db connection errors) and sending those chunks to stdout or stderr is the only way to do that (without logging in to the container which, in my use case (i.e., deployed to AWS) is impossible)
I've reviewed the knitr chunk options and there doesn't seem to be any option I can set to force the chunk output to a file or to stdout or stderr.
Is there some way I can write all of the chunk output to stdout or stderr inside of the render() function? This several-years old question is similar to mine (if not identical) but the accepted answer does not fit my use case

There are two approaches I would take to this problem - depending on what you want.
If you want to see the stderr/stdout output, as you would in a console, the simplest way to accomplish this is to use a shell script to render your doc, and pipe the output to text.
- The littler package
contains an example shell script render.r that is very useful for
this purpose. After installing littler, and ensuring that the scripts are available on your path, you can run:
render.r test.rmd > output.txt
or
render.r test.rmd > output.txt 2>&1
to include stderr in the output file.
However, I've found that this console output is often not sufficiently detailed for debugging and troubleshooting purposes.
So, another option is to edit the Rmd file to log more details about progress/errors & to direct the logger output to an external file.
This is particularly helpful in the context of a cloud or docker-container compute environment since the log can be directed to a datastore allowing for real-time logging, and searching logs across many jobs. Personally, I do this using the futile.logger package.
Using a logger created by futile.logger, this would work as follows:
In your Rmd file or in functions called by code in your Rmd file, redirect important error & warning messages to the logger. The best way to do this is the topic of another question, and in my experience varies by task.
At a minimum, this results in inserting a series of R commands like follows in your Rmd file:
library(futile.logger)
flog.info('Querying data .. ')
data <- tryCatch(dbGetQuery(...),
warning = function(war) {flog.warn(war)},
error = function(err) {flog.error(err)},
...)
possibly editing the logged messages to provide more context.
A more thorough solution will apply globally either to a code chunk or a file. I have not tested this personally in the context of an Rmd, but it might involve using withCallingHandlers, or changing options(error = custom_logging_function).
In your R session, before rendering your Rmd, redirect the logger output to a file or to your desired destination.
This looks something like:
library(futile.logger)
flog.logger(name = 'ROOT',
appender = appender.file('render.log'))
rmarkdown::render('my_document.Rmd')
As the document is rendering, you will see the logger output printed to the render.log file.
I would note that, while I actively use the futile.logger package, this package has now been deprecated into a new iteration of it, called logger. I haven't tried this approach specifically with logger, but I suspect it would work just as well if not better. The differences between logger & futile.logger are described very well in this vignette on migration, from the logger docs.

I believe the usual way to do this is to run a separate R instance and capture its output. Without any error checking:
output <- system2("R","-e \"rmarkdown::render('test.Rmd')\"",
stdout = TRUE, stderr = TRUE)
This puts all of the output into the output vector. Maybe you can run analysis code in the docker container to look for problems.

Related

R markdown keeps losing its working directory

I have set the root.dir in the settings:
```{r setup, include=TRUE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_knit$set(root.dir = "E:/something/Lists")
```
with getwd() I can check that it worked.
On top of that, I have set the working directory to exactly the same path in the RStudio tools.
This works without problems for the code chunks. E.g. I load a table and it appears correctly in the global environment:
```{r, include = TRUE, echo = FALSE}
fevofe<-read.table(file = "112_Auswertg.csv", header = T, sep=";", dec=".")
```
However, I have some inline code in order to have some numbers generated within my plain text.
"All in all we found `r length(fevove[,1])` specimens..."
And this inline code does not react to setting the root.dir. As soon as I try to run these code bits, R markdown keeps telling me that it hasn't found the object - although it is already in the global environment, because it was loaded in the previous code chunk.
After executing this inline code and the error, I ask R markdown getwd() and suddenly it is my Documents folder!
Consequently, when knitting, the process is cancelled, because the object is lost and then I get the Error:
Error in eval(parse_only(code), envir=envir) : object 'fevofe' not found calls <Anonymous>... inline_exec -> hook_eval -> withvisible -> eval -> eval
Has anybody an idea what causes this stubborn resetting of the working directory?
Any hint is welcome
so this is an interesting case:
The basic problem was a typo: fevofe was in the problematic lines spelled fevove...
Correcting this helped knitr to complete the job.
However, it seems strange enough that this causes R markdown to reset the working directory to something that is not even the standard in the options.
This topic remains interesting...

Rstudio Global Changes: show chunk output in console via CLI?

How can I force RStudio (v1.1.383) to evaluate R chunks always in console (instead of inline) when working with Rmarkdown documents, using a script?
I know I can set up Output chunks in Console by clicking on it:
According to this RStudio support post I could also un-check 'Show output inline for all R Markdown document' under 'Tools -> Global Options...':
But, is there a way to do it from a command line?
The reason I ask is that, I often work on my university machines and they all restore to defaults after each reset. Each time when in class, we have to manually go thru menus.
Knowing how to do it via a console command would as useful as starting each of my classes with
rm(list=ls())
There's not currently an elegant way to do this. This preference is stored inside an internal RStudio state file, in %localappdata%\RStudio-Desktop\monitored\user-settings. If you're sufficiently motivated you can write a script which sets the rmd_chunk_output_inline preference, but it's going to be unpleasant.
One thing you can do is set the chunk output type in the YAML header, like this:
---
editor_options:
chunk_output_type: console
---
You could also use an R Markdown document template with this set up for you (maybe your script could write this out).
Finally, there's an open issue for this on RStudio's github page which you might comment on and/or vote for:
https://github.com/rstudio/rstudio/issues/1607

How can I print to the console when using knitr?

I am trying to print to the console (or the output window) for debugging purposes. For example:
\documentclass{article}
\begin{document}
<<foo>>=
print(getwd())
message(getwd())
message("ERROR:")
cat(getwd(), file=stderr())
not_a_command() # Does not throw an error?
stop("Why doesn't this throw an error?")
#
\end{document}
I get the results in the output PDF, but my problem is I have a script that is not completing (so there is no output PDF to check), and I'm trying to understand why. There also appears to be no log file output if the knitting doesn't complete successfully.
I am using knitr 1.13 and Rstudio 0.99.896.
EDIT: The above code will correctly output (and break) if I change to Sweave, so that makes me think it is a knitr issue.
This question has several aspects – and its partly a XY problem. At the core, the question is (as I read it):
How can I see what's wrong if knitr fails and doesn't produce an output file?
In case of PDF output, quite often compiling the output PDF fails after an error occurred, but there is still the intermediate TEX file. Opening this file may reveal error messages.
As suggested by Gregor, you can run the code in the chunks line by line in the console (or by chunk). However, this may not reproduce all problems, especially if they are related to the working directory or the environment.
capture.output can be used to print debug information to an external file.
Finally (as opposed to my earlier comment), it is possible to print on RStudio's progress window (or however it's called): Messages from hooks will be printed on the progress window. Basically, the message must come from knitr itself, not from the code knitr evaluates.
How can I print debug information on the progress window in RStudio?
The following example prints all objects in the environment after each chunk with debug = TRUE:
\documentclass{article}
\begin{document}
<<>>=
knitr::knit_hooks$set(debug = function(before, options, envir) {
if (!before) {
message(
paste(names(envir), as.list(envir),
sep = " = ", collapse = "\n"))
}
})
#
<<debug = TRUE>>=
a <- 5
foo <- "bar"
#
\end{document}
The progress window reads:
Of course, for documents with more or larger objects the hook should be adjusted to selectively print (parts of) objects.
Use stop(). For example, stop("Hello World").

How to request an early exit when knitting an Rmd document?

Let's say you have an R markdown document that will not render cleanly.
I know you can set the knitr chunk option error to TRUE to request that evaluation continue, even in the presence of errors. You can do this for an individual chunk via error = TRUE or in a more global way via knitr::opts_chunk$set(error = TRUE).
But sometimes there are errors that are still fatal to the knitting process. Two examples I've recently encountered: trying to unlink() the current working directory (oops!) and calling rstudioapi::getVersion() from inline R code when RStudio is not available. Is there a general description of these sorts of errors, i.e. the ones beyond the reach of error = TRUE? Is there a way to tolerate errors in inline R code vs in chunks?
Also, are there more official ways to halt knitting early or to automate debugging in this situation?
To exit early from the knitting process, you may use the function knitr::knit_exit() anywhere in the source document (in a code chunk or inline expression). Once knit_exit() is called, knitr will ignore all the rest of the document and write out the results it has collected so far.
There is no way to tolerate errors in inline R code at the moment. You need to make sure inline R code always runs without errors1. If errors do occur, you should see the range of lines that produced the error from the knitr log in the console, of the form Quitting from lines x1-x2 (filename.Rmd). Then you can go to the file filename.Rmd and see what is wrong with the lines from x1 to x2. Same thing applies to code chunks with the chunk option error = FALSE.
Beyond the types of errors mentioned above, it may be tricky to find the source of the problem. For example, when you unintentionally unlink() the current directory, it should not stop the knitting process, because unlink() succeeded anyway. You may run into problems after the knitting process, e.g., LaTeX/HTML cannot find the output figure files. In this case, you can try to apply knit_exit() to all code chunks in the document one by one. One way to achieve this is to set up a chunk hook to run knit_exit() after a certain chunk. Below is an example of using linear search (you can improve it by using bisection instead):
#' Render an input document chunk by chunk until an error occurs
#'
#' #param input the input filename (an Rmd file in this example)
#' #param compile a function to compile the input file, e.g. knitr::knit, or
#' rmarkdown::render
knit_debug = function(input, compile = knitr::knit) {
library(knitr)
lines = readLines(input)
chunk = grep(all_patterns$md$chunk.begin, lines) # line number of chunk headers
knit_hooks$set(debug = function(before) {
if (!before) {
chunk_current <<- chunk_current + 1
if (chunk_current >= chunk_num) knit_exit()
}
})
opts_chunk$set(debug = TRUE)
# try to exit after the i-th chunk and see which chunk introduced the error
for (chunk_num in seq_along(chunk)) {
chunk_current = 0 # a chunk counter, incremented after each chunk
res = try(compile(input))
if (inherits(res, 'try-error')) {
message('The first error came from line ', chunk[chunk_num])
break
}
}
}
This is by design. I think it is a good idea to have error = TRUE for code chunks, since sometimes we want to show errors, for example, for teaching purposes. However, if I allow errors for inline code as well, authors may fail to recognize fatal errors in the inline code. Inline code is normally used to embed values inline, and I don't think it makes much sense if an inline value is an error. Imagine a sentence in a report like The P-value of my test is ERROR, and if knitr didn't signal the error, it will require the authors to read the report output very carefully to spot this issue. I think it is a bad idea to have to rely on human eyes to find such mistakes.
IMHO, difficulty debugging an Rmd document is a warning that something is wrong. I have a rule of thumb: Do the heavy lifting outside the Rmd. Do rendering inside the Rmd, and only rendering. That keeps the Rmd code simple.
My large R programs look like this.
data <- loadData()
analytics <- doAnalytics(data)
rmarkdown::render("theDoc.Rmd", envir=analytics)
(Here, doAnalytics returns a list or environment. That list or environment gets passed to the Rmd document via the envir parameter, making the results of the analytics computations available inside the document.)
The doAnalytics function does the complicated calculations. I can debug it using the regular tools, and I can easily check its output. By the time I call rmarkdown::render, I know the hard stuff is working correctly. The Rmd code is just "print this" and "format that", easy to debug.
This division of responsibility has served me well, and I can recommend it. Especially compared to the mind-bending task of debugging complicated calculations buried inside a dynamically rendered document.

Is there a way to do test-driven development with literate programming?

I'm learning to do my first unit tests with R, and I write my code in R Markdown files to make delivering short research reports easy. At the same time, I would like to test the functions I use in these files to make sure the results are sane.
Here's the problem: R Markdown files are meant to go into HTML weavers, not the RUnit test harness. If I want to load a function into the test code, I have a few choices:
Copy-paste the code chunk from the Markdown file, which decouples the code in the Markdown doc from the tested code
Put my test code inside the Markdown file, which makes the report difficult to understand (perhaps at the end would be tolerable)
Write the code, test it first, and then include it as a library in the Markdown code, which takes away the informative character of having the code in the body of the report
Is there a more sensible way to go about this that avoids the disadvantages of each of these approaches?
You could do something like this
## Rmarkdown file with tests
```{r definefxn}
foo <- function(x) x^2
```
Test fxn
```{r testfxn}
library(testthat)
expect_is(foo(8), "numeric")
expect_equal(foo(8), 6)
```
Where of course the tests that pass don't print anything, but the tests that fail print meaningful messages about what failed.

Resources