kable all tables in markdown report [duplicate] - r

Using knitr and R Markdown, I can produce a tabularised output from a matrix using the following command:
```{r results='asis'}
kable(head(x))
```
However, I’m searching for a way to make the kable code implicit since I don’t want to clutter the echoed code with it. Essentially, I want this:
```{r table=TRUE}
head(x)
```
… to yield a formatted tabular (rather than the normal output='markdown') output.
I actually thought this must be pretty straightforward since it’s a pretty obvious requirement, but I cannot find any way to achieve this, either via the documentation or on the web.
My approach to create an output hook fails because once the data arrives at the hook, it’s already formatted and no longer the raw data. Even when specifying results='asis', the hook obtains the output as a character string and not as a matrix. Here’s what I’ve tried:
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (! is.null(options$table))
kable(x)
else
default_output_hook(x, options)
)
But like I said, this fails since x is not the original matrix but rather a character string, and it doesn’t matter which value for the results option I specify.

Nowadays one can set df_print in the YAML header:
---
output:
html_document:
df_print: kable
---
```{r}
head(iris)
```

I think other answers are from a time when the following didn't work, but now we can just do :
```{r results='asis', render=pander::pander}
head(x)
```
Or set this for all chunks in the setup chunk, for instance :
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, render=pander::pander)
```

Lacking a better solution I’m currently re-parsing the character string representation that I receive in the hook. I’m posting it here since it kind of works. However, parsing a data frame’s string representation is never perfect. I haven’t tried the following with anything but my own data and I fully expect it to break on some common use-cases.
reparse <- function (data, comment, ...) {
# Remove leading comments
data <- gsub(sprintf('(^|\n)%s ', comment), '\\1', data)
# Read into data frame
read.table(text = data, header = TRUE, ...)
}
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (is.null(options$table))
default_output_hook(x, options)
else {
extra_opts <- if (is.list(options$table)) options$table else list()
paste(kable(do.call(reparse, c(x, options$comment, extra_opts))),
collapse = '\n')
}
)
This will break if the R markdown comment option is set to a character sequence containing a regular expression special char (e.g. *), because R doesn’t seem to have an obvious means of escaping a regular expression.
Here’s a usage example:
```{r table=TRUE}
data.frame(A=1:3, B=4:6)
```
You can pass extra arguments to the deparse function. This is necessary e.g. when the table contains NA values because read.table by default interprets them as strings:
```{r table=list(colClasses=c('numeric', 'numeric'))}
data.frame(A=c(1, 2, NA, 3), B=c(4:6, NA))
```
Far from perfect, but at least it works (for many cases).

Not exactly what you are looking for, but I am posting an answer here (that could not fit in a comment) as your described workflow is really similar to what my initial goal and use-case was when I started to work on my pander package. Although I really like the bunch of chunk options that are available in knitr, I wanted to have an engine that makes creating documents really easy, automatic and without any needed tweaks. I am aware of the fact that knitr hooks are really powerful, but I just wanted to set a few things in my Rprofile and let the literate programming tool its job without further trouble, that ended to be Pandoc.brew for me.
The main idea is to specify a few options (what markdown flavour are you using, what's your decimal mark, favorite colors for your charts etc), then simply write your report in a brew syntax without any chunk options, and the results of your code would be automatically transformed to markdown. Then convert that to pdf/docx/odt etc. with Pandoc.

Related

How to Customize figure LaTeX with knitr/rmarkdown

I want to create a footer within the float for a figure created with ggplot2 in an rmarkdown document that is generating a .pdf file via LaTeX.
My question: Is there a way within rmarkdown/knitr to add more LaTeX commands within the figure environment?
Specifically, I'd like to find a way to insert custom LaTeX using either the floatrow or caption* macro as described in https://tex.stackexchange.com/questions/56529/note-below-figure within the figure environment.
When I looked at the chunk options (https://yihui.org/knitr/options/#plots), something like out.extra seems close to what I want, but that is used as an extra option to \includegraphics while I want access to put extra LaTeX within the figure environment, outside of any other LaTeX command.
The solution to your question is perhaps quite similar to this one. However, I believe yours is a bit more general, so I'll try to be a bit more general as well...
As far as I know, there's no simple solution to add extra LaTeX code within the figure environment. What you can do is update the knit (or output) hook (i.e. the LaTeX code output generated by the figure chunk).
The source code for the LaTeX figure output hook can be found here (hook_plot_tex). The output generated can be found starting at line 159. Here we can see how the output is structured and we're able to modify it before it reaches the latex engine.
However, we only want to modify it for relevant figure chunks, not all. This is where Martin Schmelzer's answer comes in handy. We can create a new chunk option which allows for control over when it is activated. As an example enabling the use of caption* and floatrow we can define the following knit hook
defOut <- knitr::knit_hooks$get("plot")
knitr::knit_hooks$set(plot = function(x, options) {
#reuse the default knit_hook which will be updated further down
x <- defOut(x, options)
#Make sure the modifications only take place when we enable the customplot option
if(!is.null(options$customplot)) {
x <- gsub("caption", "caption*", x) #(1)
inter <- sprintf("\\floatfoot{%s}\\end{figure}", options$customplot[1]) #(2)
x <- gsub("\\end{figure}", inter, x, fixed=T) #(3)
}
return(x)
})
What we're doing here is (1) replacing the \caption command with \caption*, (2) defining the custom floatfoot text input, (3) replacing \end{figure} with \floatfoot{custom text here}\end{figure} such that floatfoot is inside the figure environment.
As you can probably tell, sky's the limit for what you can add/replace in the figure environment. Just make sure it is added inside the environment and is in the apropriate location. See the example below how the chunk option is used to enable floatfoot and caption*. (You can also split the customplot option into e.g. starcaption and floatfoot by simply dividing up the !is.null(options$customplot) condition. This should allow for better control)
Working example:
---
header-includes:
- \usepackage[capposition=top]{floatrow}
- \usepackage{caption}
output: pdf_document
---
```{r, echo = F}
library(ggplot2)
defOut <- knitr::knit_hooks$get("plot")
knitr::knit_hooks$set(plot = function(x, options) {
x <- defOut(x, options)
if(!is.null(options$customplot)) {
x <- gsub("caption", "caption*", x)
inter <- sprintf("\\floatfoot{%s}\\end{figure}", options$customplot[1])
x <- gsub("\\end{figure}", inter, x, fixed=T)
}
return(x)
})
```
```{r echo = F, fig.cap = "Custom LaTeX hook chunk figure", fig.align="center", customplot = list("This is float text using floatfoot and floatrow")}
ggplot(data = iris, aes(x=Sepal.Length, y=Sepal.Width))+
geom_point()
```
PS
The example above requires the fig.align option to be enabled. Should be fairly easy to fix, but I didn't have the time to look into it.
#henrik_ibsen gave the answer that got me here. I made some modifications to the code that I ended up using to make it work a bit more simply:
hook_plot_tex_footer <- function(x, options) {
x_out <- knitr:::hook_plot_tex(x, options)
if(!is.null(options$fig.footer)) {
inter <- sprintf("\\floatfoot{%s}\n\\end{figure}", options$fig.footer[1])
x_out <- gsub(x=x_out, pattern="\n\\end{figure}", replacement=inter, fixed=TRUE)
}
x_out
}
knitr::knit_hooks$set(plot=hook_plot_tex_footer)
Overall, this does the same thing. But, it uses knitr:::hook_plot_tex() instead of defOut() so that if rerun in the same session, it will still work. And, since it's going into a \floatfoot specifically, I named the option fig.footer. But, those changes are minor and the credit for the answer definitely goes to #henrik_ibsen.

how to tell if code is executed within a knitr/rmarkdown context?

Based on some simple tests, interactive() is true when running code within rmarkdown::render() or knitr::knit2html(). That is, a simple .rmd file containing
```{r}
print(interactive())
```
gives an HTML file that reports TRUE.
Does anyone know of a test I can run within a code chunk that will determine whether it is being run "non-interactively", by which I mean "within knit2html() or render()"?
As Yihui suggested on github isTRUE(getOption('knitr.in.progress')) can be used to detect whether code is being knitted or executed interactively.
A simple suggestion for rolling your own: see if you can access current output format:
```{r, echo = FALSE}
is_inside_knitr = function() {
!is.null(knitr::opts_knit$get("out.format"))
}
```
```{r}
is_inside_knitr()
```
There are, of course, many things you could check--and this is not the intended use of these features, so it may not be the most robust solution.
I suspect (?) you might just need to roll your own.
If so, here's one approach which seems to perform just fine. It works by extracting the names of all of the functions in the call stack, and then checks whether any of them are named "knit2html" or "render". (Depending on how robust you need this to be, you could do some additional checking to make sure that these are really the functions in the knitr and rmarkdown packages, but the general idea would still be the same.)
```{r, echo=FALSE}
isNonInteractive <- function() {
ff <- sapply(sys.calls(), function(f) as.character(f[[1]]))
any(ff %in% c("knit2html", "render"))
}
```
```{r}
print(isNonInteractive())
```

How to write an article with an abstract referencing data that has not yet been calculated?

UPDATE: Seems like my question is actually a very near duplicate of this question
and according to that thread, there is currently no "easy" solution. However, that question is over a year old now, and the time may have changed (one can hope!).
My original question follows:
I'm thinking that I need some kind of mechanism to re-order the text and or R chunks in the document as it is being knit. What I want to be able to do is to write an "article" style document with an abstract and summary at the beginning, before I get into any R code, but that contains "forward"-references to things that will be calculated in the R code.
So my exec summary at the beginning might be
We found a `r final_correlation/100`% correlation between x and y...
but "final_correlation" will be calculated at the back end of the document as I go through all of the steps of the reproducible research.
Indeed, when I read about reproducible research, I often see comments that the documentation can often be better presented out-of programming sequence.
I believe that in other literate programming frameworks the chunks can be tangled into a different order from that in which they were presented. How can I achieve that in knitr? Or is there some other completely different workflow or pattern I could adopt to achieve the outcome I want?
There is no way to define the order to evaluate all the code chunks in knitr at the moment. One idea I can think of is to write the abstract in the end of the article, and include it in the beginning. An outline:
article.Rmd
abstract.Rmd
In article.Rmd:
Title.
Author.
Abstract.
```{r echo=FALSE, results='asis'}
if (file.exists('abstract.md')) {
cat(readLines('abstract.md'), sep = '\n')
} else {
cat('Abstract not ready yet.')
}
```
More code chunks.
```{r}
x <- 1:10
y <- rnorm(10)
final_correlation <- cor(x, y)
```
Body.
```{r include=FALSE}
knitr::knit('abstract.Rmd') # generates abstract.md
```
In abstract.Rmd:
We found a `r final_correlation/100`% correlation between x and y...

Including R help in knitr output

Is it possible to include R documentation in knitr output? When using stock datasets, it would be nice to just include the builtin documentation without having to copy and paste it in. The problem appears to be that ? works by side effect and so there is no "result" in a meaningful sense. For example,
```{r}
?mtcars
```
has no output that is trapped by knitr.
Using help(...,help_type) instead of ? doesn't help either. I've tried:
```{r, results='markup'}
help(mtcars, help_type="text")
```
and
```{r, results='asis'}
help(mtcars, type="html")
```
with the same result. (In the latter case, knitr did trap the output ## starting httpd help server ... done, which is basically just a message about the side effect.)
In other words, is there a way to extract R help in plain text or HTML?
To answer your specific question, "Is there a way to extract R help in plain text or HTML?", the answer would be to use a combination of Rd2HTML or Rd2txt from the "tools" package, with a little bit of help from .getHelpFile from "utils".
For HTML:
tools:::Rd2HTML(utils:::.getHelpFile(help(mtcars)))
For txt:
tools:::Rd2txt(utils:::.getHelpFile(help(mtcars)))
By the sounds of it, though, you should be able to use the function I've linked to in the comment above. For instance, to include the text from the "Description" section of the "mtcars" help page, you would use something along the lines of:
```{r, echo=FALSE, results='asis'}
cat(helpExtract(mtcars, section = "Desc", type = "m_text"))
```
I think you can get what you want by hacking the pager option as follows:
pfun <- function(files, header, title, delete.file) {
all.str <- do.call("c",lapply(files,readLines))
cat(all.str,sep="\n")
}
orig_pager <- options(pager=pfun)
help("mtcars")
options(orig_pager)
(you can return the character vector from the function instead of cat()ing it if you prefer).
Use printr, e.g.
library(printr)
help(mtcars)
detach('package:printr', unload = TRUE)

Use hooks to format table in output

Using knitr and R Markdown, I can produce a tabularised output from a matrix using the following command:
```{r results='asis'}
kable(head(x))
```
However, I’m searching for a way to make the kable code implicit since I don’t want to clutter the echoed code with it. Essentially, I want this:
```{r table=TRUE}
head(x)
```
… to yield a formatted tabular (rather than the normal output='markdown') output.
I actually thought this must be pretty straightforward since it’s a pretty obvious requirement, but I cannot find any way to achieve this, either via the documentation or on the web.
My approach to create an output hook fails because once the data arrives at the hook, it’s already formatted and no longer the raw data. Even when specifying results='asis', the hook obtains the output as a character string and not as a matrix. Here’s what I’ve tried:
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (! is.null(options$table))
kable(x)
else
default_output_hook(x, options)
)
But like I said, this fails since x is not the original matrix but rather a character string, and it doesn’t matter which value for the results option I specify.
Nowadays one can set df_print in the YAML header:
---
output:
html_document:
df_print: kable
---
```{r}
head(iris)
```
I think other answers are from a time when the following didn't work, but now we can just do :
```{r results='asis', render=pander::pander}
head(x)
```
Or set this for all chunks in the setup chunk, for instance :
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, render=pander::pander)
```
Lacking a better solution I’m currently re-parsing the character string representation that I receive in the hook. I’m posting it here since it kind of works. However, parsing a data frame’s string representation is never perfect. I haven’t tried the following with anything but my own data and I fully expect it to break on some common use-cases.
reparse <- function (data, comment, ...) {
# Remove leading comments
data <- gsub(sprintf('(^|\n)%s ', comment), '\\1', data)
# Read into data frame
read.table(text = data, header = TRUE, ...)
}
default_output_hook <- knit_hooks$get('output')
knit_hooks$set(output = function (x, options)
if (is.null(options$table))
default_output_hook(x, options)
else {
extra_opts <- if (is.list(options$table)) options$table else list()
paste(kable(do.call(reparse, c(x, options$comment, extra_opts))),
collapse = '\n')
}
)
This will break if the R markdown comment option is set to a character sequence containing a regular expression special char (e.g. *), because R doesn’t seem to have an obvious means of escaping a regular expression.
Here’s a usage example:
```{r table=TRUE}
data.frame(A=1:3, B=4:6)
```
You can pass extra arguments to the deparse function. This is necessary e.g. when the table contains NA values because read.table by default interprets them as strings:
```{r table=list(colClasses=c('numeric', 'numeric'))}
data.frame(A=c(1, 2, NA, 3), B=c(4:6, NA))
```
Far from perfect, but at least it works (for many cases).
Not exactly what you are looking for, but I am posting an answer here (that could not fit in a comment) as your described workflow is really similar to what my initial goal and use-case was when I started to work on my pander package. Although I really like the bunch of chunk options that are available in knitr, I wanted to have an engine that makes creating documents really easy, automatic and without any needed tweaks. I am aware of the fact that knitr hooks are really powerful, but I just wanted to set a few things in my Rprofile and let the literate programming tool its job without further trouble, that ended to be Pandoc.brew for me.
The main idea is to specify a few options (what markdown flavour are you using, what's your decimal mark, favorite colors for your charts etc), then simply write your report in a brew syntax without any chunk options, and the results of your code would be automatically transformed to markdown. Then convert that to pdf/docx/odt etc. with Pandoc.

Resources