Plots generated in chunk don't appear with rmarkdown::render_site() - r

I am creating a website with
Rscript -e "rmarkdown::render_site()"
I am generating both html and pdf versions of documents. A plot generated in a chunk does not appear unless the pdf doc is generated before the html doc.
Here are the files:
index.Rmd
---
title: "My Website"
---
* [Test1 page](test1.html)
* [Test2 page](test2.html)
_site.yml
name: "my-website"
test1.Rmd (html generated first)
---
output:
html_document: default
pdf_document: default
---
```{r, message=FALSE, echo=FALSE}
library(ggplot2)
ggplot(mtcars, aes(x=mpg, y=disp)) + geom_point()
```
test2.Rmd (pdf generated first)
---
output:
pdf_document: default
html_document: default
---
```{r, message=FALSE, echo=FALSE}
library(ggplot2)
ggplot(mtcars, aes(x=mpg, y=disp)) + geom_point()
```
After running render_site() via Rscript, test1.html is blank --- there is no test1_files subdirectory. However, test2.html shows this plot (and of course test2_files exists):
This happens with both Rmarkdown 1.10 and 1.10.14, the development version as of October 31.
In a more complicated real life example, the plots don't appear even if I switch the document order, but I am hoping that the answer to this problem will help with the more complicated one.
UPDATE: In addition to the suggestions by #giocomai, a workaround is to compile test1.Rmd twice:
Rscript -e "rmarkdown::render_site()"
Rscript -e "rmarkdown::render_site('test1.Rmd')"
This seems to work even if you compile multiple single files. Presumably the clean-up is less aggressive in the single-file case.

I could replicate your problem, and I think this is related to the fact that rmarkdown::render() cleans the files after it creates a pdf output, as it thinks those files are useless, and render_site copies the files to the _site folder only after all output types have been rendered.
In rmarkdown::render() there is an option to set clean=FALSE, but it does not seem to be available to rmarkdown::render_site(), as arguments are not passed to render.
I think it would be worth filing it as an issue to Rmarkdown, as it shouldn't be too difficult to pass over the argument.
As a workaround, you can force cache = TRUE in the chunks of the relevant Rmd document. So, for, example, the code chunk in your test1.Rmd would look like:
```{r, message=FALSE, echo=FALSE, cache = TRUE}
library(ggplot2)
ggplot(mtcars, aes(x=mpg, y=disp)) + geom_point()
```
notice the cache = TRUE in the chunk options. With the cache enabled, the folder is preserved and it is correctly copied to the _site folder.
You can also set knitr::opts_chunk$set(cache = TRUE) for all chunks.
This solves your problem, but there should probably be more elegant solutions.

Related

In R markdown, how do I prevent plots from non-cached chunks from being saved separately?

When knitting an R markdown file, the plots outputted from any chunk with cache=TRUE are saved independently from the HTML output. This makes sense to me. However, if even a single chunk has the cache=TRUE option set, all chunks, including those with cache=FALSE, have their plots saved independently. For example, the following code saves image files for both chunks:
---
title: "Cache Plot Test"
output:
html_document:
df_print: paged
---
```{r test_plot1, cache = FALSE}
library(ggplot2)
ggplot(airquality, aes(x = Temp, y = Wind)) +
geom_point()
```
```{r test_plot2, cache = TRUE}
library(ggplot2)
ggplot(airquality, aes(x = Month, y = Ozone)) +
geom_point()
```
Is there any way to prevent this if someone wants to implement caching on particular chunks but doesn't want to independently save every single plot in the output? If there isn't such an option and this is by design, what's the rationale? Why would it be necessary to save the plots from chunks that don't implement caching?
The plots are always written out to a file. You can see that for the cached block, the image is not modified when you re-knit the document, but the image in the non-cached block is rewritten (check the modified dates). R isn't re-running the code that generates the image for the cached block. If you don't have any caching enabled, rmarkdown will "clean" up after it's run and delete all images. But because rmarkdown doesn't track side effects on a per-block level, when cachine is enabled it can't clean up after itself anymore because it doesn't know which images came from which block. So it keeps them all to be safe.

Write pdf to figure folder, but delete pngs

This is a follow-up question about this answer. I have set the knitr chunk options to output a png and pdf version of plots in a folder, as well as use the pngs in the knitted report.
However, I'd only like to keep the pdf version of the figure and discard the png file. Is there a knitr-equivalent of on.exit() to clean up the pngs after knitting? Or an option I overlooked?
With the rmarkdown document below, how do I automatically clean up the png version of the plot after knitting? (Or not produce it as a standalone file in the first place)
---
title: "Untitled"
author: "Me"
date: "21/10/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
echo = TRUE,
dev = c("png", "pdf"),
fig.path = here::here(
"figures",
gsub("\\.Rmd$", "\\\\", basename(knitr::current_input()))
)
)
```
```{r my_plot}
library(ggplot2)
ggplot(mpg, aes(displ, hwy)) +
geom_point()
```
That is not exactly what you are looking for, but manually removing eval=FALSE from the following chunk, deletes the wanted files:
```{r eval=FALSE, include=FALSE}
fList <- dir("figures")
fList <- fList[stringr::str_detect(fList,"\\.png$")]
file.remove(paste0("figures/",fList))
```

Non English characters in ggplot within a knitr document

I'm trying to knit to pdf a file with Lithuanian characters like ąčęėįšųž in RStudio from .Rmd file. While knitting to html works properly and the ggplot title has the Lithuanian characters, when knitting to pdf ggplot does create warnings and dismisses these characters.
Reproducible example:
---
title: "Untitled"
output:
pdf_document:
includes:
in_header: header_lt_text.txt
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(ggplot2)
```
## Lithuanian char: ĄČĘĖĮŠŲŪžąčęėįšųūž
```{r}
ggplot(iris, aes(Sepal.Length, Sepal.Width))+
geom_point(aes(col=Species))+
labs(title="Lithuanian char: ĄČĘĖĮŠŲŪžąčęėįšųūž")
```
I pass the header_lt_text.txt with follwoing arguments:
\usepackage[utf8]{inputenc}
\usepackage[L7x]{fontenc}
\usepackage[lithuanian]{babel}
\usepackage{setspace}
\onehalfspacing
Any suggestions on how to make ggplot create correct labels?
The problem is with the pdf device and is only apparent when saving the picture as pdf (which you want because it looks much, much better). This is why it seems to "work" in some cases: the image is not rendered as pdf but e.g. as png. Thanks to #Konrad for correctly identifying the source of the problem.
To solve this, you need to pass the correct encoding to the pdf device.
Fortunately, the pdf device (?pdf) takes an encoding argument and there is a chunk option to pass arguments to the device: dev.args
On Windows, an appropriate encoding is CP1257.enc (Baltic):
```{r dev="pdf", dev.args=list(encoding="CP1257.enc")}
ggplot(iris, aes(Sepal.Length, Sepal.Width))+
geom_point(aes(col=Species))+
labs(title="Lithuanian char: ĄČĘĖĮŠŲŪžąčęėįšųūž")
```
You can see the other encodings available out of the box with: list.files(system.file("enc", package = "grDevices"))
Works well on my linux machine:
Alternatively, if you're happy to get png images inserted in the pdf, you can simply use dev="png" in your chunk option. Doesn't look as good though.

Modularized R markdown structure

There are a few questions about this already, but they are either unclear or provide solutions that don't work, perhaps because they are outdated:
Proper R Markdown Code Organization
How to source R Markdown file like `source('myfile.r')`?
http://yihui.name/knitr/demo/externalization/
Modularized code structure for large projects
R Markdown/Notebook is nice, but the way it's presented, there is typically a single file that has all the text and all the code chunks. I often have projects where such a single file structure is not a good setup. Instead, I use a single .R master file that loads the other .R files in order. I'd like to replicate this structure using R Notebook i.e. such that I have a single .Rmd file that I call the code from multiple .R files from.
The nice thing about working with a project this way is that it allows for the nice normal workflow with RStudio using the .R files but also the neat output from R Notebook/Markdown without duplicating the code.
Minimal example
This is simplified to make the example as small as possible. Two .R files and one master .Rmd file.
start.R
# libs --------------------------------------------------------------------
library(pacman)
p_load(dplyr, ggplot2)
#normally load a lot of packages here
# data --------------------------------------------------------------------
d = iris
#use iris for example, but normally would load data from file
# data manipulation tasks -------------------------------------------------
#some code here to extract useful info from the data
setosa = dplyr::filter(d, Species == "setosa")
plot.R
#setosa only
ggplot(setosa, aes(Sepal.Length)) +
geom_density()
#all together
ggplot(d, aes(Sepal.Length, color = Species)) +
geom_density()
And then the notebook file:
notebook.Rmd:
---
title: "R Notebook"
output:
html_document: default
html_notebook: default
---
First we load some packages and data and do slight transformation:
```{r start}
#a command here to load the code from start.R and display it
```
```{r plot}
#a command here to load the code from plot.R and display it
```
Desired output
The desired output is that which one gets from manually copying over the code from start.R and plot.R into the code chunks in notebook.Rmd. This looks like this (some missing due to lack of screen space):
Things I've tried
source
This loads the code, but does not display it. It just displays the source command:
knitr::read_chunk
This command was mentioned here, but actually it does the same as source as far as I can tell: it loads the code but displays nothing.
How do I get the desired output?
The solution is to use knitr's chunk option code. According to knitr docs:
code: (NULL; character) if provided, it will override the code in the
current chunk; this allows us to programmatically insert code into the
current chunk; e.g. a chunk option code =
capture.output(dump('fivenum', '')) will use the source code of the
function fivenum to replace the current chunk
No example is provided, however. It sounds like one has to feed it a character vector, so let's try readLines:
```{r start, code=readLines("start.R")}
```
```{r plot, code=readLines("start.R")}
```
This produces the desired output and thus allows for a modularized project structure.
Feeding it a file directly does not work (i.e. code="start.R"), but would be a nice enhancement.
For interoperability with R Notebooks, you can use knitr's read_chunk method as described above. In a notebook, you must call read_chunk in the setup chunk; since you can run notebook chunks in any order, this ensures that the external code will always be available.
Here's a minimal example of using read_chunk to bring code from an external R script into a notebook:
example.Rmd
```{r setup}
knitr::read_chunk("example.R")
```
```{r chunk}
```
example.R
## ---- chunk
1 + 1
When you execute the empty chunk in the notebook, code from the external file will be inserted, and the results displayed inline, as though the chunk contained that code.
As per my comment above, I use the here library to work with projects in folders:
```{ r setup, echo=FALSE, message=FALSE, warning=FALSE, results='asis'}
library(here)
insert <- function(filename){
readLines(here::here("massive_report_folder", filename))
}
```
and then each chunk looks like
```{ r setup, echo=FALSE, message=FALSE, warning=FALSE,
results='asis', code=insert("extra_file.R")}
```

Changing page size within Rmarkdown document

I have a very large phylogenetic tree that I'd quite like to insert into a supplementary material I'm writing using Rmarkdown and knitr. I dislike splitting trees across pages and I doubt anybody would print this out anyway so I thought I'd just have a large page in the middle of the pdf I'm generating.
The question is how do I change page size for one page in an otherwise A4 document? I'm pretty new to knitr and I've found global paper size options but I'm struggling to find ways of setting up what would be the equivalent of sections in Word.
(Update) Hi does anybody else have a suggestion? I tried the pdfpages package but this seems to result in an equally small figure on a page the size of the pdf that is being pasted in i.e. if I make a 20in by 20in pdf figure then paste the page in using \includepdf then I get a 20in by 20in page with a much smaller figure on it (the same as the \eject example above). It seems like knitr or Latex is forcing the graphics to have a specific size regardless of page size. Any ideas? Here's a reproducible example:
---
title: "Test"
output:
pdf_document:
latex_engine: xelatex
header-includes:
- \usepackage{pdfpages}
---
```{r setup, include=FALSE}
require(knitr)
knitr::opts_chunk$set(echo = FALSE,
warning=FALSE,
message=FALSE,
dev = 'pdf',
fig.align='center')
```
```{r mtcars, echo=FALSE}
library(ggplot2)
pdf("myplot.pdf", width=20, height=20)
ggplot(mtcars, aes(mpg, wt)) + geom_point()
dev.off()
```
#Here's some text on a normal page, the following page is bigger but has a tiny figure.
\newpage
\includepdf[fitpaper=true]{myplot.pdf}
You should be able to use \ejectpage like this:
---
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r cars}
summary(cars)
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
\eject \pdfpagewidth=20in \pdfpageheight=20in
```{r mtcars}
library(ggplot2)
ggplot(mtcars, aes(mpg, wt)) + geom_point()
```
\eject \pdfpagewidth=210mm \pdfpageheight=297mm
Back
(I can only remember the A4 height in mm for some reason)
I have just faced the same struggle when dealing with a large phylogenetic tree.
Based on hrbrmstr's answer, I came up with the snippet below.
The trick is to make knitr generate the figure but not include it in the intermediate .tex right away (hence fig.show="hide"). Then, because figure paths are predictable, you can insert it with some latex after the code chunk.
However, the new page height is not taken into account when positioning page numbers, so they tend to be printed over your image. I have tried to overcome this behavior multiple times, but in the end I simply turned them off with \thispagestyle{empty}.
---
output: pdf_document
---
```{r fig_name, fig.show="hide", fig.height=30, fig.width=7}
plot(1)
```
\clearpage
\thispagestyle{empty}
\pdfpageheight=33in
\begin{figure}[p]
\caption{This is your caption.}\label{fig:fig_name}
{\centering \includegraphics[height=30in, keepaspectratio]{main_files/figure-latex/fig_name-1} }
\end{figure}
\clearpage
\pdfpageheight=11in

Resources