How to break a long R markdown document - r

I often find that in using RStudio R markdown, the resulting HTML can get quite long and heavy. What strategies can someone use to break down or otherwise manage long documents?

Create a table of contents (TOC) to move quickly to the different sections of your document. You do it by requesting a toc in the doc header, something like this:
---
title: "MyDoc"
output:
html_document:
toc: yes
---
Instead of being stuck at the top of a long document you can also produce a “floating toc" by adding a custom css-file. Have a look at this example.

This is the default behavior of bookdown. See Chapter 3 of its documentation for details. In particular, look for the option split_by in 3.1.1 and 3.4.

Related

R Markdown not producing bibliographies for PDF

I am about to give up on LaTex through Rmarkdown. I mean honestly what is the point when it is just a constant stream of errors that take an hour to solve. Anyway, the last thing that I need to solve is the bibliography. It does not appear, no matter what I do. I am not sure what esoteric knowledge is required to make this work but simply following instructions through Bookdown is apparently not enough. My header is set up like so:
output:
pdf_document:
toc: true
toc_depth: 3
citation_package: biblatex
bibliography: zika.bib
And the final section I am outputting is as follows:
# References
nocite: |
#*
When I do this, I get "References" in my table of contents in a weird location.
And I also get the default references shown instead of the references in my bib as well as an additional references section with the #* parameter.
I am trying to just cite all bib references for now as a test and I will add in a few more once I get that working. I will probably just cite all at the end instead of referencing throughout, although I tried in-text citations and they were not added to the references section at the end either. I have tried dozens of different ways of writing this out in the YAML header as well as changing the nocite call at the end. I often have the problem shown here, or no references and no errors or I commonly get an error saying that the bibliography cannot be built from either biblatex or biber. I'm not sure what causes one or the other as it seems to be random. I'm guessing that my .bib is somehow not being linked to the .rmd, but I'm not sure what I need to do to change this. I've tried relative and absolute paths, but this produces one of the issues I mention above. Does anyone know what I might be missing?

Ouptut options ignored when using 'render_book' ('preamble'.tex' ignored)

I'm having some trouble to compile an entire document from many Rmd files by using the bookdown approach.
If I knit individual .Rmd files then 'preamble.tex' included in YAML options is taken into account.
If I render the book (with both approaches described here), then 'preamble.tex' is ignored.
To make things concrete, consider the following mwe:
preamble.tex:
\usepackage{times}
index.Rmd:
---
title: "My paper"
site: "bookdown::bookdown_site"
output:
bookdown::pdf_document2:
includes:
in_header: "preamble.tex"
---
01-intro.Rmd:
# Introduction
This chapter is an overview of the methods that we propose to solve an **important problem**.
Then, by knitting 'index.Rmd' or '01-intro.Rmd' the font indicated in 'preamble.tex' is used.
However when rendering with bookdown::render_book('index.Rmd',"bookdown::pdf_book", new_session = T) it is simply ignored.
What is more, in my actual project there are other output options that end up ignored. For example, I use toc: false and it works when knitting single files, but fails when rendering the document.
In this simple example it would be okay to use a single file, but my actual project has many chapters with R chunks within each of them. Thus, building a single file doesn't seem a good idea.
I appreciate any hints on what I am missing here.
Thanks in advance.
What you are missing here is that in your YAML header, preamble.tex is included for the bookdown::pdf_document2 output format and not bookdown::pdf_book, the format you pass to the output_format argument in bookdown::render_book(). For this reason, other YAML options (like toc: true) do not work either.
Running
bookdown::render_book('index.Rmd', "bookdown::pdf_document2", new_session = T)
instead should work.

rmarkdown journal article templates

I recently started to use r-markdown to prepare journal articles with the handy templates from rticles. However, we ended up with submitting in Word rather than LaTex. Collaborators prefer Word, and at the moment, not all journals accept LaTex but all journals accept Word. Is there a place to get a collection of r-markdown templates to generate journal articles for Word document like rticles for LaTex?
Most likely you already found the option of editing the YAML:
output:
word_document:
reference_docx: template.docx
However, from experience, the best is to export the PDF as a word document (without figures) and add them later on.
I think what you are stating is really true and certainly helding Rmarkdown adoption.
Cheers

Hack in R Markdown or Bookdown for including LaTeX environments which appear in html or docx output?

I'd like to include LaTeX environments (e.g., algorithmic from algorithmicx, mini from optidef, dcases from mathtools, etc.) in my .Rmd bookdown file. This is no problem for pdf output. But these do not render for html or docx output.
My current hack solution:
Generate the .pdf output.
Screen shot, edit, save images of interest as png
Include images conditional on output not being LaTeX
Downsides:
Obviously doesn't scale
Images are ugly in docx and html output
Screws with figure cross-referencing
There has to be a better approach, right? I was thinking that there's a way to tell rmarkdown/LaTeX that, when rendering as pdf, certain code chunks should be saved in some image format. That way they could be added back into the document as images conditional on the output document being docx or html. Is that even possible?
UPDATE: An answer to Standalone diagrams with TikZ suggests an approach involving the LaTeX standalone package. But unfortunately, it's discovered over at standalone does not work with algorithms that this does not work for the algorithm environment. Any ideas?
index.Rmd
---
title: "Bookdown"
header-includes:
- \usepackage{float}
- \floatplacement{figure}{!htb}
- \usepackage{algorithm}
- \usepackage{algpseudocode}
output:
bookdown::gitbook:
split_by: none
bookdown::pdf_book:
fig_caption: yes
keep_tex: yes
toc: no
bookdown::word_document2: default
site: bookdown::bookdown_site
---
```{r setup, include=FALSE, }
knitr::opts_chunk$set(echo = TRUE)
```
Hello zero
# First Chapter
Hello one
\begin{algorithm}
\caption{My Algo}
\begin{algorithmic}[1]
\State Do this.
\State Do that.
\end{algorithmic}
\end{algorithm}
```{r myalgo, echo=FALSE, eval = !knitr:::is_latex_output(), fig.cap="Must have text here. For cross-referencing to work."}
knitr::include_graphics("myalgo.png")
```
Hello two.
Check out this picture: \#ref(fig:myalgo)
myalgo.png
For math, R Markdown uses MathJax, and only a subset of LaTeX is available. This subset includes the basic math macros and environments, and allows you to define new macros, but doesn't support everything necessary to let you use arbitrary LaTeX packages. See http://docs.mathjax.org/en/latest/tex.html for details.
You might be able to create an environment that looks something like algorithm or algorithmic, but it's going to be a lot of work, and likely won't look as nice.
You should probably choose between PDF output with all of LaTeX available for formatting, or some flavour of HTML output with less style. For example, you could write your algorithm as
******
**Algorithm 1**: My algo
******
1. Do this.
2. Do that.
******
and it will display as
Algorithm 1: My algo
Do this.
Do that.

Knitr & Rmarkdown docx tables

When using knitr and rmarkdown together to create a word document you can use an existing document to style the output.
For example in my yaml header:
output:
word_document:
reference_docx: style.docx
fig_caption: TRUE
within this style i have created a default table style - the goal here is to have the kable table output in the correct style.
When I knit the word document and use the style.docx the tables are not stylized according to the table.
Using the style inspector has not been helpful so far, unsure if the default table style is the incorrect style to modify.
Example Code:
```{r kable}
n <- 100
x <- rnorm(n)
y <- 2*x + rnorm(n)
out <- lm(y ~ x)
library(knitr)
kable(summary(out)$coef, digits=2, caption = "Test Captions")
```
I do not have a stylized document I can upload for testing unfortunately.
TL;DR: Want to stylise table output from rmarkdown and knitr automatically (via kable)
Update: So far I have found that changing the 'compact' style in the docx will alter the text contents of the table automatically - but this does not address the overall table styling such as cell colour and alignment.
Update 2: After more research and creation of styles I found that knitr seems to have no problem accessing paragraph styles. However table styles are not under that style category and don't seem to apply in my personal testing.
Update 3: Dabbled with the ReporteRs package - whilst it was able to produce the tables as a desired the syntax required to do so is laborious. Much rather the style be automatically applied.
Update 4: You cannot change TableNormal style, nor does setting a Table Normal style work. The XML approach is not what we are looking for. I have a VBA macro that will do the trick, just want to remove that process if possible.
This is essentially a combination of the answer that recommends TableNormal, this post on rmarkdown.rstudio.com and my own experiments to show how to use a TableNormal style to customize tables like those generated by kable:
RMD:
---
output:
word_document
---
```{r}
knitr::kable(cars)
```
Click "Knit Word" in RStudio. → The document opens in Word, without any custom styles yet.
In that document (not in a new document), add the required styles. This article explains the basics. Key is not to apply direct styles but to modify the styles. See this article on support.office.com on Style basics in Word.
Specifically, to style a table you need to add a table style. My version of Word is non-English, but according to the article linked above table styles are available via "the Design tab, on the Table Tools contextual tab".
Choose TableNormal as style name and define the desired styles. In my experiments most styles worked, however some did not. (Adding a color to the first column and making the first row bold was no problem; highlighting every second row was ignored.) The last screenshot in this answer illustrates this step.
Save the document, e.g. as styles.docx.
Modify the header in the RMD file to use the reference DOCX (see here; don't screw up the indentation – took me 10 minutes find this mistake):
---
output:
word_document:
reference_docx: styles.docx
---
Knit to DOCX again – the style should now be applied.
Following the steps I described above yields this output:
And here a screenshot of the table style dialog used to define TableNormal. Unfortunately it is in German, but maybe someone can provide an English version of it:
As this does not seem to work for most users (anyone but me …), I suggest we test this systematically. Essentially, there are 4 steps that can go wrong:
Wrong RMD (unlikely).
Differences in the initially generated DOCX.
Differences in how the TableNormal style is saved in the DOCX.
Differences in how the reference DOCX is used to format the final DOCX.
I therefore suggest using the same minimal RMD posted above (full code on pastebin) to find out where the results start do differ:
My initially generated DOCX.
The same document with TableNormal added: reference.docx
The final document.
The three files are generated on the following system: Windows 7 / R 3.3.0 / RStudio 0.99.896 / pandoc 1.15.2 / Office 2010.
I get the same results on a system with Windows 7 / R 3.2.4 / RStudio 0.99.484 / pandoc 1.13.1 / Office 2010.
I suppose the most likely culprits are the pandoc and the Office versions. Unfortunately, I cannot test other configurations at the moment. Now it would be interesting to see the following: For users where it does not work, what happens …
… if you start from my initial.docx?
If that does not work, what if you use my reference.docx as reference document?
If nothing works, are there eye-catching differences in the generated XML files (inside the DOCX container)? Please share your files and exact version information.
With a number of users running these tests it should be possible to find out what is causing the problems.
This was actually a known issue. Fortunately, it was solved in v2.0 or later releases of pandoc.
And I have tested the newer version, and found that there is a newly-added hidden style called "Table". Following #CL.'s suggestions to change "Table" style in reference.docx will be okay now.
In addition, look at this entry of pandoc's v2.0 release notes:
Use Table rather than Table Normal for table style (#3275). Table Normal is the default table style and can’t be modified.
As of 2021, I could not get any of the other suggested answers to work.
However, I did discover the {officedown} package, which, amongst other things, supports the styling of tables in .docx documents. You can install {officedown} with remotes::install_github("davidgohel/officedown")
To use {officedown} to render .Rmd to .docx you must replace
output:
word_document
in your document header with
output:
officedown::rdocx_document
In addition to this the {officedown} package must be loaded in your .Rmd.
As with the word_document output format, {officedown} allows us to use styles and settings from template documents, again with the reference_docx parameter.
With a reference document styles.docx, a minimal example .Rmd may look like:
---
date: "2038-01-19"
author: "The Reasonabilists"
title: "The end of time as we know it"
output:
officedown::rdocx_document:
reference_docx: styles.docx
---
```{r setup, include = FALSE}
# Don't forget about me: I'm important!
library("officedown")
```
{officedown} allows us to go one step further and specify the name of the table style to use in the document's front matter. This table style could be a custom style we created in styles.docx, or it could be one of Word's in-built styles you prefer.
Let's say we created a style My Table:
We could tell {officedown} to use this table style in our front matter as:
output:
officedown::rdocx_document:
reference_docx: styles.docx
tables:
style: My Table
Putting this altogether, knitting the minimal .Rmd:
---
date: "2038-01-19"
author: "The Reasonabilists"
title: "The end of time as we know it"
output:
officedown::rdocx_document:
reference_docx: styles.docx
tables:
style: My Table
---
```{r setup, include = FALSE}
# Don't forget about me: I'm important!
library(officedown)
```
```{r}
head(mtcars)
```
Resulting in a .docx document which looks like:
TableNormal doesn't work for me too.
On my Dutch version of Word 2016 (Office 365), I found out that I could markup tables with the style Compact.
Input (refdoc.docx contains the Compact style):
---
title: "Titel"
subtitle: "Ondertitel"
author: "`r Sys.getenv('USERNAME')`"
output:
word_document:
toc: true
toc_depth: 2
fig_width: 6.5
fig_height: 3.5
fig_caption: true
reference_docx: "refdoc.docx"
---
And RMarkdown:
# Methoden {#methoden}
```{r}
kable(cars)
```
Output:
You need to have a reference_docx: style.docx which has "Table" style in it. (see #Liang Zhang's explanation and links above).
Create a basis reference document using pandoc (source). In command line (or cmd.exe on Windows) run:
pandoc -o custom-reference.docx --print-default-data-file reference.docx
In this newly created reference.docx file, find the table created (a basic 1 row table with a caption).
While the table is selected, click "Table Design" and find "Modify Table Style":
Modify the style of the table as you wish and use this reference document in your RMD document (see the first answer by #CL.).
Using this reference document, you can also change the table and figure caption styles.
I was able to get my word output to use a default table style that I defined in a reference .docx.
Instead of 'TableNormal', the table style it defaulted to was 'Table'.
I discovered this by knitting an rmarkdown with a kable.
---
date: "December 1, 2017"
output:
word_document:
reference_docx: Template.docx
---
`r knitr::kable(source)`
Then I took a look at that generated document's XML to see what style it had defaulted to.
require(XML)
docx.file <- "generated_doc.docx"
## unzip the docx converted by Pandoc
system(paste("unzip", docx.file, "-d temp_dir"))
document.xml <- "temp_dir/word/document.xml"
doc <- xmlParse(document.xml)
tblStyle <- getNodeSet(xmlRoot(doc), "//w:tblStyle")
tblStyle
I defined the 'Table' style to put some color and borders in the reference docx. This works for one standard table style throughout the document, I haven't found a way to use different styles throughout.
This stayed true even after I opened the reference doc and edited it.

Resources