R markdown to MS Word: keep paragraphs together - r

I am creating a report in MS Word and am using a RMarkdown document. I have managed to use a reference.docx file where I adjusted the styles of titles, headers, text, and figure captions to my need. Now I would like to make sure some lines, paragraphs and pictures are kept together on the same page. Is there a way to do this?
Here's my sample code
---
title: "Test"
output:
word_document:
reference_docx: reference.docx
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
Here is some text.
### I Use This as Figure Caption Above the Figure
(Index, 2015=100, seasonally adjusted series)
![Source: World Bank](https://previews.123rf.com/images/lovjane/lovjane1610/lovjane161000009/64921112-hand-drawn-sun-with-face-and-eyes-alchemy-medieval-occult-mystic-symbol-of-sun-vector-illustration-.jpg){ width=10cm }
The last 3 lines are the ones I would like to be kept together in my MS-Word output file, i.e. Caption+(Comment)+Figure.

You can include certain properties in your style definitions - that's the best way. You'll want to test this in the Word environment to get a feel for how these work as they can be confusing.
For all paragraphs that should stay together on the same page (as long as the total length does not exceed the space available on the page):
Paragraph dialog box (P-dlg)/Line and Page Breaks tab (LPB-tab)/Keep with Next
In the Word object model corresponds to: Paragraph.KeepWithNext = true/false
For the last paragraph in this group, make sure to remove this property. This means a separate style for the last paragraph!
To force lines to stay together:
P-dlg/LPB-tab/Keep lines together
In the Word object model corresponds to: Paragraph.KeepTogether = true/false
The same commands will apply to any pictures formatted in-line with the text. You may want to define separate styles for pictures if they need special alignment or spacing.
For pictures with text-wrap formatting the trick to keeping them together with specific text is to lock the anchor to the Range of that text. This cannot be part of a style, however.

Related

R Markdown Grouping of Figures to Prevent Pagebreak

I'm having a problem with assigning LaTeX environments within an RMarkdown for-loop code-chunk.
In short, I've written an R Markdown document and a series of R-scripts to automatically generate PDF reports at the end of a long data analysis pipeline. The main section of the report can have a variable number of sections that I'm generating using a for-loop, with each section containing a \subsection heading, a datatable and plot generated by ggplot. Some of these sections will be very long (spanning several pages) and some will be very short (~1/4 of a page).
At the moment I'm just inserting a \pagebreak at the end of each for-loop iteration, but that leaves a lot of wasted space with the shorter sections, so I'm trying to "group" each section (i.e. the heading, table and chart) so that there can be several per page, but they will break to a new page if the whole section won't fit.
I've tried using a figure or minipage environment, but for some reason those commands are printed as literal text when the plot is included; these work as expected with the heading and data table, but aren't returned properly in the presence of the image.
I've also tried to create a LaTeX samepage environment around the whole subsection (although not sure this will behave correctly with multi-page sections?) and then it appears that the Markdown generated for the plot is not interpreted correctly somewhere along the way (Pandoc?) when it's within that environment and throws an error when compiling the TeX due to the raw Markdown ![]... image tag.
Finally, I've also tried implementing \pagebreak[x] and \nopagebreak[y] hints at various points in the subsection but can't seem get these to be produce the desired page breaking behaviour.
I've generated an MWE that reproduces my issues below.
I'd be really grateful for any suggestions on how to get around this, or better ways of approaching "grouping" of elements that are generated in a dynamic fashion like this?
---
title: "Untitled"
author: "I don't know what I'm doing"
date: "26/07/2020"
output:
pdf_document:
latex_engine: xelatex
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, dev = "cairo_pdf")
```
```{r cars, results='asis'}
for (i in 1:5){
cat("\\begin{figure}")
cat(paste0("\\subsection{This is subsection ",i,"}"))
cat("\\Huge Here's some bulk text that would represent a data table... kasvfkwsvg fiauwe grfiwgiu iudaldbau iausbd ouasbou asdbva asdbaisd i iuahihai hiuh iaiuhqijdblab ihlibljkb liuglugu h uhi uhi uhqw iuh qoijhoijoijoi qwegru wqe grouw egq\\newline")
plot(mtcars$wt,mtcars[,i])
cat("\\end{figure}")
}
```
Edit to add: interestingly these figure and minipage environments seems to work as expected when executing the same example in an .Rnw using knitr... so does that narrow it down to an issue with Pandoc? Again, any help much appreciated!
What happens is that the raw TeX commands are not treated as TeX when going through Markdown. You can fix that by explicitly marking the relevant snippets as LaTeX:
for (i in 1:5){
cat("`\\begin{figure}`{=latex}")
cat(paste0("\\subsection{This is subsection ",i,"}"))
cat("\\Huge Here's some bulk text that would represent a data table... kasvfkwsvg fiauwe grfiwgiu iudaldbau iausbd ouasbou asdbva asdbaisd i iuahihai hiuh iaiuhqijdblab ihlibljkb liuglugu h uhi uhi uhqw iuh qoijhoijoijoi qwegru wqe grouw egq\\newline")
plot(mtcars$wt,mtcars[,i])
cat("`\\end{figure}`{=latex}")
}
See the generic raw attribute section in the pandoc manual for details.

How to reference two times to a single footnote in rmarkdown?

I try to reference to a single footnote in a few places in the text. However, with the code below, I've got two footnotes with the same content.
---
title: "My document"
output: html_document
---
One part of the text [^1].
Two pages later [^1].
[^1]: My footnote
Is it possible to reference more than once to a specific footnote using rmarkdown?
I had the same problem. I used html tags which worked really well. Whatever you want as superscript just put between the <\sup> like below:
<sup> text </sup>
jan 6 2023 update:
now I use quarto which is eally easy as you can just use do[^1]
[^1]footnote text
My workaround here is to just do it manually using inline latex mathmode (e.g, \(^2\) ).. Annoying, but even if they had a solution, you'd have to remember the citation number anyways...
I would suggest you to go with latex solution if you do not have many footnotes per page. By latex solution I mean:
(in Markdown, use Latex to superscript the footnote number)
First part$^1$
(and the next one)
Second part$^2$
(at the end of your text add *** to create a line across the document)
(under the line, add the text below:)
1, 2: Text for your footnote
On the other hand, there is a thread created on this specific R-Markdown bug. Maybe take a look at it, in this link.
Hope I helped somehow.

R markdown: can I insert a pdf to the r markdown file as an image?

I am trying to insert a pdf image into an r markdown file. I know it is possible to insert jpg or png images. I was just wondering if it is also possible to insert a pdf image. Thanks very much!
If you are just trying to insert an image that has been exported from, for example, some R analysis into a pdf image, you can also use the standard image options from the knitr engine.
With something like:
```{r, out.width="0.3\\linewidth", include=TRUE, fig.align="center", fig.cap=c("your caption"), echo=FALSE}
knitr::include_graphics("./images/imagename.pdf")
```
Unfortunately you can't specify the initial dimensions of your image output (fig.width and fig.height), which you would need to pre-define in your initial output, but you can specify the ultimate size of the image in your document (out.width). As noted below, however, this is limited to scaling down.
You could also of course leave out the initial directory specification if your files are in the same working directory. Just be aware of operating system differences in specifying the path to the image.
An alternative method is to use Markdown syntax noted by #hermestrismegistus on this post:
![Image Title](./path/to/image.pdf){width=65%}
This can also be collected for multiple images side-by side:
![Image Title](./path/to/image.pdf){width=33%}![Image2 Title](./path/to/image2.pdf){width=33%}![Image3 Title](./path/to/image3.pdf){width=33%}
Edit:
After working more extensively with in-text referencing, I have found that using r chunks and the include_graphics option to be most useful. Also because of the flexibility in terms of image alignment (justification).
As an example:
```{r image-ref-for-in-text, echo = FALSE, message=FALSE, fig.align='center', fig.cap='Some cool caption', out.width='0.75\\linewidth', fig.pos='H'}
knitr::include_graphics("./folder/folder/plot_file_name.pdf")
```
The reference can later be used in-text, for example, Figure \#ref(fig:image-ref-for-in-text) illustrates blah blah.
Some important things to note using this format:
You can only expand PDF images via a code chunk up to the out.width and out.height conditions set in the original .pdf file. So I would recommend setting them slightly on the larger side in your original image (just note that any chart text will scale accordingly).
The in-text reference code (in this case image-ref-for-in-text) CANNOT contain any underscores (_) but can contain dashes (-). You will know if you get this wrong by an error message stating ! Package caption Error: \caption outside float.
To stop your plots drifting to the wrong sections of your document, but in a way that unfortunately will generate some white space, the above example includes fig.pos='H'. Where H refers to "hold" position. The same can be achieved for the former Markdown option by placing a full-stop (period .) immediately after the last curly bracket.
Example:
![Image Title](./path/to/image.pdf){width=75%}.
Unfortunately, this latter option results in some unsightly full-stops. Another reason I prefer the include_graphics option.
Sorry, I found that there is a similar post before:
Add pdf file in Rmarkdown file
Basically, I can use something like below works well for the html output:
<img src="myFirstAlignment2.pdf" alt="some text" width="4200" height="4200">
And something like below works well for the pdf output:
(1)possible solution
\begin{center} <br>
\includegraphics[width=8in]{myFirstAlignment2.pdf} <br>
\end{center}
(2)possible solution
![Alt](myFirstAlignment2.pdf)
The myFirstAlignment2.pdf should be replaced with path\myFirstAlignment2.pdf if the pdf file is not in your working directory.
In relation to the comment of the best answer, there is a way to use the second option, and the output not come out tiny.
Use the following syntax below with the height being a large number. Having text in the brackets is necessary for it to work.
![Alt](./file.pdf){width=100% height=400}
None of the answers outlined worked well for me in terms of sizing the pdf, so adding another answer using the code chunk options for out.height and out.width to control the size:
```{r out.height = "460px", out.width='800px', echo=F}
knitr::include_graphics("./images/imagename.pdf")
```

Overlapping tables from grid.arrange in gridExtra

I'm trying to build an automated report that will have three charts right underneath each other without margin space between each other.
I've mocked up my problem with the following Rmd script
---
output: pdf_document
---
```{r setup, include=FALSE}
library(gridExtra)
```
```{r, echo=FALSE}
car_tbl <- tableGrob(mtcars[1:10,])
grid.arrange(car_tbl, car_tbl, car_tbl)
```
You can see how the tables overlap each other. There seems like there are actually a few issues comprising my problem.
How do I use the options of tableGrob and grid.arrange to keep the tables from overlapping.
How do I make sure nothing is cut off? In other words, how do I set the graphic to take the whole page if I need it too?
How can I re-actively shrink the text of the plot to fit on one page?
How can I set the size of the page to whatever size I want? Are there options set the knitr document to print to a pdf page of any size I want? Perhaps poster size if I need it to?
There are different angles to approach your question. Here's a working example with three tables fitting in one chunk placed on a pdf of size A2.
---
output: pdf_document
geometry: paper=a2paper
---
```{r setup, include=FALSE}
library(gridExtra)
```
```{r, echo=FALSE, fig.height=10}
car_tbl <- tableGrob(mtcars[1:10,])
grid.arrange(car_tbl, car_tbl, car_tbl)
```
To address your questions specifically,
1. How do I use the options of tableGrob and grid.arrange to keep the tables from overlapping.
Their overlap had more to do with the default fig.height set by knitr.
2. How do I make sure nothing is cut off? In other words, how do I set the graphic to take the whole page if I need it too?
Make sure the figure height and page are big enough for the table.
3. How can I re-actively shrink the text of the plot to fit on one page?
It's not trivial, but I guess it could be done. I doubt it's a good idea though (for typographical reasons): typically the font size should be set, and if the table doesn't fit, it should be split into several pages (there's an example in a link I provided).
3. How can I set the size of the page to whatever size I want? Are there options set the knitr document to print to a pdf page of any size I want? Perhaps poster size if I need it to?
It's a latex question, which in the context of Rmd documents means that you could pass these layout options to latex via rmarkdown and pandoc, typically as options for the geometry package.

knitr .Rmd -> Word document: control details of figures

I'm producing a solutions manual for a book, using .Rmd files with the following YAML header:
---
title: "DDAR: Solutions and Hints for Exercises"
date: "`r Sys.Date()`"
output:
word_document:
reference_docx: solutions-setup.docx
---
where I control the general layout of the document with the reference_docx to get an output Word document.
There will be many figures, and I'd like to set some global graphics parameters to give relatively tight bounding boxes and reasonable font sizes
in the figures without having to tweak each one from what I see in a PDF document.
I tried the following, but the par() setting doesn't seem to have any effect:
{r setup, echo=FALSE}
options(digits=4)
par(mar=c(5,4,1,1)+.1)
Instead I get images like the following in my document with larger bounding boxes than I would like and with much larger font sizes than I would like.
I know how to control all this in .Rnw files produced with LaTeX, but I
can't find how to do it in .Rmd -> Word. Is there a chunk hook I could
use? I don't think that there is an out.width chunk option that re-scales
a figure as in LaTeX.
#scoa's answer shows how to use a hook to set some graphical parameters at the beginning of each chunk. This is necessary because "by default, knitr opens a new graphics device to record plots and close it after evaluating the code, so par() settings will be discarded", i.e. graphical parameters for later chunks cannot be set in an early setup-chunk but need to be set for each chunk separately.
If this behavior is not wanted, the package option global.par = TRUE can be used:
opts_knit$set(global.par = TRUE)
Finding the correct values for the margins is sometimes quite painful. In these cases, hook_pdfcrop can help. In all chunks where the option crop = TRUE, white margins will be removed. To apply this to all chunks, use
library(knitr)
knit_hooks$set(crop = hook_pdfcrop)
opts_chunk$set(crop = TRUE)
This works for docx output as well because "when the plot format is not PDF (e.g. PNG), the program convert in ImageMagick is used to trim the white margins" (from ?hook_pdfcrop).
Note that under some circumstances, cropping plots has the side effect of sometimes apparently different "zoom" factors of plots: This happens in cases where we start with identical sized elements on two plots but larger white margins around one of the plots. If then both are resized to a fixed output width after cropping, elements on the plot with larger margins look larger. However, this is not relevant for docx output because out.width/out.height cannot be used in that case.
The knitr documentation for hooks actually uses small margins as an example of what you can do with hooks. Here is a solution (adapted from this documentation).
---
output: word_document
---
```{r setup, echo=FALSE}
library(knitr)
knit_hooks$set(small.mar = function(before, options, envir) {
if (before) par(mar=c(5,4,1,1)+.1) # smaller margin on top and right
})
opts_chunk$set(small.mar=TRUE)
```
```{r}
plot(iris$Sepal.Length)
```
Using opts_chunk$set(small.mar=TRUE) is a way to avoid passing it to every chunk in the document.
The margin appears fixed (screenshot from the docx output in libreoffice with default reference-docx).

Resources