Converting Rmarkdown to PDF without RStudio - r

I would like to convert a *.Rmd to document to PDF without rstudio being available.
Current approach
Current approach follows the following steps:
*.Rmd document is passed to knitr: knit(input = "report.Rmd"))
Obtained md is converted via pandoc:
# Convert
pandoc --smart --to latex \
--latex-engine pdflatex \
-s report.md \
-o report.PDF
Problems
This results in the following problems, the top section of the Rmarkdown document:
---
title: "Report Title"
author: "Person"
output: pdf_document
classoption: landscape
---
and shows as:
all text is centered, whereas I would like for it to be left-aligned:
Possible approach
I would like to make use of the rmarkdown::render; however, despite setting RSTUDIO_PANDOC (as discussed here), the command fails on pandoc not being available.
Desired outcome
I don't care much whether the utilised mechanism makes use of the rmarkdown::render, what I want to achieve is:
Landscape page layout across all pages
Left-aligned text
Ability to exercise minimum control over the document by controlling default fonts
Ideally, I would like to do as much as in the *.Rmd file as possible without the need to add parameters to the pandoc command.
Updates, following comments
I'm working on Linux and pandoc is installed, I can execute pandoc command pass files and generate exports with no problems. It only doesn't work with the rmarkdown::render package.
Concerning the hooks and *.Rmd files, this is what I'm trying to understand as I see that that the first section of my *.Rmd file is ignored. The current process looks as follows:
*.Rmd (not much in it, just title section and dummy text and code that renders but wrongly justified) >
*.R file running one line knit(input = "report.Rmd")) >
*.sh file running pandoc command and generating PDF
Concerning:
if all that is in place, it is indeed just a call to
rmarkdown::render(...)
The rmarkdown::render(...) fails:
Error: pandoc version 1.12.3 is required and was not found ...
However:
>> rmarkdown::pandoc_available()
[1] TRUE
and:
$ pandoc -v
pandoc 1.9.4.1 (...)
The RSTUDIO_PANDOC points to pandoc.

A few things:
"the command fails on pandoc not being available." well you must have pandoc installed in order to call it -- but you didn't say what OS you have. On Linux it is pretty trivial to install pandoc from the package manager; otherwise jgm has binaries for you on the site; "should" be similar on OS X
for different styling you need to modify the LaTeX code which you can via numerous hooks to include macro files; see the RMarkdown cheat sheets for detail
if you want to exercise more control, you can supply your own template; I have done so in the tint package
(which is also on CRAN)
if all that is in place, it is indeed just a call to rmarkdown::render(...)

Error: pandoc version 1.12.3 is required and was not found
I think the error says it plainly: you need pandoc 1.12.3 and you have pandoc 1.9.4.1
I do not know, however, why such a specific version is required.

Related

Adding a new bibliography style with R Markdown and TinyTex

I'm writing a paper using R Markdown and TinyTex, using Biblatex for referencing. It works fine with default referencing styles, but I need to add a custom bibliography and citation style for the journal I'm writing for.
I need to follow the Unified Stylesheet for Linguistics, for which there is a Biblatex implementation available on Github here, containing a .bbx and .cbx file.
I've tried adding those .bbx and .cbx files to my local copy of TinyTex, inside Library/TinyTex/texmf-local/tex/latex/biblatex. My YAML header includes:
output:
pdf_document:
citation_package: biblatex
biblatexoptions: [bibstyle=biblatex-sp-unified, citestyle=sp-authoryear-comp]
When I knit the document, I get the following error:
tlmgr search --file --global '/biblatex-dm.cfg'
! Package keyval Error: bibstyle undefined.
I don't have a biblatex-dm.cfg file (nor do I really understand what that would be). I would have thought the .bbx and .cbx files would be sufficient, based on the regular installation instructions in the style's Github repo.
Where should I put .bbx and .cbx files, so that tlmgr can find them? And/or what additional steps do I need to take to use this style with my paper?
====================================================================
UPDATE: The problem seems to be coming from the Pandoc LaTeX template that R Markdown uses.
Setting aside R Markdown, I created a smaller minimal LaTeX example:
main.tex
references.bib
Where main.tex is:
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[bibstyle=biblatex-sp-unified,citestyle=sp-authoryear-comp]{biblatex}
\addbibresource{references.bib}
\begin{document}
Something something \citep{darwin_origin_1859}.
\printbibliography
\end{document}
And references.bib is:
#book{darwin_origin_1859,
location = {London},
title = {On the Origin of Species by Means of Natural Selection},
publisher = {J. Murray},
author = {Darwin, Charles},
date = {1859}
}
I had success compiling this example using the sequence of commands pdflatex, biber, pdflatex, pdflatex. Thus it seems my local TeX installation knows about the biblatex-sp-unified.bbx and sp-authoryear-comp.cbx files I added and can use them just fine.
Subsequently, I created an equivalent minimal R Markdown document with the YAML header:
title: "Untitled"
output:
pdf_document:
citation_package: biblatex
bibliography: references.bib
biblatexoptions: [bibstyle=biblatex-sp-unified, citestyle=sp-authoryear-comp]
and body:
Something something [#darwin_origin_1859].
This time, I got the same old error message from before:
tlmgr search --file --global '/biblatex-dm.cfg'
! Package keyval Error: bibstyle undefined.
This would seem to suggest that the problem is caused by something in Pandoc's LaTeX template, but I don't know what.
Just to confirm that it's definitely the Pandoc template and not my own installation/setup, I took the .tex file that gets produced when I knit the minimal R Markdown example above, and tried to compile it in Overleaf (with biblatex-sp-unified.bbx and sp-authoryear-comp.cbx files added). I reproduced the same error.
Although I think I've localised the problem, I'd still very much like to understand what and where the problem is in the Pandoc template. I'd also be keen to hear if anyone has any fixes (other than just using a different template or writing my own).
UPDATE: This seems to be an issue with using an out-of-date version of R Markdown and/or Pandoc.
I was using rmarkdown package v.1. At time of writing, the most up-to-date version is 2.1.
I updated all my packages and updated Rstudio (which currently ships with Pandoc v2.3.1) and no longer experience problems. I also upgraded R (from 3.5.something to 3.6.2) and did a fresh re-install of tinytex while I was at it, but I'm not sure whether those things had an effect for this particular problem.
Now, when I put biblatexoptions: [bibstyle=biblatex-sp-unified, citestyle=sp-authoryear-comp] in my YAML header, it's correctly converted into the LaTeX command \usepackage[bibstyle=biblatex-sp-unified,citestyle=sp-authoryear-comp]{biblatex}, rather than the \ExecuteBibliographyOptions command as described below.
Ralf Stubner initially suggested I check my R Markdown/Pandoc versions in the comments. Please give his comments an upvote if you them useful as well.
Problem recap:
I'm writing a document in R Markdown and I have a particular referencing style that I'd like to use with biblatex. I have a .bbx and .cbx file defining the style, available on Github (linked above). The problem is that the document fails to compile, saying biblio/citation styles are undefined (even when the style files are in the project folder itself).
I've found that the problem was caused by the way I was passing options to biblatex. In my YAML Header, the line:
biblatexoptions: [bibstyle=biblatex-sp-unified, citestyle=sp-authoryear-comp]
gets converted to the latex command:
\ExecuteBibliographyOptions{bibstyle=biblatex-sp-unified,citestyle=sp-authoryear-comp}
I'm not sure why, but when this command is included, it produces the errors I was observing.
Installing new Biblatex style:
I'm finding that TeX doesn't know about the .bbx and .cbx files when they're in my ~/Library/TinyTex/texmf-local/tex/latex/biblatex directory (which is where I expected to put them based on the Github installation instructions).
To get the referencing style recognised by the system, I placed .bbx and .cbx files inside ~/Library/TinyTex/texmf-dist/tex/latex/biblatex/bbx and ~/Library/TinyTex/texmf-dist/tex/latex/biblatex/cbx respectively. Then, in the terminal, I ran sudo mktexlsr.
(Alternatively, for use only with a particular document, the .bbx and .cbx files could simply be kept in the project directory with the R Markdown file)
Original hacky answer (but see update above):
Instead of using biblatexoptions in the YAML header of the R Markdown document, I simply knitted it with citation_package: biblatex (and no extra options). I also added keep_tex: yes. Then, I opened the resulting tex file, found the \usepackage{bibtex} command and added the desired options, so it read \usepackage[bibstyle=biblatex-sp-unified,citestyle=sp-authoryear-comp]{biblatex}.
Finally, I ran pdflatex and biber on the tex file in the terminal. Clearly far from ideal, but it will technically produce the desired output.

How to add a new language for highlight in bookdown

I'm writing a document in rmarkdown with the thesisdown template.
Related to the issue thesisdown-41: how can I add a new language for highlighting which is currently not supported?
The project mentioned in the link is derived from bookdown
Under the hood bookdown uses pandoc for tranforming markdown to HTML/PDF/.... From pandoc's manual at http://pandoc.org/MANUAL.html#syntax-highlighting we get:
The library used for highlighting is skylighting.
The list of available languages can be retrieved with pandoc --list-highlight-languages
Slightly off topic, but I just worked out how to do this in RMarkdown rather than Bookdown. I suspect you'll need this and maybe a little more.
Passing extra arguments to Pandoc via the YAML front-matter:
output:
html_document:
highlight: haddock
pandoc_args: ["--syntax-definition", "cobol.xml"]
Obtain the XML syntax definition file from somewhere (or create it). I got my COBOL one from:
wget http://kde.6490.n7.nabble.com/attachment/1163657/0/cobol.xml.gz
The syntax of the highliting file is as used by the Kate project in KDE.
Obtain the pre-req language.dtd file, this is some deep dependency with pandoc.
wget https://raw.githubusercontent.com/jgm/highlighting-kate/master/xml/language.dtd
If've just added the two files to my git repo, plus the YAML lines to my RMarkdown, and everything then worked on other developers machines.

knit HTML does not save html in vignettes/

So I have a vignette, vignettes/test-vignette3.Rmd:
---
title: "Sample Document"
output:
html_document:
highlight: kate
theme: spacelab
toc: yes
pdf_document:
toc: yes
---
Header
=========
When I hit the knit HTML button, I get the following:
processing file: test-vignette3.Rmd
output file: test-vignette3.knit.md
Output created: /tmp/RtmpKVpegL/preview-5ef42271c0d5.dir/test-vignette3.html
However, if I copy this file to inst/doc and hit the knit HTML button, I get:
processing file: test-vignette3.Rmd
output file: test-vignette3.knit.md
Output created: test-vignette3.html
My questions are:
How do I get RStudio to save the output from knit HTML on vignettes/test-vignette3.Rmw to the vignettes directory?
How do I get RStudio to not delete test-vignette3.knit.md during the knit HTML procedure? (I'd like to have the .md so people can read it on my github repo.)
I'm running RStudio version 0.98.836, rmarkdown version 0.1.98 and knitr version 1.5.
Actually you should not keep the .html output under vignettes/, because the vignette output is supposed to be generated by R CMD build. R may fail to recompile your vignettes if the HTML output files have already been there when you build the source package, which means you are likely to see old (and possibly wrong) results because the HTML file was not generated from the latest version of the .Rmd file. Therefore RStudio intentionally avoid writing the HTML files in the vignetttes directory.
If you choose to ignore the warning above, you can certainly run rmarkdown::render('your-vignette.Rmd') in the R console.
For the second question, I do not recommend you to do that, either, because Github renders the markdown to HTML differently (compared to the Pandoc conversion done through the rmarkdown package). Normally the package vignettes are shown on CRAN, see, for example, the knitr page on CRAN. However, because the rmarkdown package is not on CRAN yet, you cannot use the vignette engine knitr::rmarkdown at the moment (I guess we are not too far away from the CRAN release now). You can consider pushing the HTML files to Github pages, though.

Pandoc conversion of markdown to latex with default filename

I'm using the R package knitr to generate a markdown file test.md. This file is then processed by pandoc to produce a variety of output formats, such as html and pdf. Because I want to use bibtex when generating the pdf through latex, I believe I have to tell pandoc to stop at the intermediate latex output, and then run bibtex and pdflatex myself (twice). Here's where I found a slight annoyance in my workflow: the only way I found for pandoc to keep the intermediate tex file, and not go all the way to the pdf, was to specify a hard-coded filename through the -o option with a .tex extension. This is problematic for me because I'm using a config file to run pandoc('test.md', "latex", "config.pandoc") via knitr with options, which I would like to keep generic without hard-coded output filename:
format: latex
o: test.tex
s:
S:
biblio: refs.bib
biblatex:
template: 'template.tex'
default-image-extension: pdf
which in turn becomes the following command for pandoc,
pandoc -s -S --biblio=refs.bib --default-image-extension=pdf --biblatex --template='template.tex' -f markdown -t latex -o test.tex 'test.md'
If I skip the o: test.tex option, pandoc produces a pdf and doesn't keep the intermediate latex file. How can I keep the tex file, without specifying this hard-coded filename?
To solve this problem on my side, I added a new argument ext to the pandoc() function. It is available on Github now (knitr development version 1.3.6). You can override the default file extension, e.g.
library(knitr)
pandoc(..., ext = 'tex')

pandoc mmd_title_block appears not to load

I am new to pandoc and an attempting to use it to convert some simple mmd files to docx. These mmd files contain a mmd style title block in the following form:
Author: Author_name
Title: Title_name
Date: Date_name
I prefer this style to the pandoc style title blocks, so I would like to keep them in the multimarkdown style. The pandoc documentation indicates that there is an extension that will allow me to use them, but when I attempt to use the extension it has no effect on the output. I have tried many permutations of the command to no avail, but an example looks like this:
pandoc -f markdown-pandoc_title_block+mmd_title_block -o test.docx testinput.txt
If I convert the title block to use pandoc's style, the output properly converts the title blocks to the correct format in the resulting Word file, so I know the reference file is okay. Also, when I keep the title block in pandoc's style but use the markdown-pandoc_title_block command, it properly ignores the title block, so I know the problem is not in the disabling of pandoc title blocks.
Suggestions on what I might be doing wrong?
If you upgrade to Pandoc 1.11.1 and try running the following command, it should work fine:
pandoc -f markdown_mmd -t docx test.md -o test.docx
It preserves title, author and date fields.

Resources