pandoc skip latex environment - docx

I'm writing mainly in LaTeX, but some co-authors prefer MS Word. To facilitate their work a bit, I would like to convert the .tex file (or the .pdf) to a .docx. The formatting does not need to be perfect, but all of the text, equations, figures etc should be perfectly readable.
I'm currently thinking to take the .tex document, replace all the essential stuff and then let Pandoc do it's magic. For this I would preferably implement my additions as a Pandoc filter. E.g., my tikz pictures would be converted to png using the tikz.py filter provided with Pandoc. The problem I'm facing with this approach is that Pandoc tries to interpret the tikz environment upon conversion from tex into it's internal language and the filters take this internal language as an input. The result is that the tikz code is lost. Is there a way to tell Pandoc to leave any tikzpicture environments alone?
Edit:
See the MWE below:
MWE.tex contents:
\documentclass{article}
\usepackage{tikz}
\begin{document}
\begin{tikzpicture}
\draw (0,0) -- (2,2);
\end{tikzpicture}
\end{document}
Output of pandoc -t native MWE.tex
[Para [Str "(0,0)",Space,Str "\8211",Space,Str "(2,2);"]]
The \draw command has completely disappeared as you can see.

I found that pandoc does not skip code encapsulated in \iffalse ... \fi, so you can redefine the tikpicture environment as such (or in any other way you might like):
\documentclass{article}
\usepackage{tikz}
\iffalse
\renewenvironment{tikzpicture}%
{\par---start tikzpicture---\\}%
{\\---end tikzpicture---\par}
\renewcommand{\node}{node:}
\fi
\begin{document}
\begin{tikzpicture}
\node {foo};
\end{tikzpicture}
\end{document}
With pandoc 2.5 this results in a docx file containing:
—start tikzpicture—
node:foo;
—end tikzpicture—
This feels very wrong, and I wish I knew a nicer way.

Related

Rmd to PDF compiling error: Package geometry \paperwidth (0.0pt) too short

I am writing a paper in R markdown and need to format it with this .cls file supplied by an academic journal.
A minimal .tex file compiles perfectly well with the above cls file.
My .tex file (compiled on ShareLaTeX with clv3.cls saved in same directory):
\documentclass[shortpaper]{clv3}
\usepackage[utf8]{inputenc}
\title{Paper title}
\author{Name Surname}
\date{May 2018}
\begin{document}
\maketitle
\section{Introduction}
Some text.
\end{document}
However a comparable minimal document in R markdown, using the same cls file, fails to compile in Rstudio, with the following error: ! Package geometry Error: \paperwidth (0.0pt) too short.
My Rmd file (with clv3.cls file saved in same directory):
---
title: "Paper title"
author: "Name Surname"
documentclass: clv3
classoption: shortpaper
output: pdf_document
---
# Introduction
Some text.
Why is this error induced when I try to use this class file with an R markdown document and how may I fix it?
I've tried manually specifying a pagewidth setting in the YAML header, but I don't really know what I'm doing. This seems undesirable anyway, since the normal LaTeX document works fine without it (and surely page width is something that should be specified by a journal, not manually overwritten by an author).
I do not know where exactly the clv3.cls class and the default pandoc template clash. However, that template does so many things that do not make sense when writing with a specific style, that is best to use your own template. Using clv3-template.tex
\documentclass[shortpaper]{clv3}
\usepackage[utf8]{inputenc}
$if(title)$
\title{$title$}
$else$
\title{}
$endif$
$if(author)$
\author{$for(author)$$author$$sep$ \\ $endfor$}
$else$
\author{}
$endif$
\begin{document}
$if(title)$
\maketitle
$endif$
$body$
\end{document}
together with
---
title: "Paper title"
author: "Name Surname"
output:
pdf_document:
template:
clv3-template.tex
---
# Introduction
Some text.
should be a good starting point.
The accepted answer works perfectly for the minimal example presented. However it breaks again pretty quickly as the document is made more complex (for example, inserting a bibliography and in-text citations). I'd like to expand a little on my solution for the potential benefit of any future readers, as I found it a bit of steep learning curve:
The issue here is that Pandoc has a LaTeX template, which it uses to produce PDF documents. This is separate from a .cls class file, which defines a document class. Like Ralf Stubner says, something about my particular class file was not cooperating with Pandoc's default template. This is probably super basic and obvious to many, but I had not appreciated this extra step nor understood the distinction between these files.
If one does not wish to deal with raw LaTeX, it seems there are quite a few templates out there for various kinds of documents (see, for example, the rticle package). Using R Markdown to produce a PDF document in a particular custom format (such as for a particular journal) will require construction of a LaTeX template, however. This can be done by either of two ways:
Tinkering with an existing template until you get what you need, either by finding Pandoc's default template and starting from there (see comment by user2554330 for location) or by cloning someone else's template on Github etc.
Writing a template from scratch. In this case, Ralf Stubner's minimal example above plus this section of the Pandoc manual will be informative.
In my case, I went with the latter option. I have saved my eventual template as an R package which can be installed using devtools::install_github("JaydenM-C/CLtemplate"). So if anyone else would ever like to author a document for Computational Linguistics using this particular document style, this may save you some time.

Include TikZ code in bookdown figure environment

I'd like to add a TikZ figure to a bookdown document in order to include some fancy graphics.
My primary output format is LaTeX which means that I could essentially just include the TikZ graphics verbatim in the Rmarkdown file and it would render fine. However, two problems are haunting me:
I'd like for the TikZ graphics to be part of a figure environment (for the numbering, caption etc).
I'd like to be able to render the same code to both PDF (LaTeX) and Gitbook (HTML).
Right now I have the following chunk which nicely produces the relevant graph as a figure when I render to pdf.
```{r, echo=FALSE, engine='tikz', out.width='90%', fig.ext='pdf', fig.cap='Some caption.'}
\begin{tikzpicture}[scale=.7]
\draw [fill=gray!30,very thick] (0,-1) rectangle (5,1);
\draw [very thick] (5, 0) -- (13,0);
\node [below] at (2,-1) {\large Hello};
\node [below, align=center] at (0,-1) {\large Two\\ lines};
\end{tikzpicture}
```
However, there are two problems with the code:
I do not get any output when rendering to gitbook (using knitr and bookdown). I do get the figure caption, however, and if I render to html_document then it works too and I can see the graph.
For PDF the text is rendered using the computer modern font. I'd really like to change this, and the main font in the LaTeX document has already been set to something else. However, because the code is rendered locally by the TikZ engine and then inserted, it is not part of the full LaTeX document. Can I add some LaTeX options, packages etc. that are included by the TikZ engine before the code is rendered?
If there are other ways to include the TikZ code as part of a figure environment then I'd be happy to know.
Update: I guess the second point could be fixed by setting engine.opts = list(template = "latex/tikz2pdf.tex") where the necessary setup for LaTeX is included in the tikz2pdf.tex file. That file is read using LaTeX but I'd like to use xelatex to parse the file since I'm using the fontspec LaTex package. Can that be changed anyway?
I think I found an answer to both of my questions. It did take - as Yihui pointed out - quite some time. I'm including the answer here in case someone else turns out to need this (or myself at a later point).
Re 1) Render TikZ code to both pdf and gitbook
This turned out to be easier than I anticipated. Setting the argument fig.ext=if (knitr:::is_latex_output()) 'pdf' else 'png' as part of the chunk arguments helps this along. If I'm not knitting to PDF then imagemagick or some other software automatically converts it to PNG.
Re 2) Modifying the font
As listed in my updated question this can be set by tweaking the file tikz2pdf.tex that is part of knitr. A copy of it is included below so you don't have to search for it yourself. Setting the chunk argument engine.opts = list(template = "latex/tikz2pdf.tex") enables you to put any desired fonts, LaTeX packages etc in preamble before the TikZ code is rendered.
Looking through the knitr code, you can see that texi2dvi is used to parse the tikz2pdf.tex file with the TikZ code inserted. texi2dvi calls pdflatex which messes things up in case you need to use XeLaTeX or LuaLaTeX to include TrueType fonts using fontspec.
I'm sure it would be possible to fix that somehow in the texi2dvi code but a much simpler solution (at least for me) was to change the environment. If I set the two environmental variable before starting R and rendering the book then xelatex is automatically used for compiling all the code. In my bash terminal this is done using
export LATEX="xelatex"
export PDFLATEX="xelatex"
Voila!
The chunk becomes
```{r, echo=FALSE, engine='tikz', out.width='90%', fig.ext=if (knitr:::is_latex_output()) 'pdf' else 'png', fig.cap='Some caption.', engine.opts = list(template = "latex/tikz2pdf.tex")
}
\begin{tikzpicture}[scale=.7]
\draw [fill=gray!30,very thick] (0,-1) rectangle (5,1);
\draw [very thick] (5, 0) -- (13,0);
\node [below] at (2,-1) {\large Hello};
\node [below, align=center] at (0,-1) {\large Two\\ lines};
\end{tikzpicture}
```
and tikz2pdf.tex is
\documentclass{article}
\include{preview}
\usepackage[pdftex,active,tightpage]{preview}
\usepackage{amsmath}
\usepackage{tikz}
\usetikzlibrary{matrix}
%% INSERT YOUR OWN CODE HERE
\begin{document}
\begin{preview}
%% TIKZ_CODE %%
\end{preview}
\end{document}
I'm still surprised at the whole flexibility of knitr and related packages. Nice work Yihui!
I had problems compiling plots with The pgfplots and bookdown. Using engine='tikz', gave the error message: 'axis environment does not exist'. I fixed it by changing \usepackage{tikz} to \usepackage{pgfplots} in the tikz2pdf.tex file and using engine.opts = list(template ="Tikz2pdf.tex") as suggested.

knitr: automate producing multiple versions of PDF for beamer slides

For lectures, I am using knitr to produce LaTeX beamer slides as a PDF. For a given lecture, I want to produce also (a) 1-up handout (using the handout option, and (b) the same handout formatted 4-up.
I find I have to run knitr 3 times to do this as shown below. Is there a way to simplify this work flow?
A lecture stub:
\documentclass[10pt,table]{beamer}
\input{inputs/beamer-setup}
\input{inputs/defs}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
...
\end{document}
And I run knitr as
knit2pdf("Lecture1.Rnw")
To get the 1-up handout (which suppresses the separate pages when you use transitions), I edit the first line to:
\documentclass[10pt,table,handout]{beamer}
and run
knit2pdf("Lecture1.Rnw" output="Lecture1-1up.tex")
Finally, to get the 2 x 2 version, I use the LaTeX pgfpages package,
\documentclass[10pt,table,handout]{beamer}
\input{inputs/beamer-setup}
\input{inputs/defs}
\usepackage{pgfpages}
\pgfpagesuselayout{4 on 1}[letterpaper,landscape]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
And run:
knit2pdf("Lecture1.Rnw" output="Lecture1-4up.tex")
(I found that with beamer, I could not simply print the PDF 4-up using Adobe Acrobat -- it generated a corrupt PDF file. I was forced to use pgfpages)
Then, of course I have to revert my .Rnw file to the original if I need re-do the slides. Very tedious. There must be a better way.

Can I use a knitr inline expression in the title of a latex document?

I'd like to use a Knitr/Sweave in-line call (\Sexpr{}) in the title of a LaTeX document, after the \begin{document} command but before the \maketitle command. The in-line R code would extract one or two pieces of information from an R data-frame created early in the R script I'm embedding in LaTeX.
I have a couple of Knitr chunks that create a data.frame from which I derive the information I want to put in the Title. I've tried placing these chunks between LaTeX's \begin{document} call and the \title code, like this:
\documentclass
[LaTex Preamble]
\begin{document}
[%% Knitr chunks that initialize an R data-frame]
\title \Sexpr{--a snippet of R code that extracts an element from the data-frame --}
\maketitle
... (rest of the LaTeX document)
and I've also tried putting the Knitr chunks in the preamble to the LaTeX code before \begin{document} statement.
But in Knitr seems to ignore code (other than initialization) that is placed ahead of the \maketitle call in LaTeX, so the in-line snippets included the title look like errors to Latex and it halts output.
I can't find any information in the Knitr documentation on including in-line code in the Title of a LaTeX document.
Any ideas?
OK: Found the solution thanks to the hint from #ben-bolker below. Ben uses the formatting of R chunks before output to an RNW file (in a 2-step Knitr process: latex -> rnw -> pdf) . But I'm compiling the LaTeX file to PDF in one-step without going to an RNW file from inside TeXShop (on Mac OSX). I found that I could get Ben's example to work using the RNW delimiters (<<>>=) and one-step compiling. But I couldn't mix the usual LaTeX chunk-delimiters (%%begin.rcode and %% end.rcode) and the RNW in-line statement hook (\Sexpr{}). The latter didn't work no matter how I fiddled with it. Eventually I found that the correct in-line hook for LaTeX is \\rinline{}.
It's not very clear in the Knitr documentation that this is the required format for LaTeX and I found it eventually mainly thanks to Ben's example. Best, Peter
Update 2 ... and then there's RTFM (or the 'cheat sheet' in this case): http://cran.r-project.org/web/packages/knitr/vignettes/knitr-refcard.pdf
Hmm. The following file works for me:
\documentclass{article}
<<echo=FALSE>>=
x <- 5
#
\title{The number is \Sexpr{x^2}}
\begin{document}
\maketitle
Some stuff
\end{document}
with knitr version 0.8 on Ubuntu 10.04, via knit2pdf("knitr_title.Rnw") ...

Getting Sweave code chunks inside some framed box?

I would like to make an R code chunk (in Sweave) printed inside a framed box in the resulting pdf.
Is there an easy solution for doing that?
The short answer is that yes, there is an easy way. Just add the following lines, or something like them to the preamble of your Sweave document:
\DefineVerbatimEnvironment{Sinput}{Verbatim} {xleftmargin=2em,
frame=single}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{xleftmargin=2em,
frame=single}
This works because the appearance of code (and output) chunks is controlled by the definition of the Sinput and Soutput environments. These are both Verbatim environments as provided by the LaTeX package fancyvrb. (Click here for a 73 page pdf describing the numerous options that fancyvrb provides).
A quick look in the file Sweave.sty reveals the default definition of those two environments:
\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontshape=sl}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{}
\DefineVerbatimEnvironment{Scode}{Verbatim}{fontshape=sl}
To change those definitions, just add \DefineVerbatimEnvironment statements of your own devising either: (a) at the end of the Sweave.sty file; or (b) at the start of your *.Snw document.
Finally, here's an example to show what this looks like in practice:
\documentclass[a4paper]{article}
\usepackage{Sweave}
\DefineVerbatimEnvironment{Sinput}{Verbatim} {xleftmargin=2em,
frame=single}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{xleftmargin=2em,
frame=single}
\title{Sweave with boxes}
\begin{document}
\maketitle
<<echo=FALSE>>=
options(width=60)
#
Here is an example of a code chunk followed by an output chunk,
both enclosed in boxes.
<<>>=
print(rnorm(99))
#
\end{document}
knitr, a successor of Sweave, by default outputs all echoed R code in boxes, and also formats it to the margins. Other nice features include syntax coloring and PGF integration.
Sweave code of average complexity needs only minor if any adaptions to run with knitr.

Resources