knitr parses extra verbatim documentclass - r

I'd like to include an example of complete LaTeX code inside an Rnw document that is parsed using knitr.
My .Rnw file is shown here (I really want to include a bunch of R code as well but this minimal example shows my problem)
\documentclass{article}
\begin{document}
Recursiveness, see
\begin{verbatim}
\documentclass{article}
\begin{document}
Recursiveness
\end{document}
\end{verbatim}
\end{document}
When I run knitr test.Rnw I get a .tex file where the second documentclass is replaced (see below, where the top of the .tex file is shown). I'm pretty sure that this is knitr that replaces the litaral string \documentclass with its own macro that and it does not realize that it shouldn't be parsen when inside the verbatim environment.
\documentclass{article}
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\begin{document}
Recursiveness, see
\begin{verbatim}
\documentclass{article}\usepackage[]{graphicx}\usepackage[]{color}
%% maxwidth is the original width if it is less than linewidth
%% otherwise use linewidth (to make sure the graphics do not exceed the margin)
\makeatletter
\def\maxwidth{ %
\ifdim\Gin#nat#width>\linewidth
\linewidth
\else
\Gin#nat#width
\fi
}
\makeatother
Is it possible to circumvent this? I tried to insert a zero-width space character in the middle of documentclass but that gave me a bunch of headaches with the parsing.

This is a bug and should be fixed in the current development version of knitr (>= 1.12.22) now.

Related

Pandoc version 2.7.3 fails to convert knitr .tex file to .docx

I've been using knitr in combination with .Rnw files in Rstudio to generate both pdf and docx files without any issue until today. The pdf conversion runs natively with Rstudio and for the docx conversion I simply call pandoc under the hood by giving the .tex file resulting from 'knitting' the .Rnw file. So far, I've been using pandoc version 1.19.2.1 and works just fine. However, after sharing some of my code to perform this with a colleague, I've realized that the strategy fails when using a newer version of pandoc (2.7.3).
So far, I've tried to update knitr and understand the error without much success. The issue appears to be present only when shaded areas of the generated .tex file are needed, usually after say, echo=TRUE is set.
This is my Rnw file (min_reproducible_example.Rnw)
\documentclass{article}
\usepackage{multirow}
\setlength\parindent{0pt}
\usepackage{geometry}
\usepackage{longtable}
\usepackage{float}
\usepackage{verbatim}
\usepackage{hyperref}
\geometry{left=1.5cm,right=1.5cm,top=1.5cm,bottom=1.5cm}
\title{Docx from tex file example}
\begin{document}
\maketitle
<<chunk1,echo=TRUE,message=FALSE>>=
library(survival)
str(lung)
#
\end{document}
which after hitting 'Compile PDF' in Rstudio generates files: min_reproducible_example.pdf and min_reproducible_example.tex.
Just in case, the .tex output of the .Rnw file (min_reproducible_example.tex) is
\documentclass{article}\usepackage[]{graphicx}\usepackage[]{color}
% maxwidth is the original width if it is less than linewidth
% otherwise use linewidth (to make sure the graphics do not exceed the margin)
\makeatletter
\def\maxwidth{ %
\ifdim\Gin#nat#width>\linewidth
\linewidth
\else
\Gin#nat#width
\fi
}
\makeatother
\definecolor{fgcolor}{rgb}{0.345, 0.345, 0.345}
\newcommand{\hlnum}[1]{\textcolor[rgb]{0.686,0.059,0.569}{#1}}%
\newcommand{\hlstr}[1]{\textcolor[rgb]{0.192,0.494,0.8}{#1}}%
\newcommand{\hlcom}[1]{\textcolor[rgb]{0.678,0.584,0.686}{\textit{#1}}}%
\newcommand{\hlopt}[1]{\textcolor[rgb]{0,0,0}{#1}}%
\newcommand{\hlstd}[1]{\textcolor[rgb]{0.345,0.345,0.345}{#1}}%
\newcommand{\hlkwa}[1]{\textcolor[rgb]{0.161,0.373,0.58}{\textbf{#1}}}%
\newcommand{\hlkwb}[1]{\textcolor[rgb]{0.69,0.353,0.396}{#1}}%
\newcommand{\hlkwc}[1]{\textcolor[rgb]{0.333,0.667,0.333}{#1}}%
\newcommand{\hlkwd}[1]{\textcolor[rgb]{0.737,0.353,0.396}{\textbf{#1}}}%
\let\hlipl\hlkwb
\usepackage{framed}
\makeatletter
\newenvironment{kframe}{%
\def\at#end#of#kframe{}%
\ifinner\ifhmode%
\def\at#end#of#kframe{\end{minipage}}%
\begin{minipage}{\columnwidth}%
\fi\fi%
\def\FrameCommand##1{\hskip\#totalleftmargin \hskip-\fboxsep
\colorbox{shadecolor}{##1}\hskip-\fboxsep
% There is no \\#totalrightmargin, so:
\hskip-\linewidth \hskip-\#totalleftmargin \hskip\columnwidth}%
\MakeFramed {\advance\hsize-\width
\#totalleftmargin\z# \linewidth\hsize
\#setminipage}}%
{\par\unskip\endMakeFramed%
\at#end#of#kframe}
\makeatother
\definecolor{shadecolor}{rgb}{.97, .97, .97}
\definecolor{messagecolor}{rgb}{0, 0, 0}
\definecolor{warningcolor}{rgb}{1, 0, 1}
\definecolor{errorcolor}{rgb}{1, 0, 0}
\newenvironment{knitrout}{}{} % an empty environment to be redefined in TeX
\usepackage{alltt}
\usepackage{multirow}
\setlength\parindent{0pt}
\usepackage{geometry}
\usepackage{longtable}
\usepackage{float}
\usepackage{verbatim}
\usepackage{hyperref}
\geometry{left=1.5cm,right=1.5cm,top=1.5cm,bottom=1.5cm}
\title{Docx from tex file example}
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\begin{document}
\maketitle
\begin{knitrout}
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}
\begin{alltt}
\hlkwd{library}\hlstd{(survival)}
\end{alltt}
\end{kframe}
\end{knitrout}
\end{document}
Next, I can call a wrapper function that runs the following code in command line to generate the docx file:
path/to/pandoc/pandoc -o min_reproducible_example.docx min_reproducible_example.tex
I am working in Windows, so, I have not checked if this issue remains in other OS.
There are couple of lines that I believe can be informative:
This line is the culprit, I believe:
Error at "source" (line 68, column 67):
unexpected end of input
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}
which, for what I've been digging 'kframe' comes from a latex environment created by knitr when doing the 'knitting'. This line generates the following error from pandoc:
Warning message:
In shell(command) :
'"C:/pandoc/pandoc" -o min_reproducible_example.docx min_reproducible_example.tex --default-image-extension=png' execution failed with error code 65
I have no idea what this error code 65 means. I've seen threads from previous issues with pandoc that suggest to look directly at the code to understand the error. I can do that if needed be it is just odd to me that previous pandoc versions work and the newer one is crashing. I've decided to post this in here, wondering if anybody has run into the same issue.
I'm going to post one possible solution for future reference based on another conversation at pandoc-discuss.
After some back and forth with John MacFarlane, he kindly suggested me to redefine the environment that was causing trouble to pandoc: kframe. He suggested to redefine it simply as:
\renewenvironment{kframe}{}{}
So, what I did is to redefine a custom R function I have that calls pandoc internally. I'm only including the relevant lines below.
## read the original .tex file (.Rnw output)
tx <- readLines(paste0(fname, '.tex'),warn=FALSE)
## rename the environment to something simpler as suggested by John MacFarlane in the pandoc-discuss thread
tx2 <- gsub(pattern = "\\begin{document}",
replace = "\\renewenvironment{kframe}{}{}\\begin{document}",
x = tx, fixed = TRUE)
## create a file with the workaround for the kframe environment and use it in the pandoc call below
zz <- file(paste0(fname, '_cp.tex'), "wb")
writeLines(tx2, con=zz)
close(zz)
command <- paste0('"',pdwd,'/pandoc" -o ', fname, '.docx ', fname, '_cp.tex ',
"--default-image-extension=png ")
shell(command)
# remove the file
file.remove(paste0(fname, '_cp.tex'))
After that pandoc is able to execute without complaints. I noticed that an earlier version of pandoc (1.19.2.1), although executes without error, outputs a docx file that lacks what's inside the kframe environment, while this fix renders a more accurate representation of the pdf.
I haven't tried this fix extensively so, please leave comments in case you find any issue.

tikzDevice does not use LaTeX preamble when used in RMarkdown document

I want to use tikz as graphics device in RMarkdown and I want it to include the generated LaTeX preamble.
In the past, I already used tikzDevice within knitr documents. The tex file generated by tikzDevice usually included the whole preamble from my knitr/LaTeX document. When I use it with RMarkdown, I get the standard preamble (see below).
RMarkdown file:
---
title: "Title"
author: "Me"
fontsize: 12pt
documentclass: scrartcl
output:
bookdown::pdf_document2:
toc: true
fig_caption: true
keep_tex: true
---
# Introduction
```{r plot, dev="tikz"}
plot(rnorm(50))
``
Beginning of generated tex file (plot-1.tex):
% Created by tikzDevice version 0.12.3 on 2019-06-16 16:09:40
% !TEX encoding = UTF-8 Unicode
\documentclass[10pt]{article}
Desired/expected beginning of plot-1.tex:
% Created by tikzDevice version 0.12.3 on 2019-06-16 16:09:40
% !TEX encoding = UTF-8 Unicode
\documentclass[12pt]{scrartcl}
I'm not sure you really want what you're asking for. The figure will be produced as a separate document containing nothing except the figure, which will be rendered as a PDF. The differences between scrartcl and article shouldn't matter for the figure, they matter for the document as a whole.
But if you really do need that document class, you get it by specifying options(tikzDocumentDeclaration = "\\documentclass[12pt]{scrartcl}") in an R chunk early in your document. When I do that I can see in the source that it worked, but the output looks pretty much the same as it did with the default class. It's also possible to specify this using chunk options, but there's unlikely to be any advantage to doing that.
I think I figured it out:
My problem was that while using RMarkdown the options tikzDocumentDeclaration, tikzLatexPackages ... (nearly all options for tikzDevice) were not set automatically. When you use knitr the options for tikzDevice get set up in the process of splitting up markup and code chunks from the source file. With RMarkdown there is no LaTeX code to extract and use with tikz because pandoc generates it after the graphic is rendered. So one can either define the tikz... options manually or use the chunk option external=FALSE like user2554330 suggested.
Example minimal_knitr.Rnw:
\documentclass[fontsize=12pt]{scrartcl}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\begin{document}
<<r plot, dev='tikz', echo=FALSE>>=
plot(rnorm(50))
#
\end{document}

pandoc skip latex environment

I'm writing mainly in LaTeX, but some co-authors prefer MS Word. To facilitate their work a bit, I would like to convert the .tex file (or the .pdf) to a .docx. The formatting does not need to be perfect, but all of the text, equations, figures etc should be perfectly readable.
I'm currently thinking to take the .tex document, replace all the essential stuff and then let Pandoc do it's magic. For this I would preferably implement my additions as a Pandoc filter. E.g., my tikz pictures would be converted to png using the tikz.py filter provided with Pandoc. The problem I'm facing with this approach is that Pandoc tries to interpret the tikz environment upon conversion from tex into it's internal language and the filters take this internal language as an input. The result is that the tikz code is lost. Is there a way to tell Pandoc to leave any tikzpicture environments alone?
Edit:
See the MWE below:
MWE.tex contents:
\documentclass{article}
\usepackage{tikz}
\begin{document}
\begin{tikzpicture}
\draw (0,0) -- (2,2);
\end{tikzpicture}
\end{document}
Output of pandoc -t native MWE.tex
[Para [Str "(0,0)",Space,Str "\8211",Space,Str "(2,2);"]]
The \draw command has completely disappeared as you can see.
I found that pandoc does not skip code encapsulated in \iffalse ... \fi, so you can redefine the tikpicture environment as such (or in any other way you might like):
\documentclass{article}
\usepackage{tikz}
\iffalse
\renewenvironment{tikzpicture}%
{\par---start tikzpicture---\\}%
{\\---end tikzpicture---\par}
\renewcommand{\node}{node:}
\fi
\begin{document}
\begin{tikzpicture}
\node {foo};
\end{tikzpicture}
\end{document}
With pandoc 2.5 this results in a docx file containing:
—start tikzpicture—
node:foo;
—end tikzpicture—
This feels very wrong, and I wish I knew a nicer way.

Avoid code-chunks from breaking in Knitr? ( preferably using a chunk option )

Using knitr to create a pdf, codechunks break according to page breaks. Usually this is exactly what I want, but in some cases I would like to be able to avoid this. E.g. by making a code-chunk jump to the next page if it does not fit the current page. I would prefer if this could be done in a chunk option, I.E not using eg. \newpage etc.
The following is an example of a code-chunk that breaks. How do I avoid this?
\documentclass{article}
\usepackage[english]{babel}
\usepackage{lipsum}
\begin{document}
\lipsum[1-3] \textbf{The following chunk will break. How do I avoid this breaking? }
<<echo=TRUE>>=
(iris)[1:20,]
#
\end{document}
I left an empty environment knitrout in the knitr design for such purposes. You can redefine this environment to achieve what you want. There are many LaTeX environments that are non-breakable, such as a figure environment. Below I use the minipage environment as an example:
\documentclass{article}
\renewenvironment{knitrout}{\begin{minipage}{\columnwidth}}{\end{minipage}}
% alternatively, you can use `figure`
% \renewenvironment{knitrout}{\begin{figure}}{\end{figure}}
\begin{document}
\begin{figure}
\caption{One figure.}
\end{figure}
% placeholder
\framebox{\begin{minipage}[t][0.3\paperheight]{1\columnwidth}%
nothing
\end{minipage}}
<<echo=TRUE>>=
(iris)[1:20,]
#
\begin{figure}
\caption{Another one.}
\end{figure}
\end{document}

Getting Sweave code chunks to stay inside page margins?

Sometimes I get to make an R code chunk (in Sweave) which is longer then the margins of the page. Is there a way to force it to "go to the next line" once that happens?
Here is a simple example of that happening:
\documentclass[a4paper]{article}
\usepackage{Sweave}
\DefineVerbatimEnvironment{Sinput}{Verbatim} {xleftmargin=2em,
frame=single}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{xleftmargin=2em,
frame=single}
\title{Sweave with boxes}
\begin{document}
\maketitle
<<echo=FALSE>>=
options(width=60)
#
Here is an example of a code chunk followed by an output chunk,
both enclosed in boxes.
<<>>=
print(rnorm(99))
#
<<>>=
print("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")
#
\end{document}
This is a difficult and extreme case, because you do not have spaces among those a's, so LaTeX may not be able to wrap the words. If you do have spaces, knitr will be able to produce the output with the long lines wrapped with tidy=TRUE, highlight=TRUE (so will Sweave, I think, if you set keep.source=FALSE).

Resources