Automatically generated LaTeX beamer slides with R/knitr - r

I am working on a LaTeX report template that automatically generates a beamer document, pulling in figures from specified directories and placing them one per slide.
Here is an example of the code that I am using for this, as a code chunk in my .Rnw document:
<<results='asis',echo=FALSE>>=
suppressPackageStartupMessages(library("Hmisc"))
# get the plots from the common directory
Barplots_dir<-"/home/figure/barplots"
Barplots_files<-dir(Barplots_dir)
# create a beamer slide for each plot
# use R to output LaTeX markup into the document
for(i in 1:length(Barplots_files)){
GroupingName<-gsub("_alignment_barplot.pdf", "", Barplots_files[i]) # strip this from the filename
file <- paste0(Barplots_dir,"/",Barplots_files[i]) # path to the figure
cat("\\subsubsection{", latexTranslate(GroupingName), "}\n", sep="") # don't forget you need double '\\' because one gets eaten by R !!
cat("\\begin{frame}{", latexTranslate(GroupingName), " Alignment Stats}\n", sep="")
cat("\\includegraphics[width=0.9\\linewidth,height=0.9\\textheight,keepaspectratio]{", file, "}\n", sep="")
cat("\\end{frame}\n\n")
}
#
However I recently came across this article by Yihui Xie which includes a remark about cat("\\includegraphics{}") being a bad idea. Is there a reason for this, and is there a better option?
To be clear, these figures are generated by other programs as part of a larger pipeline; generating them within the document is not an option, but I need the document to be able to dynamically find and insert them into the report. I know that there are some capabilities to do this directly from within LaTeX itself but cat'ing out the LaTeX markup I need seemed like an easier and more flexible task.

cat("\\includegraphics{}") is likely to be a bad idea if you are from the old Sweave world (where one might need to open a graphics device, draw a plot, close the device, and cat("\\includegraphics{}")). No kittens will be killed as long as you understand what you are doing. Your use case seems to be very reasonable to me, and I don't have a better approach.

Related

Input .tex in Rmarkdown

I'm using Rmarkdown/Bookdown to write a paper/PDF document, which is an amazing tool #Yihui, thanks! Now I'm trying to include a table I have already put in LaTeX into the document by reading in this external .tex file. However, when knitting in RStudio with a \include{some-file.tex} or input{some-file.tex} in the body of the .Rmd outside of a chunk a LaTeX Error: Can be used only in preamble. is produced and the process stopped. I haven't found a way how to directly input through knit or otherwise into a chunk as well.
I found this question here: Rmarkdown v2, embed Latex document, although while the question is similar, there is no answer which would reflect how to input/include .tex-files into an .Rmd.
Why would I want this? Sometimes LaTeX tables offer more layout options than building directly in R, like for tables only with text rather than R-computed numbers. Also, when running models on a cluster, exporting results directly into .tex ready for compilation saves a lot of computation compared to have to open all these heavy .RData files just for getting the results into a PDF. Similarly, having sometimes multiple types of reports with different audiences, having the full R code in one main .Rmd file and integrating only the necessary results in other files reduces complexity by not having to redo all steps in each file newly. This way, I can keep one report with the full picture and do not have to check if I included every little change in various documents simultaneously.
So finally the question is how to get prepared .tex-Files into a .Rmd-document?
Thanks for your answers!

Knitr:spin - how to add text without manually adding #' every line?

My understanding is that knitr:spin allows me to work on my plain, vanilla, regular ol' good R script, while keeping the ability to generate a full document that understands markdown syntax. (see https://yihui.name/knitr/demo/stitch/)
Indeed, the rmarkdown feature in Rstudio, while super neat, is actually really a hassle because
I need to duplicate my code and break it in chunks which is super boring + inefficient as it is hard to keep track of code changes.
On top of that rmarkdown cannot read my current workspace. This is somehow surprising but it is what it is.
All in all this is very constraining... See here for a related discussion Is there a way to knitr markdown straight out of your workspace using RStudio?.
As discussed here (http://deanattali.com/2015/03/24/knitrs-best-hidden-gem-spin/), spin seems to be the solution.
Indeed, knitr:spin syntax looks like the following:
#' This is a special R script which can be used to generate a report. You can
#' write normal text in roxygen comments.
#'
#' First we set up some options (you do not have to do this):
#+ setup, include=FALSE
library(knitr)
in a regular workspace!
BUT note how each line of text is preceded by #'.
My problem here is that it is also very inefficient to add #' after each single line of text. Is there a way to do so automatically?
Say I select a whole chunk of text and rstudio adds this #' every row? Maybe in the same spirit as commenting a whole chunk of code lines?
Am I missing something?
Thanks!
In RStudio v 1.1.28, starting a line with #' causes the next line to start with #' when I hit enter in a *.R file on my machine (Ubuntu Linux 16.04LTS).
So as long as you start a text chunk with it, it will continue. But for previously existing R scripts, it looks like you would have to use find -> replace, or write a function to modify the required file, this worked for me in a very simple test.
comment_replace <- function(in_file, out_file = in_file){
in_text <- scan(file = in_file, what = character(), sep = "\n")
out_text <- gsub("^# ", "#' ", in_text)
cat(out_text, sep = "\n", file = out_file)
}
I would note, that this function does not check for preexisting #', you would want to build that in. I modified it so that it shouldn't replace them too much by adding a space in the regular expression.
With an RMarkdown document, you would write something like this:
As you can see I have some fancy code below, and text right here.
```{r setup}
# R code here
library(ggplot2)
```
And I have more text here...
This gist offers a quick introduction to RMarkdown and knitr's features. I think you don't entirely understand what RMarkdown really is, it's a markdown document with R sprinkled in between, not (as you said) an R script with markdown sprinkled in between.
Edit: For those who are downvoting, please read the comments below this... OP didn't specify he was using spin earlier.

How to avoid generating pdf-file per figure in sweave?

Sorry if I'm asking a stupid question, but I'm kinda new to R/Sweave.
I have noticed that if I run my file, RStudio automatically generates a pdf-file for each figure plotted (as well as a pdf-file containing all generated figures from the Sweave-file). For example, suppose I have the following chunk of code in RStudio (simplified version):
\begin{figure}[htbp]
\centering
<<fig1, fig=TRUE, echo=FALSE>>=
plot(pts.X,1:length(pts.X),
main = "Type I error for X-var IT")
#
\caption{}
\label{X-var}
\end{figure}
Then, RStudio saves a pdf-file called R/SweaveFileName-fig1.pdf as well as a pdf-file Rplots.pdf which will also contain any other figure included in the Sweave-file. Since my R/Sweave files contain a lot of figures, I was wondering whether it is possible to change this option in R/Sweave. And, if not, is it possible to redirect these pdf-files into a separate folder?
You can't avoid generating the figures. RStudio isn't really doing much of the work here; it's just directing other software to do it.
R generates the figure, and the LaTeX source code to import it.
LaTeX imports the figure and produces the final .pdf for the whole document.
You can tell R to put the files in a particular place using \SweaveOptions{prefix.string = figs/}. Put this into your document somewhere
pretty early, and all figures will be put into a directory called "figs" (which must exist for this to work).
For more details about the options in Sweave, see the vignette in the utils package.

R2HTML or knitr for dynamic report generation?

I want to write an R function which processes some data and then automatically outputs an html report. This report should contain some fixed text, some text changing according to the underlying data and some figures.
What is the best way to go?
R2HTML or knitr?
What are the advantages of one over the other?
As far as I understood R2HTML allows me to build the html file sequentially while knitr already operates on an predefined .Rhtml file.
So, either use R2HTML or stitch and spin from knitr for on the fly report generation.
I would appreciate any suggestions or hints.
I grab this nice opportunity to promote pander a bit :)
This package was written for similar reasons like #Yihui's great knitr, although I wanted to let users really concentrate on the text and R code without dealing with chunk options etc. So letting users generate pretty HTML, pdf or even docx or odt output automatically with some predefined options.
These options affects e.g. the cache engine (handling dependencies without any chunk options) or the default plot options (let it be a "base" R graphics, lattice or ggplot2), so that you do no thave to set the color palette or the minor grid in each of your plots, just once - or live with the package defaults :)
The package captures the results (besides errors/warnings and other messages and the output) of all run R expression and can convert to Pandoc's markdown automatically. There are some helper functions that let you convert the resulting document written in a brew-like syntax automatically to e.g. HTML if you have pandoc installed, or export R objects to markdown/HTML/any other supported format in a live R session with a reference class.
Short demo:
brew file
Pandoc.brew('file_name.brew', output = 'foo.html', convert = 'html')
HTML output
knitr, every time. Handles graphics, lets you write your report with markdown instead of having to write html everywhere (if you want), caches things, makes coffee for you etc.
You can also build an HTML file sequentially as long as you have a decent text editor like Emacs/ESS or RStudio, etc. R2HTML is excellent in terms of its wide support to many R objects (see methods(HTML)), but I'll probably frown on RweaveHTML() due to its root Sweave().
That said, I think it may be a good idea to combine R2HTML and knitr, e.g.
# A LOESS Example
```{r loess-demo, results='asis'}
cars.lo <- loess(dist ~ speed, cars)
library(R2HTML)
HTML(cars.lo, file = '')
```
I was using the R Markdown syntax in the above example. The key is results='asis' which means to writing raw HTML code into the output.
I believe that you can also use Sweave to create HTML files, though I have heard that knitr is easier to use.

practically getting started with Sweave

my question(s) might be less general than the title suggests. I am running R on Mac OS X with a MySQL database to store the data. I have been working with the Komodo / Sciviews-R for some time. Recently I had the need for auto-generated reports and looked into Sweave. I guess StatET / Eclipse appears to be the "standard" solution for Sweavers.
1) Is it reasonable to switch from Komodo to StatET Eclipse? I tried StatET before but chose Komodo over StatET because I liked the calltip / autosuggest and the more convenient config from Komodo so much.
2) What´s a reasonable workflow to generate Sweave files? Usually I develop my R code first and then care about the report later. I just learned today that there is one file in Sweave that contains R code and Latex code at once and that from this file the .tex document is created. While the example files look handily and can't really imagine how to enter my 250 + lines of R code to a file and mixed it up with Latex.
Is it possible to just enter the qplot() and ggplot() statements to a such a document and source the functionality like database connection and intermediate results somehow?
Or is it just a matter of being used to the mix of Latex and R code?
Thx for any suggestions, hints, links and back-to-the-roots-shout-outs…
You've asked several questions, so here's several answers;
Is StatEt/Eclipse the right way to do Sweave ?
Not nessarily (note: I'm an avid StatEt/Eclipse user, and use it for both pure R and Sweave/R and love it, I haven't used Komodo / sciviews-R). You should be able to run the sweave command from any R command line which will generate a .tex file. You can then turn the .tex file into something readable (like pdf) from any tex environment.
What's a good Sweave workflow ?
When I have wanted to turn an r script into a sweave report I generaly start with an empty sweave template and copy/paste my entire R script into a sweave R block just after the title, i.e;
<<label=myEntireRScript, echo=false, include=false>>
#Insert code here
myTable<-dataframe(...)
myPlot<-qplot(....)
#
Then I go through and find the parts I want to report. For instance, if i want to put a table into the report, I'll cut the R block and put an xtable block in, and the same for variables and plots.
<<label=myEntireRScript, echo=false, include=false>>=
#Insert code here
#
Put any text I want before my table here, maybe with a \Sexpr{print(variable)} named variable
<<label=myTable, result=Tex>>=
myTable<-dataframe(...)
print(xtable(mytable,...),...)
#
Any text I want before my figure
<label=myplot, result=figure>>=
myPlot<-qplot(....)
print(qplot)
#
You may want to look at these related SO posts. The rest of my post relates to your question 2.
When creating reports with Sweave, I usually keep most of the R code and the report text separate. If the R code is fast to run, then I prefer I will include something like the following at the start of the .Rnw file:
<<>>
source('/path/to/script.r')
#
On the other hand, if the R code takes a long time, I will often include something like the following at the end of the R script:
Sweave('/path/to/report.Rnw'); system('pdflatex report.tex')
That way, I can re-generate the report quickly, without needing to run all the R code again. Then, the only work R has to do in the Sweave file is print tables, make graphs and maybe extract a few figures.
Like nullglob, I prefer to keep the R and Sweave files separate, but I prefer to save the workspace with save.image() rather than to source() the file. This avoids running the R calculations with each .Rnw file compiling (and I always end up tinkering with the typesetting more than I'd like).
My general work flow is to do each paper/project in it's own folder with it's own R file(s). When the calculation side is "done", I save.image() to store all the workspace variables as-is.
Then, in the .Rnw file in the same directory I set the working directory with setwd() and load all variables with load(".Rdata"). Of course, you can change the name you use for your workspace, but I do one workspace per folder and keep the default name. Oh, and if you tinker with the R file, be sure save the workspace image and watch out for variables that linger in the workspace and .Rnw file, but are no longer part of the R file... this is where the save.image() approach can cause some headaches.
I am on a Mac and I suggest TextMate if you're mildly geeky and emacs/ess if you're really geeky. I use vim and command line R, but emacs/ess works best for most. If you're in this for the long haul, I doubt you'll regret learning emacs/ess for R, Sweave, and LaTeX.

Resources