Produce two plots from same chunk / statement in knitr - r

Is it possible to have plot-generating code output two versions of the same figure, at different sizes, from a .Rmd document? Either through chunk options (I didn't see anything that works directly here), or through a custom knitr hook? Preferably this would be done with the png device.
My motivation: I'd like to be able to output a figure at one size, which would fit inline in a compiled HTML document, and another figure that a user could show after clicking (think fancybox). I think I'll be able to handle the scripting necessary to make that work; however, first I need to convince R / knitr to output two versions of the figure.
Although I'm sure there are workarounds, it would be best if there was some way to get it to 'just work' behind the scenes, e.g. through a knitr hook. That way, we don't have to do anything special to the R code within a chunk, we just modify how we parse / evaluate that chunk.
Alternatively, one could use SVG graphics that would scale nicely, but then we lose the nice inference of good sizes for the plot labels, and vector graphics aren't great for plots with many many points.

I thought there was not a solution, and was about to say no to #baptiste, but got a hack in my mind soon. Below is an R Markdown example:
```{r test, dev='png', fig.ext=c('png', 'large.png'), fig.height=c(4, 10), fig.width=c(4, 10)}
library(ggplot2)
qplot(speed, dist, data=cars)
```
See the [original plot](figure/test.png) and
a [larger version](figure/test.large.png).
The reason I thought the vectorized version of dev would not work was: for dev=c('png', 'png'), the second png file will overwrite the first one because the figure filename is the same. Then I realized fig.ext was also vectorized, and a file extension like large.png does not really destroy the file extension png; this is why it is a hack.
Anyway, by vectorized versions of dev, fig.ext, fig.height, and fig.width, you can save the same plot to multiple versions. If you use a deterministic pattern for the figure file extensions, I think you can also cook up some JavaScript code to automatically attach fancy boxes onto images.

If you just need the small and large figure, can you just do:
<<plotSmall, fig.height=6, fig.width=8, out.width='.1\\textwidth'>>=
plot(...)
#
<<plotBig, fig.height=6, fig.width=8, out.width='.99\\textwidth'>>=
plot(...)
#
Or more simply:
<<plotBoth, fig.height=6, fig.width=8, out.width=c('.1\\textwidth', '.9\\textwidth')>>=
plot(...)
plot(...)
#
(sure you know this, but .Rmd is for LaTeX, while .Rhtml is for html - the .Rhtml syntax is slightly different.)

Related

Automatically generated LaTeX beamer slides with R/knitr

I am working on a LaTeX report template that automatically generates a beamer document, pulling in figures from specified directories and placing them one per slide.
Here is an example of the code that I am using for this, as a code chunk in my .Rnw document:
<<results='asis',echo=FALSE>>=
suppressPackageStartupMessages(library("Hmisc"))
# get the plots from the common directory
Barplots_dir<-"/home/figure/barplots"
Barplots_files<-dir(Barplots_dir)
# create a beamer slide for each plot
# use R to output LaTeX markup into the document
for(i in 1:length(Barplots_files)){
GroupingName<-gsub("_alignment_barplot.pdf", "", Barplots_files[i]) # strip this from the filename
file <- paste0(Barplots_dir,"/",Barplots_files[i]) # path to the figure
cat("\\subsubsection{", latexTranslate(GroupingName), "}\n", sep="") # don't forget you need double '\\' because one gets eaten by R !!
cat("\\begin{frame}{", latexTranslate(GroupingName), " Alignment Stats}\n", sep="")
cat("\\includegraphics[width=0.9\\linewidth,height=0.9\\textheight,keepaspectratio]{", file, "}\n", sep="")
cat("\\end{frame}\n\n")
}
#
However I recently came across this article by Yihui Xie which includes a remark about cat("\\includegraphics{}") being a bad idea. Is there a reason for this, and is there a better option?
To be clear, these figures are generated by other programs as part of a larger pipeline; generating them within the document is not an option, but I need the document to be able to dynamically find and insert them into the report. I know that there are some capabilities to do this directly from within LaTeX itself but cat'ing out the LaTeX markup I need seemed like an easier and more flexible task.
cat("\\includegraphics{}") is likely to be a bad idea if you are from the old Sweave world (where one might need to open a graphics device, draw a plot, close the device, and cat("\\includegraphics{}")). No kittens will be killed as long as you understand what you are doing. Your use case seems to be very reasonable to me, and I don't have a better approach.

have knitr output figures in different formats

Let's say I have a long report that produces a lot of figures which knitr makes as pdfs and it works really well with LaTeX. At the end of the project, my co-authors would like to have also raster based figures. One option would be to convert everything using ImageMagick. Another option would be to specify for each chunk dev = c("jpg", "pdf"), but given the number of figures, this could be cumbersome.
Is there a global switch to make knitr produce figures in pdf and other formats at the same time?
I think in the preamble
opts_chunk$set(dev = c("pdf", "jpg"))
should do. Within a R-chunk of course.

Making flattened pdfs in Sweave

So I am creating pdfs using Sweave that include some graphs that have a ton of points on them. I can get the pdf well enough, but it seems to have created it with a ton of layers, so it's hard to open the file in Acrobat or Reader. When I do, I literally can watch the points load on the document.
Is there a way to flatten the pdf in Sweave so that it's not so bulky?
(Note that I am using RStudio. I know I should probably be using something else, but I haven't found anything that has worked this smoothly yet.)
There is no need to switch to Knitr for this, though there are plenty of advantages in doing so.
One solution is just to arrange for the plot file to be produced and then include it yourself rather than rely on Sweave to do it for you
<<gen_fig, echo=true, eval=true>>=
png("path/to/fig/location/my_fig.png")
plot(1:10)
dev.off()
#
\includegraphics[options_here]{path/to/fig/location/my_fig}
Another option is to consider whether a plot with a "ton of points" is a useful figure - can you see all the points? Is the density of the points of interest? Alternatives include plotting via the hexbin package or generating a 2-d density of the points and plotting that as a lower-density set of points. The ggplot2 package has plenty of this functionality built in, see e.g. stat__bin2d() or stat_binhex() for examples.
As Gavin said, there is no need to switch to knitr for this, though there are other advantages to do so. However, you don't even need to write your own saving and including code; Sweave can do that for you. If the initial document is:
\documentclass{article}
\usepackage[american]{babel}
\begin{document}
<<>>=
n <- 100000
DF <- data.frame(x=rnorm(n), y=rnorm(n))
#
<<gen_fig, fig=TRUE>>=
plot(DF)
#
\end{document}
Then just by changing the arguments to the figure chunk, you can get a PNG instead of a PDF:
<<gen_fig, fig=TRUE, png=TRUE, pdf=FALSE>>=
plot(DF)
#
In this simple example, it shrinks my final PDF from 685K to 70K.
As has already been mentioned you should probably switch to knitr, which makes swapping between pdfs and other formats much nicer. In particular, you should look at:
the transition guide between knitr and sweave
global options: that way you can easily swap between pdfs, high-res png and low-res pngs.
caching: only generate the figures when needed.
Here is an example of using the PNG device:
\documentclass{article}
\begin{document}
<<gen_fig, dev='png'>>=
n <- 100000
DF <- data.frame(x=rnorm(n), y=rnorm(n))
plot(DF)
#
\end{document}
There is no need to specify fig=TRUE for knitr. If the image quality of the PNG device in the graphics package is not enough, you can easily switch to other PNG devices, e.g. dev='CairoPNG' or 'Cairo_png'. In Sweave you just write more code to do the same thing.

R2HTML or knitr for dynamic report generation?

I want to write an R function which processes some data and then automatically outputs an html report. This report should contain some fixed text, some text changing according to the underlying data and some figures.
What is the best way to go?
R2HTML or knitr?
What are the advantages of one over the other?
As far as I understood R2HTML allows me to build the html file sequentially while knitr already operates on an predefined .Rhtml file.
So, either use R2HTML or stitch and spin from knitr for on the fly report generation.
I would appreciate any suggestions or hints.
I grab this nice opportunity to promote pander a bit :)
This package was written for similar reasons like #Yihui's great knitr, although I wanted to let users really concentrate on the text and R code without dealing with chunk options etc. So letting users generate pretty HTML, pdf or even docx or odt output automatically with some predefined options.
These options affects e.g. the cache engine (handling dependencies without any chunk options) or the default plot options (let it be a "base" R graphics, lattice or ggplot2), so that you do no thave to set the color palette or the minor grid in each of your plots, just once - or live with the package defaults :)
The package captures the results (besides errors/warnings and other messages and the output) of all run R expression and can convert to Pandoc's markdown automatically. There are some helper functions that let you convert the resulting document written in a brew-like syntax automatically to e.g. HTML if you have pandoc installed, or export R objects to markdown/HTML/any other supported format in a live R session with a reference class.
Short demo:
brew file
Pandoc.brew('file_name.brew', output = 'foo.html', convert = 'html')
HTML output
knitr, every time. Handles graphics, lets you write your report with markdown instead of having to write html everywhere (if you want), caches things, makes coffee for you etc.
You can also build an HTML file sequentially as long as you have a decent text editor like Emacs/ESS or RStudio, etc. R2HTML is excellent in terms of its wide support to many R objects (see methods(HTML)), but I'll probably frown on RweaveHTML() due to its root Sweave().
That said, I think it may be a good idea to combine R2HTML and knitr, e.g.
# A LOESS Example
```{r loess-demo, results='asis'}
cars.lo <- loess(dist ~ speed, cars)
library(R2HTML)
HTML(cars.lo, file = '')
```
I was using the R Markdown syntax in the above example. The key is results='asis' which means to writing raw HTML code into the output.
I believe that you can also use Sweave to create HTML files, though I have heard that knitr is easier to use.

Looping knitr options

I'm curious, is there a way to incorporate changes in knitr options in a loop? For example, if I wanted to loop through and see how the same block of code looked in all different knitr themes, my first guess would be:
\documentclass{article}
\begin{document}
<<test>>=
themes<-knit_theme$get()
for (a.theme in themes){
knit_theme$set(a.theme)
a <- 3+5
b<- sum(1:10, na.rm=T)
for(g in 1:10) z<-0
}
#
\end{document}
And yet, this produces some pretty odd output. Is there a way to use loops like this, to dynamically change output, or perhaps dynamically include or not include certain chunks?
That is not possible. One document only supports one theme, since all the colors are defined in the preamble, and there can only be one set of definitions. If you want different themes, they must live in different documents. See this gist for how to do it in HTML as well as a gallery of built-in themes in knitr: https://gist.github.com/3422133
With LaTeX/PDF, you can also do a loop on an Rnw document to generate PDF's for different themes and use \includegraphics{} to include them in the main TeX document. You can probably figure it out since it is not very different with the HTML example above.

Resources