Reusing chunks in Knitr - r

I'm having lots of fun with Knitr but noticed I am reusing code in a bad way - cut and paste. In my example I want to load a dataset, calculate some statistics and print those, and plot the dataset -- easy to do with a few chunks, but if I want to do the same thing with another dataset I have to copy and paste the chunks and change only the name of the dataset.
Suppose I have something like this :
<p>Load the dataset <tt>dataset01</tt></p>
<!--begin.rcode load-dataset01
# Create an alias so there is no need to change it several times in the
# chunks
myDataset <- dataset01
a <- calcSomeStats(myDataset)
input <- myDataset[,1:2]
ideal <- class.ind(myDataset$label)
end.rcode-->
<p>Now let's plot it</p>
<!--begin.rcode plot-dataset01, fig.width=10, fig.height=10
neurons <- 1
NNET = nnet(input, ideal, size=neurons,softmax=TRUE)
plotnet(NNET)
par(pty="s",xpd=T, mar=par()$mar+c(0,0,0,2))
axis(1, at = seq(bbox[1],bbox[2], by = 2), las=1)
axis(2, at = seq(bbox[1],bbox[2], by = 2), las=2)
points(myDataset$x,myDataset$y,
col=myPal[unclass(myDataset$label)],cex=2,pch=16)
legend("topright", levels(factor(myDataset$label)),fill=myPal,inset=c(-0.1,0))
end.rcode-->
Code is not really complete, there are other parts that I am still developing, but it is working.
My question is, considering the two chunks shown as code above, which is the best (or Riest) way to reuse it? Suppose I have a list of dozens of datasets and I want to run the same chunks on them, is possible even substituting the non-R, HTML parts. Is it possible?
I've naively tried to create a function but since it starts with this:
<!--begin.rcode
abc <- function(n)
{
<!--begin.rcode howdoInamethischunkwithanuniquename
n <- n*2
end.rcode-->
}
end.rcode-->
it did not work (error: unexpected end of input)
thanks
Rafael
Edit: there are similar questions with answers in Using loops with knitr to produce multiple pdf reports... need a little help to get me over the hump and https://github.com/yihui/knitr/issues/435 but they cover LaTeX and/or R markdown, not HTML.
Another edit: things I've tried after #Yuhui comment:
Using the same label for both chunks
<!--begin.rcode chunkA, echo=TRUE, results='hide'
x <- rnorm(100)
end.rcode-->
<p>Plot it?</p>
<!--begin.rcode chunkA, echo=FALSE, results='markup'
mean(x)
end.rcode-->
With this I get the "Error in parse_block(g[-1], g[1], params.src) : duplicate label 'chunkA'" message.
Using chunk option ref.label
<!--begin.rcode chunkA, echo=TRUE, results='hide'
x <- rnorm(100)
end.rcode-->
<p>Plot it?</p>
<!--begin.rcode chunkB, ref.label='chunkA', echo=FALSE, results='markup'
mean(x)
end.rcode-->
With this I get the R code (x <- rnorm(100)), "Plot it?" and then nothing. Changing echo to TRUE just repeat (x <- rnorm(100)).
More information
My scenario is having several small data frames that have the same structure (x,y,label) and I want to process them in a chunk "A" and plot them with similar parameters in another chunk "B". If I do this without reusing code, I have to copy-and-paste chunks "A" and "B" several times, which is not a really good idea.
I know I cannot pass a parameter to a HTML chunk, and the recipes at http://yihui.name/knitr/demo/reference/ seems close to what I need, but I cannot figure out how to do them in R+HTML.

OK, I got it, and am posting this to serve as an example.
From what I understand, it is not possible to create a knitr chunk that works as a function. So, this is not possible:
<!--begin.rcode fakeFunction
# do something with myData, assume it is defined!
end.rcode-->
<!--begin.rcode myPlot1 ref.label='fakeFunction'
myData <- iris
# Assume fakeFunction will be executed somehow with iris
end.rcode-->
<!--begin.rcode myPlot2 ref.label='fakeFunction'
myData <- cars
# Assume fakeFunction will be executed somehow with cars
end.rcode-->
What will work is something like this:
<!--begin.rcode
myData <- iris
end.rcode-->
<!--begin.rcode plot
summary(myData)
end.rcode-->
<!--begin.rcode
myData <- cars
end.rcode-->
<!--begin.rcode plot2, ref.label='plot'
end.rcode-->
Basically we're saying that chunk plot2 will "paste" the code from chunk plot. We don't need to define anything else in plot2, and I guess it will be ignored anyway.
I haven't figured out a detail, though. Suppose I have the chunk plot working OK (imagine dozens of lines of R code) and want a slight different behavior in plot2, that would impact a single line of code. From what I understand I won't be able to do this with knitr -- anyone knows how to reuse code by writing chunks as procedures or functions?

I got similar ERRORs.
What I did was to name the chunk differently or not name them at all.
For example
{r, echo=F }
some code here
this is an example of a default code chunk without a name
{r setup, echo=F }
some code here
this is a chunk with name "setup."
Basically, if you have all chunk unnamed, or have all different named chunk, youll be fine.

Related

Remove progress bar from knitr output

I'm analyzing some data and would like to do a Simpsons paradox on R. I've installed the Simpsons package and loaded the library. Here is an example based on the package documentation:
---
output: html_document
---
```{r}
library(Simpsons)
#generating data
Coffee1=rnorm(100,100,15)
Neuroticism1=(Coffee1*.8)+rnorm(100,15,8)
g1=cbind(Coffee1, Neuroticism1)
Coffee2=rnorm(100,170,15)
Neuroticism2=(300-(Coffee2*.8)+rnorm(100,15,8))
g2=cbind(Coffee2, Neuroticism2)
Coffee3=rnorm(100,140,15)
Neuroticism3=(200-(Coffee3*.8)+rnorm(100,15,8))
g3=cbind(Coffee3, Neuroticism3)
data2=data.frame(rbind(g1,g2,g3))
colnames(data2) <- c("Coffee","Neuroticism")
example <- Simpsons(Coffee,Neuroticism,data=data2)
plot(example)
```
This is returning a plot with 3 clusters (exactly what I need). However, when I Knit the Rmd file to HTML, I'm getting a lot of equals signs (======) with a percentage next to it like a loading grid which I would like to remove from my final output.
You can suppress any output messages in R by setting the knitr chunk option. If we wish to hide all code output other than plots, we can use the following solution:
---
output: html_document
---
```{r echo=FALSE, results='hide', fig.keep='all', message = FALSE}
library(Simpsons)
#generating data
Coffee1=rnorm(100,100,15)
Neuroticism1=(Coffee1*.8)+rnorm(100,15,8)
g1=cbind(Coffee1, Neuroticism1)
Coffee2=rnorm(100,170,15)
Neuroticism2=(300-(Coffee2*.8)+rnorm(100,15,8))
g2=cbind(Coffee2, Neuroticism2)
Coffee3=rnorm(100,140,15)
Neuroticism3=(200-(Coffee3*.8)+rnorm(100,15,8))
g3=cbind(Coffee3, Neuroticism3)
data2=data.frame(rbind(g1,g2,g3))
colnames(data2) <- c("Coffee","Neuroticism")
example <- Simpsons(Coffee,Neuroticism,data=data2)
plot(example)
```
I would note that this package seems to print out a lot more content that most packages, and therefore the combination of options are quite long.
An easier method would probably be to move the plot to a separate chunk and have all the analysis run before it. The include argument can be used to suppress all outputs, but this includes plots, hence why we must use two chunks:
```{r, include = FALSE}
# your code to build model
```
```{r}
plot(example)
```
Check out the full list of knitr chunk options here

knitr: add to previous plot in new code chunk

I am using the knitr package for R to produce a LaTeX document combining text with embedded R plots and output.
It is common to write something like this:
We plot y vs x in a scatter plot and add the least squares line:
<<scatterplot>>=
plot(x, y)
fit <- lm(y~x)
abline(fit)
#
which works fine.
(For those not familiar with knitr or Sweave, this echos the code and output in a LaTeX verbatim environment and also adds the completed plot as a figure in the LaTeX document.)
But now I would like to write more detailed line-by-line commentary like:
First we plot y vs x with a scatterplot:
<<scatterplot>>=
plot(x, y)
#
Then we regress y on x and add the least squares line to the plot:
<<addline>>=
fit <- lm(y~x)
abline(fit)
#
The problem is that there are now two knitr code chunks for the same plot. The second code chunk addline fails because the plot frame created in the first code chunk scatterplot is not visible to the code in the second code chunk. The plotting window doesn't seem to be persistent from one code chunk to the next.
Is there any way that I can tell knit() to keep the plot window created by plot() active for the second code chunk?
If that is not possible, how else might I achieve LaTeX-style commentary on code lines that add to existing plots?
One Day Later
I can now see that essentially the same question has been asked before, see:
How to build a layered plot step by step using grid in knitr? from 2013 and
Splitting a plot call over multiple chunks from 2016.
Another question from 2013 is also very similar:
How to add elements to a plot using a knitr chunk without original markdown output?
You can set knitr::opts_knit$set(global.device = TRUE), which means all code chunks share the same global graphical device. A full example:
\documentclass{article}
\begin{document}
<<setup, include=FALSE>>=
knitr::opts_knit$set(global.device = TRUE)
#
First we plot y vs x with a scatterplot:
<<scatterplot>>=
x = rnorm(10); y = rnorm(10)
plot(x, y)
#
Then we regression y and x and add the least square line to the plot:
<<addline>>=
fit <- lm(y~x)
abline(fit)
#
\end{document}
You can show code without evaluating it by adding the chunk option eval=FALSE. If you only want to show the final version of the plot with the regression line added, then use eval=FALSE for the first plot(x,y).
Then we add two chunks for the regression line: One is the complete code needed to render the plot, but we don't want to display this code, because we don't want to repeat the plot(x,y) call. So we add a second chunk that we echo, to display the code, but don't evaluate.
---
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r data}
set.seed(10)
x=rnorm(10)
y=rnorm(10)
```
First we plot y vs x with a scatterplot:
```{r scatterplot, eval=FALSE}
plot(x, y)
```
Then we regress y on x and add the least squares line to the plot:
```{r addline, eval=FALSE}
fit <- lm(y~x)
abline(fit)
```
```{r echo=FALSE}
plot(x,y)
fit <- lm(y~x)
abline(fit)
```
Here's what the output document looks like:

Null output when printing heatmap.2 object in rmarkdown

I'm using rmarkdown via R-Studio and want to plot a heatmap by the heatmap.2. When I change the angle of column labels via the strCol option I get a NULL message printed before the heatmap in the output PDF file.
Attached a minimal code reproduce the problem:
{r, message=FALSE,warning=FALSE, echo=FALSE}
require(gplots)
data(mtcars)
x <- as.matrix(mtcars)
heatmap.2(x,srtCol=0)
The PDF look like
Is there any way to remove this NULL from the PDF output?
Try the following modification using capture.output. This did not print NULL for me.
```{r, message=FALSE,warning=FALSE, echo=FALSE}
require(gplots)
data(mtcars)
x <- as.matrix(mtcars)
res <- capture.output(heatmap.2(x,srtCol=0))
```
There may be a better way with some option to heatmap.2 but I didn't see it in the documentation. This was based off of the following SO post Suppress one command's output in R.

Description and plot for every variable in R Markdown

I have a dataframe dataof nobservations of several numeric and factor variables. I would like to produce a html report in which class and describe are reported and a histogram (qplotor ggplot) is plotted for every variable.
How can I do that?
Is it possible in R Markdown to produce an automatic header preceding every variable analysis?
Thank you for your help.
Corrado
You can put a loop in your R chunks in Markdown files. Something like that for example :
```{r, echo=FALSE}
library(ggplot2)
```
This is an introductory sentence with absolutely no interest.
```{r, results="asis", eval=TRUE, echo=FALSE}
data(cars)
for (varname in names(cars)) {
var <- cars[,varname]
cat(paste0("<h2>",varname,"</h2>"))
cat(paste0("Class : <pre>",class(var),"</pre>"))
cat("Summary : <pre>")
print(summary(var))
cat("</pre>")
if (is.numeric(var)) print(qplot(var, binwidth=diff(range(var))/30))
}
```
This is an astonishing conclusion.
Which gives the following result : http://rpubs.com/juba/mdloop

knitr - How to align code and plot side by side

is there a simple way (e.g., via a chunk option) to get a chunk's source code and the plot it produces side by side, as on page 8 (among others) of this document?
I tried using out.width="0.5\\textwidth", fig.align='right', which makes the plot correctly occupy only half the page and be aligned to the right, but the source code is displayed on top of it, which is the normal behaviour.
I would like to have it on the left side of the plot.
Thanks
Sample code:
<<someplot, out.width="0.5\\textwidth", fig.align='right'>>=
plot(1:10)
#
Well, this ended up being trickier than I'd expected.
On the LaTeX side, the adjustbox package gives you great control over alignment of side-by-side boxes, as nicely demonstrated in this excellent answer over on tex.stackexchange.com. So my general strategy was to wrap the formatted, tidied, colorized output of the indicated R chunk with LaTeX code that: (1) places it inside of an adjustbox environment; and (2) includes the chunk's graphical output in another adjustbox environment just to its right. To accomplish that, I needed to replace knitr's default chunk output hook with a customized one, defined in section (2) of the document's <<setup>>= chunk.
Section (1) of <<setup>>= defines a chunk hook that can be used to temporarily set any of R's global options (and in particular here, options("width")) on a per-chunk basis. See here for a question and answer that treat just that one piece of this setup.
Finally, Section (3) defines a knitr "template", a bundle of several options that need to be set each time a side-by-side code-block and figure are to be produced. Once defined, it allows the user to trigger all of the required actions by simply typing opts.label="codefig" in a chunk's header.
\documentclass{article}
\usepackage{adjustbox} %% to align tops of minipages
\usepackage[margin=1in]{geometry} %% a bit more text per line
\begin{document}
<<setup, include=FALSE, cache=FALSE>>=
## These two settings control text width in codefig vs. usual code blocks
partWidth <- 45
fullWidth <- 80
options(width = fullWidth)
## (1) CHUNK HOOK FUNCTION
## First, to set R's textual output width on a per-chunk basis, we
## need to define a hook function which temporarily resets global R's
## option() settings, just for the current chunk
knit_hooks$set(r.opts=local({
ropts <- NA
function(before, options, envir) {
if (before) {
ropts <<- options(options$r.opts)
} else {
options(ropts)
}
}
}))
## (2) OUTPUT HOOK FUNCTION
## Define a custom output hook function. This function processes _all_
## evaluated chunks, but will return the same output as the usual one,
## UNLESS a 'codefig' argument appeared in the chunk's header. In that
## case, wrap the usual textual output in LaTeX code placing it in a
## narrower adjustbox environment and setting the graphics that it
## produced in another box beside it.
defaultChunkHook <- environment(knit_hooks[["get"]])$defaults$chunk
codefigChunkHook <- function (x, options) {
main <- defaultChunkHook(x, options)
before <-
"\\vspace{1em}\n
\\adjustbox{valign=t}{\n
\\begin{minipage}{.59\\linewidth}\n"
after <-
paste("\\end{minipage}}
\\hfill
\\adjustbox{valign=t}{",
paste0("\\includegraphics[width=.4\\linewidth]{figure/",
options[["label"]], "-1.pdf}}"), sep="\n")
## Was a codefig option supplied in chunk header?
## If so, wrap code block and graphical output with needed LaTeX code.
if (!is.null(options$codefig)) {
return(sprintf("%s %s %s", before, main, after))
} else {
return(main)
}
}
knit_hooks[["set"]](chunk = codefigChunkHook)
## (3) TEMPLATE
## codefig=TRUE is just one of several options needed for the
## side-by-side code block and a figure to come out right. Rather
## than typing out each of them in every single chunk header, we
## define a _template_ which bundles them all together. Then we can
## set all of those options simply by typing opts.label="codefig".
opts_template[["set"]](
codefig = list(codefig=TRUE, fig.show = "hide",
r.opts = list(width=partWidth),
tidy = TRUE,
tidy.opts = list(width.cutoff = partWidth)))
#
A chunk without \texttt{opts.label="codefig"} set...
<<A>>=
1:60
#
\texttt{opts.label="codefig"} \emph{is} set for this one
<<B, opts.label="codefig", fig.width=8, cache=FALSE>>=
library(raster)
library(RColorBrewer)
## Create a factor raster with a nice RAT (Rast. Attr. Table)
r <- raster(matrix(sample(1:10, 100, replace=TRUE), ncol=10, nrow=10))
r <- as.factor(r)
rat <- levels(r)[[1]]
rat[["landcover"]] <- as.character(1:10)
levels(r) <- rat
## To get a nice grid...
p <- as(r, "SpatialPolygonsDataFrame")
## Plot it
plot(r, col = brewer.pal("Set3", n=10),
legend = FALSE, axes = FALSE, box = FALSE)
plot(p, add = TRUE)
text(p, label = getValues(r))
#
\texttt{opts.label="codefig"} not set, and all settings back to ``normal''.
<<C>>=
lm(mpg ~ cyl + disp + hp + wt + gear, data=mtcars)
#
\end{document}
I see 3 possibilities
for beamer presentations, I'd go for \begin{columns} ... \end{columns} as well.
If it is only one such plot: Minipages
Here I used a table (column code and column result). (This example is "normal" Sweave)
For all three, the chunk options would have include = FALSE, and the plot would "manually" be put to the right place by \includegraphics[]{}.
You can display the text in a 'textplot' from package PerformanceAnalytics or gplots.
(Little) downside: To my knowledge there is no Syntax highlighting possible.
Sample Code:
```{r fig.width=8, fig.height=5, fig.keep = 'last', echo=FALSE}
suppressMessages(library(PerformanceAnalytics))
layout(t(1:2))
textplot('plot(1:10)')
plot(1:10)
```

Resources