How to wrap text in R source with tidy and knitr - r

I'm working with knitr lately and while most aspects of that have gone quite smoothly, there's one formatting issue with including R code in the finished document that I haven't figured out. I often need to create relatively long text strings in my R chunks, e.g. captions for xtable() functions. While tidy generally does a great job at wrapping R code and keeping it in the shaded boxes in LaTeX, it doesn't know what to do with text stings, so it doesn't wrap them, and they flow off the right side of the page.
I would be most happy with a solution that has tidy doing all the work. However, I'd also be satisfied with a solution that I can apply manually to long strings in R chunks in my Rnw source. I just don't want to have to edit the tex file created by KnitR.
Below is a minimal working example.
\documentclass[12pt, english, oneside]{amsart}
\begin{document}
<<setup, include=FALSE, cache=FALSE, tidy=TRUE>>=
options(tidy=TRUE, width=50)
#
<<>>=
x <- c("This","will","wrap","nicely","because","tidy","knows","how","to","deal","with","it.","So","nice","how","it","stays","in","the","box.")
longstr <- "This string will flow off the right side of the page, because tidy doesn't know how to wrap it."
#
\end{document}

This is an extremely manual solution, but one which I have used.
You build the string up, using paste0 and that gives tidy a chance to split it.
longstr <- paste0("This string will flow off the right side"," of the page, because tidy doesn't know how to wrap it.")

The other solution is to use strwrap.
> longstr <- "This string will flow off the right side of the page, because tidy doesn't know how to wrap it."
> strwrap(longstr, 70)
[1] "This string will flow off the right side of the page, because tidy" "doesn't know how to wrap it."
> str(strwrap(longstr, 70))
chr [1:2] "This string will flow off the right side of the page, because tidy" "doesn't know how to wrap it."
Unfortunately, I do not know whether this will work with tidy, but it works extremely well with knitr's HTML output.

This answer is a bit late to the party, but I have found that even when I use tidy.opts = list(width.cutoff = 60) in an early chunk (using RStudio and a .Rnw script) and then in each chunk option list I include tidy = TRUE, the overflow of lines still happens. My overflow lines are in sections of code that create ggplot2 plots. Trial and error discovered that if I add a carriage return after the + at the end of a line, I have no overflow problems. The extra line does not show up in the PDF that LaTeX creates.

Related

Knitr:spin - how to add text without manually adding #' every line?

My understanding is that knitr:spin allows me to work on my plain, vanilla, regular ol' good R script, while keeping the ability to generate a full document that understands markdown syntax. (see https://yihui.name/knitr/demo/stitch/)
Indeed, the rmarkdown feature in Rstudio, while super neat, is actually really a hassle because
I need to duplicate my code and break it in chunks which is super boring + inefficient as it is hard to keep track of code changes.
On top of that rmarkdown cannot read my current workspace. This is somehow surprising but it is what it is.
All in all this is very constraining... See here for a related discussion Is there a way to knitr markdown straight out of your workspace using RStudio?.
As discussed here (http://deanattali.com/2015/03/24/knitrs-best-hidden-gem-spin/), spin seems to be the solution.
Indeed, knitr:spin syntax looks like the following:
#' This is a special R script which can be used to generate a report. You can
#' write normal text in roxygen comments.
#'
#' First we set up some options (you do not have to do this):
#+ setup, include=FALSE
library(knitr)
in a regular workspace!
BUT note how each line of text is preceded by #'.
My problem here is that it is also very inefficient to add #' after each single line of text. Is there a way to do so automatically?
Say I select a whole chunk of text and rstudio adds this #' every row? Maybe in the same spirit as commenting a whole chunk of code lines?
Am I missing something?
Thanks!
In RStudio v 1.1.28, starting a line with #' causes the next line to start with #' when I hit enter in a *.R file on my machine (Ubuntu Linux 16.04LTS).
So as long as you start a text chunk with it, it will continue. But for previously existing R scripts, it looks like you would have to use find -> replace, or write a function to modify the required file, this worked for me in a very simple test.
comment_replace <- function(in_file, out_file = in_file){
in_text <- scan(file = in_file, what = character(), sep = "\n")
out_text <- gsub("^# ", "#' ", in_text)
cat(out_text, sep = "\n", file = out_file)
}
I would note, that this function does not check for preexisting #', you would want to build that in. I modified it so that it shouldn't replace them too much by adding a space in the regular expression.
With an RMarkdown document, you would write something like this:
As you can see I have some fancy code below, and text right here.
```{r setup}
# R code here
library(ggplot2)
```
And I have more text here...
This gist offers a quick introduction to RMarkdown and knitr's features. I think you don't entirely understand what RMarkdown really is, it's a markdown document with R sprinkled in between, not (as you said) an R script with markdown sprinkled in between.
Edit: For those who are downvoting, please read the comments below this... OP didn't specify he was using spin earlier.

Displaying dataframes in R Markdown

I can't find a method to remove the hash marks and row numbers from dataframes outputted to a word document in R markdown. I'd like to be able to present only the data without those features
The knitr website and specifically the page on Chunk options suggests the use of a separate chunk (before your want to display a data.frame in this manner) to change the default for the chunk option comment, perhaps like this:
```{r global_options}
opts_chunk$set(comment = NA) # default value is '##'
```
to disable the inserting of comment characters on output. Realize that this setting of the comment option is applicable to all chunks that follow this chunk; this chunk itself will not be affected by it.
This does give the textual representation of the data.frame (as if it were on the terminal), and not a more refined representation. I second #PierreLafortune's suggestion to look at knitr::kable.
Check out the sjPlot package and specifically the view_df function

Automatically generated LaTeX beamer slides with R/knitr

I am working on a LaTeX report template that automatically generates a beamer document, pulling in figures from specified directories and placing them one per slide.
Here is an example of the code that I am using for this, as a code chunk in my .Rnw document:
<<results='asis',echo=FALSE>>=
suppressPackageStartupMessages(library("Hmisc"))
# get the plots from the common directory
Barplots_dir<-"/home/figure/barplots"
Barplots_files<-dir(Barplots_dir)
# create a beamer slide for each plot
# use R to output LaTeX markup into the document
for(i in 1:length(Barplots_files)){
GroupingName<-gsub("_alignment_barplot.pdf", "", Barplots_files[i]) # strip this from the filename
file <- paste0(Barplots_dir,"/",Barplots_files[i]) # path to the figure
cat("\\subsubsection{", latexTranslate(GroupingName), "}\n", sep="") # don't forget you need double '\\' because one gets eaten by R !!
cat("\\begin{frame}{", latexTranslate(GroupingName), " Alignment Stats}\n", sep="")
cat("\\includegraphics[width=0.9\\linewidth,height=0.9\\textheight,keepaspectratio]{", file, "}\n", sep="")
cat("\\end{frame}\n\n")
}
#
However I recently came across this article by Yihui Xie which includes a remark about cat("\\includegraphics{}") being a bad idea. Is there a reason for this, and is there a better option?
To be clear, these figures are generated by other programs as part of a larger pipeline; generating them within the document is not an option, but I need the document to be able to dynamically find and insert them into the report. I know that there are some capabilities to do this directly from within LaTeX itself but cat'ing out the LaTeX markup I need seemed like an easier and more flexible task.
cat("\\includegraphics{}") is likely to be a bad idea if you are from the old Sweave world (where one might need to open a graphics device, draw a plot, close the device, and cat("\\includegraphics{}")). No kittens will be killed as long as you understand what you are doing. Your use case seems to be very reasonable to me, and I don't have a better approach.

Define commands for frequently used text in knitr

Is there a way to define a command that can be used as a short cut for frequently used text or html commands in knitr when compiling to html?
I use knitr to compile an rmkardown file (.Rmd) and the output is a html file (i.e., I press Knit HTML in RStudio).
To be more specific, let me add an example: I want to separate the percent sign by a hair space from the number before, which I achieve by typing, e.g., 5 %. It would be very convenient, if I could define a command, let's say \perc, that I can use instead, such that 5\perc would be equivalent to 5 %.
Is this at all possible and if yes, how can it be done?
You can define an R function and then call it inline. For example:
```{r}
perc <- function(){
" %"
}
```
This is inline r code 5`r perc()`
I think you could also use it in chunks where the result would be 'asis'.

Produce two plots from same chunk / statement in knitr

Is it possible to have plot-generating code output two versions of the same figure, at different sizes, from a .Rmd document? Either through chunk options (I didn't see anything that works directly here), or through a custom knitr hook? Preferably this would be done with the png device.
My motivation: I'd like to be able to output a figure at one size, which would fit inline in a compiled HTML document, and another figure that a user could show after clicking (think fancybox). I think I'll be able to handle the scripting necessary to make that work; however, first I need to convince R / knitr to output two versions of the figure.
Although I'm sure there are workarounds, it would be best if there was some way to get it to 'just work' behind the scenes, e.g. through a knitr hook. That way, we don't have to do anything special to the R code within a chunk, we just modify how we parse / evaluate that chunk.
Alternatively, one could use SVG graphics that would scale nicely, but then we lose the nice inference of good sizes for the plot labels, and vector graphics aren't great for plots with many many points.
I thought there was not a solution, and was about to say no to #baptiste, but got a hack in my mind soon. Below is an R Markdown example:
```{r test, dev='png', fig.ext=c('png', 'large.png'), fig.height=c(4, 10), fig.width=c(4, 10)}
library(ggplot2)
qplot(speed, dist, data=cars)
```
See the [original plot](figure/test.png) and
a [larger version](figure/test.large.png).
The reason I thought the vectorized version of dev would not work was: for dev=c('png', 'png'), the second png file will overwrite the first one because the figure filename is the same. Then I realized fig.ext was also vectorized, and a file extension like large.png does not really destroy the file extension png; this is why it is a hack.
Anyway, by vectorized versions of dev, fig.ext, fig.height, and fig.width, you can save the same plot to multiple versions. If you use a deterministic pattern for the figure file extensions, I think you can also cook up some JavaScript code to automatically attach fancy boxes onto images.
If you just need the small and large figure, can you just do:
<<plotSmall, fig.height=6, fig.width=8, out.width='.1\\textwidth'>>=
plot(...)
#
<<plotBig, fig.height=6, fig.width=8, out.width='.99\\textwidth'>>=
plot(...)
#
Or more simply:
<<plotBoth, fig.height=6, fig.width=8, out.width=c('.1\\textwidth', '.9\\textwidth')>>=
plot(...)
plot(...)
#
(sure you know this, but .Rmd is for LaTeX, while .Rhtml is for html - the .Rhtml syntax is slightly different.)

Resources