OpenCPU and multi-page plots - r

I'm trying to capture a multi-plot pdf from a function. In R, this gives me a three page PDF:
pdf(file='test.pdf', onefile=TRUE)
lapply(1:3, 'plot')
dev.off()
Using OpenCPU:
$ curl http://localhost:6977/ocpu/library/base/R/lapply -H 'Content-Type: application/json' -d '{"X":[1,2,3], "FUN":"plot"}'
/ocpu/tmp/x0dc3dad0/R/.val
/ocpu/tmp/x0dc3dad0/graphics/1
/ocpu/tmp/x0dc3dad0/graphics/2
/ocpu/tmp/x0dc3dad0/graphics/3
/ocpu/tmp/x0dc3dad0/stdout
/ocpu/tmp/x0dc3dad0/source
/ocpu/tmp/x0dc3dad0/console
/ocpu/tmp/x0dc3dad0/info
I can get any of the individual pages as a single-page PDF file, but not as one combined file.
Two possible work-arounds, not without their share of problems:
Use par(mfrow), layout(), or a similar mechanism, though this will create a monster image in the end (I'm dealing with more than three images in my code).
Use tempfile, create an Rmd file on-the-fly, return the filename in the session (have not tested this yet), and use OpenCPU's processing of Rmd files. Unfortunately, this now uses LaTeX's geometries and page numbering (workarounds exist for this).
Are there other ways to do this?

Good question. OpenCPU captures graphics using evaluate which stores each graphic individually. The API itself doesn't support combining multiple graphics within a single file. I would personally do this sort of PDF post processing in the application layer (i.e. with non-R tools), but perhaps it would be useful to support this in the API.
Some suggestions:
Any file that your R function/script saves to the working directory (i.e. getwd()) will also become available through the API. So one thing you could do is in your R code manually create your combined pdf file and save it to the working directory and then download it through opencpu.
Graphics are actually recordedPlot objects, and besides png, pdf and svg, you can also retrieve the graphic as rds or rda. So you could write an R function that downloads the recordedPlot object from the API and then prints it. Not sure if that would be helpful in your use case.

Related

Creating an editable slideshow (ideally a powerpoint) from R

As part of a contract, the team I work in has to produce a monthly powerpoint filled with KPI's and other requested values, which is then passed on to another team who write a commentary on last months performance. At the moment the values are created (mostly in SAS) exported to an excel file and then copy and pasted into a powerpoint. This is an old approach which clearly needs updating.
What I would ideally like to do is to automate the presentation using RMarkdown and save myself the hassle of copy and pasting values. The issue is that RMarkdown from what I can see can't produce a .ppt file, or another editable format that the commentary team could add to without having to use R.
From googling around the topic I found packages such as rcom, RDCOMclient, and R2PPT but they don't appear to have been recently updated or maintained.
TLDR; Need a way of making a powerpoint/slideshow in R where the text can be edited afterwards outside of R.
This can be easily done with RStudio and pandoc 2.1:
1) install Pandoc 2 from pandoc.org (this is higher version than the one which currently comes with rstudio )
2) create your RMarkdown file in RStudio,
---
title: 'Some title'
author: 'author'
output:
md_document: default
---
3) knit to md
4) call pandoc to convert to pptx
system("cmd", input = "C:\\Users\\janvy\\AppData\\Local\\Pandoc\\pandoc -f markdown -t pptx -o myfile.pptx myfile.md" )
I used to work for a company that had all their presentations linked to excel sheets and it worked fine for some broad definition of fine.
If you have to keep Powerpoint as a presentation format, I'd advise to not use R for creating it. There are some packages that create great connections to Office products, but in my experience they break easily with development of packages and R versions. (one of the ones I played in the past, would recreate ggplot2 plots with actual shapes and lines on Powerpoint, resulting in a huge file)
With that in mind, I'd advise you to create the results programatically and dump it in an excel spread sheet and build the presentation linked to that spread sheet. To keep you sane, I'd do one file per month (if that is the kpi's periodicity). There are many nice packages that create excel spreadsheets, but I'd stick to csv files for their simplicity.
I recommend that you take a look at SlideMight, a utility for merging data with PowerPoint templates; both text and images, in slides and tables. The usage is in principle similar to mail merge, with some more advanced stuff.
Possibly a solution for you would be to have the R program first write your data in a YAML, JSON or XML file; then it invokes the command-line version of SlideMight.
See www.slidemight.com.
Disclaimer: I am the developer and seller of SlideMight.

Is there a way to knitr markdown straight out of your workspace using RStudio?

I wonder whether I can use knitr markdown to just create a report on the fly with objects stemming from my current workspace. Reproducibility is not the issue here. I also read this very fine thread here.
But still I get an error message complaining that the particular object could not be found.
1) Suppose I open a fresh markdown document and save it.
2) write a chunk that refers to some lm object in my workspace. call summary(mylmobject)
3) knitr it.
Unfortunately the report is generated but the regression output cannot be shown because the object could not be found. Note, it works in general if i just save the object to .Rdata and then load it directly from the markdown file.
Is there a way to use objects in R markdown that are in the current workspace?
This would be really nice to show non R people some output while still working.
RStudio opens a new R session to knit() your R Markdown file, so the objects in your current working space will not be available to that session (they are two separate sessions). Two solutions:
file a feature request to RStudio, asking them to support knitting in the current R session instead of forcibly starting a new session;
knit manually by yourself: library(knitr); knit('your_file.Rmd') (or knit2html() if you want HTML output in one step, or rmarkdown::render() if you are using R Markdown v2)
Might be easier to save you data from your other session using:
save.image("C:/Users/Desktop/example_candelete.RData")
and then load it into your MD file:
load("C:/Users/Desktop/example_candelete.RData")
The Markdownreports package is exactly designed for parsing a markdown document on the fly.
As Julien Colomb commented, I've found the best thing to do in this situation is to save the large objects and then load them explicitly while I'm tailoring the markdown. This is a must if your data is coming through an ODBC and you don't want to run the entire queries repeatedly as you tinker with fonts and themes.

Create and save R's default codebooks as a pdf

If I load data(mtcars) it comes with a very neat codebook that I can call using ?mtcars.
I'm interested to document my data in the same way and, furthermore, save that neat codebook as a pdf.
Is it possible to save the 'content' of ?mtcars and how is it created?
Thanks, Eric
P.S. I did read this thread.
update 2012-05-14 00:39:59 PDT
I am looking for a solution using only R; unfortunately I cannot rely on other software (e.g. Tex)
update 2012-05-14 09:49:05 PDT
Thank you very much everyone for the many answers.
Reading these answers I realized that I should have made my priorities much clearer. Therefore, here is a list of my priorities in regard to this question.
R, I am looking for a solution that is based exclusively on R.
Reproducibility, that the codebook can be part of a automated script.
Readability, the text should be easy to read.
Searchability, a file that can be open with any standard software and searched (this is why I thought pdf would be a good solution, but this is overruled by 1 through 3).
I am currently labeling my variables using label() from the Hmisc package and might end up writing a .txt codebook using Label() from the same package.
(I'm not completely sure what you're after, but):
Like other package documentation, the file for mtcars is an .Rd file. You can convert it into other formats (ASCII) than pdf, but the usual way of producing a pdf does use pdflatex.
However, most information in such an .Rd file is written more or less by hand (unless you use yet another R package like roxygen/roxygen2 help you to generate parts of it automatically.
For user-data, usually Noweb is much more convenient.
.Rnw -Sweave-> -> .tex -pdflatex-> pdf is certainly the most usual way with such files.
However, you can use it e.g. with Openoffice (if that is installed) or use it with plain ASCII files instead of TeX.
Have a look at package knitr which may be easier with pure-ASCII files. (I'm not an expert, just switching over from Sweave)
If html is an option, both Sweave and knitr can work with that.
I don't know how to get the pdf of individual data sets but you can build the pdf of the entire datasets package from the LaTeX version using:
path <- find.package('datasets')
system(paste(shQuote(file.path(R.home("bin"), "R")),"CMD",
"Rd2pdf",shQuote(path)))
I'm not sure on this but it only makes sense you'd have to have some sort of LaTeX program like MikTex. Also I'm not sure how this will work on different OS as mine is windows and this works for me.
PS this is only a partial answer to your question as you want to do this for your data, but if nothing else it may get the ball rolling.
The help page that is displayed when entering ?mtcars is generated from an .Rd file, which is a LaTeX-like file that is used for all of R's help pages. Although .Rd files are LaTeX-like, you don't actually need to know LaTeX to read or write them. The actual mtcars.Rd file is available here: http://commondatastorage.googleapis.com/jthetzel-public/mtcars.Rd , which can be viewed with any text editor.
.Rd files included in the ./man directory of a package are converted to .html files when installing the package. They are converted by functions in the "tools" package.. If you would like functionality like ?mtcars for your datasets, you would need to create a package for them. That might sound complicated if you have never created a package before, but it is easy enough to learn and will make you a better R programmer. There are a number of examples of dataset-only packages on CRAN, for example msProstate: http://cran.r-project.org/web/packages/msProstate/index.html . Consider downloading the package source to see how it is organized.
For more information on creating your own packages, writing .Rd files, and building packages:
http://cran.r-project.org/doc/manuals/R-exts.html, especially "1.1.5 Data in packages".
Edit
And if you want to convert the .Rd file in your package to a .pdf, you can do so when building your package, but you will need a LaTeX compiler. If you are on Windows, see here: http://cran.r-project.org/bin/windows/Rtools/ .
You can't create a PDF with just R; you need to use other software that creates PDFs.
You could use a combination of utils::promptData, tools::Rd2HTML, and a simple custom function to open the created HTML file in the users' browser.
It would probably be easier to just make a package containing your data sets. Look at the "datasets" package for an example.
It looks like that if you want to generate a pdf, an external tool like LaTeX is always needed. I would recommend using a simple ASCII text format to generate such a file. In principle the .Rd files are also ASCII text, but I do not find them particularly readable.
Instead, I would recommend using a plain text ASCII format such as Markdown (which is e.g. used on StackOverflow) to write the text file. Such a file is already much more readable than an .Rd formatted file, and as a bonus it can quite easily be processed into a PDF should you choose to do so later on. The knitr package I think is capable of generating PDF files from Markdown sources. In addition, knitr allows you to mix in R code in the Markdown text. This code can be evaluated and the results (even figures) added to the resulting PDF.
In practice you can use sprintf to generate character vectors that you can pipe to a file in order to dynamically generate the markdown text. Just write the template one time, and mark the places for the text you want to add later like this:
base_text = "
First header
============
This document was generated on %s, by %s.
"
text_forfile = sprintf(text, some_date, some_name)
Just dump the text in text_forfile to a .md file and your done, no external tools needed. See this post on SO for how dump text to a file.

practically getting started with Sweave

my question(s) might be less general than the title suggests. I am running R on Mac OS X with a MySQL database to store the data. I have been working with the Komodo / Sciviews-R for some time. Recently I had the need for auto-generated reports and looked into Sweave. I guess StatET / Eclipse appears to be the "standard" solution for Sweavers.
1) Is it reasonable to switch from Komodo to StatET Eclipse? I tried StatET before but chose Komodo over StatET because I liked the calltip / autosuggest and the more convenient config from Komodo so much.
2) What´s a reasonable workflow to generate Sweave files? Usually I develop my R code first and then care about the report later. I just learned today that there is one file in Sweave that contains R code and Latex code at once and that from this file the .tex document is created. While the example files look handily and can't really imagine how to enter my 250 + lines of R code to a file and mixed it up with Latex.
Is it possible to just enter the qplot() and ggplot() statements to a such a document and source the functionality like database connection and intermediate results somehow?
Or is it just a matter of being used to the mix of Latex and R code?
Thx for any suggestions, hints, links and back-to-the-roots-shout-outs…
You've asked several questions, so here's several answers;
Is StatEt/Eclipse the right way to do Sweave ?
Not nessarily (note: I'm an avid StatEt/Eclipse user, and use it for both pure R and Sweave/R and love it, I haven't used Komodo / sciviews-R). You should be able to run the sweave command from any R command line which will generate a .tex file. You can then turn the .tex file into something readable (like pdf) from any tex environment.
What's a good Sweave workflow ?
When I have wanted to turn an r script into a sweave report I generaly start with an empty sweave template and copy/paste my entire R script into a sweave R block just after the title, i.e;
<<label=myEntireRScript, echo=false, include=false>>
#Insert code here
myTable<-dataframe(...)
myPlot<-qplot(....)
#
Then I go through and find the parts I want to report. For instance, if i want to put a table into the report, I'll cut the R block and put an xtable block in, and the same for variables and plots.
<<label=myEntireRScript, echo=false, include=false>>=
#Insert code here
#
Put any text I want before my table here, maybe with a \Sexpr{print(variable)} named variable
<<label=myTable, result=Tex>>=
myTable<-dataframe(...)
print(xtable(mytable,...),...)
#
Any text I want before my figure
<label=myplot, result=figure>>=
myPlot<-qplot(....)
print(qplot)
#
You may want to look at these related SO posts. The rest of my post relates to your question 2.
When creating reports with Sweave, I usually keep most of the R code and the report text separate. If the R code is fast to run, then I prefer I will include something like the following at the start of the .Rnw file:
<<>>
source('/path/to/script.r')
#
On the other hand, if the R code takes a long time, I will often include something like the following at the end of the R script:
Sweave('/path/to/report.Rnw'); system('pdflatex report.tex')
That way, I can re-generate the report quickly, without needing to run all the R code again. Then, the only work R has to do in the Sweave file is print tables, make graphs and maybe extract a few figures.
Like nullglob, I prefer to keep the R and Sweave files separate, but I prefer to save the workspace with save.image() rather than to source() the file. This avoids running the R calculations with each .Rnw file compiling (and I always end up tinkering with the typesetting more than I'd like).
My general work flow is to do each paper/project in it's own folder with it's own R file(s). When the calculation side is "done", I save.image() to store all the workspace variables as-is.
Then, in the .Rnw file in the same directory I set the working directory with setwd() and load all variables with load(".Rdata"). Of course, you can change the name you use for your workspace, but I do one workspace per folder and keep the default name. Oh, and if you tinker with the R file, be sure save the workspace image and watch out for variables that linger in the workspace and .Rnw file, but are no longer part of the R file... this is where the save.image() approach can cause some headaches.
I am on a Mac and I suggest TextMate if you're mildly geeky and emacs/ess if you're really geeky. I use vim and command line R, but emacs/ess works best for most. If you're in this for the long haul, I doubt you'll regret learning emacs/ess for R, Sweave, and LaTeX.

R: dev.copy2pdf, multiple graphic devices to a single file, how to append to file?

I have a script that makes barplots, and opens a new window when 6 barplots have been written to the screen and keeps opening new graphic devices whenever necessary.
Depending on the input, this leaves me with a potential large number of openened windows (graphic devices) which I would like to write to a single PDF file.
Considering my Perl background, I decided to iterate over the different graphics devices, printing them out one by one. I would like to keep appending to a single PDF file, but I do not know how to do this, or if this is even possible. I would like to avoid looping in R. :)
The code I use:
for (i in 1:length(dev.list())
{
dev.set(which = dev.list()[i]
dev.copy2pdf(device = quartz, file = "/Users/Tim/Desktop/R/Filename.pdf")
}
However, this is not working as it will overwrite the file each time. Now is there an append function in R, like there is in Perl. Which allows me to keep adding pages to the existing pdf file?
Or is there a way to contain the information in a graphic window to a object, and keep adding new graphic devices to this object and finally print the whole thing to a file?
Other possible solutions I thought about:
writing different pdf files, combining them after creation (perhaps even possible in R, with the right libraries installed?)
copying the information in all different windows to one big graphic device and then print this to a pdf file.
Quick comments:
use the onefile=TRUE argument which gets passed through to pdf(), see the help pages for dev.copypdf and pdf
as a general rule, you may find it easier to open the devices directly; again see help(pdf)
So in sum, add onefile=TRUE to you call and you should be fine but consider using pdf() directly.
To further elaborate on the possibility to append to a pdf. Although, multiples graphs can be put easaly into one file it turns out that it is impossiple or at least not simple to really append a pdf once finished by dev.off() - see here.
I generate many separate pages and then join them with something like system('pdfjam pages.pdf -o output.pdf' )*

Resources