Getting foreach() and ggplot2 to get along - r

I have a set of survey data, and I'd like to generate plots of a particular variable, grouped by the respondent's country. The code I have written to generate the plots so far is:
countries <- isplit(drones, drones$v3)
foreach(country = countries) %dopar% {
png(file = paste(output.exp, "/Histogram of Job Satisfaction in ", country$key[[1]], ".png", sep = ""))
country.df <- data.frame(country) #ggplot2 doesn't appreciate the lists nextElem() produces
ggplot(country.df, aes(x = value.v51)) + geom_histogram()
dev.off()
}
The truly bizarre thing? I can run the isplit(), set country <- nextElem(countries), and then run through the code without sending the foreach line - and get a lovely plot. If I send the foreach, I get some blank .png files.
I can definitely do this with standard R loops, but I'd really like to get a better grasp on foreach.

You need to print the plot if you want it to display:
print(ggplot(country.df, aes(x = value.v51)) + geom_histogram())
By default, ggplot commands return a plot object but the command itself does not actually display the plot; that is done with the print command. Note that when you run code interactively, results of commands get printed which is why you often don't need the explicit print. But when wrapping in a foreach, you need to explicitly print since the results of the commands in the body will not be echoed.

Related

Using a plot call in a function without evaluating it

In a Rmarkdown document, I want to accellerate the knitting process by only building the plots when they have not been already built and saved.
I did this using the code below, with a simple example.
x = rnorm(10)
if (! "figurename" %in% dir("figure")) {
png("figure/figurename.png")
hist(x)
dev.off()
}
Now I want to make a function, that does the above command automatically, with a plot call as input. Also, the plot call should not have been evaluated (too slow!). I learned about the substitute command and I wrote this :
x = rnorm(10)
plot_call = substitute(hist(x))
function(plot_call, figurename){
if (! figurename %in% dir("figure")) {
png(file.path("figure", figurename))
eval(plot_call)
dev.off()
}
knitr::include_graphics(file.path("figure", figurename))
}
I have two issues with this :
it does not seem to work with multiple lines plot calls
it seems like a dubious hack
What do you think? Is there a better way ?
A more formal way to cache code blocks is to leverage chunk options. Adding a simple cache = TRUE to the block header will force your plot to be re-evaluated each time block options or the code itself will change:
```{r expensive_plot, cache = TRUE}
# Some expensive plot
df %>%
ggplot(aes(x, y)) +
geom_point()
```
If the plot needs to be recomputed each time there's a change in the underlying data, it's possible to invalidate cache each time the file 'last edited' field changes by adding cache.extra = file.mtime('your-csv.csv') to your options.

R: Create a User-defined Function that Plots data and then Exports the plot

Simple objective: I'm trying to create a user-defined fn (udf) that will allow me to combine a few lines of code so it's condensed and easier to read.
Let's take an easy example. In one fn, I'd like to plot some data and then export the plot using the tiff fn (or any similar fn: png, jpeg, ...). However, I want to be able to use any plot function of my choice, so I thought passing a code block (let's call it 'plotcode') to a fn call would work just fine. Something along the lines of:
udf <- function(plotcode, title){
plotcode
tiff(filename=title)
plotcode
dev.off()
}
The ultimate goal being to turn 4 lines of code into 1, i.e.:
plot(dat) # ====> udf(plotcode = plot(dat), title = "plot.tiff")
tiff("myplot.tiff")
plot(dat)
dev.off()
Unfortunately, it only plots the data and does not export it to a file. From what I understand, it's because I'm opening a graphics device when I use the export function, so it doesn't recognize the plotcode anymore?
I did find one alternative solution, the do.call fn, but it makes you separate the plotcode into two pieces: (1)fn name & (2)arguments. I'm hoping to keep the plotcode together so it's easer to copy/paste later.
Maybe there's even another option I'm not thinking of?
Following code lets you pass code that creates a plot, create the plot and save the plot.
library(tidyverse)
udf <- function(plotcode, title){
eval(plotcode)
ggsave(title)
}
udf('qplot(mtcars$mpg)', 'plot.jpg')

Why am I unable to see multiple plots appear with a for loop?

This is my code which is part of a larger script.
for(d1 in names(survD)){
survfit1 <- survfit(Surv(time=survD[[d1]][,"time"],
event=survD[[d1]][,"death"],type='right')~1)
png(paste(survPath,"/surv_",d1,".png",sep=""))
plot(survfit1,xlab="Years",ylab="Survival probability",xmax=xmax1)
}
I don't have a good idea of what this code does yet, so I'm trying to look at each individual plot to see what it is. The problem is, whenever I run this in the R command line in the terminal in linux, nothing appears. I have to use dev.off() multiple times and then rerun this code:
plot(survfit1)
for something to appear. How can I see all the plots?
Sounds like this is really what you want:
for(d1 in names(survD)){
survfit1 <- survfit(Surv(time=survD[[d1]][,"time"],
event=survD[[d1]][,"death"],type='right')~1)
x11() ## open up new graphical window for each plot (to avoid overwriting)
plot(survfit1,xlab="Years",ylab="Survival probability",
xmax=xmax1, main = d1) ## use different titles to distinguish those plots
}
This will produce plots on normal graphical windows.
If you want to use the original code, you'd better do this way:
for(d1 in names(survD)){
survfit1 <- survfit(Surv(time=survD[[d1]][,"time"],
event=survD[[d1]][,"death"],type='right')~1)
png(paste(survPath,"/surv_",d1,".png",sep=""))
plot(survfit1,xlab="Years",ylab="Survival probability",xmax=xmax1)
dev.off()
}
Then, have a look at the directory given by getwd(). All the plots are saved in png files.
Calling Sys.sleep(.1) might help during the for loop. Maybe try:
for(d1 in names(survD)){
survfit1 <- survfit(Surv(time=survD[[d1]][,"time"],
event=survD[[d1]][,"death"],type='right')~1)
Sys.sleep(.1)
png(paste(survPath,"/surv_",d1,".png",sep="", collapse="))
plot(survfit1,xlab="Years",ylab="Survival probability",xmax=xmax1)
dev.off()
}

Creating Multiple Plots in Multiple files with ggplot2 in R

Having an unusual problem with creating multiple files in R with ggplot2.
I've got multiple plots to create for multiple people, so I'm creating all the plots for each person in a pdf. So it goes something like this...
for(i in 1:10)
{
pdf(paste("person",i,".pdf",sep=""))
ggplot2(...)+.........
ggplot2(...)+.........
ggplot2(...)+.........
ggplot2(...)+.........
dev.off()
}
I've verified that all the code to create the plots is working and that creating a single pdf works, no problems there. The problem arises when I try to run the loop, it creates the files, but they're blank. I've tried everything I can think of and can't seem to find any information about this. I've tried in RStudio (Windows) and command line (ubuntu), both create the same issue.
Any insight or an alternative would be appreciated, thanks
You need to use print for each plot want you output into a pdf.
library(ggplot2)
dat = data.frame(x1=rnorm(10), x2=rnorm(10))
for(i in 1:2){
pdf(paste("person",i,".pdf",sep=""))
p1 = ggplot(dat, aes(x=x1)) + geom_histogram()
p2 = ggplot(dat, aes(x=x2)) + geom_histogram()
print(p1)
print(p2)
dev.off()
}

Print to PDF in a for loop

I want to loop over a plot and put the result of the plot in a PDF.
The following code is used to do this:
What this does is loop 3 times and plot 3 different plots from the iris dataset. Then it should save it to the C:/ drive. The PDF files are created, but are corrupted.
for(i in 1:3){
pdf(paste("c:/", i, ".pdf", sep=""))
plot(cbind(iris[1], iris[i]))
dev.off()
}
To drawn lattice plots on the device, one needs to print the object produced by a call to one of the lattice graphics functions. Normally, in interactive use, R auto prints objects if not assigned. In loops however, auto printing does not work, so one must arrange for the object to be printed, usually by wrapping it in print().
Here is an example (please excuse my abuse of the formula notation ;-):
require(lattice)
for(i in 1:3) {
pdf(paste("plot", i, ".pdf", sep = ""))
print(xyplot(iris[,1] ~ iris[,i], data = iris))
dev.off()
}
This produces the three plots on a pdf device.
Is a file name that contains "c:/" a valid file name on your OS? That looks like part of the working directory that you'd want to set before calling pdf. I get an error telling me it can't open that file:
Error in pdf(paste("c:/", i, ".pdf", sep = "")) :
cannot open file 'c:/1.pdf'
If I drop the "c:/" bit from the file name, three PDFs are generated properly. Also, if you move the dev.off() outside of the for loop, you'll get a single PDF with three pages instead of three PDFs. May or may not be what you want...
for(i in 1:3){
pdf(paste("plot", i,".pdf",sep=""))
plot(cbind(iris[1],iris[i]))
dev.off()
}

Resources