I am trying to create a pdf with several plots. More specifically, I want to save my plots, 4 in each page. Therefore, I have the following code in r (which works, but leaves a page empty -the first one-):
pdf("Plots/plots_numeric_four_in_page.pdf",paper="a4r",width = 14)
graphlist <- lapply(3:NCOL(agg_num), function(i) {
force(i)
tempColName=dataName_num[i]
print (tempColName)
p<-qplot(Group.1,agg_num[[tempColName]],data = agg_num,color=Group.2,geom = "line",main=tempColName) + xlab("Date") + ylab(paste("Count of ", tempColName)) + geom_line(size=1.5)+ scale_x_date(labels = date_format("%m/%Y"))+
theme(legend.position="bottom",legend.direction="horizontal")+ guides(col=guide_legend(ncol=3))
})
do.call("marrangeGrob",c(graphlist,ncol=2,nrow=2))
dev.off()
It correctly displays around 50 plots, 4 in each page correctly in a PDF. However, it leaves the first page empty and starts from the second. I looked at marrangeGrob options, but I couldnt find anything to address the problem. Do you know any workaround, or any way to resolve this issue?
There's a known bug between ggplot2 in gridExtra that's causing this for some marrangeGrob's that contain ggplots. Manually overriding the grid.draw.arrangelist function (src) (marrangeGrob returns an arrangelist object) may potentially fix it (suggested here).
grid.draw.arrangelist <- function(x, ...) {
for(ii in seq_along(x)){
if(ii>1) grid.newpage() # skips grid.newpage() call the first time around
grid.draw(x[[ii]])
}
}
It may be safer to define a new class for the arrangelist object in question and apply the fix to it than override grid.draw for every marrageGrob call in scope.
Related
I have recently been going through some of the modules on Software Carpentry's "Programming in R" lesson, and I have experienced problems displaying some of my plots in RStudio.
Here is the link to the site:
http://swcarpentry.github.io/r-novice-inflammation/04-cond/index.html
Here is the page with the download link for the inflammation data files (r-novice-inflammation-data.zip) if necessary:
http://swcarpentry.github.io/r-novice-inflammation/setup.html
And the practice problem that I am on is called "Choosing Plots Based on Data"
I was supposed to create a function that creates a box plot or stripchart for a file on patient inflammation data based on whether the vector of that data that is called meets a certain threshold. Here is my code:
dat <- read.csv(file = "data/inflammation-01.csv", header = FALSE)
plot_dist <- function(x, threshold) {
if (length(x) > threshold) {
boxplot(x)
} else {
stripchart(x)
}
}
plot_dist(dat[, 10], threshold = 10)
When I call the function, however, no plots are displayed. I went to the plots section in Rstudio, but still found no plots. On my console, the output after calling this function is something like:
Does anyone have any ideas why my plots aren't being displayed? I don't think there is anything wrong with the function, since I don't get any errors after calling it, it just doesn't plot anything. Thanks much!
In a Rmarkdown document, I want to accellerate the knitting process by only building the plots when they have not been already built and saved.
I did this using the code below, with a simple example.
x = rnorm(10)
if (! "figurename" %in% dir("figure")) {
png("figure/figurename.png")
hist(x)
dev.off()
}
Now I want to make a function, that does the above command automatically, with a plot call as input. Also, the plot call should not have been evaluated (too slow!). I learned about the substitute command and I wrote this :
x = rnorm(10)
plot_call = substitute(hist(x))
function(plot_call, figurename){
if (! figurename %in% dir("figure")) {
png(file.path("figure", figurename))
eval(plot_call)
dev.off()
}
knitr::include_graphics(file.path("figure", figurename))
}
I have two issues with this :
it does not seem to work with multiple lines plot calls
it seems like a dubious hack
What do you think? Is there a better way ?
A more formal way to cache code blocks is to leverage chunk options. Adding a simple cache = TRUE to the block header will force your plot to be re-evaluated each time block options or the code itself will change:
```{r expensive_plot, cache = TRUE}
# Some expensive plot
df %>%
ggplot(aes(x, y)) +
geom_point()
```
If the plot needs to be recomputed each time there's a change in the underlying data, it's possible to invalidate cache each time the file 'last edited' field changes by adding cache.extra = file.mtime('your-csv.csv') to your options.
The following code produces an image. No problem.
change <- function(score, d, k, p) {k*(score - 1/(1+k^(d/p)))}
parameters <- c(10:110)
colorshelf <-rainbow(length(parameters), start=1/6) #yellow is low
for(i in seq_along(parameters)) {
curve(change(score=1, d=x, k=parameters[i], p=-800), from=-500, to=500, add=T, ylim=c(0, 100), col=colorshelf[i], xlab="rating difference", ylab="gain for winning")
}
legend.index <- round(quantile(seq_along(parameters)))
legend.param <- legend.index + min(parameters)
legend.color <- colorshelf[legend.index]
legend("right", title="k-factor", lty=c(1,1), legend=legend.param, col=legend.color)
Now I would like to save the image to a file with specified resolution. So I add:
png(filename="gain by ratingdiff.png", res=30, width = 1000, height = 1000)
and
dev.off()
before and after the code block. But then I get two errors, complaining about plot.new has not been called yet.
I know this issue came up like a million times. And there are so many posts about this here on stackoverflow. But none of these really helped me out. I tried adding plot.new() at different places in the code. But that did not help.
The help page on plot.new() reads:
"This function (frame is an alias for plot.new) causes the completion of plotting in the current plot (if there is one) and an advance to a new graphics frame. This is used in all high-level plotting functions and also useful for skipping plots when a multi-figure region is in use. "
But is this really what I want? I mean, I want to draw everything in one graphics device, so why would I want to cause the completion of plotting, except maybe at the end of the code.
Others have suggested, the problem is related to the usage of RStudio, but I do not use RStudio. I use Notepad++ in combination with NppToR.
Also, someone suggested to add { } around the code block (did not work).
Please help.
Before using curve()function is needed to run plot(). That is why you have a problem when saving the plot.
Before running:
for(i in seq_along(parameters)) {
curve(change(score=1, d=x, k=parameters[i], p=-800), from=-500, to=500, add=T, ylim=c(0, 100), col=colorshelf[i], xlab="rating difference", ylab="gain for winning")}
you need to run plot() giving the margins, labels and information useful to represent your images.
I'm trying to save a ggplot within a function using graphics devices. But I found the code produces empty graphs. Below is a very very simple example.
library(ggplot2)
ff <- function(){
jpeg("a.jpg")
qplot(1:20, 1:20)
dev.off()
}
ff()
If I only run the content of the function, everything is fine. I know that using ggsave() will do the thing that I want, but I am just wondering why jpeg() plus dev.off() doesn't work. I tried this with different versions of R, and the problem persists.
You should use ggsave instead of the jpeg(); print(p); dev.off() sequence. ggsave is a wrapper that does exactly what you intend to do with your function, except that it offers more options and versatility. You can specify the type of output explicitly, e.g. jpg or pdf, or it will guess from your filename extension.
So your code might become something like:
p <- qplot(1:20, 1:20)
ggsave(filename="a.jpg", plot=p)
See ?ggsave for more details
The reason why the original behaviour in your code doesn't worked is indeed a frequently asked question (on stackoverlflow as well as the R FAQs on CRAN). You need to insert a print statement to print the plot. In the interactive console, the print is silently execututed in the background.
These plots have to be printed:
ff <- function(){
jpeg("a.jpg")
p <- qplot(1:20, 1:20)
print(p)
dev.off()
}
ff()
This is a very common mistake.
I have a set of survey data, and I'd like to generate plots of a particular variable, grouped by the respondent's country. The code I have written to generate the plots so far is:
countries <- isplit(drones, drones$v3)
foreach(country = countries) %dopar% {
png(file = paste(output.exp, "/Histogram of Job Satisfaction in ", country$key[[1]], ".png", sep = ""))
country.df <- data.frame(country) #ggplot2 doesn't appreciate the lists nextElem() produces
ggplot(country.df, aes(x = value.v51)) + geom_histogram()
dev.off()
}
The truly bizarre thing? I can run the isplit(), set country <- nextElem(countries), and then run through the code without sending the foreach line - and get a lovely plot. If I send the foreach, I get some blank .png files.
I can definitely do this with standard R loops, but I'd really like to get a better grasp on foreach.
You need to print the plot if you want it to display:
print(ggplot(country.df, aes(x = value.v51)) + geom_histogram())
By default, ggplot commands return a plot object but the command itself does not actually display the plot; that is done with the print command. Note that when you run code interactively, results of commands get printed which is why you often don't need the explicit print. But when wrapping in a foreach, you need to explicitly print since the results of the commands in the body will not be echoed.