Plotting to a file from a function in R - r

Background
Hey everyone!
I'm new to using R, and became interested in using it after having a team member give a tutorial of how useful it can be in an academic setting.
I am trying to write a script to automatically read my data from multiple files and then plot the resultant graphs to multiple files, so that they can be easily added to a manuscript (PowerPoint, latex, etc.)
Problem
I have found that the following code will allow me to produce a graph
p = qplot(factor(step), y, data=x, colour=c))
p = p + theme_bw()
# etc...
wrapping this around a png call will allow me to output the plot to a PNG:
png("test.png")
p = qplot(factor(step), y, data=x, colour=c))
p = p + theme_bw()
# etc...
p
dev.off()
I wanted to put the graph creation into a function so that I can create graphs and consequent seperate PNGs. So I put everything into a function:
func <- function()
{
png("test.png")
p = qplot(factor(step), y, data=x, colour=c))
p = p + theme_bw()
# etc...
p
dev.off()
}
If I call func() A PNG is created, but it is empty. Is there any specific reason why I can do this without the function but can't when I'm calling it from a function?

When using ggplot2 or lattice non-interactively (i.e. not from the command-line), you need to explicitly print() plots that you've constructed. So just do print(p) in the final line of your code, and all should be fine.
This is unintuitive enough that it's one of the most frequent of all FAQs.

Related

How to save several plots from an own function into a list in R?

I have created one function that contains two types of plots and it will give you one image. However, the title of this image will change depending on one list, so, you will have several plots but different title.
(The original function will change the numbers that the plot uses but, in essence, is what I need).
This is the example that I have created.
list_genes <- c("GEN1", "GEN2", "GEN3")
myfunction <- function(x,y){
for(gene in list_genes){
# This to draw both plots
par(mfrow=c(2,1))
plot(x,y, main=paste0("Plot of ", gene))
hist(x, main=paste0("Plot of ", gene))
}
}
myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
Since the list has 3 elements, we get 3 plots.
However, as I need to create a presentation with all the plots that I generate, I found this post with a solution to create slides for several plots inside a for loop. It is what I want, but for that, I need to save my plots into a list/variable.
object <- myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
> object
NULL
I found this post (which gives you an interesting solution) but, the plots still cannot be saved into an object.
calling_myfunc <- function(){
myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
}
calling_myfunc()
object <- calling_myfunc()
> object
NULL
My final objective is to create a presentation (automatically) with all of the plots that I generate from my function. As I saw in this post. But I need to save the plots into a variable.
Could anyone help me with this?
Thanks very much in advance
Although I couldn't find the way to save the plots into an object, I found a way to create a presentation with those images thanks to this post and the export package.
library(export)
list_genes <- c("GEN1", "GEN2", "GEN3")
myfunction <- function(x,y){
for(gene in list_genes){
# This to draw both plots
par(mfrow=c(2,1))
plot(x,y, main=paste0("Plot of ", gene))
hist(x, main=paste0("Plot of ", gene))
graph2ppt(file="plots.pptx", width=6, height=5,append=TRUE) } }
myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
Of course, the width and height of the plots can be changed or put them as a parameter in the function.
Since the package is not available in CRAN for my current R version (4.1.2), I downloaded it from GitHub:
devtools::install_github("tomwenseleers/export")
In addition, I have found another package that I can use for the same purpose (although it adds one extra slide at the beginning, I don't know why)
library(eoffice)
list_genes <- c("GEN1", "GEN2", "GEN3")
myfunction <- function(x,y){
for(gene in list_genes){
# This to draw both plots
par(mfrow=c(2,1))
plot(x,y, main=paste0("Plot of ", gene))
hist(x, main=paste0("Plot of ", gene))
topptx(file="plots.pptx", width=6, height=5,append=TRUE)
}
}
myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
PS: I found the solution to create a presentation --> How to create a presentation in R with several plots obtained by a function?

Is there a ggplot2 analogue to the avPlots function in R?

When undertaking regression modelling it is useful to produce added variable plots for the explanatory variables in the model, to check whether the posited relationships to the response variable are appropriate to the data. The avPlots function in the car package in R takes a model input, and produces a grid of added-variable plots using the base graphics system. This function is extremely user-friendly, insofar as all you need to do is put in the model object as an argument, and it automatically produces all the added variable plots for each explanatory variable. This matrix of plots contains all the desired information, but unfortunately the plots look poor, owing to the fact that it uses the base graphics system rather than the ggplot2 package. For example, using data found here (downloaded as the file Trucking.csv) here is the output of the avPlots function.
#Load required libraries
library(car);
#Import data, fit model, and show AV plots
DATA <- read.csv('Trucking.csv');
MODEL <- lm(log(PRICPTM) ~ DISTANCE + PCTLOAD + ORIGIN + MARKET + DEREG + PRODUCT,
data = DATA);
avPlots(MODEL);
Question: Is there an equivalent function in ggplot2 that produces a matrix of each of the added-variable plots for a model, but with "prettier" plots? Is it possible to produce these plots, but then customise them using standard ggplot syntax?
I am not aware of any automated function that produces the added variable plots using ggplot. However, as well as giving a plot output as a side-effect of the function call, the avPlots function produces an object that is a list containing the data values used in each of the added variable plots. It is relatively simple to extract data frames of these variables and use these to generate added variable plots using ggplot. This can be done for a general model object using the following functions.
avPlots.invis <- function(MODEL, ...) {
ff <- tempfile()
png(filename = ff)
OUT <- car::avPlots(MODEL, ...)
dev.off()
unlink(ff)
OUT }
ggAVPLOTS <- function(MODEL, YLAB = NULL) {
#Extract the information for AV plots
AVPLOTS <- avPlots.invis(MODEL)
K <- length(AVPLOTS)
#Create the added variable plots using ggplot
GGPLOTS <- vector('list', K)
for (i in 1:K) {
DATA <- data.frame(AVPLOTS[[i]])
GGPLOTS[[i]] <- ggplot2::ggplot(aes_string(x = colnames(DATA)[1],
y = colnames(DATA)[2]),
data = DATA) +
geom_point(colour = 'blue') +
geom_smooth(method = 'lm', se = FALSE,
color = 'red', formula = y ~ x, linetype = 'dashed') +
xlab(paste0('Predictor Residual \n (',
names(DATA)[1], ' | others)')) +
ylab(paste0('Response Residual \n (',
ifelse(is.null(YLAB),
paste0(names(DATA)[2], ' | others'), YLAB), ')')) }
#Return output object
GGPLOTS }
The function ggAVPLOTS will take an input model and produce a list of ggplot objects for each of the added variable plots. These have been constructed to give "pretty" plots with blue points and a dashed red regression line through each plot. If you want all the added variable plots to show up in a single plot, it is relatively simple to do this using the grid.arrange function in the gridExtra package. Below we apply this to your model and show the resulting plot.
#Produce matrix of added variable plots
library(gridExtra)
PLOTS <- ggAVPLOTS(MODEL)
K <- length(PLOTS)
NCOL <- ceiling(sqrt(K))
AVPLOTS <- do.call("arrangeGrob", c(PLOTS, ncol = NCOL, top = 'Added Variable Plots'))
ggsave('AV Plots - Trucking.jpg', width = 10, height = 10)
It is possible to make whatever alterations you want to these plots in the ggplot code above, so if a user prefers to change the colours, font sizes, etc., this is done using standard syntax in ggplot. This method works by importing the data for the added variable plots from the avPlots function, but once you have done that, you can use this data to produce any kind of plot.

Ggplot does not show plots in sourced function

I've been trying to draw two plots using R's ggplot library in RStudio. Problem is, when I draw two within one function, only the last one displays (in RStudio's "plots" view) and the first one disappears. Even worse, when I run ggsave() after each plot - which saves them to a file - neither of them appear (but the files save as expected). However, I want to view what I've saved in the plots as I was able to before.
Is there a way I can both display what I'll be plotting in RStudio's plots view and also save them? Moreover, when the plots are not being saved, why does the display problem happen when there's more than one plot? (i.e. why does it show the last one but not the ones before?)
The code with the plotting parts are below. I've removed some parts because they seem unnecessary (but can add them if they are indeed relevant).
HHIplot = ggplot(pergame)
# some ggplot geoms and misc. here
ggsave(paste("HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
HHIAvePlot = ggplot(AveHHI, aes(x = AveHHI$n_brokers))
# some ggplot geoms and misc. here
ggsave(paste("Average HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
I've already taken a look here and here but neither have helped. Adding a print(HHIplot) or print(HHIAvePlot) after the ggsave() lines has not displayed the plot.
Many thanks in advance.
Update 1: The solution suggested below didn't work, although it works for the answer's sample code. I passed the ggplot objects to .Globalenv and print() gives me an empty gray box on the plot area (which I imagine is an empty ggplot object with no layers). I think the issue might lie in some of the layers or manipulators I have used, so I've brought the full code for one ggplot object below. Any thoughts? (Note: I've tried putting the assign() line in all possible locations in relation to ggsave() and ggplot().)
HHIplot = ggplot(pergame)
HHIplot +
geom_point(aes(x = pergame$n_brokers, y = pergame$HHI)) +
scale_y_continuous(limits = c(0,10000)) +
scale_x_discrete(breaks = gameSizes) +
labs(title = paste("HHI Index of all games,",year,"Finals"),
x = "Game Size", y = "Herfindahl-Hirschman Index") +
theme(text = element_text(size=15),axis.text.x = element_text(angle = 0, hjust = 1))
assign("HHIplot",HHIplot, envir = .GlobalEnv)
ggsave(paste("HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
I'll preface this by saying that the following is bad practice. It's considered bad practice to break a programming language's scoping rules for something as trivial as this, but here's how it's done anyway.
So within the body of your function you'll create both plots and put them into variables. Then you'll use ggsave() to write them out. Finally, you'll use assign() to push the variables to the global scope.
library(ggplot2)
myFun <- function() {
#some sample data that you should be passing into the function via arguments
df <- data.frame(x=1:10, y1=1:10, y2=10:1)
p1 <- ggplot(df, aes(x=x, y=y1))+geom_point()
p2 <- ggplot(df, aes(x=x, y=y2))+geom_point()
ggsave('p1.jpg', p1)
ggsave('p2.jpg', p2)
assign('p1', p1, envir=.GlobalEnv)
assign('p2', p2, envir=.GlobalEnv)
return()
}
Now, when you run myFun() it will write out your two plots to .jpg files, and also drop the plots into your global environment so that you can just run p1 or p2 on the console and they'll appear in RStudio's Plot pane.
ONCE AGAIN, THIS IS BAD PRACTICE
Good practice would be to not worry about the fact that they're not popping up in RStudio. They wrote out to files, and you know they did, so go look at them there.

r - Missing object when ggsave output as .svg

I'm attempting to step through a dataset and create a histogram and summary table for each factor and save the output as a .svg . The histogram is created using ggplot2 and the summary table using summary().
I have successfully used the code below to save the output to a single .pdf with each page containing the relevant histogram/table. However, when I attempt to save each histogram/table combo into a set of .svg images using ggsave only the ggplot histogram is showing up in the .svg. The table is just white space.
I've tried using dev.copy Cairo and svg but all end up with the same result: Histogram renders, but table does not. If I save the image as a .png the table shows up.
I'm using the iris data as a reproducible dataset. I'm not using R-Studio which I saw was causing some "empty plot" grief for others.
#packages used
library(ggplot2)
library(gridExtra)
library(gtable)
library(Cairo)
#Create iris histogram plot
iris.hp<-ggplot(data=iris, aes(x=Sepal.Length)) +
geom_histogram(binwidth =.25,origin=-0.125,
right = TRUE,col="white", fill="steelblue4",alpha=1) +
labs(title = "Iris Sepal Length")+
labs(x="Sepal Length", y="Count")
iris.list<-by(data = iris, INDICES = iris$Species, simplify = TRUE,FUN = function(x)
{iris.hp %+% x + ggtitle(unique(x$Species))})
#Generate list of data to create summary statistics table
sum.str<-aggregate(Sepal.Length~Species,iris,summary)
spec<-sum.str[,1]
spec.stats<-sum.str[,2]
sum.data<-data.frame(spec,spec.stats)
sum.table<-tableGrob(sum.data)
colnames(sum.data) <-c("species","sep.len.min","sep.len.1stQ","sep.len.med",
"sep.len.mean","sep. len.3rdQ","sep.len.max")
table.list<-by(data = sum.data, INDICES = sum.data$"species", simplify = TRUE,
FUN = function(x) {tableGrob(x)})
#Combined histogram and summary table across multiple plots
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list,table.list))),
nrow=2, ncol=1, top = quote(paste(iris$labels$Species,'\nPage', g, 'of',pages)))
#bypass the class check per #baptiste
ggsave <- ggplot2::ggsave; body(ggsave) <- body(ggplot2::ggsave)[-2]
#
for(i in 1:3){
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list[i],table.list[i]))),
nrow=2, ncol=1,heights=c(1.65,.35),
top = quote(paste(iris$labels$Species,'\nPage', g, 'of',pages)))
prefix<-unique(iris$Species)
prefix<-prefix[i]
filename<-paste(prefix,".svg",sep="")
ggsave(filename,multi.plots)
#dev.off()
}
Edit removed theme tt3 that #rawr referenced. It was accidentally left in example code. It was not causing the problem, just in case anyone was curious.
Edit: Removing previous answer regarding it working under 32bit install and not x64 install because that was not the problem. Still unsure what was causing the issue, but it is working now. Leaving the info about grid.export as it may be a useful alternative for someone else.
Below is the loop for saving the .svg's using grid.export(), although I was having some text formatting issues with this (different dataset).
for(i in 1:3){
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list[i],table.list[i]))),
nrow=2, ncol=1,heights=c(1.65,.35), top =quote(paste(iris$labels$Species,'\nPage', g,
'of',pages)))
prefix<-unique(iris$Species)
prefix<-prefix[i]
filename<-paste(prefix,".svg",sep="")
grid.draw(multi.plots)
grid.export(filename)
grid.newpage()
}
EDIT: As for using arrangeGrob per #baptiste's comment. Below is the updated code. I was incorrectly using the single brackets [] for the returned by list, so I switched to the correct double brackets [[]] and used grid.draw to on the ggsave call.
for(i in 1:3){
prefix<-unique(iris$Species)
prefix<-prefix[i]
multi.plots<-grid.arrange(arrangeGrob(iris.list[[i]],table.list[[i]],
nrow=2,ncol=1,top = quote(paste(iris$labels$Species))))
filename<-paste(prefix,".svg",sep="")
ggsave(filename,grid.draw(multi.plots))
}

Adding points after the fact with ggplot2; user defined function

I believe the answer to this is that I cannot, but rather than give in utterly to depraved desperation, I will turn to this lovely community.
How can I add points (or any additional layer) to a ggplot after already plotting it? Generally I would save the plot to a variable and then just tack on + geom_point(...), but I am trying to include this in a function I am writing. I would like the function to make a new plot if plot=T, and add points to the existing plot if plot=F. I can do this with the basic plotting package:
fun <- function(df,plot=TRUE,...) {
...
if (!plot) { points(dYdX~Time.Dec,data=df2,col=col) }
else { plot(dYdX~Time.Dec,data=df2,...) }}
I would like to run this function numerous times with different dataframes, resulting in a plot with multiple series plotted.
For example,
fun(df.a,plot=T)
fun(df.b,plot=F)
fun(df.c,plot=F)
fun(df.d,plot=F)
The problem is that because functions in R don't have side-effects, I cannot access the plot made in the first command. I cannot save the plot to -> p, and then recall p in the later functions. At least, I don't think I can.
have a ggplot plot object be returned from your function that you can feed to your next function call like this:
ggfun = function(df, oldplot, plot=T){
...
if(plot){
outplot = ggplot(df, ...) + geom_point(df, ...)
}else{
outplot = oldplot + geom_point(data=df, ...)
}
print(outplot)
return(outplot)
}
remember to assign the plot object returned to a variable:
cur.plot = ggfun(...)

Resources