Creating multiple plots within a loop and saving in R? - r

I am having trouble saving multiple plots from the output of a loop. To give some background:
I have multiple data frames, each with the data for single chemical toxicity for multiple species. I have labelled each data frame for the chemical that it represents, ie "ChemicalX". The data is in this format as this is how the "SSDTools" package works, which creates a species sensitivity distribution for a single chemical.
Because I have a lot of chemicals, I want to create a loop that iterates over each data frame, calculates the required metrics to create an SSD, plot the SSD, and then save the plot.
The code below works for calculating all of metrics and plotting the SSDs - it only breaks when I try to create a title within the loop, and when I try to save the plot within the loop
For reference, I am using the packages:
SSDTools, ggplot2, tidyverse, fitdistrplus
My code is as follows:
# Create a list of data frames
list_dfs <- list(ChemicalX, ChemicalY, ChemicalZ)
# make the loop
for (i in list_dfs){ # for each chemical (ie data frame)
ssd.fits <- ssd_fit_dists(i, dists = c("llogis", "gamma", "lnorm", "gompertz", "lgumbel", "weibull", "burrIII3", "invpareto", "llogis_llogis", "lnorm_lnorm")) # Test the goodness of fit using all distributions available
ssd.gof_fits <- ssd_gof(ssd.fits) # Save the goodness of fit statistics
chosen_dist <- ssd.gof_fits %>% # Choose the best fit distribution by
filter(aicc==min(aicc)) # finding the minimum aicc
final.fit <- ssd_fit_dists(i, dists = chosen_dist$dist) # Use the chosen distribution only
final.predict <-predict(final.fit, ci = TRUE) # generate the final predictions
plotdata <- i # create a separate plot data frame
final.plot <- ssd_plot(plotdata, final.predict, # generate the final plot
color = "Taxa",
label = "Species",
xlab = "Concentration (ug/L)", ribbon = TRUE) +
expand_limits(x = 10e6) + # to ensure the species labels fit
ggtitle(paste("Species Sensitivity for",chem_names_df[i], sep = " ")) +
scale_colour_ssd()
ggsave(filename = paste("SSD for",chem_names_df[i], ".pdf", sep = ""),
plot = final.plot)
}
The code works great right up until the last part, where I want to create a title for each chemical in each iteration, and where I want to save the filename as the chemical name.
I have two issues:
I want the title of the plot to be "Species Sensitivity for ChemicalX", with ChemicalX being the name of the data frame. However, when I use the following code the title gets all messed up, and gives me a list of the species in that data frame (see image).
ggtitle(paste("Species Sensitivity. for",i, sep = " "))
Graph title output using "i"
To try and get around this, I created a vector of chemical names that matches the order of the data frame list, called "chem_names_df". When I use ggtitle(paste("Species Sensitivity for",chem_names_df[i], sep = " ")) however, it gives me the error of Error in chem_names_df[i] : invalid subscript type 'list'
A similar issue is happening when I try to save the plot using GGSave. I am trying to save the filenames for each chemical data frame as "SSD_ChemicalX", except similarly to above it just outputs a list of the species in the place of i.
I think it has something to do with how R is calling from my list of dataframes - I am not sure why it is calling the species list (ie c("Danio Rerio, Lepomis macrochirus,...)) instead of the chemical name.
Any help would be appreciated! Thank you!

Basically your problem here is that you are sometimes using i as if it is an index, and sometimes as if it is a data frame, but in fact it is a data frame.
Your example is not reproducible so let me provide one. You have done the equivalent of:
list_dfs2 <- list(mtcars, mtcars, cars)
for(i in list_dfs2){
print(i)
}
This is just going to print the whole mtcars dataset twice and then the cars dataset. You can then define a vector:
cars_names <- c("mtcars", "mtcars", "cars")
If you call cars_names[i], on the first iteration you're not calling cars_names[1], you're trying to subset a vector with an entire data frame. That won't work. Better to seq_along() your list of data frames and then subset it with list_dfs[[i]] when you want to refer to the actual data frame rather than the index, i. Something like:
# Create a list of data frames
list_dfs <- list(ChemicalX, ChemicalY, ChemicalZ)
# make the loop
for (i in seq_along(list_dfs)){ # for each chemical (ie data frame)
ssd.fits <- ssd_fit_dists(list_dfs[[i]], dists = c("llogis", "gamma", "lnorm", "gompertz", "lgumbel", "weibull", "burrIII3", "invpareto", "llogis_llogis", "lnorm_lnorm")) # Test the goodness of fit using all distributions available
ssd.gof_fits <- ssd_gof(ssd.fits) # Save the goodness of fit statistics
chosen_dist <- ssd.gof_fits %>% # Choose the best fit distribution by
filter(aicc==min(aicc)) # finding the minimum aicc
final.fit <- ssd_fit_dists(list_dfs[[i]], dists = chosen_dist$dist) # Use the chosen distribution only
final.predict <-predict(final.fit, ci = TRUE) # generate the final predictions
plotdata <- list_dfs[[i]] # create a separate plot data frame
final.plot <- ssd_plot(plotdata, final.predict, # generate the final plot
color = "Taxa",
label = "Species",
xlab = "Concentration (ug/L)", ribbon = TRUE) +
expand_limits(x = 10e6) + # to ensure the species labels fit
ggtitle(paste("Species Sensitivity for",chem_names_df[i], sep = " ")) +
scale_colour_ssd()
ggsave(filename = paste("SSD for",chem_names_df[i], ".pdf", sep = ""),
plot = final.plot)
}

Consider using a defined method that receives name and data frame as input parameters. Then, pass a named list into the method using Map to iterate through data frames and corresponding names elementwise:
Function
build_plot <- function(plotdata, plotname) {
# Test the goodness of fit using all distributions available
ssd.fits <- ssd_fit_dists(
plotdata,
dists = c(
"llogis", "gamma", "lnorm", "gompertz", "lgumbel", "weibull",
"burrIII3", "invpareto", "llogis_llogis", "lnorm_lnorm"
)
)
# Save the goodness of fit statistics
ssd.gof_fits <- ssd_gof(ssd.fits)
# Choose the best fit distribution by finding the minimum aicc
chosen_dist <- filter(ssd.gof_fits, aicc==min(aicc))
# Use the chosen distribution only
final.fit <- ssd_fit_dists(plotdata, dists = chosen_dist$dist)
# generate the final predictions
final.predict <- predict(final.fit, ci = TRUE)
# generate the final plot
final.plot <- ssd_plot(
plotdata, final.predict, color = "Taxa", label = "Species",
xlab = "Concentration (ug/L)", ribbon = TRUE) +
expand_limits(x = 10e6) + # to ensure the species labels fit
ggtitle(paste("Species Sensitivity for", plotname)) +
scale_colour_ssd()
# export plot to pdf
ggsave(filename = paste0("SSD for ", plotname, ".pdf"), plot = final.plot)
# return plot to environment
return(final.plot)
}
Call
# create a named list of data frames
chem_dfs <- list(
"ChemicalX"=ChemicalX, "ChemicalY"=ChemicalY, "ChemicalZ"=ChemicalZ
)
chem_plots <- Map(build_plot, chem_dfs, names(chem_dfs))

Related

Looping and Saving Scatterplots in R

I have a data frame that consists of 2 columns and 3110 rows. The X column is a constant, where as the Y column changes each row. I am looking to create a loop that will generate a scatter plot for each row, and ultimately save the scatter plots onto my desktop.
The original code that I would use to create one scatter plot is:
X <- Abundances$s__Coprobacillus_cateniformis
Y <- Abundances$Gene1
plot(X, Y, main = "Species Vs Gene Expression",
xlab = "s__Coprobacillus_cateniformis", ylab = "Gene1",
pch = 19, frame = FALSE)
So, the X variable is a specie name, and will stay constant. The Y variable is a gene name, and will change for each of the 3110 plots. I am using the percentage abundances for the gene expression and the specie's from another data frame called "Abundances".
A short snippet of my data looks like so, it has 2 columns, one column called Predictor, and one column called response:
Response <- c("ENSG00000000005.5", "ENSG00000001167.10", "ENSG00000001617.7", "ENSG00000003393.10", "ENSG00000004142.7")
Predictor <- c("s__Coprobacillus_cateniformis", "s__Coprobacillus_cateniformis", "s__Coprobacillus_cateniformis", "s__Coprobacillus_cateniformis", "s__Coprobacillus_cateniformis" )
If anyone could help me generate a loop that could create a scatter plot for each individual gene (on the y axis), against the specie on the X axis, and then immediately save these plots on my desktop, that would be great!
Thanks.
It's impossible to test without a sample from Abundances, but I think this is on the right track. The key thing to note is that $ doesn't work with strings, but [[ does: Abundances$Gene1 is the same as Abundances[["Gene1"]] is the same as col = "Gene1"; Abundances[[col]].
for(i in seq_along(Response)) {
png(filename = paste0("plot_", Response[i], ".png"))
X <- Abundances[[Predictor[i]]]
Y <- Abundances[[Response[i]]]
plot(X, Y, main = "Species Vs Gene Expression",
xlab = Response[i], ylab = Predictor[i],
pch = 19, frame = FALSE)
dev.off()
}
If you want the plots on your desktop, set that as the working directory or put the paste to your desktop as part of the filename.

Saving output plot in R with grid.grab() doesn't work

I've been trying to save multiple plot generated with the meta package in R, used to conduct meta-analysis, but I have some troubles. I need to save this plot to arrange them in a multiple plot figure.
Example data:
s <- data.frame(Study = paste0("Study", 1:15),
event.e = sample(1:100, 15),
n.e = sample(100:300, 15))
meta1 <- meta::metaprop(event = event.e,
n= n.e,
data=s,
studlab = Study)
Here is the code:
meta::funnel(meta1)
funnelplot <- grid::grid.grab()
I can see the figure in the "plot" tab in R Studio; However, if I search the funnelplot object in the environment it say that is a "NULL" type, and obviously trying to recall that doesn't work.
How can I fix it?

Automatically plots with autoplot function from forecasting object

I am foresting with combination of data sets from fpp2 package and forecasting function from the forecast package. Output from this forecasting is object list with SNAIVE_MODELS_ALL. This object contain data separate for two series, where first is Electricity and second is Cement.
You can see code below :
# CODE
library(fpp2)
library(dplyr)
library(forecast)
library(gridExtra)
library(ggplot2)
#INPUT DATA
mydata_qauselec <- qauselec
mydata_qcement <- window(qcement, start = 1956, end = c(2010, 2))
# Мerging data
mydata <- cbind(mydata_qauselec, mydata_qcement)
colnames(mydata) <- c("Electricity", "Cement")
# Test Extract Name
mydata1 <- data.frame(mydata)
COL_NAMES <- names(mydata1)
rm(mydata_qauselec, mydata_qcement)
# FORCASTING HORIZON
forecast_horizon <- 12
#FORCASTING
BuildForecast <- function(Z, hrz = forecast_horizon) {
timeseries <- msts(Z, start = 1956, seasonal.periods = 4)
forecast <- snaive(timeseries, biasadj = TRUE, h = hrz)
}
frc_list <- lapply(X = mydata1, BuildForecast)
#FINAL FORCASTING
SNAIVE_MODELS_ALL<-lapply(frc_list, forecast)
So my intention here is to put this object SNAIVE_MODELS_ALL into autoplot function in order to get two plots like pic below.
With code below I draw both plots separate, but my main intention is to do this with function autoplot and some function like apply or something similar, which can automatically draw this two chart like pic above.This is only small example in real example I will have maybe 5 or 10 charts.
#PLOT 1
P_PLOT1<-autoplot(SNAIVE_Electricity,main = "Snaive Electricity forecast",xlab = "Year", ylab = "in billion kWh")+
autolayer(SNAIVE_Electricity,series="Data")+
autolayer(SNAIVE_Electricity$fitted,series="Forecasts")
# PLOT 2
P_PLOT2<-autoplot(SNAIVE_Cement,main = "Snaive Cement forecast",xlab = "Year", ylab = "in millions of tonnes")+
autolayer(SNAIVE_Cement,series="Data")+
autolayer(SNAIVE_Cement$fitted,series="Forecasts")
#UNION PLOTS (PLOT 1 AND PLOT 2)
SNAIVE_PLOT_ALL<-grid.arrange(P_PLOT1,P_PLOT2)
So can anybody help me with this code ?
If I understand in a proper way, one of the difficulties with that problem is that each plot should have a specific title and y label. One of the possible solutions is to set the plot titles and y-lables as function arguments:
PlotForecast <- function(df_pl, main_pl, ylab_plt){
autoplot(df_pl,
main = main_pl,
xlab = "Year", ylab = ylab_plt)+
autolayer(df_pl,series="Data")+
autolayer(df_pl$fitted,series="Forecasts")
}
Prepare lists of the plot labels to be used with PlotForecast():
main_lst <- list("Snaive Electricity forecast", "Snaive Cement forecast")
ylab_lst <- list("in billion kWh", "in millions of tonnes")
Construct a list of plot-objects using a base Map() function:
PL_list <- Map(PlotForecast, df_pl = SNAIVE_MODELS_ALL, main_pl = main_lst,
ylab_plt= ylab_lst)
Then all we have to do is to call grid.arrange() with the plot list:
do.call(grid.arrange, PL_list)
Note, please, that main_lst and ylab_lst are created manually for demonstration purposes, but it is not the best way if you work with a lot of charts. Ideally, the labels should be generated automatically using the original SNAIVE_PLOT_ALL list.

Plotting quantile regression by variables in a single page

I am running quantile regressions for several independent variables separately (same dependent). I want to plot only the slope estimates over several quantiles of each variable in a single plot.
Here's a toy data:
set.seed(1988)
y <- rnorm(50, 5, 3)
x1 <- rnorm(50, 3, 1)
x2 <- rnorm(50, 1, 0.5)
# Running Quantile Regression
require(quantreg)
fit1 <- summary(rq(y~x1, tau=1:9/10), se="boot")
fit2 <- summary(rq(y~x2, tau=1:9/10), se="boot")
I want to plot only the slope estimates over quantiles. Hence, I am giving parm=2 in plot.
plot(fit1, parm=2)
plot(fit2, parm=2)
Now, I want to combine both these plots in a single page.
What I have tried so far;
I tried setting par(mfrow=c(2,2)) and plotting them. But it's producing a blank page.
I have tried using gridExtra and gridGraphics without success. Tried to convert base graphs into Grob objects as stated here
Tried using function layout function as in this document
I am trying to look into the source code of plot.rqs. But I am unable to understand how it's plotting confidence bands (I'm able to plot only the coefficients over quantiles) or to change mfrow parameter there.
Can anybody point out where am I going wrong? Should I look into the source code of plot.rqs and change any parameters there?
While quantreg::plot.summary.rqs has an mfrow parameter, it uses it to override par('mfrow') so as to facet over parm values, which is not what you want to do.
One alternative is to parse the objects and plot manually. You can pull the tau values and coefficient matrix out of fit1 and fit2, which are just lists of values for each tau, so in tidyverse grammar,
library(tidyverse)
c(fit1, fit2) %>% # concatenate lists, flattening to one level
# iterate over list and rbind to data.frame
map_dfr(~cbind(tau = .x[['tau']], # from each list element, cbind the tau...
coef(.x) %>% # ...and the coefficient matrix,
data.frame(check.names = TRUE) %>% # cleaned a little
rownames_to_column('term'))) %>%
filter(term != '(Intercept)') %>% # drop intercept rows
# initialize plot and map variables to aesthetics (positions)
ggplot(aes(x = tau, y = Value,
ymin = Value - Std..Error,
ymax = Value + Std..Error)) +
geom_ribbon(alpha = 0.5) +
geom_line(color = 'blue') +
facet_wrap(~term, nrow = 2) # make a plot for each value of `term`
Pull more out of the objects if you like, add the horizontal lines of the original, and otherwise go wild.
Another option is to use magick to capture the original images (or save them with any device and reread them) and manually combine them:
library(magick)
plots <- image_graph(height = 300) # graphics device to capture plots in image stack
plot(fit1, parm = 2)
plot(fit2, parm = 2)
dev.off()
im1 <- image_append(plots, stack = TRUE) # attach images in stack top to bottom
image_write(im1, 'rq.png')
The function plot used by quantreg package has it's own mfrow parameter. If you do not specify it, it enforces some option which it chooses on it's own (and thus overrides your par(mfrow = c(2,2)).
Using the mfrow parameter within plot.rqs:
# make one plot, change the layout
plot(fit1, parm = 2, mfrow = c(2,1))
# add a new plot
par(new = TRUE)
# create a second plot
plot(fit2, parm = 2, mfrow = c(2,1))

How to store results of a simulation and plot all the results in one plot using plot_KDE in Luminescence package for R

I am creating a simulation using random number simulations. This gives 100 sets of 45 values with error.
First I would like to store the results of these simulations.
I then need to plot the results of these simulations on one plot. The plot I need to produce uses the package Luminescence and is of the type KDE.
I have managed to produce the separate entities but am struggling to both store the results and to produce the plot with all the simulations.
So far I have created the simulation:
Simulation <- function() {
RNC <- rescale (SFMT(45, dim=1, mexp=216091,
usepset=T, withtorus= F, usetime=T),
c(0.01,130))
RNC_error <- RNC*0.15
df <-data.frame(RNC,RNC_error)
}
the plot I want to create uses the following:
library("Luminescence")
plot_KDE(data=df, na.rm = TRUE,
values.cumulative = TRUE, order = TRUE,
boxplot = F, rug = F,
summary.method = "MCM", bw = "nrd0",
output = TRUE)
For my final result I require the numerical results of all the simulations stored and a single KDE plot with the results of all the simulations.
Split your problem into two parts.
Storing results. You have a data frame, df, so just use write.csv() to store the results to a CSV file, i.e.
write.csv(df, file="some_file.csv")
Storing your plot. Obviously you can't use a csv file, so instead we'll use a pdf or png, e.g.
# Open the file
pdf("figure_file.pdf")
plot_KDE(data=df, na.rm = TRUE,
values.cumulative = TRUE, order = TRUE,
boxplot = F, rug = F,
summary.method = "MCM", bw = "nrd0",
output = TRUE)
# Close the file
dev.off()
To save as a png use png() instead of pdf()

Resources