how to save ggplots as separate R-objects using For loop - r

It is intended to produce a number of plots and then merge them freely together using multiplot function. Please could you tell me how to save each plot as a separate R-object instead of having it printed as png file:
Examplary dataframe:
df1 <- data.frame(A = rnorm(50), B = rnorm(50), C = rnorm(50), group = rep(LETTERS[24:25], 25))
we use a for loop to produce pictures and save them in a file:
And the loop to change:
for(i in names(df1)[1:3]) {
png(paste(i, "png", sep = "."), width = 800, height = 600)
df2 <- df1[, c(i, "group")]
print(ggplot(df2) + geom_boxplot(aes_string(x = "group", y = i, fill = "group")) + theme_bw())
dev.off()
}
Could you please help with changing the code in order to save each plot as R-object on my screen?
Big Thanks in advance!

I'm not sure what you are talking about with "merge them freely together using multiplot function", but you can save ggplot objects using the standard assignment operator. Like any R object, they can be stored in a list.
# empty list for storage
gg_list <- list()
# if you must use a loop, loop through an indexing vector
for(i in 1:3) {
# if you need the i'th name in df1 use:
names(df1)[i]
# assign your ggplot call to the i'th position in the list
gg_list[[i]] <- ggplot(...)
}
# Now you can recall the ggplots by reference to the list.
# E.g., display the 1st one:
print(gg_list[[1]])

Instead of using a for loop, if you're gonna store in a list you could just use lapply:
df1 <- data.frame(A = rnorm(50),
B = rnorm(50),
C = rnorm(50),
group = rep(LETTERS[24:25], 25))
gg_list <- lapply(names(df1)[1:3], function(i) {
df2 <- df1[, c(i, "group")]
ggplot(df2) +
geom_boxplot(aes_string(x = "group", y = i, fill = "group")) +
theme_bw()
})
gg_list[[1]]
You can even save the list in an RDS object:
saveRDS(gg_list, file = "./gg_list.RDS")

Here's an alternative strategy that I find more straight-forward (no need to fiddle with names and aes_string): melt the data to long format, and plot subsets
df1 <- data.frame(A = rnorm(50), B = rnorm(50), C = rnorm(50),
group = rep(LETTERS[24:25], 25))
m = reshape2::melt(df1, id="group")
## base plot, all the data
p = ggplot(m) + geom_boxplot(aes(x = group, y = value)) + theme_bw()
## split-and-apply strategy, using the `%+%` operator to change datasets
pl = plyr::dlply(m, "variable", `%+%`, e1 = p)
do.call(gridExtra::grid.arrange, pl)

Related

Customize facet_wrap plot label in R?

I'm creating a facet_wrap plot in R, and I'm trying to automate the labeller. I can create a custom label manually, using this code:
library(ggplot2)
library(tidyverse)
df <- data.frame(a = rep(c(1/8,1/4,1/2), each = 100),
b = rep(c("A", "B", "C", "D"), each = 25),
x = rnorm(100))
names <- c(
`0.125` = "alpha~`=`~1/8",
`0.25` = "alpha~`=`~1/4",
`0.5` = "alpha~`=`~1/2"
)
df %>% ggplot() +
geom_density(aes(x = x, colour = b))+
facet_wrap(~a, labeller = labeller(a = as_labeller(names, label_parsed)))
The above code produces this plot:
facetplot
As you can see I'm creating the custom names in the names variable and then passing that to the labeller argument. I want to come up with a way to automate this process. So I can use any vector of names. Any suggestions?
I think I have a possible solution. If a user supplies a personalised vector of values, say, vec <- c(1/5, 'Hello', 3.14) then Im looping through each value of vector and pasting the unicode value for alpha and turning it into a list. After this I am creating a kind of dummy function that contains my new list of label titles and Im passing that to the labeller function:
# EDIT: removed loop with help from comments!
#nam <- NULL
# for(i in 1:(length(vec))){
# nam[i] <- list(paste0('\u03b1',"=", vec[i]))
#}
nam <- paste0('\u03b1',"=", vec)
custLab <- function (x){
nam
}
p <- df %>% ggplot() +
geom_density(aes(x = x, colour = b))+
facet_wrap(~a, labeller = as_labeller((custLab)))
This produces the following:

problems with converting a string into a variable name and use it for plot in a loop in R

I want to make a series of plots with a for loop, then store all the plots in a list, and at last print all the plots to a .pdf file.
All my y variable names are stored in a char vector, and I need to loop through this vector to make a plot for each of them.
data <- data.frame(index = seq(1, 10, 1), var1 = (1:10)^2, var2 = (1:10)^3, var3 = (1:10) ^4)
varnames <- c("var1", "var2", "var3") # store all the names in a char vector.
plot.list <- list() # make a list to store the result
for (i in varnames) {
plot.list[[i]] =
ggplot(data, aes(x = index, y = as.name(i))) +
geom_point()
}
Here are my questions:
when I used y = as.name(i) to convert each element of varnames into a varible name, I got an error. The error is like this:
Don't know how to automatically pick scale for object of type name. Defaulting to continuous.
Error: Aesthetics must be valid data columns. Problematic aesthetic(s): y = as.name(i).
Did you mistype the name of a data column or forget to add after_stat()?
Why did this happen and how to resolve it?
I want to store each plot into the plot.list, is this the right way to do it?
plot.list = some_plot_funtion()
How to plot all the elements in the plot.list to a .pdf
Thanks.
You can use lapply to plot since it returns a list by default.
library(ggplot2)
plot.list <- lapply(varnames, function(x)
ggplot(data, aes(index, .data[[x]])) + geom_point())
ggsave(filename = "plots.pdf",
plot = gridExtra::marrangeGrob(plot.list, nrow=1, ncol=1),
width = 15, height = 9)
You can adjust width and height as per your choice.
Try this very minor modification of your code. The get function acts in the manner both you and I think that as.name should work (but doesn't):
data <- data.frame(index = seq(1, 10, 1), var1 = (1:10)^2, var2 = (1:10)^3, var3 = (1:10) ^4)
varnames <- c("var1", "var2", "var3") # store all the names in a char vector.
plot.list <- list() # make a list to store the result
for (i in varnames) {
p =
ggplot(data, aes(x = index, y = get(i) )) +
geom_point() ; print(p); plot.list[[i]] <- p
}
You also need to do something to force evaluation of the print object. I tried using force but that failed. So I used print which does have the disadvantage of a side-effect of getting the individual plots to appear on your active graphics device.

Use plot_grid to arrange plots where plot information is stored in an R data frame

I generate a series of plots stored in a matrix as part of a for loop much like in the MWE below. This same matrix also stores two other columns of information (Colour and Animal in this example). I then want to be able to create a grid of plots, where I identify the plot based on the corresponding Colour and Animal.
I tried creating a data frame and then using row names to call out the plots I needed, but had the common error of Cannot convert object of class list into a grob.. If I call from the matrix directly this works - however I want a way not have to do this in case the order of the data changes in the input files. Is it possible to work directly from the data frame? I've seen similar examples, but couldn't apply to my case. I want to stick with cow plot and change as little as possible in the data generation stage.
MWE
library(cowplot)
p <- vector('list', 15)
p <-
matrix(
p,
nrow = 5,
ncol = 3
)
myColours = c("Yellow", "Red", "Blue", "Green", "Orange")
myAnimals = c("Kangaroo", "Emu", "Echidna", "Platypus", "Cassowary")
x = seq(1,10)
it = 1
for (i in seq(0,4)){ # generate example data and plots
y = x^i
t = runif(5)
df <- data.frame("X" = x, "Y" = y, "T" = t)
theanimal = myAnimals[i+1]
thecolour = myColours[i+1]
p[[it,1]] = thecolour
p[[it,2]] = theanimal
p[[it,3]] = ggplot(data = df, mapping = aes(x = X, y = Y)) +
geom_point(aes(color = T)) +
ggtitle(paste(thecolour, theanimal, sep = " "))
it = it+ 1
}
# turn into df
pltdf<- as.data.frame(p)
colnames(pltdf) <- c("Colour", "Animal", "plot")
rownames(pltdf) <- do.call(paste, c(pltdf[c("Colour", "Animal")], sep="-"))
pltdf[[1,3]] # this is what I expect for a single plot
plot1 = vector('list', 4)
plot1 <-
matrix(
plot1,
nrow = 2,
ncol = 2
)
plot1[[1,1]] = pltdf["Red-Emu", "plot"]. # also tried with just plot[[1]] = etc.
plot1[[1,2]] = pltdf["Blue-Echidna", "plot"]
plot1[[2,1]] = pltdf["Orange-Cassowary", "plot"]
plot1[[2,2]] = pltdf["Green-Platypus", "plot"]
plot_grid(plotlist = t(plot1), ncol = 2)
plot_grid(plotlist = list(plot1), ncol = 2) # suggested solution on a dif problem
plot2 = vector('list', 4) # what I want plots to look like in the end
plot2[[1]] = p[[1,3]]
plot2[[2]] = p[[4,3]]
plot2[[3]] = p[[2, 3]]
plot2[[4]] = p[[5, 3]]
plot_grid(plotlist = t(plot2), ncol = 2)
You can specify the order that you want the plots to be in and subset the dataframe accordingly which can be used in plot_grid.
library(cowplot)
order <- c("Red-Emu", "Blue-Echidna", "Orange-Cassowary", "Green-Platypus")
plot_grid(plotlist = pltdf[order, 'plot'], ncol = 2)

geom_density plots with nested vectors

I have a data frame with a nested vector in one column. Any ideas how to ggplot a geom_density using the values from the nested vector?
If I use pivot_longer the entire data frame, I get 25 million rows, so I'd prefer to avoid that if possible.
library(ggplot2)
df = data.frame(a = rep(letters[1:5],length.out = 100), b = sample(LETTERS, 100, replace = T))
df[["c"]] = purrr::map(1:100, function(x) rnorm(100))
# works but too heavy for the actual implementation
ggplot(tidyr::unnest(df, c), aes(c, group = a)) + geom_density() + facet_wrap(vars(b))
# doesn't work
ggplot(df, aes(c, group = a)) + geom_density() + facet_wrap(vars(b))
Different solution: Prepare each plot separately and rearrange your plots afterwards using gridExtra package.
library(ggplot2)
df = data.frame(a = rep(letters[1:5],length.out = 100), b = sample(LETTERS, 100, replace = T))
df[["c"]] = purrr::map(1:100, function(x) rnorm(100))
lst_plot <- lapply(sort(unique(df$b)), function(x){
data <- df[df$b == x,
data <- purrr::map_dfr(seq(length(data$a)), ~ data.frame(a = data$a[.x], c = data$c[.x][[1]]))
gg <- ggplot(data) +
geom_density(aes(c, group = a)) +
ylab(NULL)
return(gg)
})
gridExtra::grid.arrange(grobs = lst_plot, ncol = 6, left = "density")
To be honest, I'm not sure how well this works with your massive dataset...

R: order top 10 of result "means" and output results as .csv

I have database: link
fun_mean <- function(x){return(round(data.frame(y=mean(x),label=mean(x,na.rm=T)),digit=2))}
foo <- qplot(Interest, Scored.Probabilities, data = dataset1, geom = "boxplot");
foo <- foo+stat_summary(fun.y = mean, geom="point",colour="darkred",size=3)+stat_summary(fun.data = fun_mean,geom="text", vjust=-0.7)
ggsave(foo, file="Interest.png", width=20, height=7)
There are so mach information, I want only top 10 by mean values (output new .png) and could I output all the mean value table as .csv?
Thank you.
A dplyr solution
library(dplyr)
fun_mean <- function(x){return(round(data.frame(y=mean(x),label=mean(x,na.rm=T)),digit=2))}
m <- dataset1 %>% group_by(Interest) %>%
summarize(y=mean(Scored.Probabilities),
label=mean(Scored.Probabilities,na.rm=T)) %>%
arrange(desc(y))
idx <- as.character(m$Interest[1:10])
dataset2 <- filter(dataset1,Interest %in% idx)
foo <- qplot(Interest, Scored.Probabilities, data = dataset2, geom = "boxplot");
foo <- foo + stat_summary(fun.y = mean, geom="point",colour="darkred",size=3) +
stat_summary(fun.data = fun_mean,geom="text", vjust=-0.7) +
I don't know the plotting function, but I guess you need to define, which of them are top 10, something like this could work...
selectedMeans <- which(fun_mean %in% sort(fun_mean)[1:10])
Than I would use parameter subset = selectedMeans, but I am not sure if this plotting function can do it... Otherwise I would create new subtable of selected data...

Resources