Create the same plot for various data.frames - r

I have three different data.frames (GRCYPT_flows, ESIEIT_flows, GRCYPT_flows) which contain the same variables (report_ctry, partner_ctry, indicator, year, value), but with different levels/observations. Now I want to create plots for each of those data.frames. Since the plots are supposed to look the same, I seems reasonable to use an iterative command. I tried the foreach loop:
foreach(i=GRCYPT_flows, ESIEIT_flows, GRCYPT_flows) %do% { ggplot(i, aes(year, value)) +
geom_line(aes(colour=partner_ctry, linetype=indicator)) + facet_wrap(~report_ctry) +
theme(axis.text.x=element_text(angle=90, vjust=0.5)) +
scale_x_continuous(breaks=seq(2002, 2012, 2), name="") +
scale_y_continuous(name="Billion Euros") +
scale_colour_discrete(breaks=c("EA17", "ROW_NON_EA17"), labels=c("EA17", "Extra-EA17")) +
scale_linetype_discrete(breaks=c("EA17", "ROW_NON_EA17"), labels=c("Trade", "Capital")) +
theme(legend.title=element_blank())}
The code, as it is, does not work. I face to problems here:
Assign a data.frame to an iteration variable.
Tell the foreach loop to save each iteration to a different list with a distinct name (plot1, plot2, plot3, etc.).
I'm relatively sure, this is quite easy so solve if you have some experience with R. I'm a total greenhorn, however, so I really don't know where to start (I could easily do it with Stata with which I have at least some experience).
What I want to do is tell R: "Make a plot for each of these data.frames and save each of it in an individual list."

I would suggest separating the plotting code from the loop, that way you can test it on one example and then run it for the batch easily. And you probably want to save the batch to files.
library(tidyverse)
myplot <- function(df, filename = NULL) {
df %>%
ggplot(aes(Sepal.Length, Petal.Length)) +
geom_point() ->
result
if(!is.null(filename)) ggsave(filename, plot = result, width = 6, height = 4)
else result
}
# test the plot
myplot(iris)
# do the batch
l <- list(one = iris, two = iris)
l %>% names %>% walk(function(n) myplot(l[[n]], paste0(n, ".pdf")))

Here's an example with three data.frames of iris, which I'd named i1, i2 and i3 for simplicity sake.
i2 <- i3 <- i1 <- iris
foreach(m = 1:3) %do% {
dat <- paste0("i" , m) %>% get
ggplot(dat, aes(Sepal.Length, Petal.Length)) + geom_line()
}
Basically the trick is to call for the specific data.frame with get. In your case, this should work:
data.names <- c("GRCYPT_flows", "ESIEIT_flows", "GRCYPT_flows")
foreach(i=1:length(data.names) %do% {
dat <- get(data.names[i])
ggplot(dat, aes(year, value)) +
geom_line(aes(colour=partner_ctry, linetype=indicator)) +
facet_wrap(~report_ctry) +
theme(axis.text.x=element_text(angle=90, vjust=0.5)) +
scale_x_continuous(breaks=seq(2002, 2012, 2), name="") +
scale_y_continuous(name="Billion Euros") +
scale_colour_discrete(breaks=c("EA17", "ROW_NON_EA17"),
labels=c("EA17", "Extra-EA17")) +
scale_linetype_discrete(breaks=c("EA17", "ROW_NON_EA17"),
labels=c("Trade", "Capital")) +
theme(legend.title=element_blank())
}

I think the most "R"-y solution here would be lapply. Lapply takes a vector of things and does the same thing to all of them, then stores the outputs as a list. Since you're using ggplot, you may like a neatly organized list of all the similar plots.
First organize your data frames together in a list
my_data <- list(GRCYPT_flows, ESIEIT_flows)
Two of your "three" data frames have exactly the same name. I'm going to assume you actually meant two, but this would work with any number of data frames.
my_plots = lapply(my_data, function(i) {
ggplot(i, aes(year, value))
})
This takes each element of the list ("i") and does the custom function to it, where the custom function is your elaborate plots.
Since you're using ggplot, you can store these plots as outputs. so my_plots will be a neat list with all your plots.
so with your full plot function try:
my_plot <- lapply(my_data, function(i) {
ggplot(i, aes(year, value)) +
geom_line(aes(colour=partner_ctry, linetype=indicator)) + facet_wrap(~report_ctry) +
theme(axis.text.x=element_text(angle=90, vjust=0.5)) +
scale_x_continuous(breaks=seq(2002, 2012, 2), name="") +
scale_y_continuous(name="Billion Euros") +
scale_colour_discrete(breaks=c("EA17", "ROW_NON_EA17"), labels=c("EA17", "Extra-EA17")) +
scale_linetype_discrete(breaks=c("EA17", "ROW_NON_EA17"), labels=c("Trade", "Capital")) +
theme(legend.title=element_blank())
})

Related

for-loop to create ggplots

I trying to make boxplots with ggplot2.
The code I have to make the boxplots with the format that I want is as follows:
p <- ggplot(mg_data, aes(x=Treatment, y=CD68, color=Treatment)) +
geom_boxplot(mg_data, mapping=aes(x=Treatment, y=CD68))
p+ theme_classic() + geom_jitter(shape=16, position=position_jitter(0.2))
I can was able to use the following code to make looped boxplots:
variables <- mg_data %>%
select(10:17)
for(i in variables) {
print(ggplot(mg_data, aes(x = Treatment, y = i, color=Treatment)) +
geom_boxplot())
}
With this code I get the boxplots however, they do not have the name label of what variable is being select for the y-axis, unlike the original code when not using the for loop. I also do not know how to add the formating code to the loop:
p + theme_classic() + geom_jitter(shape=16, position=position_jitter(0.2))
Here is a way. I have tested with built-in data set iris, just change the data name and selected columns and it will work.
suppressPackageStartupMessages({
library(dplyr)
library(ggplot2)
})
variables <- iris %>%
select(1:4) %>%
names()
for(i in variables) {
g <- ggplot(iris, aes(x = Species, y = get(i), color=Species)) +
geom_boxplot() +
ylab(i)
print(g)
}
Edit
Answering to a comment by user TarJae, reproduced here because answers are less deleted than comments:
Could you please expand with saving all four files. Many thanks.
The code above can be made to save the plots with a ggsave instruction at the loop end. The filename is the variable name and the plot is the default, the return value of last_plot().
for(i in variables) {
g <- ggplot(iris, aes(x = Species, y = get(i), color=Species)) +
geom_boxplot() +
ylab(i)
print(g)
ggsave(paste0(i, ".png"), device = "png")
}
Try this:
variables <- mg_data %>%
colnames() %>%
`[`(10:17)
for (i in variables) {
print(ggplot(mg_data, aes(
x = Treatment, y = {{i}}, color = Treatment
)) +
geom_boxplot())
}
Another option is to use lapply. It's approximately the same as using a loop, but it hides the actual looping part and can make your code look a little cleaner.
variables = iris %>%
select(1:4) %>%
names()
lapply(variables, function(x) {
ggplot(iris, aes(x = Species, y = get(x), color=Species)) +
geom_boxplot() + ylab(x)
})

Appending list of values onto ggplots

I have a list of values and a list of ggplots. I would like to attach the values from the list on to the ggplots. Is there a good way to do that?
Here's what I have for the list of ggplots:
p.list <- lapply(sort(unique(ind_steps$AnimalID)), function(i){
ggplot(ind_steps[ind_steps$AnimalID == i,], aes(x = t2, y = NSD)) +
geom_line() + theme_bw() +
theme(axis.text.x = element_text(angle = 90)) +
scale_x_datetime(date_breaks = '10 days', date_labels = '%y%j') +
facet_grid( ~ AnimalID, scales = "free") +
scale_colour_manual(values=hcl(seq(15,365,length.out=4)[match(i, sort(unique(ind_steps$AnimalID)))], 100, 65))
})
Assuming I have another list the same length as this one, and each one has a single value in each list.
I want to pair the ggplots with the list of values, and have the values show up in each respective plot. My expected output would be to have each value from the list of values be on each respective plot within the list of plots.
Since you don't provide any example data here I put an example with the iris built-in dataset. You can add values to plots with geom_text or geom_label (if I well understood what you want). For example, here we add the R^2 values to all the plot in a list:
library(ggplot2)
data(iris)
rsq <- lapply(1:length(unique(iris$Species)), function(i) {
cor(iris[iris$Species == unique(iris$Species)[i], "Sepal.Length"], iris[iris$Species == unique(iris$Species)[i], "Petal.Length"])^2
})
p.list <- lapply(1:length(unique(iris$Species)), function(i) {
ggplot(iris[iris$Species == unique(iris$Species)[i], ], aes(x = Sepal.Length, y = Petal.Length)) +
geom_point() + theme_bw()+
geom_text(aes(x=min(Sepal.Length),y=max(Petal.Length),label=paste0("R= ",round(rsq[[i]],2))))
})
library(gridExtra)
grid.arrange(p.list[[1]],p.list[[2]],p.list[[3]],nrow=3)
which return :

R: How to join two different plots in the same pdf page over a loop

I am interested in visualizing two plots together in the same page. One plot shows Usage of Drugs over time and the other plot shows Occurrence of Resistance over time. The Y-axis scales are very different for both so I think displaying them in separate graphs is useful.
I use the following code to generate the two different graphs for 676 ids(or elements in my data) in to two separate pdfs. This is not helpful when comparing how usage and resistance for one id is varying with time. Instead I would like to generate one pdf and in each page of the pdf, I would like to show the resistance and usage variation over time for the same id/element. So goal is to have 676 pages for 676 ids in my pdf and in each page display the use and resistance for the same id.
I know this can be done using grid.arrange from gridExtra but not sure how to use it in a loop and with lapply.
###Resistance
Plot_list1 =list()
#this is the loop
for (i in J0$id){
temp1 <- J0%>%
filter(id==i)%>%
ggplot(aes(x = Year , y = Rest)) +
geom_line()+
geom_point()+
scale_x_continuous(breaks=c(2008, 2009,2010,2011,2012,2013,2014,2015,2016,2017,2018))+
theme(axis.text.x = element_text(angle = 90))+
theme(legend.position = "none") +
ggtitle(i)
Plot_list1[[i]] <- temp1
}
##saving the loop in pdf
pdf("Resistance.pdf")
invisible(lapply(Plot_list1, print))
dev.off()
###Usage
Plot_list2 =list()
#this is the loop
for (i in J0$id){
temp2 <- J0%>%
filter(id==i)%>%
ggplot(aes(x = Year , y = DUL0)) +
geom_line()+
geom_point()+
scale_x_continuous(breaks=c(2008, 2009,2010,2011,2012,2013,2014,2015,2016,2017,2018))+
theme(axis.text.x = element_text(angle = 90))+
theme(legend.position = "none") +
ggtitle(i)
Plot_list2[[i]] <- temp2
}
##saving the loop in pdf
pdf("UsageDUL0.pdf")
invisible(lapply(Plot_list2, print))
dev.off()
Here's a terse walk-through.
Step 1, generate fake data, plot it individually into gg1 and gg2, then combine them using patchwork. This could easily (and perhaps arguably should) be broken into multiple stages, but it's small enough to use just one lapply.
library(ggplot2)
library(patchwork)
set.seed(42)
allgg <- lapply(1:3, function(ind) {
dat <- mtcars[sample(NROW(mtcars), 10),]
gg1 <- ggplot(dat, aes(disp, mpg)) + geom_point(color = "red") + labs(title = paste("Page ", ind), subtitle = "mpg ~ disp")
gg2 <- ggplot(dat, aes(qsec, drat)) + geom_point(color = "blue") + labs(subtitle = "drat ~ qsec")
gg1 / gg2
})
Start the pdf file, plot them all, then close the device.
pdf("quux.pdf", onefile = TRUE, width = 6, height = 6)
for (pg in allgg) print(pg)
dev.off()
For me, I get a 6x6 (inch) PDF with three pages, looking like:

R - Reorder a bar plot in a function using ggplot2

I have the following plot function using ggplot2.
Function_Plot <- function(Fun_Data, Fun_Color)
{
MyPlot <- ggplot(data = na.omit(Fun_Data), aes_string(x = colnames(Fun_Data[2]), fill = colnames(Fun_Data[1]))) +
geom_bar(stat = "count") +
coord_flip() +
scale_fill_manual(values = Fun_Color)
return(MyPlot)
}
The result is :
I need to upgrade my function to reorder the bar according frequencies of the words (in descending order). As I see the answer for another question about reordering, I try to introduce reorder function in the aes_string but it doesn't work.
A reproducible example :
a <- c("G1","G1","G1","G1","G1","G1","G1","G1","G1","G1","G2","G2","G2","G2","G2","G2","G2","G2")
b <- c("happy","sad","happy","bravery","bravery","God","sad","happy","freedom","happy","freedom",
"God","sad","happy","freedom",NA,"money","sad")
MyData <- data.frame(Cluster = a, Word = b)
MyColor <- c("red","blue")
Function_Plot(Fun_Data = MyData, Fun_Color = MyColor)
Well, if reordering doesn't work inside aes_string, let's try it beforehand.
Function_Plot <- function(Fun_Data, Fun_Color)
{
Fun_Data[[2]] <- reorder(Fun_Data[[2]], Fun_Data[[2]], length)
MyPlot <- ggplot(data = na.omit(Fun_Data), aes_string(x = colnames(Fun_Data[2]), fill = colnames(Fun_Data[1]))) +
geom_bar(stat = "count") +
coord_flip() +
scale_fill_manual(values = Fun_Color)
return(MyPlot)
}
Function_Plot()
Couple other notes - I'd recommend you use a more consistent style, mixing whether or not use use _ to separate words in variable names is confusing and asking for bugs.
It won't matter much unless your data is really big, but extracting names from a data frame is very efficient, whereas subsetting a data frame is less efficient. Your code subsets a data frame and then extracts the column names remaining, e.g., colnames(Fun_Data[1]). It will be cleaner to extract the names and then subset that vector: colnames(Fun_Data)[1]

Nested facet_wrap() in ggplot2

Say I have these data:
set.seed(100)
mydf<-
data.frame(
day = rep(1:5, each=20),
id = rep(LETTERS[1:4],25),
x = runif(100),
y = sample(1:2,100,T)
)
If I just want to plot all five days of id=="A" using facet_wrap(), we do like this:
ggplot(mydf[mydf$id=="A",], aes(x,y)) +
geom_tile() +
facet_wrap(~day,ncol=1)
Gives:
But, if I want to plot four of these next to each other automatically in a 2x2 grid (i.e. showing A,B,C,D), is that possible using a nested facet? I tried doing multiple variables in the function like this:
ggplot(mydf, aes(x,y)) +
geom_tile() +
facet_wrap(~ day+id)
but this gives this:
I'm looking for a nested approach. Five faceted rows by day in each panel with each plot in columns/rows by id. Obviously for a small number of plots I could save individually and arrange with grid.arrange etc., but in the real data I have many plots so want to automate if possible.
EDIT:
In response to comment - this is the sort of desired output:
try this,
p <- ggplot(mydf, aes(x,y)) +
geom_tile() +
facet_wrap(~ day, ncol=1)
library(plyr)
lp <- dlply(mydf, "id", function(d) p %+% d + ggtitle(unique(d$id)))
library(gridExtra)
grid.arrange(grobs=lp, ncol=2)
Here is a quick attempt using the multiplot function found here
ids = levels(as.factor(mydf$id))
p = vector("list", length(ids))
names(p) = ids
for(i in 1:length(ids)){
p[[i]] = ggplot(mydf[mydf$id == ids[i],], aes(x,y)) + geom_tile() + ggtitle(paste(ids[i])) + facet_wrap(~day, ncol=1)
}
multiplot(p$A, p$B, p$C, p$D, cols = 2)

Resources