how to plot multiple plots on ggplots with lapply - r

library(ggplot2)
x<-c(1,2,3,4,5)
a<-c(3,8,4,7,6)
b<-c(2,9,4,8,5)
df1 <- data.frame(x, a, b)
x<-c(1,2,3,4,5)
a<-c(6,5,9,4,1)
b<-c(9,5,8,6,2)
df2 <- data.frame(x, a, b)
df.lst <- list(df1, df2)
plotdata <- function(x) {
ggplot(data = x, aes(x=x, y=a, color="blue")) +
geom_point() +
geom_line()
}
lapply(df.lst, plotdata)
I have a list of data frames and i am trying to plot the same columns on the same ggplot. I tried with the code above but it seems to return only one plot.
There should be 2 ggplots. one with the "a" column data plotted and the other with the "b" column data plotted from both data frames in the list.
i've looked at many examples and it seems that this should work.

They are both plotted. If you are using RStudio, click the back arrow to toggle between the plots. If you want to see them together, do:
library(gridExtra)
do.call(grid.arrange,lapply(df.lst, plotdata))

If you want them on the same plot, it's as simple as:
ggplot(data = df1, aes(x=x, y=a), color="blue") +
geom_point() +
geom_line() +
geom_line(data = df2, aes(x=x, y=a), color="red") +
geom_point(data = df2, aes(x=x, y=a), color="red")
Edit: if you have several of these, you are probably better off combining them into a big data set while keeping the df of origin for use in the aesthetic. Example:
df.lst <- list(df1, df2)
# put an identifier so that you know which table the data came from after rbind
for(i in 1:length(df.lst)){
df.lst[[i]]$df_num <- i
}
big_df <- do.call(rbind,df.lst) # you could also use `rbindlist` from `data.table`
# now use the identifier for the coloring in your plot
ggplot(data = big_df, aes(x=x, y=a, color=as.factor(df_num))) +
geom_point() +
geom_line() + scale_color_discrete(name="which df did I come from?")
#if you wanted to specify the colors for each df, see ?scale_color_manual instead

Related

ggplot: adding a label to a geom_line aes_string

I have a for loop plotting 3 geom_lines, how do I add a label/legend so they won't all be 3 indiscernible black lines?
methods.list <- list(rwf,snaive,meanf)
cv.list <- lapply(methods.list, function(method) {
taylor%>% tsCV(forecastfunction = method, h=48)
})
gg <- ggplot(NULL, aes(x))
for (i in seq(1,3)){
gg <- gg + geom_line(aes_string( y=sqrt(colMeans(cv.list[[i]]^2, na.rm=TRUE))))
}
gg + guides(colour=guide_legend(title="Forecast"))
If I don't use a loop, I can use aes instead of that horrible aes_string and then everything works, but I have to write the same code 3 times and replace the loop with this:
gg <- gg + geom_line(aes(y=sqrt(colMeans(cv.list[[1]]^2, na.rm=TRUE)), colour=names(cv.list)[1]))
gg <- gg + geom_line(aes(y=sqrt(colMeans(cv.list[[2]]^2, na.rm=TRUE)), colour=names(cv.list)[2]))
gg <- gg + geom_line(aes(y=sqrt(colMeans(cv.list[[3]]^2, na.rm=TRUE)), colour=names(cv.list)[3]))
and then there are nice automatic colors and legend. What am I missing? Why is r being so noob-unfriendly?
The example is not reproducible, (there is no data!) but it seems you have some information in a list cv.list which contains multiple data.frames, and you want to plot some summary statistic of each against a common varaible stored in x.
The simplest method is simply to create a data.frame and plot using the data.frame.
#Create 3 data.frames with data (forecast?)
df <- lapply(1:3, function(group){
summ_stat <- sqrt(colMeans(cv.list[[i]]^2, na.rm=TRUE))
group <- group
data.frame(summ_stat, group, x = x)
})
#bind the data.frames into a single data.frame
df <- do.call(rbind, df)
#Create the plot
ggplot(data = df, aes(x = x, y = summ_stat, colour = group)) +
geom_line() +
labs(colour = "Forecast")
Note the change of label in the labs argument. This is changing the label of colour which is part of aes.

boxplot ggplot2::qplot() ungrouped and grouped data in same plot R

My data set features a factor(TypeOfCat) and a numeric (AgeOfCat).
I've made the below box plot. In addition to a box representing each type of cat, I've also tried to add a box representing the ungrouped data (ie the entire cohort of cats and their ages). What I've got is not quite what I'm after though, as sum() of course won't provide all the information needed to create such a plot. Any help would be much appreciated.
Data set and current code:
Df1 <- data.frame(TypeOfCat=c("A","B","B","C","C","A","B","C","A","B","A","C"),
AgeOfCat=c(14,2,5,8,4,5,2,6,3,6,12,7))
Df2 <- data.frame(TypeOfCat=c("AllCats"),
AgeOfCat=sum(Df1$AgeOfCat)))
Df1 <- rbind(Df1, Df2)
qplot(Df1$TypeOfCat,Df1$AgeOfCat, geom = "boxplot") + coord_flip()
No need for sum. Just take all the values individually for AllCats:
# Your original code:
library(ggplot2)
Df1 <- data.frame(TypeOfCat=c("A","B","B","C","C","A","B","C","A","B","A","C"),
AgeOfCat=c(14,2,5,8,4,5,2,6,3,6,12,7))
# this is the different part:
Df2 <- data.frame(TypeOfCat=c("AllCats"),
AgeOfCat=Df1$AgeOfCat)
Df1 <- rbind(Df1, Df2)
qplot(Df1$TypeOfCat,Df1$AgeOfCat, geom = "boxplot") + coord_flip()
You can see you have all the observations if you add geom_point to the boxplot:
ggplot(Df1, aes(TypeOfCat, AgeOfCat)) +
geom_boxplot() +
geom_point(color='red') +
coord_flip()
Like this?
library(ggplot2)
# first double your data frame, but change "TypeOfCat", since it contains all:
df <- rbind(Df1, transform(Df1, TypeOfCat = "AllCats"))
# then plot it:
ggplot(data = df, mapping = aes(x = TypeOfCat, y = AgeOfCat)) +
geom_boxplot() + coord_flip()

plot selected columns using ggplot2

I would like to plot multiple separate plots and so far I have the following code:
However, I don't want the final column from my dataset; it makes ggplot2 plot x-variable vs x-variable.
library(ggplot2)
require(reshape)
d <- read.table("C:/Users/trinh/Desktop/Book1.csv", header=F,sep=",",skip=24)
t<-c(0.25,1,2,3,4,6,8,10)
d2<-d2[,3:13] #removing unwanted columns
d2<-cbind(d2,t) #adding x-variable
df <- melt(d2, id = 't')
ggplot(data=df, aes(y=value,x=t) +geom_point(shape=1) +
geom_smooth(method='lm',se=F)+facet_grid(.~variable)
I tried adding
data=subset(df,df[,3:12])
but I don't think I am writing it correctly. Please advise. Thanks.
Here's how you could do it, using data(iris) as an example:
(i) plot with all variables
df <- reshape2::melt(iris, id="Species")
ggplot(df, aes(y=value, x=Species)) + geom_point() + facet_wrap(~ variable)
(ii) plot without "Petal.Width"
library(dplyr)
df2 <- df %>% filter(!variable == "Petal.Width")
ggplot(df2, aes(y=value, x=Species)) + geom_point() + facet_wrap(~ variable)

Align multiple ggplot graphs with and without legends [duplicate]

This question already has answers here:
Align multiple plots in ggplot2 when some have legends and others don't
(6 answers)
Closed 5 years ago.
I'm trying to use ggplot to draw a graph comparing the absolute values of two variables, and also show the ratio between them. Since the ratio is unitless and the values are not, I can't show them on the same y-axis, so I'd like to stack vertically as two separate graphs with aligned x-axes.
Here's what I've got so far:
library(ggplot2)
library(dplyr)
library(gridExtra)
# Prepare some sample data.
results <- data.frame(index=(1:20))
results$control <- 50 * results$index
results$value <- results$index * 50 + 2.5*results$index^2 - results$index^3 / 8
results$ratio <- results$value / results$control
# Plot absolute values
plot_values <- ggplot(results, aes(x=index)) +
geom_point(aes(y=value, color="value")) +
geom_point(aes(y=control, color="control"))
# Plot ratios between values
plot_ratios <- ggplot(results, aes(x=index, y=ratio)) +
geom_point()
# Arrange the two plots above each other
grid.arrange(plot_values, plot_ratios, ncol=1, nrow=2)
The big problem is that the legend on the right of the first plot makes it a different size. A minor problem is that I'd rather not show the x-axis name and tick marks on the top plot, to avoid clutter and make it clear that they share the same axis.
I've looked at this question and its answers:
Align plot areas in ggplot
Unfortunately, neither answer there works well for me. Faceting doesn't seem a good fit, since I want to have completely different y scales for my two graphs. Manipulating the dimensions returned by ggplot_gtable seems more promising, but I don't know how to get around the fact that the two graphs have a different number of cells. Naively copying that code doesn't seem to change the resulting graph dimensions for my case.
Here's another similar question:
The perils of aligning plots in ggplot
The question itself seems to suggest a good option, but rbind.gtable complains if the tables have different numbers of columns, which is the case here due to the legend. Perhaps there's a way to slot in an extra empty column in the second table? Or a way to suppress the legend in the first graph and then re-add it to the combined graph?
Here's a solution that doesn't require explicit use of grid graphics. It uses facets, and hides the legend entry for "ratio" (using a technique from https://stackoverflow.com/a/21802022).
library(reshape2)
results_long <- melt(results, id.vars="index")
results_long$facet <- ifelse(results_long$variable=="ratio", "ratio", "values")
results_long$facet <- factor(results_long$facet, levels=c("values", "ratio"))
ggplot(results_long, aes(x=index, y=value, colour=variable)) +
geom_point() +
facet_grid(facet ~ ., scales="free_y") +
scale_colour_manual(breaks=c("control","value"),
values=c("#1B9E77", "#D95F02", "#7570B3")) +
theme(legend.justification=c(0,1), legend.position=c(0,1)) +
guides(colour=guide_legend(title=NULL)) +
theme(axis.title.y = element_blank())
Try this:
library(ggplot2)
library(gtable)
library(gridExtra)
AlignPlots <- function(...) {
LegendWidth <- function(x) x$grobs[[8]]$grobs[[1]]$widths[[4]]
plots.grobs <- lapply(list(...), ggplotGrob)
max.widths <- do.call(unit.pmax, lapply(plots.grobs, "[[", "widths"))
plots.grobs.eq.widths <- lapply(plots.grobs, function(x) {
x$widths <- max.widths
x
})
legends.widths <- lapply(plots.grobs, LegendWidth)
max.legends.width <- do.call(max, legends.widths)
plots.grobs.eq.widths.aligned <- lapply(plots.grobs.eq.widths, function(x) {
if (is.gtable(x$grobs[[8]])) {
x$grobs[[8]] <- gtable_add_cols(x$grobs[[8]],
unit(abs(diff(c(LegendWidth(x),
max.legends.width))),
"mm"))
}
x
})
plots.grobs.eq.widths.aligned
}
df <- data.frame(x = c(1:5, 1:5),
y = c(1:5, seq.int(5,1)),
type = factor(c(rep_len("t1", 5), rep_len("t2", 5))))
p1.1 <- ggplot(diamonds, aes(clarity, fill = cut)) + geom_bar()
p1.2 <- ggplot(df, aes(x = x, y = y, colour = type)) + geom_line()
plots1 <- AlignPlots(p1.1, p1.2)
do.call(grid.arrange, plots1)
p2.1 <- ggplot(diamonds, aes(clarity, fill = cut)) + geom_bar()
p2.2 <- ggplot(df, aes(x = x, y = y)) + geom_line()
plots2 <- AlignPlots(p2.1, p2.2)
do.call(grid.arrange, plots2)
Produces this:
// Based on multiple baptiste's answers
Encouraged by baptiste's comment, here's what I did in the end:
library(ggplot2)
library(dplyr)
library(gridExtra)
# Prepare some sample data.
results <- data.frame(index=(1:20))
results$control <- 50 * results$index
results$value <- results$index * 50 + 2.5*results$index^2 - results$index^3 / 8
results$ratio <- results$value / results$control
# Plot ratios between values
plot_ratios <- ggplot(results, aes(x=index, y=ratio)) +
geom_point()
# Plot absolute values
remove_x_axis =
theme(
axis.ticks.x = element_blank(),
axis.text.x = element_blank(),
axis.title.x = element_blank())
plot_values <- ggplot(results, aes(x=index)) +
geom_point(aes(y=value, color="value")) +
geom_point(aes(y=control, color="control")) +
remove_x_axis
# Arrange the two plots above each other
grob_ratios <- ggplotGrob(plot_ratios)
grob_values <- ggplotGrob(plot_values)
legend_column <- 5
legend_width <- grob_values$widths[legend_column]
grob_ratios <- gtable_add_cols(grob_ratios, legend_width, legend_column-1)
grob_combined <- gtable:::rbind_gtable(grob_values, grob_ratios, "first")
grob_combined <- gtable_add_rows(
grob_combined,unit(-1.2,"cm"), pos=nrow(grob_values))
grid.draw(grob_combined)
(I later realised I didn't even need to extract the legend width, since the size="first" argument to rbind tells it just to have that one override the other.)
It feels a bit messy, but it is exactly the layout I was hoping for.
An alternative & quite easy solution is as follows:
# loading needed packages
library(ggplot2)
library(dplyr)
library(tidyr)
# Prepare some sample data
results <- data.frame(index=(1:20))
results$control <- 50 * results$index
results$value <- results$index * 50 + 2.5*results$index^2 - results$index^3 / 8
results$ratio <- results$value / results$control
# reshape into long format
long <- results %>%
gather(variable, value, -index) %>%
mutate(facet = ifelse(variable=="ratio", "ratio", "values"))
long$facet <- factor(long$facet, levels=c("values", "ratio"))
# create the plot & remove facet labels with theme() elements
ggplot(long, aes(x=index, y=value, colour=variable)) +
geom_point() +
facet_grid(facet ~ ., scales="free_y") +
scale_colour_manual(breaks=c("control","value"), values=c("green", "red", "blue")) +
theme(axis.title.y=element_blank(), strip.text=element_blank(), strip.background=element_blank())
which gives:

Adding an extra line with less points using ggplot2

I am trying to add an extra line which contains less points.
I have tried:
library(ggplot2)
df <- data.frame(x=c(1:50),y=c(1:50)*2+5)
df2 <- data.frame(x=c(20,30,40),y=c(40,60,80))
plot1 <- ggplot(df , aes(x=x, y=y)) + geom_line()
plot2 <- plot1+ geom_line( aes(x=df2$x, y=df2$y))
plot2
But this does not work.
You could pass df2 to the data argument of the second geom_line call and change the mapping slightly
plot2 <- plot1+ geom_line( aes(x=x, y=y), data = df2)

Resources