I am using ggplot2 for a multiplot. Althoug after a lot of tweaking, I still face
problems as:
Some free space gets plotted on each side (left/right) of each plot. I have marked this on the right side of each plot.
Plots are not aligned by the left side. This problem is clearly observed in the bottom plot
Y axis label is much far away from the plots. Can I reduce this separation?
Multiplot is:
I used the following R code for the same:
p1 <- ggplot(data = dplots[[1]],aes(timestamp,power/1000))+ geom_line()+
ylab("")+theme(axis.text.x=element_blank(),axis.title.x=element_blank(),axis.ticks.x=element_blank(),plot.margin = unit(c(-0.3,1,-0.3,1), "cm"))+labs(title="room1")
p2 <- ggplot(data = dplots[[2]],aes(timestamp,power/1000))+ geom_line()+
ylab("")+theme(axis.text.x=element_blank(),axis.title.x=element_blank(),axis.ticks.x=element_blank(),plot.margin = unit(c(-0.3,1,-0.3,1), "cm"))+ labs(title="room2")
p3 <- ggplot(data = dplots[[6]],aes(timestamp,power/1000))+ geom_line()+
ylab("")+theme(axis.text.x=element_blank(),axis.title.x=element_blank(),axis.ticks.x=element_blank(),plot.margin = unit(c(-0.3,1,-0.3,1), "cm"))+ labs(title="room3")
p4 <- ggplot(data = dplots[[4]],aes(timestamp,power/1000))+ geom_line()+
ylab("")+theme(axis.text.x=element_blank(),axis.title.x=element_blank(),axis.ticks.x=element_blank(),plot.margin = unit(c(-0.3,1,-0.3,1), "cm"))+ labs(title="room4")
p5 <- ggplot(data = dplots[[5]],aes(timestamp,power/1000))+ geom_line()+
ylab("")+theme(axis.text.x=element_blank(),axis.title.x=element_blank(),axis.ticks.x=element_blank(),plot.margin = unit(c(-0.3,1,-0.3,1), "cm"))+ labs(title="room5")
p6 <- ggplot(data = dplots[[3]],aes(timestamp,power/1000))+ geom_line()+
ylab("")+theme(axis.title.x=element_blank(),axis.ticks.x=element_blank(),plot.margin = unit(c(-0.3,1,-0.3,1), "cm"))+ labs(title="Chiller") +
scale_x_datetime(labels= date_format("%d-%m-%y",tz ="UTC"),breaks = pretty_breaks(8))
grid.arrange(p1,p2,p3,p4,p5,p6,nrow=6,ncol=1,heights=c(0.15,0.15,0.15,0.15,0.15,0.15),left="Power (KW)")
The dataset (dplots) is stored at the link.
Probably the easiest solution is to combine the dataframes in the list in one dataset. With rbindlist from the data.table package you can also include id's for each dataframe:
library(data.table)
# bind the dataframes together into one datatable (which is an enhanced dataframe)
DT <- rbindlist(dplots, idcol = "id")
# give names to the id's
DT$id <- factor(DT$id, labels = c("room 1","room 2","room 3", "room 4","room 5","Chiller"))
library(ggplot2)
ggplot(DT, aes(x = timestamp, y = power)) +
geom_line() +
scale_x_datetime(expand = c(0,0)) +
facet_grid(id ~ ., scales="free_y") +
theme_bw()
this results in the following plot:
With your existing code, use cowplot package:
library(cowplot)
plot_grid(p1,p2,p3,p4,p5,p6,ncol=1,align = "v")
Related
I asked a question yesterday about annotating the x-axis with N in a faceted plot using a minimal example that turns out to be too simple, relative to my real problem. The answer given there works in the case of complete data, but if you have missing facets you would like to preserve, the combination of facet_wrap options drop=FALSE and scales="free_x" triggers an error: "Error in if (zero_range(from) || zero_range(to)) { : missing value where TRUE/FALSE needed"
Here is a new, less-minimal example. The goal here is to produce a large graph with two panels using grid.arrange; the first showing absolute values over time by treatment group; the second showing the change from baseline over time by treatment group. In the second panel, we need a blank facet when vis=1.
# setup
library(ggplot2)
library(plyr)
library(gridExtra)
trt <- factor(rep(LETTERS[1:2],150),ordered=TRUE)
vis <- factor(c(rep(1,150),rep(2,100),rep(3,50)),ordered=TRUE)
id <- c(c(1:150),c(1:100),c(1:50))
val <- rnorm(300)
data <- data.frame(id,trt,vis,val)
base <- with(subset(data,vis==1),data.frame(id,trt,baseval=val))
data <- merge(data,base,by="id")
data <- transform(data,chg=ifelse(vis==1,NA,val-baseval))
data.sum <- ddply(data, .(vis, trt), summarise, N=length(na.omit(val)))
data <- merge(data,data.sum)
data <- transform(data, trtN=paste(trt,N,sep="\n"))
mytheme <- theme_bw() + theme(panel.margin = unit(0, "lines"), strip.background = element_blank())
# no missing facets
plot.a <- ggplot(data) + geom_boxplot(aes(x=trtN,y=val,group=trt,colour=trt), show.legend=FALSE) +
facet_wrap(~ vis, drop=FALSE, switch="x", nrow=1, scales="free_x") +
labs(x="Visit") + mytheme
# first facet should be blank
plot.b <- ggplot(data) + geom_boxplot(aes(x=trtN,y=chg,group=trt,colour=trt), show.legend=FALSE) +
facet_wrap(~ vis, drop=FALSE, switch="x", nrow=1, scales="free_x") +
labs(x="Visit") + mytheme
grid.arrange(plot.a,plot.b,nrow=2)
You can add a blank layer to draw all the facets in your second plot. The key is that you need a variable that exists for every level of vis to use as your y variable. In your case you can simply use the variable you used in your first plot.
ggplot(data) +
geom_boxplot(aes(x = trtN, y = chg, group = trt, colour = trt), show.legend = FALSE) +
geom_blank(aes(x = trtN, y = val)) +
facet_wrap(~ vis, switch = "x", nrow = 1, scales = "free_x") +
labs(x="Visit") + mytheme
If your variables have different ranges, you can set the y limits using the overall min and max of your boxplot y variable.
+ scale_y_continuous(limits = c(min(data$chg, na.rm = TRUE), max(data$chg, na.rm = TRUE)))
I have a data.frame that looks something like this:
HSP90AA1 SSH2 ACTB TotalTranscripts
ESC_11_TTCGCCAAATCC 8.053308 12.038484 10.557234 33367.23
ESC_10_TTGAGCTGCACT 9.430003 10.687959 10.437068 30285.41
ESC_11_GCCGCGTTATAA 7.953726 9.918988 10.078192 30133.94
ESC_11_GCATTCTGGCTC 11.184402 11.056144 8.316846 24857.07
ESC_11_GTTACATTTCAC 11.943733 11.004500 9.240883 23629.00
ESC_11_CCGTTGCCCCTC 7.441695 9.774733 7.566619 22792.18
The TotalTranscripts column is sorted in descending order. What I'd like to do is generate three bar graphs using ggplot2 with each bar graph corresponding to each column of the data.frame with the exception of TotalTranscripts. I'd like the bar graphs to be ordered by TotalTranscripts just as the data.frame. I would be ideal to have these bar graphs on one plot using a facet wrap.
Any help would be greatly appreciated! Thank you!
EDIT: Here is my current code using barplot().
cells = "ESC"
genes = c("HSP90AA1", "SSH2", "ACTB")
g = data[genes,grep(cells, colnames(data))]
g = data.frame(t(g), colSums(data)[grep(cells, colnames(data))])
colnames(g)[ncol(g)] = "TotalTranscripts"
g = g[order(g$TotalTranscripts, decreasing=T), , drop=F]
barplot(as.matrix(g[1]), beside=TRUE, names.arg=paste(rownames(g)," (",g$TotalTranscripts,")",sep=""), las=2, col="light blue", cex.names=0.3, main=paste(colnames(g)[1], "\nCells sorted by total number of transcripts (colSums)", sep=""))
This will generate a plot that looks like this.
Again, the problem I seem to be having here is how to have multiple of these plots on the same image. I would like to add 20+ columns to this data.frame but I've cut this down to 3 for the sake of simplicity.
EDIT: Current code incorporating the answer below
cells = "ESC"
genes = rownames(data[x,])[1:8]
# genes = c("HSP90AA1", "SSH2", "ACTB")
g = data[genes,grep(cells, colnames(data))]
g = data.frame(t(g), colSums(data)[grep(cells, colnames(data))])
colnames(g)[ncol(g)] = "TotalTranscripts"
g = g[order(g$TotalTranscripts, decreasing=T), , drop=F]
g$rowz <- row.names(g)
g$Cells <- reorder(g$rowz, rev(g$TotalTranscripts))
df1 <- melt(g, id.vars = c("Cells", "TotalTranscripts"), measure.vars=genes)
ggplot(df1, aes(x = Cells, y = value)) + geom_bar(stat = "identity") +
theme(axis.title.x=element_blank(), axis.text.x = element_blank()) +
facet_wrap(~ variable, scales = "free") +
theme_bw() + theme(axis.text.x = element_text(angle = 90))
Here is the example data for anybody else:
df <- structure(list(HSP90AA1 = c(8.053308, 9.430003, 7.953726, 11.184402,
11.943733, 7.441695), SSH2 = c(12.038484, 10.687959, 9.918988,
11.056144, 11.0045, 9.774733), ACTB = c(10.557234, 10.437068,
10.078192, 8.316846, 9.240883, 7.566619), TotalTranscripts = c(33367.23,
30285.41, 30133.94, 24857.07, 23629, 22792.18)), .Names = c("HSP90AA1",
"SSH2", "ACTB", "TotalTranscripts"), class = "data.frame", row.names = c("ESC_11_TTCGCCAAATCC",
"ESC_10_TTGAGCTGCACT", "ESC_11_GCCGCGTTATAA", "ESC_11_GCATTCTGGCTC",
"ESC_11_GTTACATTTCAC", "ESC_11_CCGTTGCCCCTC"))
And here is a solution:
#New column for row names so they can be used as x-axis elements
df$rowz <- row.names(df)
#Explicitly order the rows (see the Kohske link)
df$rowz1 <- reorder(df$rowz, rev(df$TotalTranscripts))
library(reshape2)
#Melt the data from wide to long
df1 <- melt(df, id.vars = c("rowz1", "TotalTranscripts"),
measure.vars = c("HSP90AA1", "SSH2", "ACTB"))
library(ggplot2)
gp <- ggplot(df1, aes(x = rowz1, y = value)) + geom_bar(stat = "identity") +
facet_wrap(~ variable, scales = "free") +
theme_bw()
gp + theme(axis.text.x = element_text(angle = 90))
This example by Kohske is a constant reference for me on ordering elements in ggplot2.
If you have many columns, but the same six ESC complexes, you can switch the groupings, i.e. x = variable and facet_wrap(~ rowz1), but this fundamentally changes how you are visualizing/comparing your data. Also, consider facet_grid(row ~ column) if you can organize the columns by 2 components (Columns being the data that are melted into 'variable' and 'value').
And this additional SO solution isn't related to your question, but it is an elegant way to reorder elements in each facet by their values (for future reference).
Finally, the method that will give you the finest control is to plot each graph separately and combine the grobs. Baptiste's packages like gridExtra and gtable are useful for these tasks.
**EDIT in response to new information from OP**
The OP has subsequently asked how to visualize the data, especially when there are more ESC categorical variables (up to 600+).
Here are some examples, with the big caveat that with many categorical variables, they should be grouped or converted to a continuous variable somehow.
#Plot colour to a few discrete, categorical variables
gp + aes(fill = rowz1) +
theme(axis.text.x = element_blank(), axis.ticks.x = element_blank()) +
labs(x = NULL, fill = "Cell", title = "Discrete categorical variables")
#Plot colour on a continuous scale.
#Ultimately, not appropriate for this example! (but shown for reference)
#More appropriate: fill = TotalTranscripts
gp + aes(fill = as.numeric(rowz1)) +
theme(axis.text.x = element_blank(), axis.ticks.x = element_blank()) +
labs(x = NULL, title = "Continuous variables (legend won't work for many values)") +
scale_fill_gradient2(name = "Cell",
breaks = as.numeric(df1$rowz1),
labels = df1$rowz1,
midpoint=median(as.numeric(df1$rowz1)))
#x is continuous, colour plotted to the categorical variable.
#Same caveats as earlier.
gp1 <- ggplot(df1, aes(x = TotalTranscripts/1000, y = value, colour = rowz1)) +
geom_point(size=3) + facet_wrap(~ variable, scales = "free") +
labs(title = "X is an actual continuous variable") +
theme_bw() + labs(x = bquote("Total Transcripts,"~10^3), colour = "Cell")
gp1
Does somebody know a alternative method for ordering stacks of a ggplot2 bar graph?
I used to use for example
library(ggplot2)
library(plyr)
a <- cbind(rep("a",5),sample(1:100,5), rep_len(c("1","2","3"),5))
b <- cbind(rep("b",7),sample(1:100,7), rep_len(c("1","2","3"),7))
c <- cbind(rep("c",3),sample(1:100,3), rep_len(c("1","2","3"),3))
d <- cbind(rep("d",10),sample(1:100,10), rep_len(c("1","2","3"),10))
e <- cbind(rep("e",15),sample(1:100,15), rep_len(c("1","2","3"),15))
dat <- rbind(a,b,c,d,e)
colnames(dat) <- c("x","count","example")
dat <- as.data.frame(dat)
dat$x <- as.character(dat$x)
dat$count <- as.numeric(dat$count)
dat$example <- as.character(dat$example)
GP <- ggplot(dat, aes(x= reorder(x, count, sum), y=count, fill = example, order = desc(count)))+
geom_bar(stat="identity", fill= "grey", colour= "black", size = 1)+
coord_flip() +
scale_y_continuous()+
scale_x_discrete('')+
#scale_fill_brewer()+
labs(y="")+
theme_bw()+
theme(axis.text.y=element_text(size=8,face="bold"),
axis.text.x=element_text(size=10,face="bold"),
axis.title.x=element_text(size=16,face="bold"),
axis.title.y=element_text(size=16,face="bold"),
plot.title=element_text(size=16,face="bold"),
strip.text.x = element_text(size=10,face="bold"),
strip.background = element_blank())
print(GP)
to create graphs like
however in version 2.0.0 of ggplot2 order() has been removed. and now the graph will be like:
Does anybody know a alternative?
Tanks
I've created a plot of categorical data using facet in ggplot.
Example script here:
#script to produce plot with dummy data
rm(list=ls(all=TRUE))
library(ggplot2)
require(gridExtra)
#put dummy data in df
dummy_data<-data.frame(experiment_number=c(rep("exp_1",15),rep("exp_2",15)),
group=rep(c("A","B","C"),5),yvalue=runif(30, 0.0, 0.05))
# make plot
plot1<-ggplot(data = dummy_data)+
geom_point(aes(x = group, y = yvalue,
colour=group,shape=group),size=3.5,position = position_jitter(w = 0.2)) +
facet_wrap( ~ experiment_number) +
ylab("yvalue") +
xlab("")
#plot
plot1
I now want to add text & bars below the plot to show the p values relating to a statistical test between the groups -an example where I've just drawn it in my hand is attached (p values just made up).
Note the p values will be different in the two different panels. I've played around with annotate & custom annotate but cant seem to get it to work. Any ideas?
thanks v much
Here's a totally ridiculous way of doing something similar to what you are asking for. I used geom_errorbar for the bars, so I had to flip the coordinate system. Anyway, you should be able to customize this to do what you need.
rm(list=ls(all=TRUE))
library(ggplot2)
#put dummy data in df
dummy_data<-data.frame(experiment_number=c(rep("exp_1",15),rep("exp_2",15)),
group=rep(c("A","B","C"),5),yvalue=runif(30, 0.0, 0.05))
# make plot
plot1<-ggplot(data = dummy_data)+
geom_point(aes(y = group, x = yvalue, #changed x and y
colour=group,shape=group),size=3.5,position = position_jitter(h = 0.2)) + # changed w=... to h=...
facet_wrap( ~ experiment_number) +
xlab("yvalue") +
ylab("") + coord_flip() # flipped coordinate system
#plot
rng <- range(dummy_data$yvalue) # range
df.lines <- data.frame(ymin=LETTERS[1:3], ymax=LETTERS[c(2,3,1)], x=rng[1]-diff(rng)*1:3/12) #data for geom_errorbar
# data for geom_text
df.txt <- data.frame(y=c("AB", "BC", "B"),
x=rng[1]-diff(rng)*(1:3+.5)/12,
label=c("p=0.003", "p=0.05", "p=0.6",
"p=0.2", "p=0.1", "p=0.05"),
experiment_number=rep(c("exp_1", "exp_2"), each=3))
# add some space and geom_errorbar and geom_text
plot2 <- plot1 + scale_x_continuous(limits=c(rng[1]-diff(rng)/3, rng[2]+diff(rng)/5)) +
geom_errorbar(data=df.lines, aes(x=x, ymin=ymin, ymax=ymax)) +
scale_y_discrete(breaks=LETTERS[1:3], limits=c("A", "AB", "B", "BC", "C")) +
geom_text(data=df.txt, aes(x=x, y=y, label=label), xjust=0.5)
plot2
I have data where I look at the difference in growth between a monoculture and a mixed culture for two different species. Additionally, I made a graph to make my data clear.
I want a barplot with error bars, the whole dataset is of course bigger, but for this graph this is the data.frame with the means for the barplot.
plant species means
Mixed culture Elytrigia 0.886625
Monoculture Elytrigia 1.022667
Monoculture Festuca 0.314375
Mixed culture Festuca 0.078125
With this data I made a graph in ggplot2, where plant is on the x-axis and means on the y-axis, and I used a facet to divide the species.
This is my code:
limits <- aes(ymax = meansS$means + eS$se, ymin=meansS$means - eS$se)
dodge <- position_dodge(width=0.9)
myplot <- ggplot(data=meansS, aes(x=plant, y=means, fill=plant)) + facet_grid(. ~ species)
myplot <- myplot + geom_bar(position=dodge) + geom_errorbar(limits, position=dodge, width=0.25)
myplot <- myplot + scale_fill_manual(values=c("#6495ED","#FF7F50"))
myplot <- myplot + labs(x = "Plant treatment", y = "Shoot biomass (gr)")
myplot <- myplot + opts(title="Plant competition")
myplot <- myplot + opts(legend.position = "none")
myplot <- myplot + opts(panel.grid.minor=theme_blank(), panel.grid.major=theme_blank())
So far it is fine. However, I want to add two different horizontal lines in the two facets. For that, I used this code:
hline.data <- data.frame(z = c(0.511,0.157), species = c("Elytrigia","Festuca"))
myplot <- myplot + geom_hline(aes(yintercept = z), hline.data)
However if I do that, I get a plot were there are two extra facets, where the two horizontal lines are plotted. Instead, I want the horizontal lines to be plotted in the facets with the bars, not to make two new facets. Anyone a idea how to solve this.
I think it makes it clearer if I put the graph I create now:
Make sure that the variable species is identical in both datasets. If it a factor in one on them, then it must be a factor in the other too
library(ggplot2)
dummy1 <- expand.grid(X = factor(c("A", "B")), Y = rnorm(10))
dummy1$D <- rnorm(nrow(dummy1))
dummy2 <- data.frame(X = c("A", "B"), Z = c(1, 0))
ggplot(dummy1, aes(x = D, y = Y)) + geom_point() + facet_grid(~X) +
geom_hline(data = dummy2, aes(yintercept = Z))
dummy2$X <- factor(dummy2$X)
ggplot(dummy1, aes(x = D, y = Y)) + geom_point() + facet_grid(~X) +
geom_hline(data = dummy2, aes(yintercept = Z))