Is this possible to reproduce this lattice plot with ggplot2?
library(latticeExtra)
data(mtcars)
x <- t(as.matrix(scale(mtcars)))
dd.row <- as.dendrogram(hclust(dist(x)))
row.ord <- order.dendrogram(dd.row)
dd.col <- as.dendrogram(hclust(dist(t(x))))
col.ord <- order.dendrogram(dd.col)
library(lattice)
levelplot(x[row.ord, col.ord],
aspect = "fill",
scales = list(x = list(rot = 90)),
colorkey = list(space = "left"),
legend =
list(right =
list(fun = dendrogramGrob,
args =
list(x = dd.col, ord = col.ord,
side = "right",
size = 10)),
top =
list(fun = dendrogramGrob,
args =
list(x = dd.row,
side = "top",
size = 10))))
EDIT
From 8 August 2011 the ggdendro package is available on CRAN
Note also that the dendrogram extraction function is now called dendro_data instead of cluster_data
Yes, it is. But for the time being you will have to jump through a few hoops:
Install the ggdendro package (available from CRAN). This package will extract the cluster information from several types of cluster methods (including Hclust and dendrogram) with the express purpose of plotting in ggplot.
Use grid graphics to create viewports and align three different plots.
The code:
First load the libraries and set up the data for ggplot:
library(ggplot2)
library(reshape2)
library(ggdendro)
data(mtcars)
x <- as.matrix(scale(mtcars))
dd.col <- as.dendrogram(hclust(dist(x)))
col.ord <- order.dendrogram(dd.col)
dd.row <- as.dendrogram(hclust(dist(t(x))))
row.ord <- order.dendrogram(dd.row)
xx <- scale(mtcars)[col.ord, row.ord]
xx_names <- attr(xx, "dimnames")
df <- as.data.frame(xx)
colnames(df) <- xx_names[[2]]
df$car <- xx_names[[1]]
df$car <- with(df, factor(car, levels=car, ordered=TRUE))
mdf <- melt(df, id.vars="car")
Extract dendrogram data and create the plots
ddata_x <- dendro_data(dd.row)
ddata_y <- dendro_data(dd.col)
### Set up a blank theme
theme_none <- theme(
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.title.x = element_text(colour=NA),
axis.title.y = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.line = element_blank()
#axis.ticks.length = element_blank()
)
### Create plot components ###
# Heatmap
p1 <- ggplot(mdf, aes(x=variable, y=car)) +
geom_tile(aes(fill=value)) + scale_fill_gradient2()
# Dendrogram 1
p2 <- ggplot(segment(ddata_x)) +
geom_segment(aes(x=x, y=y, xend=xend, yend=yend)) +
theme_none + theme(axis.title.x=element_blank())
# Dendrogram 2
p3 <- ggplot(segment(ddata_y)) +
geom_segment(aes(x=x, y=y, xend=xend, yend=yend)) +
coord_flip() + theme_none
Use grid graphics and some manual alignment to position the three plots on the page
### Draw graphic ###
grid.newpage()
print(p1, vp=viewport(0.8, 0.8, x=0.4, y=0.4))
print(p2, vp=viewport(0.52, 0.2, x=0.45, y=0.9))
print(p3, vp=viewport(0.2, 0.8, x=0.9, y=0.4))
As Ben says, everything is possible. Some work to support dendrograms has been done. Andrie de Vries has made a fortify method of tree objects. However, the resulting graphic is not pretty as you can see.
The tile would be easy to do. For the dendrogram I would inspect plot.dendrogram (using getAnywhere) to see how the coordinates for the segments are calculated. Extract those coordinates and use geom_segment to plot the dendrogram. Then use viewports to plot the tiles and the dendrogram together. Sorry I can't give a example, it's a lot of work and it's too late.
I hope this helps
Cheers
Doubtful. I do not see any functions in the Index for ggplot2 that would suggest support for dendrograms, and when this blogger put together a set of translations of the illustrations in Sarkar's Lattice book, he was unable to get a ggplot dendrogram legend:
http://learnr.wordpress.com/2009/08/10/ggplot2-version-of-figures-in-lattice-multivariate-data-visualization-with-r-part-9/
These links provide a solution for heatmaps with dendrograms in ggplot2:
https://gist.github.com/chr1swallace/4672065
https://github.com/chr1swallace/random-functions/blob/master/R/ggplot-heatmap.R
and also this one:
Align ggplot2 plots vertically
Related
I looked for answers in other Qs, couldn't find this Q (or Answer).
Using ggplot2 to generate the two plots individually.
Then using plot_grid function from the cowplot package to combine them.
They two data have exactly the same number of common dates.
Thus the x-axis is same time, I want the two graph's grey box to start from the same vertical spot,
so that they are time aligned. Presently, due to ylabs of different size, they don't start from same vertical line. Here is a pictorial description:
This could be achieved via the patchwork package:
library(ggplot2)
library(patchwork)
p1 <- ggplot(mtcars, aes(hp, mpg)) +
geom_point()
p2 <- ggplot(mtcars, aes(hp, mpg * 1000)) +
geom_point()
p1 / p2
If you want a solution that only uses plot_grid, you could do the following (admittedly hackier than the patchwork package):
myPlot1 <- ggplot()
myPlot2 <- ggplot()
#get a ggplot that is the axis only
myYAxis1 <- get_y_axis(myPlot1)
myYAxis2 <- get_y_axis(myPlot2)
#remove all y axis stuff from the plots themselves
myPlot1 <- myPlot1 + theme(axis.text.y = element_blank(), axis.title.y = element_blank(), axis.ticks.y = element_blank())
myPlot2 <- myPlot2 + theme(axis.text.y = element_blank(), axis.title.y = element_blank(), axis.ticks.y = element_blank())
#reassemble plots
ratioAxisToPlot = .1 #determine what fraction of the arranged plot you want to be axis and what fraction you want to be plot)
plot1Reassembled <- plot_grid(myYAxis1, myPlot1, rel_widths = c(ratioAxisToPlot, 1), ncol=2)
plot2Reassembled <- plot_grid(myYAxis2, myPlot2, rel_widths = c(ratioAxisToPlot, 1), ncol=2)
#put it all together
finalPlot <- plot_grid(plot1Reassembled, plot2Reassembled, nrow=2)
I am generating some plots with the following code.I have 8 plots generated with the the following code and what I want is to have them on the same page with no titles. More specifically, I want in every plot to have on the left up-corner a letter (a,b..) and at the end of the plot to have something like an one-row legend (e.g Plots: a. category one, b. category two, ...).
Code:
g1= ggplot(som, aes(x=value, y=variable))+geom_smooth(method=lm,alpha=0.25,col='green',lwd=0.1) +ylim(0,1000)+xlim(-2,2)+
geom_point(shape=23,fill="black",size=0.2)+theme_bw()+theme(plot.background = element_blank(),panel.grid.major = element_blank()
,panel.grid.minor = element_blank()) +labs(x="something here",y="something else")+
theme(axis.title.x = element_text(face="bold", size=7),axis.text.x = element_text(size=5))+
theme(axis.title.y = element_text(face="bold", size=7),axis.text.y = element_text(size=5))+
theme(plot.title = element_text(lineheight=.8, face="bold",size=8))
grid.arrange(g1,g2,g3,g4,g5,g6,g7,g8,ncol=2)
Is it possible to do that with ggplot? If so, how can I do this?
p.s I have no problem with the above code
Thank you.
This is how you could do it with library(cowplot).
First some plots:
set.seed(1)
plots <- list()
for (i in 1:8) {
my_cars <- mtcars[sample(1:nrow(mtcars), 10), ]
plots[[i]] <- ggplot(my_cars, aes(mpg, hp, color = as.factor(cyl))) +
geom_point() +
geom_smooth(method = "lm", color = "black")
}
Then to have a unifying title (or legend here) we use a combination of two plot_grid() calls.
lbls <- LETTERS[1:length(plots)]
# add a line break because its long
lbls <- gsub("E", "\nE", lbls)
grid <- plot_grid(plotlist = plots, labels = lbls, ncol = 2)
legend <- ggdraw() +
draw_label(paste0(lbls, "= category",1:length(plots), collapse = " "))
plot_grid(grid, legend, rel_heights = c(1, .1), ncol = 1)
The documentation for cowplot is great and has a ton of examples. Check it out here and here. Let me know if you get stuck.
I am trying to write a script that produces four different plots in a single image. Specifically, I want to recreate this graphic as closely as possible:
My current script produces four plots similar to these but I cannot figure out how to allocate screen real-estate accordingly. I want to:
modify the height and width of the plots so that all four have uniform width, one is substantially taller than the others which have uniform height among them
define the position of the legends by coordinates so that I can use screen space effectively
modify the overall shape of my image explicitly as needed (maybe I will need it closer to square-shaped at some point)
GENERATE SOME DATA TO PLOT
pt_id = c(1:279) # DEFINE PATIENT IDs
smoke = rbinom(279,1,0.5) # DEFINE SMOKING STATUS
hpv = rbinom(279,1,0.3) # DEFINE HPV STATUS
data = data.frame(pt_id, smoke, hpv) # PRODUCE DATA FRAME
ADD ANATOMICAL SITE DATA
data$site = sample(1:4, 279, replace = T)
data$site[data$site == 1] = "Hypopharynx"
data$site[data$site == 2] = "Larynx"
data$site[data$site == 3] = "Oral Cavity"
data$site[data$site == 4] = "Oropharynx"
data$site_known = 1 # HACK TO FACILITATE PRODUCING BARPLOTS
ADD MUTATION FREQUENCY DATA
data$freq = sample(1:1000, 279, replace = F)
DEFINE BARPLOT
require(ggplot2)
require(gridExtra)
bar = ggplot(data, aes(x = pt_id, y = freq)) + geom_bar(stat = "identity") + theme(axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("Number of Mutations")
# DEFINE BINARY PLOTS
smoke_status = ggplot(data, aes(x=pt_id, y=smoke, fill = "red")) + geom_bar(stat="identity") + theme(legend.position = "none", axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("Smoking Status")
hpv_status = ggplot(data, aes(x=pt_id, y = hpv, fill = "red")) + geom_bar(stat="identity") + theme(legend.position = "none", axis.title.x = element_blank(), axis.ticks.x = element_blank(), axis.text.x = element_blank()) + ylab("HPV Status")
site_status = ggplot(data, aes(x=pt_id, y=site_known, fill = site)) + geom_bar(stat="identity")
PRODUCE FOUR GRAPHS TOGETHER
grid.arrange(bar, smoke_status, hpv_status, site_status, nrow = 4)
I suspect that the functions needed to accomplish these tasks are already included in ggplot2 and gridExtra but I have not been able to figure out how. Also, if any of my code is excessively verbose or there is a simpler, more-elegant way to do what I have already done - please feel free to comment on that as well.
Here are the steps to get the layout you describe:
1) Extract the legend as a separate grob ("graphical object"). We can then lay out the legend separately from the plots.
2) Left-align the edges of the four plots so that the left edges and the x-scales line up properly. The code to do that comes from this SO answer. That answer has a function to align an arbitrary number of plots, but I wasn't able to get that to work when I also wanted to change the proportional space allotted to each plot, so I ended up doing it the "long way" by adjusting each plot separately.
3) Lay out the plots and the legend using grid.arrange and arrangeGrob. The heights argument allocates different proportions of the total vertical space to each plot. We also use the widths argument to allocate horizontal space to the plots in one wide column and the legend in another narrow column.
4) Plot to a device in whatever size you desire. This is how you get a particular shape or aspect ratio.
library(gridExtra)
library(grid)
# Function to extract the legend from a ggplot graph as a separate grob
# Source: https://stackoverflow.com/a/12539820/496488
get_leg = function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
legend
}
# Get legend as a separate grob
leg = get_leg(site_status)
# Add a theme element to change the plot margins to remove white space between the plots
thm = theme(plot.margin=unit(c(0,0,-0.5,0),"lines"))
# Left-align the four plots
# Adapted from: https://stackoverflow.com/a/13295880/496488
gA <- ggplotGrob(bar + thm)
gB <- ggplotGrob(smoke_status + thm)
gC <- ggplotGrob(hpv_status + thm)
gD <- ggplotGrob(site_status + theme(plot.margin=unit(c(0,0,0,0), "lines")) +
guides(fill=FALSE))
maxWidth = grid::unit.pmax(gA$widths[2:5], gB$widths[2:5], gC$widths[2:5], gD$widths[2:5])
gA$widths[2:5] <- as.list(maxWidth)
gB$widths[2:5] <- as.list(maxWidth)
gC$widths[2:5] <- as.list(maxWidth)
gD$widths[2:5] <- as.list(maxWidth)
# Lay out plots and legend
p = grid.arrange(arrangeGrob(gA,gB,gC,gD, heights=c(0.5,0.15,0.15,0.21)),
leg, ncol=2, widths=c(0.8,0.2))
You can then determine the shape or aspect ratio of the final plot by setting the parameters of the output device. (You may have to adjust font sizes when you create the underlying plots in order to get the final layout to look the way you want it.) The plot pasted in below is a png saved directly from the RStudio graph window. Here's how you would save the plot as PDF file (but there are many other "devices" you can use (e.g., png, jpeg, etc.) to save in different formats):
pdf("myPlot.pdf", width=10, height=5)
p
dev.off()
You also asked about more efficient code. One thing you can do is create a list of plot elements that you use multiple times and then just add the name of the list object to each plot. For example:
my_gg = list(geom_bar(stat="identity", fill="red"),
theme(legend.position = "none",
axis.title.x = element_blank(),
axis.ticks.x = element_blank(),
axis.text.x = element_blank()),
plot.margin = unit(c(0,0,-0.5,0), "lines"))
smoke_status = ggplot(data, aes(x=pt_id, y=smoke)) +
labs(y="Smoking Status") +
my_gg
I got a list of number (n=9) and would like to draw them out in a 3*3 square grid and each grid fill with corresponding number. How can I do this in R without installing additional package e.g. plotrix. Many thanks!
Here is a ggplot solution that was a little harder than I expected:
# Setup the data
m <- matrix(c(8,3,4,1,5,9,6,7,2), nrow=3, ncol=3)
df <- expand.grid(x=1:ncol(m),y=1:nrow(m))
df$val <- m[as.matrix(df[c('y','x')])]
library(ggplot2)
library(scales)
ggplot(df, aes(x=x, y=y, label=val)) +
geom_tile(fill='transparent', colour = 'black') +
geom_text(size = 14) +
scale_y_reverse() +
theme_classic() +
theme(axis.text = element_blank(),
panel.grid = element_blank(),
axis.line = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank())
Here is a good solution using just base R, and outputting to a png. Note the default png device has equal width and height.
png("magic_square.png")
par(mar=c(.5,.5,.5,.5))
plot(x=df$x,y=df$y,pch=as.character(df$val),
asp=1, xlim=c(0.5,3.5),ylim=c(0.5,3.5),xaxt="n",yaxt="n",xlab="",ylab="",
xaxs="i", yaxs="i", axes=F)
abline(v=0.5+(0:3),h=0.5+(0:3))
dev.off()
You can use cex in the plot call to make the numbers appear larger.
And you can add circles as follows. Note the abline locations.
symbols(1.5,1.5,circles=1,add=TRUE)
And to annotate as shown in the comment, set the background of the circle and use points to draw additional text annotations.
symbols(1.5,1.5,circles=1,bg="white",add=TRUE)
text(x=1.5,y=1.5,labels="17",cex=3)
Of course the real key to doing this well will be mastering the data structures to make calls into plot, symbols, and text efficient.
Here's one using plotrix (sorry, but it's much easier if you use a package!) and #nograpes's df data.
library(plotrix)
xt <- xtabs(val ~ ., df[c(2,1,3)])
color2D.matplot(xt, vcex = 3, show.values = 1, axes = FALSE, xlab = "",
ylab = "", cellcolors = rep("white", length(xt)))
In case other answers ever change, df was constructed with
m <- matrix(c(8,3,4,1,5,9,6,7,2), nrow = 3, ncol = 3)
df <- expand.grid(x = 1:ncol(m),y = 1:nrow(m))
df$val <- m[as.matrix(df[c('y', 'x')])]
Context
I have some datasets/variables and I want to plot them, but I want to do this in a compact way. To do this I want them to share the same y-axis but distinct x-axis and, because of the different distributions, I want one of the x-axis to be log scaled and the other linear scaled.
Example
Suppose I have a long tailed variable (that I want the x-axis to be log-scaled when plotted):
library(PtProcess)
library(ggplot2)
set.seed(1)
lambda <- 1.5
a <- 1
pareto <- rpareto(1000,lambda=lambda,a=a)
x_pareto <- seq(from=min(pareto),to=max(pareto),length=1000)
y_pareto <- 1-ppareto(x_pareto,lambda,a)
df1 <- data.frame(x=x_pareto,cdf=y_pareto)
ggplot(df1,aes(x=x,y=cdf)) + geom_line() + scale_x_log10()
And a normal variable:
set.seed(1)
mean <- 3
norm <- rnorm(1000,mean=mean)
x_norm <- seq(from=min(norm),to=max(norm),length=1000)
y_norm <- pnorm(x_norm,mean=mean)
df2 <- data.frame(x=x_norm,cdf=y_norm)
ggplot(df2,aes(x=x,y=cdf)) + geom_line()
I want to plot them side by side using the same y-axis.
Attempt #1
I can do this with facets, which looks great, but I don't know how to make each x-axis with a different scale (scale_x_log10() makes both of them log scaled):
df1 <- cbind(df1,"pareto")
colnames(df1)[3] <- 'var'
df2 <- cbind(df2,"norm")
colnames(df2)[3] <- 'var'
df <- rbind(df1,df2)
ggplot(df,aes(x=x,y=cdf)) + geom_line() +
facet_wrap(~var,scales="free_x") + scale_x_log10()
Attempt #2
Use grid.arrange, but I don't know how to keep both plot areas with the same aspect ratio:
library(gridExtra)
p1 <- ggplot(df1,aes(x=x,y=cdf)) + geom_line() + scale_x_log10() +
theme(plot.margin = unit(c(0,0,0,0), "lines"),
plot.background = element_blank()) +
ggtitle("pareto")
p2 <- ggplot(df2,aes(x=x,y=cdf)) + geom_line() +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
plot.margin = unit(c(0,0,0,0), "lines"),
plot.background = element_blank()) +
ggtitle("norm")
grid.arrange(p1,p2,ncol=2)
PS: The number of plots may vary so I'm not looking for an answer specifically for 2 plots
Extending your attempt #2, gtable might be able to help you out. If the margins are the same in the two charts, then the only widths that change in the two plots (I think) are the spaces taken by the y-axis tick mark labels and axis text, which in turn changes the widths of the panels. Using code from here, the spaces taken by the axis text should be the same, thus the widths of the two panel areas should be the same, and thus the aspect ratios should be the same. However, the result (no margin to the right) does not look pretty. So I've added a little margin to the right of p2, then taken away the same amount to the left of p2. Similarly for p1: I've added a little to the left but taken away the same amount to the right.
library(PtProcess)
library(ggplot2)
library(gtable)
library(grid)
library(gridExtra)
set.seed(1)
lambda <- 1.5
a <- 1
pareto <- rpareto(1000,lambda=lambda,a=a)
x_pareto <- seq(from=min(pareto),to=max(pareto),length=1000)
y_pareto <- 1-ppareto(x_pareto,lambda,a)
df1 <- data.frame(x=x_pareto,cdf=y_pareto)
set.seed(1)
mean <- 3
norm <- rnorm(1000,mean=mean)
x_norm <- seq(from=min(norm),to=max(norm),length=1000)
y_norm <- pnorm(x_norm,mean=mean)
df2 <- data.frame(x=x_norm,cdf=y_norm)
p1 <- ggplot(df1,aes(x=x,y=cdf)) + geom_line() + scale_x_log10() +
theme(plot.margin = unit(c(0,-.5,0,.5), "lines"),
plot.background = element_blank()) +
ggtitle("pareto")
p2 <- ggplot(df2,aes(x=x,y=cdf)) + geom_line() +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
plot.margin = unit(c(0,1,0,-1), "lines"),
plot.background = element_blank()) +
ggtitle("norm")
gt1 <- ggplotGrob(p1)
gt2 <- ggplotGrob(p2)
newWidth = unit.pmax(gt1$widths[2:3], gt2$widths[2:3])
gt1$widths[2:3] = as.list(newWidth)
gt2$widths[2:3] = as.list(newWidth)
grid.arrange(gt1, gt2, ncol=2)
EDIT
To add a third plot to the right, we need to take more control over the plotting canvas. One solution is to create a new gtable that contains space for the three plots and an additional space for a right margin. Here, I let the margins in the plots take care of the spacing between the plots.
p1 <- ggplot(df1,aes(x=x,y=cdf)) + geom_line() + scale_x_log10() +
theme(plot.margin = unit(c(0,-2,0,0), "lines"),
plot.background = element_blank()) +
ggtitle("pareto")
p2 <- ggplot(df2,aes(x=x,y=cdf)) + geom_line() +
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
plot.margin = unit(c(0,-2,0,0), "lines"),
plot.background = element_blank()) +
ggtitle("norm")
gt1 <- ggplotGrob(p1)
gt2 <- ggplotGrob(p2)
newWidth = unit.pmax(gt1$widths[2:3], gt2$widths[2:3])
gt1$widths[2:3] = as.list(newWidth)
gt2$widths[2:3] = as.list(newWidth)
# New gtable with space for the three plots plus a right-hand margin
gt = gtable(widths = unit(c(1, 1, 1, .3), "null"), height = unit(1, "null"))
# Instert gt1, gt2 and gt2 into the new gtable
gt <- gtable_add_grob(gt, gt1, 1, 1)
gt <- gtable_add_grob(gt, gt2, 1, 2)
gt <- gtable_add_grob(gt, gt2, 1, 3)
grid.newpage()
grid.draw(gt)
The accepted answer is exactly what makes people run when comes to plotting using R! This is my solution:
library('grid')
g1 <- ggplot(...) # however you draw your 1st plot
g2 <- ggplot(...) # however you draw your 2nd plot
grid.newpage()
grid.draw(cbind(ggplotGrob(g1), ggplotGrob(g2), size = "last"))
This takes care of the y axis (minor and major) guide-lines to align in multiple plots, effortlessly.
Dropping some axis text, unifying the legends, ..., are other tasks that can be taken care of while creating the individual plots, or by using other means provided by grid or gridExtra packages.
The accepted answer looks a little too daunting to me. So I find two ways to get around it with less efforts. Both are based on your Attempt #2 grid.arrange() method.
1. Make plot 1 no y-axis as well
theme(axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank()
So all the plots will be the same. You won't have problems with different aspects ratios. You will need to generate a separate y-axis with R or your favorite image editting app.
2. Fix and respect aspects ratio
Add aspect.ratio = 1 or whatever ratio you desire to theme() of individual plots. Then use respect=TRUE in your grid.arrange()
This way you can keep y-axis in plot1 and still maintains aspects ratio in all plots. Inspired by this answer.
Hope you find these helpful!