I am using ggplot to plot some data across 5 facets and I want to put some text that says "Delta = #" where Delta is the upper case math delta symbol and # is 1,2,3,4, or 5 based on which facet it is. Here is what I have:
annotate("text",x="baseline",y=75,label=paste(expression(Delta),"=",1:5))
My line of code works but it spells out Delta rather than giving me the Delta symbol. How can I get the math symbol?
Try this
df <- mtcars[2:6,]
ggplot(df, aes(mpg, disp))+
geom_point()+
annotate("text",df$mpg,df$disp,label=paste(("Delta * '=' *"), 1:5),
parse=TRUE, hjust = 1.1)
annotate() will give you the same annotation on each facet, you should use geom_text() instead, with a suitable data.frame to provide the mapping.
library(ggplot2)
ggplot(data.frame(f=1:2, lab = sprintf("Delta == %i", 1:2))) + facet_wrap(~f) +
geom_text(aes(label=lab), x=0, y=0, parse=TRUE)
Related
This question is motivated by a previous post illustrating various ways to change how axes scales are plotted in a ggplot figure, from the default exponential notation to the full integer value (when ones axes values are very large). While I am able to convert the axes scales from exponential notation to full values, I am unclear how one would achieve the same goal for the values appearing in the legend.
While I understand that one can manually change the length of the legend scale with "scale_color..." or "scale_fill..." followed by the "limits" argument, this does not appear to be a solution to getting my legend values to show "6000000000" rather than "6e+09" (or "0" rather than "0e+00" for that matter).
The following example should suffice. My hope is someone can point out how to implement the 'scales' package to apply for legend scales rather than axes scales.
Thanks very much.
library(ggplot2)
library(scales)
Data <- data.frame(
pi = c(2,71,828,1828,45904,523536,2874713,52662497,757247093,6999595749),
e = c(3,14,159,2653,58979,311599,7963468,54418516,1590576171, 99),
face = 1:10)
p <- ggplot(data = Data, aes(x=face, y=e, colour = pi))
myplot <- p + geom_point() +
scale_y_continuous(labels = comma) +
scale_color_gradientn(colours = rainbow(2), limits=c(0,7000000000))
myplot
Use the Comma formatter in scale_color_gradientn by setting labels = comma e.g.:
p <- ggplot(data = Data, aes(x=face, y=e, colour = pi))
myplot <- p + geom_point() +
scale_y_continuous(labels = comma) +
scale_color_gradientn(colours = rainbow(2), limits=c(0,7000000000), labels = comma)
myplot
I have a horizontal line in a ggplot and I would like to label it's value (7.1) on the y axis.
library(ggplot2)
df <- data.frame(y=c(1:10),x=c(1:10))
h <- 7.1
plot1 <- ggplot(df, aes(x=x,y=y)) + geom_point()
plot2 <- plot1+ geom_hline(aes(yintercept=h))
Thank you for your help.
It's not clear if you want 7.1 to be part of the y-axis, or if you just want a way to label the line. Assuming the former, you can use scale_y_continuous() to define your own breaks. Something like this may do what you want (will need some fiddling most likely):
plot1+ geom_hline(aes(yintercept=h)) +
scale_y_continuous(breaks = sort(c(seq(min(df$y), max(df$y), length.out=5), h)))
Assuming the latter, this is probably more what you want:
plot1 + geom_hline(aes(yintercept=h)) +
geom_text(aes(0,h,label = h, vjust = -1))
Similar to Chase's solution with a change of using the existing labels.
ggplot_build(plot1)$layout$panel_ranges[[1]]$y.major_source can be used to extract the exisitng labels and add new ones h.
plot1 + geom_hline(aes(yintercept=h)) +
scale_y_continuous(breaks = sort(c(ggplot_build(plot1)$layout$panel_ranges[[1]]$y.major_source, h)))
How about something like this?
plot1 + geom_hline(aes(yintercept=h), colour="#BB0000", linetype="dashed") +
geom_text(aes( 0, h, label = h, vjust = -1), size = 3)
This is a follow-up to Prradep's answer.
I think Prradep's answer works for an older version of ggplot2. I'm using ggplot2 version 3.1.0 and in order to extract the existing labels of plot1 in that version you have to use:
ggplot_build(plot1)$layout$panel_params[[1]]$y.major
This only works for LINEAR AXES! If you have a non-linear y-axis (for example logarithmic), then ggplot2 stores where tick marks would be if the axis were linear in $y.major. The actual tick mark labels are stored as a character vector in $y.labels. Therefore, for a non-linear y-axis you need to use:
as.numeric(ggplot_build(cl.plot.log)$layout$panel_params[[1]]$y.labels)
I have a set of code that produces multiple plots using facet_wrap:
ggplot(summ,aes(x=depth,y=expr,colour=bank,group=bank)) +
geom_errorbar(aes(ymin=expr-se,ymax=expr+se),lwd=0.4,width=0.3,position=pd) +
geom_line(aes(group=bank,linetype=bank),position=pd) +
geom_point(aes(group=bank,pch=bank),position=pd,size=2.5) +
scale_colour_manual(values=c("coral","cyan3", "blue")) +
facet_wrap(~gene,scales="free_y") +
theme_bw()
With the reference datasets, this code produces figures like this:
I am trying to accomplish two goals here:
Keep the auto scaling of the y axis, but make sure only 1 decimal place is displayed across all the plots. I have tried creating a new column of the rounded expr values, but it causes the error bars to not line up properly.
I would like to wrap the titles. I have tried changing the font size as in Change plot title sizes in a facet_wrap multiplot, but some of the gene names are too long and will end up being too small to read if I cram them on a single line. Is there a way to wrap the text, using code within the facet_wrap statement?
Probably cannot serve as definite answer, but here are some pointers regarding your questions:
Formatting the y-axis scale labels.
First, let's try the direct solution using format function. Here we format all y-axis scale labels to have 1 decimal value, after rounding it with round.
formatter <- function(...){
function(x) format(round(x, 1), ...)
}
mtcars2 <- mtcars
sp <- ggplot(mtcars2, aes(x = mpg, y = qsec)) + geom_point() + facet_wrap(~cyl, scales = "free_y")
sp <- sp + scale_y_continuous(labels = formatter(nsmall = 1))
The issue is, sometimes this approach is not practical. Take the leftmost plot from your figure, for example. Using the same formatting, all y-axis scale labels would be rounded up to -0.3, which is not preferable.
The other solution is to modify the breaks for each plot into a set of rounded values. But again, taking the leftmost plot of your figure as an example, it'll end up with just one label point, -0.3
Yet another solution is to format the labels into scientific form. For simplicity, you can modify the formatter function as follow:
formatter <- function(...){
function(x) format(x, ..., scientific = T, digit = 2)
}
Now you can have a uniform format for all of plots' y-axis. My suggestion, though, is to set the label with 2 decimal places after rounding.
Wrap facet titles
This can be done using labeller argument in facet_wrap.
# Modify cyl into factors
mtcars2$cyl <- c("Four Cylinder", "Six Cylinder", "Eight Cylinder")[match(mtcars2$cyl, c(4,6,8))]
# Redraw the graph
sp <- ggplot(mtcars2, aes(x = mpg, y = qsec)) + geom_point() +
facet_wrap(~cyl, scales = "free_y", labeller = labeller(cyl = label_wrap_gen(width = 10)))
sp <- sp + scale_y_continuous(labels = formatter(nsmall = 2))
It must be noted that the wrap function detects space to separate labels into lines. So, in your case, you might need to modify your variables.
This only solved the first part of the question. You can create a function to format your axis and use scale_y_continous to adjust it.
df <- data.frame(x=rnorm(11), y1=seq(2, 3, 0.1) + 10, y2=rnorm(11))
library(ggplot2)
library(reshape2)
df <- melt(df, 'x')
# Before
ggplot(df, aes(x=x, y=value)) + geom_point() +
facet_wrap(~ variable, scale="free")
# label function
f <- function(x){
format(round(x, 1), nsmall=1)
}
# After
ggplot(df, aes(x=x, y=value)) + geom_point() +
facet_wrap(~ variable, scale="free") +
scale_y_continuous(labels=f)
scale_*_continuous(..., labels = function(x) sprintf("%0.0f", x)) worked in my case.
I am plotting a forest plot in ggplot2 and am having issues with the ordering of the labels in the legend matching the order of the labels in the data set. Here is my code below.
data code
d<-data.frame(x=c("Co-K(W) N=720", "IH-K(W) N=67", "IF-K(W) N=198", "CO-K(B)N=78", "IH-K(B) N=13", "CO=A(W) N=874","D-Sco Ad(W) N=346","DR-Ad (W) N=892","CE_A(W) N=274","CO-Ad(B) N=66","D-So Ad(B) N=215","DR-Ad(B) N=123","CE-Ad(B) N=79"),
y = rnorm(13, 0, 0.1))
d <- transform(d, ylo = y-1/13, yhi=y+1/13)
d$x <- factor(d$x, levels=rev(d$x)) # reverse ordering
forest plot code
credplot.gg <- function(d){
# d is a data frame with 4 columns
# d$x gives variable names
# d$y gives center point
# d$ylo gives lower limits
# d$yhi gives upper limits
require(ggplot2)
p <- ggplot(d, aes(x=x, y=y, ymin=ylo, ymax=yhi,group=x,colour=x,)) +
geom_pointrange(size=1) +
theme_bw() +
scale_color_discrete(name="Sample") +
coord_flip() +
theme(legend.key=element_rect(fill='cornsilk2')) +
guides(colour = guide_legend(override.aes = list(size=0.5))) +
geom_hline(aes(x=0), colour = 'red', lty=2) +
xlab('Cohort') + ylab('CI') + ggtitle('Forest Plot')
return(p)
}
credplot.gg(d)
This is what I get. As you can see the labels on the y axis matches the labels in the order that it is in the data. However, it is not the same order in the legend. I'm not sure how to correct this. This is my first time creating a plot in ggplot2. Any feedback is well appreciated.Thanks in advanced
Nice plot, especially for a first ggplot! I've not tested, but I think all you need is to add reverse=TRUE inside your colour's guide_legend(found this in the Cookbook for R).
If I were to make one more comment, I'd say that ordering your vertical factor by numeric value often makes comparisons easier when alphabetical order isn't particularly meaningful. (Though maybe your alpha order is meaningful.)
I have data for 4 sectors (A,B,C,D) and 5 years. I would like to draw 4 lines, 1 for each sector, adding a point for every year and add a fifth line representing the mean line using the stat_summary statement and controlling the line colors by means of scale_color_manual and point shapes in aes() argument. The problem is that if I add the point geom the legend is split in two parts one for point shapes and one for line colors. I didn't understand how to obtain 1 legend combining colors and points.
Here is an example. First of all let's build the data frame dtfr as follows:
a <- 100; b <- 100; c <- 100; d <- 100
for(k in 2:5){
a[k] <- a[k-1]*(1+rnorm(1)/100)
b[k] <- b[k-1]*(1+rnorm(1)/100)
c[k] <- c[k-1]*(1+rnorm(1)/100)
d[k] <- d[k-1]*(1+rnorm(1)/100)
}
v <- numeric()
for(k in 1:5){ v <- c(v,a[k],b[k],c[k],d[k]) }
dtfr <- data.frame(Year=rep(2008:2012,1, each=4),
Sector=rep(c("A","B","C","D"),5),
Value=v,
stringsAsFactors=F)
Now let us start to draw our graph by ggpolt2. In the first graph we draw lines and points geom without the mean line:
library(ggplot2)
ggplot(dtfr, aes(x=Year, y=Value)) +
geom_line(aes(group=Sector, color=Sector)) +
geom_point(aes(color=Sector, shape=Sector)) +
# stat_summary(aes(colour="mean",group=1), fun.y=mean, geom="line", size=1.1) +
scale_color_manual(values=c("#004E00", "#33FF00", "#FF9966", "#3399FF", "#FF004C")) +
ggtitle("Test for ggplot2 graph")
In this graph we have the legend with line colors and point shapes all in one:
But if I use the stat_summary to draw the mean line using the following code:
ggplot(dtfr, aes(x=Year, y=Value)) +
geom_line(aes(group=Sector, color=Sector)) +
geom_point(aes(color=Sector, shape=Sector)) +
stat_summary(aes(colour="mean",group=1), fun.y=mean, geom="line", size=1.1) +
scale_color_manual(values=c("#004E00", "#33FF00", "#FF9966", "#3399FF", "#FF004C")) +
ggtitle("Test for ggplot2 graph")
I get the mean (red) line but the legend is split into two parts one for line colors and one for point shapes. At this point my question is: How can I get the mean line graph with the legend like the one in the first graph? That is, how to get only one legend combining lines and shapes in the second graph where is drawn the mean line?
Try this:
ggplot(dtfr, aes(x=Year, y=Value)) +
geom_line(aes(group=Sector, color=Sector)) +
geom_point(aes(color=Sector, shape=Sector)) +
stat_summary(aes(colour="mean",shape="mean",group=1), fun.y=mean, geom="line", size=1.1) +
scale_color_manual(values=c("#004E00", "#33FF00", "#FF9966", "#3399FF", "#FF004C")) +
scale_shape_manual(values=c(1:4, 32)) +
ggtitle("Test for ggplot2 graph")
Maybe someone more knowledgeable can come in and correct my explanation (or provide a better solution), but here's how I understand it: You have 5 values in the color scale, but you only have 4 in the shape scale; you're missing a value for "mean". So the scales aren't really compatible in a way. You can fix this by assigning a blank shape (32) to your mean line.
Here is a different approach that calculates the summary/mean beforehand and adds it as an additional level to the data frame before building the plot.
The approach can be used to easily add an additional line but with a specific color, which may be desired for a summary/mean for example.
First, I calculate the mean and add it to the dtfr of the OP.
dtfr2 <- dtfr %>%
dplyr::group_by(Year) %>%
dplyr::summarise(Value = mean(Value)) %>%
dplyr::mutate(Sector = NA) %>%
dplyr::bind_rows(dtfr)
dtfr2 now has additional rows with the mean values stored in Value and NAs in Sector.
Then, building the plot is easy:
p1 <- ggplot(dtfr2, aes(x=Year, y=Value, color = Sector, shape = Sector)) +
geom_line() +
geom_point()
Finally, you may tweak the legend a little:
p1 +
scale_color_discrete(labels = c(letters[1:4], "M"), na.value = "black") +
scale_shape_discrete(labels = c(letters[1:4], "M"))