I am making a bar chart with long axis labels which i need to wrap and right align. The only complication is i need to add a expression to have superscripts.
library(ggplot2)
library(scales)
df <- data.frame("levs" = c("a long label i want to wrap",
"another also long label"),
"vals" = c(1,2))
p <- ggplot(df, aes(x = levs, y = vals)) +
geom_bar(stat = "identity") +
coord_flip() +
scale_x_discrete(labels = wrap_format(20))
which produces the desired result:
with properly wrapped text with all labels fully right aligned.
However now I attempt to add superscript using the below code, and the axis text alignment changes:
p <- ggplot(df, aes(x = levs, y = vals)) +
geom_bar(stat = "identity") +
coord_flip() +
scale_x_discrete(labels = c(expression("exponent"^1),
wrap_format(20)("another also long label")))
(NB I cannot use unicode as is recommended to others with the same question because it does not work with the font I am required to use).
How can I get the axis text to be aligned right even when one of the axis labels includes an expression?
It's a strange thing, but if a vector (e.g. a character vector of labels) includes an object created by expression(), the whole vector appears to be treated as an expression:
# create a simple vector with one expression & one character string
label.vector <- c(expression("exponent"^1),
wrap_format(20)("another also long label"))
> sapply(label.vector, class) # the items have different classes when considered separately
[1] "call" "character"
> class(label.vector) # but together, it's considered an expression
[1] "expression"
... and expressions are always left-aligned. This isn't a ggplot-specific phenomenon; we can observe it in the base plotting functions as well:
# even with default hjust = 0.5 / vjust = 0.5 (i.e. central alignment), an expression is
# anchored based on the midpoint of its last line, & left-aligned within its text block
ggplot() +
annotate("point", x = 1:2, y = 1) +
annotate("text", x = 1, y = 1,
label = expression("long string\nwith single line break"))+
annotate("text", x = 2, y = 1,
label = expression("long string\nwith multiple line\nbreaks here")) +
xlim(c(0.5, 2.5))
# same phenomenon observed in base plot
par(mfrow = c(1, 3))
plot(0, xlab=expression("short string"))
plot(0, xlab=expression("long string\nwith single line break"))
plot(0, xlab=expression("long string\nwith multiple line\nbreaks here"))
Workaround
If we can force each label to be considered on its own, without the effect of other labels in the label vector, the non-expression labels could be aligned like normal character strings. One way to do this is to convert the ggplot object into grob, & replace the single textGrob for y-axis labels with multiple text grobs, one for each label.
Prep work:
# generate plot (leave the labels as default)
p <- ggplot(df, aes(x = levs, y = vals)) +
geom_bar(stat = "identity") +
coord_flip()
p
# define a list (don't use `c(...)` here) of desired y-axis labels, starting with the
# bottom-most label in your plot & work up from there
desired.labels <- list(expression("exponent"^1),
wrap_format(20)("another also long label"))
Grob hacking:
library(grid)
library(magrittr)
# convert to grob object
gp <- ggplotGrob(p)
# locate label grob in the left side y-axis
old.label <- gp$grobs[[grep("axis-l", gp$layout$name)]]$children[["axis"]]$grobs[[1]]$children[[1]]
# define each label as its own text grob, replacing the values with those from
# our list of desired y-axis labels
new.label <- lapply(seq_along(old.label$label),
function(i) textGrob(label = desired.labels[[i]],
x = old.label$x[i], y = old.label$y[i],
just = old.label$just, hjust = old.label$hjust,
vjust = old.label$vjust, rot = old.label$rot,
check.overlap = old.label$check.overlap,
gp = old.label$gp))
# remove the old label
gp$grobs[[grep("axis-l", gp$layout$name)]]$children[["axis"]]$grobs[[1]] %<>%
removeGrob(.$children[[1]]$name)
# add new labels
for(i in seq_along(new.label)) {
gp$grobs[[grep("axis-l", gp$layout$name)]]$children[["axis"]]$grobs[[1]] %<>%
addGrob(new.label[[i]])
}
# check result
grid.draw(gp)
Related
Context
I want to show a specific value at x axis in ggplot2. It is 2.84 in the Reproducible code.
I found the answer at How can I add specific value to x-axis in ggplot2?
It very close to my need.
Question
Is there some way that do not need set breaks and labels in scale_x_continuous to show a specific value at x axis.
Because I need to draw a large number of similar images, setting the breaks and labels for each image will be very tedious.
Reproducibale code
# make up some data
d <- data.frame(x = 6*runif(10) + 1,
y = runif(10))
# generate break positions
breaks = c(seq(1, 7, by=0.5), 2.84)
# and labels
labels = as.character(breaks)
# plot
ggplot(d, aes(x, y)) + geom_point() + theme_minimal() +
scale_x_continuous(limits = c(1, 7), breaks = breaks, labels = labels,
name = "Number of treatments")
You can automate this process by creating a wrapper round scale_x_continuous that inserts your break into a vector of pretty breaks:
scale_x_fancy <- function(xval, ...) {
scale_x_continuous(breaks = ~ sort(c(pretty(.x, 10), xval)), ...)
}
So now you just add the x value(s) where you want the extra break to appear:
ggplot(d, aes(x, y)) +
geom_point() +
theme_minimal() +
scale_x_fancy(xval = 2.84, name = "Number of treatments")
I have a set of code that produces multiple plots using facet_wrap:
ggplot(summ,aes(x=depth,y=expr,colour=bank,group=bank)) +
geom_errorbar(aes(ymin=expr-se,ymax=expr+se),lwd=0.4,width=0.3,position=pd) +
geom_line(aes(group=bank,linetype=bank),position=pd) +
geom_point(aes(group=bank,pch=bank),position=pd,size=2.5) +
scale_colour_manual(values=c("coral","cyan3", "blue")) +
facet_wrap(~gene,scales="free_y") +
theme_bw()
With the reference datasets, this code produces figures like this:
I am trying to accomplish two goals here:
Keep the auto scaling of the y axis, but make sure only 1 decimal place is displayed across all the plots. I have tried creating a new column of the rounded expr values, but it causes the error bars to not line up properly.
I would like to wrap the titles. I have tried changing the font size as in Change plot title sizes in a facet_wrap multiplot, but some of the gene names are too long and will end up being too small to read if I cram them on a single line. Is there a way to wrap the text, using code within the facet_wrap statement?
Probably cannot serve as definite answer, but here are some pointers regarding your questions:
Formatting the y-axis scale labels.
First, let's try the direct solution using format function. Here we format all y-axis scale labels to have 1 decimal value, after rounding it with round.
formatter <- function(...){
function(x) format(round(x, 1), ...)
}
mtcars2 <- mtcars
sp <- ggplot(mtcars2, aes(x = mpg, y = qsec)) + geom_point() + facet_wrap(~cyl, scales = "free_y")
sp <- sp + scale_y_continuous(labels = formatter(nsmall = 1))
The issue is, sometimes this approach is not practical. Take the leftmost plot from your figure, for example. Using the same formatting, all y-axis scale labels would be rounded up to -0.3, which is not preferable.
The other solution is to modify the breaks for each plot into a set of rounded values. But again, taking the leftmost plot of your figure as an example, it'll end up with just one label point, -0.3
Yet another solution is to format the labels into scientific form. For simplicity, you can modify the formatter function as follow:
formatter <- function(...){
function(x) format(x, ..., scientific = T, digit = 2)
}
Now you can have a uniform format for all of plots' y-axis. My suggestion, though, is to set the label with 2 decimal places after rounding.
Wrap facet titles
This can be done using labeller argument in facet_wrap.
# Modify cyl into factors
mtcars2$cyl <- c("Four Cylinder", "Six Cylinder", "Eight Cylinder")[match(mtcars2$cyl, c(4,6,8))]
# Redraw the graph
sp <- ggplot(mtcars2, aes(x = mpg, y = qsec)) + geom_point() +
facet_wrap(~cyl, scales = "free_y", labeller = labeller(cyl = label_wrap_gen(width = 10)))
sp <- sp + scale_y_continuous(labels = formatter(nsmall = 2))
It must be noted that the wrap function detects space to separate labels into lines. So, in your case, you might need to modify your variables.
This only solved the first part of the question. You can create a function to format your axis and use scale_y_continous to adjust it.
df <- data.frame(x=rnorm(11), y1=seq(2, 3, 0.1) + 10, y2=rnorm(11))
library(ggplot2)
library(reshape2)
df <- melt(df, 'x')
# Before
ggplot(df, aes(x=x, y=value)) + geom_point() +
facet_wrap(~ variable, scale="free")
# label function
f <- function(x){
format(round(x, 1), nsmall=1)
}
# After
ggplot(df, aes(x=x, y=value)) + geom_point() +
facet_wrap(~ variable, scale="free") +
scale_y_continuous(labels=f)
scale_*_continuous(..., labels = function(x) sprintf("%0.0f", x)) worked in my case.
I am trying to build a horizontal bar chart.
library(ggplot2)
library(plyr)
salary <- read.csv('September 15 2015 Salary Information - Alphabetical.csv', na.strings = '')
head(salary)
salary$X <- NULL
salary$X.1 <- NULL
salary$Club <- as.factor(salary$Club)
levels(salary$Club)
salary$Base.Salary <- gsub(',', '', salary$Base.Salary)
salary$Base.Salary <- as.numeric(as.character(salary$Base.Salary))
salary$Base.Salary <- salary$Base.Salary / 1000000
salary <- ddply(salary, .(Club), transform, pos = cumsum(Base.Salary) - (0.5 * Base.Salary))
ggplot(salary, aes(x = Club, y = Base.Salary, fill = Base.Salary)) +
geom_bar(stat = 'identity') +
ylab('Base Salary in millions of dollars') +
theme(axis.title.y = element_blank()) +
coord_flip() +
geom_text(data = subset(salary, Base.Salary > 2), aes(label = Last.Name, y = pos))
(credits to this thread: Showing data values on stacked bar chart in ggplot2 for the text position calculation)
and the resulting plot is this:
I was thoroughly confused for a while, because I was using xlab to specify the label, and theme(axis.title.y = element_blank()) to hide the y label. However, this didn't work, and I got it to work by changing it to ylab. This seems rather confusing, is it intended?
This seems rather confusing, is it intended?
Yes.
Rather than using theme() to hide the y label, I think
labs(x = "My x label",
y = "")
is more straightforward.
When you flip x and y, they take their labels with them. If this weren't the case, a graph compared with and without coordinate flip would have incorrect axis labels in one of the two cases - which seems confusing and inconsistent. As-is, the labels will be correct always (with and without coord_flip).
Theming, on the other hand, is applied after-the-fact.
Is there any way to line up the points of a line plot with the bars of a bar graph using ggplot when they have the same x-axis? Here is the sample data I'm trying to do it with.
library(ggplot2)
library(gridExtra)
data=data.frame(x=rep(1:27, each=5), y = rep(1:5, times = 27))
yes <- ggplot(data, aes(x = x, y = y))
yes <- yes + geom_point() + geom_line()
other_data = data.frame(x = 1:27, y = 50:76 )
no <- ggplot(other_data, aes(x=x, y=y))
no <- no + geom_bar(stat = "identity")
grid.arrange(no, yes)
Here is the output:
The first point of the line plot is to the left of the first bar, and the last point of the line plot is to the right of the last bar.
Thank you for your time.
Extending #Stibu's post a little: To align the plots, use gtable (Or see answers to your earlier question)
library(ggplot2)
library(gtable)
data=data.frame(x=rep(1:27, each=5), y = rep(1:5, times = 27))
yes <- ggplot(data, aes(x = x, y = y))
yes <- yes + geom_point() + geom_line() +
scale_x_continuous(limits = c(0,28), expand = c(0,0))
other_data = data.frame(x = 1:27, y = 50:76 )
no <- ggplot(other_data, aes(x=x, y=y))
no <- no + geom_bar(stat = "identity") +
scale_x_continuous(limits = c(0,28), expand = c(0,0))
gYes = ggplotGrob(yes) # get the ggplot grobs
gNo = ggplotGrob(no)
plot(rbind(gNo, gYes, size = "first")) # Arrange and plot the grobs
Edit To change heights of plots:
g = rbind(gNo, gYes, size = "first") # Combine the plots
panels <- g$layout$t[grepl("panel", g$layout$name)] # Get the positions for plot panels
g$heights[panels] <- unit(c(0.7, 0.3), "null") # Replace heights with your relative heights
plot(g)
I can think of (at least) two ways to align the x-axes in the two plots:
The two axis do not align because in the bar plot, the geoms cover the x-axis from 0.5 to 27.5, while in the other plot, the data only ranges from 1 to 27. The reason is that the bars have a width and the points don't. You can force the axex to align by explicitly specifying an x-axis range. Using the definitions from your plot, this can be achieved by
yes <- yes + scale_x_continuous(limits=c(0,28))
no <- no + scale_x_continuous(limits=c(0,28))
grid.arrange(no, yes)
limits sets the range of the x-axis. Note, though, that the alginment is still not quite perfect. The y-axis labels take up a little more space in the upper plot, because the numbers have two digits. The plot looks as follows:
The other solution is a bit more complicated but it has the advantage that the x-axis is drawn only once and that ggplot makes sure that the alignment is perfect. It makes use of faceting and the trick described in this answer. First, the data must be combined into a single data frame by
all <- rbind(data.frame(other_data,type="other"),data.frame(data,type="data"))
and then the plot can be created as follows:
ggplot(all,aes(x=x,y=y)) + facet_grid(type~.,scales = "free_y") +
geom_bar(data=subset(all,type=="other"),stat="identity") +
geom_point(data=subset(all,type=="data")) +
geom_line(data=subset(all,type=="data"))
The trick is to let the facets be constructed by the variable type which was used before to label the two data sets. But then each geom only gets the subset of the data that should be drawn with that specific geom. In facet_grid, I also used scales = "free_y" because the two y-axes should be independent. This plot looks as follows:
You can change the labels of the facets by giving other names when you define the data frame all. If you want to remove them alltogether, then add the following to your plot:
+ theme(strip.background = element_blank(), strip.text = element_blank())
Like this previous poster, I am also using geom_text to annotate plots in gglot2. And I want to position those annotations in relative coordinates (proportion of facet H & W) rather than data coordinates. Easy enough for most plots, but in my case I'm dealing with histograms. I'm sure the relevant information as to the y scale must be lurking in the plot object somewhere (after adding geom_histogram), but I don't see where.
My question: How do I read maximum bar height from a faceted ggplot2 object containing geom_histogram? Can anyone help?
Try this:
library(plyr)
library(scales)
p <- ggplot(mtcars, aes(mpg)) + geom_histogram(aes(y = ..density..)) + facet_wrap(~am)
r <- print(p)
# in data coordinate
(dc <- dlply(r$data[[1]], .(PANEL), function(x) max(x$density)))
(mx <- dlply(r$data[[1]], .(PANEL), function(x) x[which.max(x$density), ]$x))
# add annotation (see figure below)
p + geom_text(aes(x, y, label = text),
data = data.frame(x = unlist(mx), y = unlist(dc), text = LETTERS[1:2], am = 0:1),
colour = "red", vjust = 0)
# scale range
(yr <- llply(r$panel$ranges, "[[", "y.range"))
# in relative coordinates
(rc <- mapply(function(d, y) rescale(d, from = y), dc, yr))