I would like to make all of my value labels to be at the same height above the bars on ggplot2. How can I do this?
p <- ggplot(figure_df, aes(x=event_type, y=fraction)) +
geom_bar(aes(fill=factor(method)), position=position_dodge(width=0.9), stat="identity") +
facet_grid(~event_time, switch="x") +
theme(axis.text.x = element_text(angle = 45,hjust = 1)) +
geom_text(aes(label=ifelse(method == 'THRIVE',round(pval, 2)," ")))
It's always best to make your questions reproducible. Check it out: making R reproducible questions.
In lieu of the data I don't have, I made some to answer your question. To set a position at the same height for all labels, you can set y in geom_text to a static number.
Here's an example:
set.seed(3523)
df1 = data.frame(coloring = rep(LETTERS[1:2], 4),
x = rep(LETTERS[1:4], each = 2),
y = runif(8, 10, 29))
ggplot(df1, aes(x = x, y = y, fill = coloring)) +
geom_col(position = position_dodge(.9)) +
geom_text(aes(y = 10, label = round(y, 1)),
position = position_dodge(.9))
You said you wanted the values above the bars. You can set it to a known value or you can use the data to set a singular static value, like this:
ggplot(df1, aes(x = x, y = y, fill = coloring)) +
geom_col(position = position_dodge(.9)) +
geom_text(aes(y = max(y) + 1, label = round(y, 1)),
position = position_dodge(.9))
If you didn't want them at the same height, but you wanted them at the same distance from the bars, you can do that in a variety of different ways. Here is one method.
ggplot(df1, aes(x = x, y = y, fill = coloring)) +
geom_col(position = position_dodge(.9)) +
geom_text(aes(y = y + 1, label = round(y, 1)),
position = position_dodge(.9))
Related
I'm having a hard time dealing with this plot.
The height of values in ANI>96 making it hard to read the red and blue percentage text.
I failed to break the y-axis by looking at answers from other posts in StackOverflow.
Any suggestions?
Thanks.
library(data.table)
library(ggplot2)
dt <- data.table("ANI"= sort(c(seq(79,99),seq(79,99))), "n_pairs" = c(5, 55, 13, 4366, 6692, 59568, 382873, 397996, 1104955, 282915,
759579, 261170, 312989, 48423, 120574, 187685, 353819, 79468, 218039, 66314, 41826, 57668, 112960, 81652, 28613,
64656, 21939, 113656, 170578, 238967, 610234, 231853, 1412303, 5567, 4607268, 5, 14631942, 0, 17054678, 0, 3503846, 0),
"same/diff" = rep(c("yes","no"), 21))
for (i in 1:nrow(dt)) {
if (i%%2==0) {
next
}
total <- dt$n_pairs[i] + dt$n_pairs[i+1]
dt$total[i] <- total
dt$percent[i] <- paste0(round(dt$n_pairs[i]/total *100,2), "%")
dt$total[i+1] <- total
dt$percent[i+1] <- paste0(round(dt$n_pairs[i+1]/total *100,2), "%")
}
ggplot(data=dt, aes(x=ANI, y=n_pairs, fill=`same/diff`)) +
geom_text(aes(label=percent), position=position_dodge(width=0.9), hjust=0.75, vjust=-0.25) +
geom_bar(stat="identity") + scale_x_continuous(breaks = dt$ANI) +
labs(x ="ANI", y = "Number of pairs", fill = "Share one common species taxonomy?") +
theme_classic() + theme(legend.position="bottom")
Here is the list of major changes I made:
I reduced the y axis by zooming into the chart with coord_cartesian (which is called by coord_flip).
coord_flip shouuld also improve the readability of the chart by switching x and y. I don't know if the switch is a desirable output for you.
Also now position_dodge, works as expected: two bars next to each other with the labels on top (on the left in this case).
I set geom_bar before geom_text so that the text is always in front of the bars in the chart.
I set scale_y_continuous to change the labels of the y axis (in the chart the x axis because of the switch) to improve the readability of the zeros.
ggplot(data=dt, aes(x = ANI, y = n_pairs, fill = `same/diff`)) +
geom_bar(stat = "identity", position = position_dodge2(width = 1), width = 0.8) +
geom_text(aes(label = percent), position = position_dodge2(width = 1), hjust = 0, size = 3) +
scale_x_continuous(breaks = dt$ANI) +
scale_y_continuous(labels = scales::comma) +
labs(x ="ANI", y = "Number of pairs", fill = "Share one common species taxonomy?") +
theme_classic() +
theme(legend.position = "bottom") +
coord_flip(ylim = c(0, 2e6))
EDIT
Like this columns and labels are stacked but labels never overlap.
ggplot(data=dt, aes(x = ANI, y = n_pairs, fill = `same/diff`)) +
geom_bar(stat = "identity", width = 0.8) +
geom_text(aes(label = percent,
hjust = ifelse(`same/diff` == "yes", 1, 0)),
position = "stack", size = 3) +
scale_x_continuous(breaks = dt$ANI) +
scale_y_continuous(labels = scales::comma) +
labs(x ="ANI", y = "Number of pairs", fill = "Share one common species taxonomy?") +
theme_classic() +
theme(legend.position = "bottom") +
coord_flip(ylim = c(0, 2e6))
Alternatively, you can avoid labels overlapping with check_overlap = TRUE, but sometimes one of the labels will not be shown.
ggplot(data=dt, aes(x = ANI, y = n_pairs, fill = `same/diff`)) +
geom_bar(stat = "identity", width = 0.8) +
geom_text(aes(label = percent), hjust = 1, position = "stack", size = 3, check_overlap = TRUE) +
scale_x_continuous(breaks = dt$ANI) +
scale_y_continuous(labels = scales::comma) +
labs(x ="ANI", y = "Number of pairs", fill = "Share one common species taxonomy?") +
theme_classic() +
theme(legend.position = "bottom") +
coord_flip(ylim = c(0, 2e6))
I would like to add summary statistics on a box plot at the max of a dynamic y axis.
In the real data the y axis is a dynamic dropdown, one value is between 0 - 6; and the other between 0 - 100. In the example below I have hard coded where I would like the labels to be, but I cannot hard code them in the real data.
Is there a way to either:
Set labels outside the graph above the y axis? So that the labels will not move even if the axis changes?
Or is there a way to set it to max of Y + n?
Example:
# library
library(ggplot2)
library(ggpubr)
# create a data frame
variety=rep(LETTERS[1:7], each=40)
treatment=rep(c("high","low"),each=20)
note=seq(1:280)+sample(1:150, 280, replace=T)
data=data.frame(variety, treatment , note)
# grouped boxplot
ggplot(data, aes(x = variety, y = note, fill = treatment)) +
geom_boxplot() +
scale_fill_manual(values = c("#79AAB9", "#467786")) +
stat_compare_means(aes(group = treatment), label = "p.format") +
stat_summary(
fun.data = function(x)
data.frame(y = 460, label = paste(round(median(
x
), 1))),
geom = "text",
aes(group = treatment),
hjust = 0.5,
position = position_dodge(0.9)
) +
stat_summary(
fun.data = function(x)
data.frame(y = 445, label = paste("n", length(x))),
geom = "text",
aes(group = treatment),
hjust = 0.5,
position = position_dodge(0.9)
) +
expand_limits(y = 100)
Thanks so much for any help in advance.
Managed to get the following working with suggestion from #MarkNeal
# library
library(ggplot2)
library(ggpubr)
# create a data frame
variety=rep(LETTERS[1:7], each=40)
treatment=rep(c("high","low"),each=20)
note=seq(1:280)+sample(1:150, 280, replace=T)
data=data.frame(variety, treatment , note)
# grouped boxplot
ggplot(data, aes(x = variety, y = note, fill = treatment)) +
geom_boxplot() +
scale_fill_manual(values = c("#79AAB9", "#467786")) +
stat_compare_means(aes(group = treatment), label = "p.format", vjust = 3) +
stat_summary(
fun.data = function(x)
data.frame(y= Inf, label = paste(round(median(
x
), 1))),
geom = "text",
aes(group = treatment),
hjust = 0.5, vjust = 1,
position = position_dodge(0.9)
) +
stat_summary(
fun.data = function(x)
data.frame(y = Inf, label = paste("n", length(x))),
geom = "text",
aes(group = treatment),
hjust = 0.5, vjust = 2,
position = position_dodge(0.9)
)
How to show the point (x=0, y=1500) with a text label next to it on the following histogram?
ggplot(ds_visits, aes(x = patientsInService)) +
geom_histogram(stat = "count", col = "black", fill = "white") +
theme_bw() +
labs(x = "Patients in service", y = "Cases") +
scale_x_discrete(limits = seq(0, 5, 1))
You have to create dummy data.frame for point data:
pointData <- data.frame(X = 0, Y = 1500)
Plot it with with two additional gems (geom_point and geom_text):
ggplot(ds_visits, aes(patientsInService)) +
geom_histogram(stat = "count", col = "black", fill = "white") +
geom_point(data = pointData, aes(X , Y)) +
geom_text(data = pointData, aes(X + 1 , Y + 10, label = "My Text"))
In geom_text I'm changing coordinates a little bit not to overlap text with point.
The data I am working on is a clustering data, with multiple observations within one group, I generated a caterpillar plot and want labelling for each group(zipid), not every line, my current graph and code look like this:
text = hosp_new[,c("zipid")]
ggplot(hosp_new, aes(x = id, y = oe, colour = zipid, shape = group)) +
# theme(panel.grid.major = element_blank()) +
geom_point(size=1) +
scale_shape_manual(values = c(1, 2, 4)) +
geom_errorbar(aes(ymin = low_ci, ymax = high_ci)) +
geom_smooth(method = lm, se = FALSE) +
scale_linetype_manual(values = linetype) +
geom_segment(aes(x = start_id, xend = end_id, y = region_oe, yend = region_oe, linetype = "4", size = 1.2)) +
geom_ribbon(aes(ymin = region_low_ci, ymax = region_high_ci), alpha=0.2, linetype = "blank") +
geom_hline(aes(yintercept = 1, alpha = 0.2, colour = "red", size = 1), show.legend = "FALSE") +
scale_size_identity() +
scale_x_continuous(name = "hospital id", breaks = seq(0,210, by = 10)) +
scale_y_continuous(name = "O:E ratio", breaks = seq(0,7, by = 1)) +
geom_text(aes(label = text), position = position_stack(vjust = 10.0), size = 2)
Caterpillar plot:
Each color represents a region, I just want one label/per region, but don't know how to delete the duplicated labels in this graph.
Any idea?
The key is to have geom_text return only one value for each zipid, rather than multiple values. If we want each zipid label located in the middle of its group, then we can use the average value of id as the x-coordinate for each label. In the code below, we use stat_summaryh (from the ggstance package) to calculate that average id value for the x-coordinate of the label and return a single label for each zipid.
library(ggplot2)
theme_set(theme_bw())
library(ggstance)
# Fake data
set.seed(300)
dat = data.frame(id=1:100, y=cumsum(rnorm(100)),
zipid=rep(LETTERS[1:10], c(10, 5, 20, 8, 7, 12, 7, 10, 13,8)))
ggplot(dat, aes(id, y, colour=zipid)) +
geom_segment(aes(xend=id, yend=0)) +
stat_summaryh(fun.x=mean, aes(label=zipid, y=1.02*max(y)), geom="text") +
guides(colour=FALSE)
You could also use faceting, as mentioned by #user20650. In the code below, panel.spacing.x=unit(0,'pt') removes the space between facet panels, while expand=c(0,0.5) adds 0.5 units of padding on the sides of each panel. Together, these ensure constant spacing between tick marks, even across facets.
ggplot(dat, aes(id, y, colour=zipid)) +
geom_segment(aes(xend=id, yend=0)) +
facet_grid(. ~ zipid, scales="free_x", space="free_x") +
guides(colour=FALSE) +
theme_classic() +
scale_x_continuous(breaks=0:nrow(dat),
labels=c(rbind(seq(0,100,5),'','','',''))[1:(nrow(dat)+1)],
expand=c(0,0.5)) +
theme(panel.spacing.x = unit(0,"pt"))
Consider this sample data.
df <- data.frame(
x = factor(c(1, 1, 2, 2)),
y = c(.1, .3, .2, .1),
grp = c("a", "b", "a", "b")
)
Now I create the graph using ggplot, and annotate it using geom_text()
ggplot(data = df, aes(x, y, fill = grp, label = y)) +
geom_bar(stat = "identity", position = "dodge") +
scale_y_continuous(limits=c(0,1)) +
geom_text(position = position_dodge(0.9))
How do I specify that all the text values align perfectly horizontal at the top of the graph window?
You can specify the aes(y=...) in geom_text. So, for the numbers at the top of the graph window you'll have
ggplot(data = df, aes(x, y, fill = grp, label = y)) +
geom_bar(stat = "identity", position = "dodge") +
geom_text(aes(y=Inf), position = position_dodge(0.9))
And you may want to chuck in a + ylim(0, 4) to expand the plot area.
To match the edited question:
ggplot(data = df, aes(x, y, fill = grp, label = y)) +
geom_bar(stat = "identity", position = "dodge") +
scale_y_continuous(limits=c(0,1)) +
geom_text(aes(y=0.9), position = position_dodge(0.9)) ## can specify any y=.. value