Breaking y-axis in ggplot2 with geom_bar - r

I'm having a hard time dealing with this plot.
The height of values in ANI>96 making it hard to read the red and blue percentage text.
I failed to break the y-axis by looking at answers from other posts in StackOverflow.
Any suggestions?
Thanks.
library(data.table)
library(ggplot2)
dt <- data.table("ANI"= sort(c(seq(79,99),seq(79,99))), "n_pairs" = c(5, 55, 13, 4366, 6692, 59568, 382873, 397996, 1104955, 282915,
759579, 261170, 312989, 48423, 120574, 187685, 353819, 79468, 218039, 66314, 41826, 57668, 112960, 81652, 28613,
64656, 21939, 113656, 170578, 238967, 610234, 231853, 1412303, 5567, 4607268, 5, 14631942, 0, 17054678, 0, 3503846, 0),
"same/diff" = rep(c("yes","no"), 21))
for (i in 1:nrow(dt)) {
if (i%%2==0) {
next
}
total <- dt$n_pairs[i] + dt$n_pairs[i+1]
dt$total[i] <- total
dt$percent[i] <- paste0(round(dt$n_pairs[i]/total *100,2), "%")
dt$total[i+1] <- total
dt$percent[i+1] <- paste0(round(dt$n_pairs[i+1]/total *100,2), "%")
}
ggplot(data=dt, aes(x=ANI, y=n_pairs, fill=`same/diff`)) +
geom_text(aes(label=percent), position=position_dodge(width=0.9), hjust=0.75, vjust=-0.25) +
geom_bar(stat="identity") + scale_x_continuous(breaks = dt$ANI) +
labs(x ="ANI", y = "Number of pairs", fill = "Share one common species taxonomy?") +
theme_classic() + theme(legend.position="bottom")

Here is the list of major changes I made:
I reduced the y axis by zooming into the chart with coord_cartesian (which is called by coord_flip).
coord_flip shouuld also improve the readability of the chart by switching x and y. I don't know if the switch is a desirable output for you.
Also now position_dodge, works as expected: two bars next to each other with the labels on top (on the left in this case).
I set geom_bar before geom_text so that the text is always in front of the bars in the chart.
I set scale_y_continuous to change the labels of the y axis (in the chart the x axis because of the switch) to improve the readability of the zeros.
ggplot(data=dt, aes(x = ANI, y = n_pairs, fill = `same/diff`)) +
geom_bar(stat = "identity", position = position_dodge2(width = 1), width = 0.8) +
geom_text(aes(label = percent), position = position_dodge2(width = 1), hjust = 0, size = 3) +
scale_x_continuous(breaks = dt$ANI) +
scale_y_continuous(labels = scales::comma) +
labs(x ="ANI", y = "Number of pairs", fill = "Share one common species taxonomy?") +
theme_classic() +
theme(legend.position = "bottom") +
coord_flip(ylim = c(0, 2e6))
EDIT
Like this columns and labels are stacked but labels never overlap.
ggplot(data=dt, aes(x = ANI, y = n_pairs, fill = `same/diff`)) +
geom_bar(stat = "identity", width = 0.8) +
geom_text(aes(label = percent,
hjust = ifelse(`same/diff` == "yes", 1, 0)),
position = "stack", size = 3) +
scale_x_continuous(breaks = dt$ANI) +
scale_y_continuous(labels = scales::comma) +
labs(x ="ANI", y = "Number of pairs", fill = "Share one common species taxonomy?") +
theme_classic() +
theme(legend.position = "bottom") +
coord_flip(ylim = c(0, 2e6))
Alternatively, you can avoid labels overlapping with check_overlap = TRUE, but sometimes one of the labels will not be shown.
ggplot(data=dt, aes(x = ANI, y = n_pairs, fill = `same/diff`)) +
geom_bar(stat = "identity", width = 0.8) +
geom_text(aes(label = percent), hjust = 1, position = "stack", size = 3, check_overlap = TRUE) +
scale_x_continuous(breaks = dt$ANI) +
scale_y_continuous(labels = scales::comma) +
labs(x ="ANI", y = "Number of pairs", fill = "Share one common species taxonomy?") +
theme_classic() +
theme(legend.position = "bottom") +
coord_flip(ylim = c(0, 2e6))

Related

geom_text with rounded percentage numbers on a bar graph?

I am trying to achieve a bar graph with percentages shown, but can not round them down.
In this variant the percentages are shown, but I am not able to round them
ggplot(data=true_dizzy, aes(x = reorder(factor(ICD_subgroup_new), ICD_subgroup_new, length),
y = prop.table(stat(count)),
label = scales::percent(prop.table(stat(count))))) +
geom_bar() +
coord_flip() +
geom_text(stat = "count", position = position_dodge(.9), hjust = 0, size = 3) +
scale_y_continuous(label = scales::percent) +
labs(y = c(""), x = c(""), title = c("Subgroups of Dizzy Patients (%")) +
theme_bw()
and in this variant there are no percent signs behind the numbers
ggplot(data=true_dizzy, aes(x = reorder(factor(ICD_subgroup_new), ICD_subgroup_new, length),
y = prop.table(stat(count)),
label = scales::percent(prop.table(stat(count))))) +
geom_bar() +
coord_flip() +
geom_text(stat = "count", aes(label = round(x = (..count../sum(..count..))*100, digits = 1), hjust = 0),
position = position_dodge(.9), hjust = 0, size = 3) +
scale_y_continuous(label = scales::percent) +
labs(y = c(""), x = c(""), title = c("Subgroups of Dizzy Patients (%")) +
theme_bw()
I know the problem is in the geom_text line, but i cant seem to figure it out.

Formatting GGplot stacked barplot

I am making a set of scorecards where I am generating a set of graphs that show the distribution of responses from a survey and also where the response for a specific company falls. I need to modify the formatting of a graph, a stacked barchart, and add a few features I’ve outlined below. I’ve already spent a few hours getting my chart to where it is now and would appreciate your help with the features I outline below.
Data is
Data<-data.frame(Reviewed = c("Annually", "Annually", "Hourly", "Monthly", "Weekly","Monthly","Weekly","Other","Other","Monthly","Weekly"),Company=c("a","b","c","d","e","f","g","h","i","j","k"),Question="Q1")
So far I’ve developed this
ggplot(Data, aes(x="Question", fill=Reviewed)) + geom_bar(position='fill' ) +
coord_flip()
I would like to do the following:
Order the variables so they are arranged on plot as follows: Annually,Monthly,Weekly,Hourly,Other
Express the y axis in terms of percent. I.e. 0.25 turns into 25%
Move y-axis directly underneath the bar.
Remove the legend but move the terms underneath the respective part of the graph on a diagonal slant.
Add a black line that cuts down the 50% mark
Add a dot in at the midpoint of the stack for the value of company “e”.
Remove gray background
This is what I'm hoping the finished graph will look like.
There's a lot to unpack here, so I'll break it down bit by bit:
Order the variables so they are arranged on plot as follows: Annually,Monthly,Weekly,Hourly,Other
Assign "Reviewed" as an ordered factor. I'm reversing the order here since it wants to plot the "lowest" factor first (to the left).
Data$Reviewed <- factor(Data$Reviewed,
levels = rev(c('Annually', 'Monthly', 'Weekly', 'Hourly', 'Other')),
ordered = T)
ggplot(Data, aes(x="Question", fill=Reviewed)) + geom_bar(position='fill' ) +
coord_flip()
Express the y axis in terms of percent. I.e. 0.25 turns into 25%
Use scale_y_continuous(labels = scales::percent) to adjust the labels. I believe that the scales was pulled in when you installed ggplot2.
ggplot(Data, aes(x="Question", fill=Reviewed)) +
geom_bar(position = 'fill') +
scale_y_continuous(labels = scales::percent) +
coord_flip()
Move y-axis directly underneath the bar.
Remove gray background
These are done all at once by adding expand = F to coord_flip.
ggplot(Data, aes(x="Question", fill=Reviewed)) +
geom_bar(position = 'fill') +
scale_y_continuous(labels = scales::percent) +
coord_flip(expand = F)
Remove the legend...
Add theme(legend.position = 'none').
ggplot(Data, aes(x="Question", fill=Reviewed)) +
geom_bar(position = 'fill') +
scale_y_continuous(labels = scales::percent) +
coord_flip(expand = F) +
theme(legend.position = 'none')
but move the terms underneath the respective part of the graph on a diagonal slant.
This is tougher and takes a good amount of fiddling.
Use geom_text to make the labels
Calculate the position along the bar using the 'count' stat
Move the labels to the bottom of the plot by providing a fake x coordinate
Align the labels in the center of the bars using position_stack, and make them abut the x axis using hjust.
Add angle.
Use clip = 'off' in coord_flip to make sure that these values are not cut out since they're outside the plotting area.
Fiddle with the x limits to crop out empty plotting area.
Adjust the plot margin in theme to make sure everything can be seen.
ggplot(Data, aes(x="Question", fill=Reviewed)) +
geom_bar(position = 'fill') +
geom_text(aes(label = Reviewed, x = 0.45,
y = stat(..count../sum(..count..))), stat = 'count',
position = position_stack(0.5),
hjust = 0,
angle = 45) +
scale_y_continuous(labels = scales::percent) +
coord_flip(xlim = c(0.555, 1.4), clip = 'off',expand = F) +
theme(plot.margin = margin(0, 0, 35, 10),
legend.position = 'none')
Add a black line that cuts down the 50% mark
Use geom_hline(yintercept = 0.5); remember that it's a "horizontal" line since the coordinates are flipped.
ggplot(Data, aes(x="Question", fill=Reviewed)) +
geom_bar(position = 'fill') +
geom_text(aes(label = Reviewed, x = 0.45,
y = stat(..count../sum(..count..))), stat = 'count',
position = position_stack(0.5),
hjust = 0,
angle = 45) +
geom_hline(yintercept = 0.5) +
scale_y_continuous(labels = scales::percent) +
coord_flip(xlim = c(0.555, 1.4), clip = 'off',expand = F) +
theme(plot.margin = margin(0, 0, 20, 10),
legend.position = 'none')
Add a dot in at the midpoint of the stack for the value of company “e”.
This is pretty hack-y. Using the same y values as in geom_text, use geom_point to plot a point for every value of Reviewed, then use position_stack(0.5) to nudge them to the center of the bar. Then use scale_color_manual to only color "Weekly" values (which is the corresponding value of Reviewed for Company "e"). I'm sure there's a way to do this more programmatically.
ggplot(Data, aes(x="Question", fill=Reviewed)) +
geom_bar(position = 'fill') +
geom_text(aes(label = Reviewed, x = 0.45,
y = stat(..count../sum(..count..))), stat = 'count',
position = position_stack(0.5),
hjust = 0,
angle = 45) +
geom_hline(yintercept = 0.5) +
geom_point(aes(y = stat(..count../sum(..count..)),
color = Reviewed), stat = 'count',
position = position_stack(0.5), size = 5) +
scale_color_manual(values = 'black', limits = 'Weekly') +
scale_y_continuous(labels = scales::percent) +
coord_flip(xlim = c(0.555, 1.4), clip = 'off',expand = F) +
theme(plot.margin = margin(0, 0, 20, 10),
legend.position = 'none')
This is what I'm hoping the finished graph will look like.
Prettying things up:
ggplot(Data, aes(x="Question", fill = Reviewed)) +
geom_bar(position = 'fill') +
geom_text(aes(label = Reviewed, x = 0.45,
y = stat(..count../sum(..count..))), stat = 'count',
position = position_stack(0.5),
hjust = 0,
angle = 45) +
geom_hline(yintercept = 0.5) +
geom_point(aes(y = stat(..count../sum(..count..)),
color = Reviewed), stat = 'count',
position = position_stack(0.5), size = 5) +
scale_color_manual(values = 'black', limits = 'Weekly') +
scale_y_continuous(labels = scales::percent) +
coord_flip(xlim = c(0.555, 1.4), clip = 'off', expand = F) +
labs(x = NULL, y = NULL) +
theme_minimal() +
theme(plot.margin = margin(0, 0, 35, 10),
legend.position = 'none')

ggplot legend / label change with various guides

I am trying to change the labels of a legend on a ggplot where I have legends for 2 aes.
With scale_file_manual, it works, but for one of the two legends only. I guess I should use "guides" that I already used to remove the titles and also remove legend for a 3rd aes, but I do not manage to do it.
Do you have a solution?
Here is my code :
p <- ggplot(data, aes(x = Nb))
p + geom_ribbon(aes(ymin = Sandwich.min, ymax = Sandwich.max, fill = 'grey70',alpha=0.8)) +
geom_ribbon(aes(ymin = Assiette.min, ymax = Assiette.max, fill = '#6495ED80',alpha=0.8)) +
geom_line(aes(y = Pizza, col = '#FF7F24')) +
geom_line(aes(y = Sushis, col = '#228B22')) +
labs(title = "Business lunch cost by number of participants",
x = "Number of participants",
y = "Price (Euros)") +
scale_x_continuous(breaks=seq(1,20,1)) +
scale_y_continuous(breaks = seq(0,300,50)) +
theme_light() +
theme(plot.title = element_text(size = 12, hjust = 0.5)) +
guides(alpha = FALSE, colour = guide_legend(" "), fill = guide_legend(" ")) +
scale_fill_manual(
values=c('#6495ED80','grey70'),
labels=c("Assiettes","Sandwiches"))

How to align text on clustered bar chart in ggplot2?

I'm trying to align the percent frequency of each bar in my clustered bar chart. Right now, my chart looks like this:
Here's the code as well:
ggplot(graph_data, aes(x, Freq)) +
geom_bar(aes(fill = Pref), position = 'dodge', stat = 'identity') +
geom_text(aes(label = sprintf("%.0f%%", round(Freq/sum(Freq) * 100))),
hjust = -0.25) +
labs(list(x = attr(graph_data, 'seg_label'),
y = 'Frequency',
title = paste('Q:', attr(graph_data, 'question')))) +
scale_y_continuous(limits = c(0, 1.2 * max(graph_data$Freq))) +
guides(fill = F) +
coord_flip() +
annotate("text", x = Inf, y = Inf,
label = paste0("N = ", sum(graph_data$Freq)),
hjust = 1.5, vjust = 1.5)
I think the issue can be solved on this snippet of code, but I'm not sure how:
geom_text(aes(label = sprintf("%.0f%%", round(Freq/sum(Freq) * 100))), hjust = -0.25)
Any help would be greatly appreciated!
Edit: Here's a sample of my data's structure as well:
df <- data.frame(x = rep(c('1824', '2534', '3544'), 3),
Pref = rep(c('low', 'neutral', 'high')),
Freq = 1:9 * 10)
As mentioned in the comments I think this is a duplicate of Position geom_text on dodged barplot.
But I did it now, so I'll include the code.
ggplot(df, aes(x, Freq, fill = Pref)) +
geom_bar(position = 'dodge', stat = 'identity') +
geom_text(aes(label = sprintf("%.0f%%", round(Freq/sum(Freq) * 100))),
position = position_dodge(width = 0.9), hjust = -0.25) +
labs(list(x = attr(df, 'seg_label'),
y = 'Frequency',
title = paste('Q:', attr(df, 'question')))) +
scale_y_continuous(limits = c(0, 1.2 * max(df$Freq))) +
guides(fill = F) +
coord_flip()
You need to put fill in the original aes so the that geom_text knows which label to dodge by which amount.

How to label a barplot on the opposite side of a bar of -ve and +ve values with ggplot?

May I please seek your help to Label a barplot with ggplot2 like the following graph:
I am using the following code to obtain the attached plot:
library(ggplot2)
test <- data.frame(x = c("Moderately Poor", "Deeply Poor", "Deeply & Intensely Poor", "Intensely Poor", "Overall Poverty"), y = c(0.024, -0.046, -0.025, -0.037, -0.083))
test$colour <- ifelse(test$y < 0, "firebrick1", "steelblue")
test$hjust <- ifelse(test$y > 0, 1.03, -0.03)
ggplot(test, aes(x, y, label = x, hjust = hjust)) +
geom_text(aes(y = 0, colour = colour)) +
geom_bar(stat = "identity", aes(fill = colour))
last_plot() + coord_flip() + labs(x = "", y = "") +
scale_x_discrete(breaks = NA) + theme_bw() +
opts(legend.position = "none")
I was just wondering how can I get the second numeric label on each bar?
Thanks,
R graphics, including ggplot2, are pen-on-paper, i.e layers (each geom_...) will be drawn in order.
So, if you want to have a geom_text on top of a geom_bar, the geom_text will need to come after the geom_bar.
Updating for ggplot2 0.9.3 (the current version)
ggplot(test, aes(x, y)) +
geom_text(aes(y = 0, colour = colour, hjust =hjust, label = x), size=4.5) +
geom_bar(stat = "identity", aes(fill = colour)) +
geom_text(colour ='black',
aes(label = paste( formatC( round(y*100, 2 ), format='f', digits=2 ),'%'),
hjust = ifelse(y>0,1,0))) +
coord_flip() + labs(x = "", y = "") +
scale_x_discrete(breaks = NULL) + theme_bw() +
theme(legend.position = "none")
produces

Resources