vjust inconsistent in stacked bar plot - r

I have a stacked bar plot, with highly unequal heights of bars. I would like to show the percentages on top of each bar.
What I have done so far is the following
df = structure(list(Type = c("Bronchoscopy", "Bronchoscopy", "Endoscopy",
"Endoscopy"), Bacteremia = structure(c(1L, 2L, 1L, 2L), .Label = c("False",
"True"), class = "factor"), count = c(2710L, 64L, 13065L, 103L
), perc = c(97.6928622927181, 2.3071377072819, 99.2178007290401,
0.782199270959903)), class = c("grouped_df", "tbl_df", "tbl",
"data.frame"), row.names = c(NA, -4L), groups = structure(list(
Type = c("Bronchoscopy", "Endoscopy"), .rows = list(1:2,
3:4)), row.names = c(NA, -2L), class = c("tbl_df", "tbl",
"data.frame"), .drop = TRUE))
ggplot(df, aes(x = Type, y = perc, fill = Bacteremia)) +
geom_bar(stat = "identity") +
ylab("percent") +
geom_text(aes(label = paste0(round(perc, 2), "%")), position =
position_stack(vjust = -0.1), color = "black", fontface = "bold")
I can't seem to get the vjust right. It seems like it's not behaving in the same way for the bottom versus the top bar.
What I would like to achieve is to place the percentages slightly higher than the top edge of each bar.
Any ideas?

Here's a possible approach:
ggplot(df, aes(x = Type, y = perc, fill = Bacteremia)) +
geom_bar(stat = "identity") +
ylab("percent") +
geom_text(aes(label = paste0("", round(perc, 2), "%\n"), y = perc),
color = "black", fontface = "bold", nudge_y = 2)
I should elaborate that ggplot2 is going to try to place the geom_text() relative to the data. If you are trying to align horizontally the text labels, you will need to either use annotate() or supply a labelling dataset with type, percent and Bacteremia and call that in geom_text() as below.
labdf <- cbind(df, ypos = c(103, 5, 103, 5))
ggplot(df, aes(x = Type, y = perc, fill = Bacteremia)) +
geom_bar(stat = "identity") +
ylab("percent") +
geom_text(data = labdf,
aes(label = paste0("", round(perc, 2), "%"), y = ypos, x = Type),
color = "black", fontface = "bold")

Here's one way to do it:
df <-
tibble(
Type = c("Bronchoscopy", "Bronchoscopy", "Endoscopy", "Endoscopy"),
Bacteremia = c("False", "True", "False", "True"),
count = c(2710L, 64L, 13065L, 103L)
) %>%
group_by(Type) %>%
mutate(Percent = round((count / sum(count) * 100), 1))
df %>%
ggplot(aes(x = Type, y = Percent, fill = Bacteremia)) +
geom_col() +
geom_label(
data = . %>% filter(Bacteremia == "True"),
aes(y = Percent + 5, label = str_c(Percent, "%")),
show.legend = FALSE
) +
geom_label(
data = . %>% filter(Bacteremia == "False"),
aes(y = 105, label = str_c(Percent, "%")),
show.legend = FALSE
)
The choices of 5 and 105 work on my computer, but may need to be tweaked a bit based on your specific settings and aspect ratio. The first geom_label call sets the y-axis based on the precise percentage, while the second one sets it at a constant level above the bars.
You might also want to play around with using geom_text vs. geom_label to experiment with different color and label settings. The nice thing about geom_label is that it will make it very clear which group is being labeled.

Related

Setting a Background Color for One Facet of Pie Charts in ggplot2

I'm trying to create a plot with three pie containing facets. One of these contains the overall statistics. therefore, to emphasize the "overall" one, I'd like to put a background color behind it.
Here is how the data looks
cat action pct
<chr> <chr> <dbl>
1 All No 34
2 All Yes 66
3 Host No 24
4 Host Yes 76
5 Refugee No 38
6 Refugee Yes 62
Here is the dput deconstruction
> dput(a)
structure(list(cat = c("All", "All", "Host", "Host", "Refugee",
"Refugee"), action = c("No", "Yes", "No", "Yes", "No", "Yes"),
pct = c(34, 66, 24, 76, 38, 62)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L), groups = structure(list(
cat = c("All", "Host", "Refugee"), .rows = structure(list(
1:2, 3:4, 5:6), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .drop = TRUE))
I've tried adding a geomrect() layer before. Usually this method works with my other line and bar graphs where I haven't applied coord_polar() in the plot.
Here is the simplified code:
a %>%
ggplot(aes("", pct, fill= action))+
geom_rect(data = data.frame(cat="All"), aes(xmin = -Inf,xmax = Inf, ymin = -Inf,ymax = Inf,),
fill='red',alpha = 0.2, inherit.aes = FALSE)+
geom_bar(stat = "identity", position = "fill")+
coord_polar(theta = "y", start = 0)+
facet_wrap(~cat)+
theme_solid()+
guides(fill="none")
I don't think there's an easy way to do this directly within ggplot. The rectangular grobs and annotations don't seem to accept infinite limits with a polar transformation, and any finite limits will result in a circular highlight area being drawn. You cannot pass multiple element_rect in theme to style multiple panels either.
This leaves two broad options:
Generate the plots separately and draw them together on a single page
Take the graphical output of your plot and change the appropriate grob to a rectGrob with the appropriate fill color.
One neat way to achieve the first option without repeating yourself is to use dplyr::group_map and patchwork::wrap_plots:
library(tidyverse)
a %>%
group_by(cat) %>%
group_map(.keep = TRUE,
~ ggplot(.x, aes("", pct, fill = action)) +
geom_bar(stat = "identity", position = "fill")+
coord_polar(theta = "y", start = 0) +
ggthemes::theme_solid() +
guides(fill = "none") +
theme(panel.background = element_rect(
fill = if(all(.x$cat == 'All')) '#FF000032' else NA))) %>%
patchwork::wrap_plots()
The other option, if for some reason you need to use facets, is some form of grob hacking like this:
p <- a %>%
ggplot(aes("", pct, fill = action)) +
geom_bar(stat = "identity", position = "fill") +
coord_polar(theta = "y", start = 0) +
facet_wrap(~cat) +
ggthemes::theme_solid() +
guides(fill = "none")
pg <- ggplotGrob(p)
new_background <- grid::rectGrob(gp = grid::gpar(fill = '#FF000032', col = NA))
panel1 <- pg$grobs[[which(pg$layout$name == 'panel-1-1')]]
panel1$children <- panel1$children
background <- grep('rect', sapply(panel1$children[[1]], names)$children)
panel1$children[[1]]$children[[background]] <- new_background
pg$grobs[[which(pg$layout$name == 'panel-1-1')]] <- panel1
grid::grid.newpage()
grid::grid.draw(pg)

How to add a condition statement in geom_text for ggplot?

I have a plot that looks at 2 quarts worth of data. I also included a target value (dashed line) and a YTD section (which is the cumsum(count).
I am having an issue trying to show the # in that section added for YTD but only for 1 of the quarters (since Q1 should already have a value inside the bar plot). Currently it is showing 0 and 2 in the plot below but I only want to show everything > Q1 values.
Current plot
I have tried with this current approach but does not seem to work:
**geom_text(aes(label = ifelse((quarter_2022= "Q1"), total_attainment, ifelse(quarter_2022="Q2",total_attainment+2)),
position = position_stack(vjust = 1))) +**
Plot Code
ggplot(df1, aes(x=quarter_2022, y=total_attainment)) +
geom_col(aes(y = YTD_TOTAL), fill = c("green1", "green2"), color = "black") +
geom_text(aes(y = YTD_TOTAL, label = scales::percent(YTD_PERCENT_ATTAINMENT)),
vjust = -0.5) +
geom_col(fill = "gray70", color = "gray20") +
geom_text(aes(label = YTD_TOTAL - total_attainment),
position = position_stack(vjust = 1.25))+
geom_text(aes(label = total_attainment),
position = position_stack(vjust = 0.5))+
geom_segment(aes(x = as.numeric(as.factor(quarter_2022)) - 0.4,
xend = as.numeric(as.factor(quarter_2022)) + 0.4,
y = attainment_target, yend = attainment_target),
linetype = "dashed") +
geom_text(aes(label = attainment_target),
position = position_stack(vjust = 4))
Here is the data:
structure(list(attainment_target = c(7.5, 15), quarter_2022 = c("Q1",
"Q2"), year = structure(c(1640995200, 1640995200), class = c("POSIXct",
"POSIXt"), tzone = ""), total_attainment = c(2, 4), percent_attainment_by_quarter = c(0.2666,
0.2666), ytd = c(2, 6), YTD_TOTAL = c(2, 6), YTD_PERCENT_ATTAINMENT = c(0.266666666666667,
0.4)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"
))
Create a logical column in your dataset that indicates whether the label is 0. In the geom_text that creates the label, set the color aesthetic to the logical column. Use scale_color_manual(values = c(NA, "black"), na.value = NA) to assign no color to the labels that were 0s.

How do I create a frequency stacked bar chart however have percentage labels on the bars and frequencies on the y axis, in R?

I started with the code below, however it is not showing the right output. I would just like a normal frequency stacked bar chart to show percentages on the bars but frequencies on the y axis. Could anyone offer any suggestions please?
ggplot(data = df, mapping = aes(x = Family_Size, y = Freq, fill = Survived)) + geom_bar(stat = "identity") +
geom_text(aes(label = paste0(df$Percentage),y=Percentage),size = 3) +
theme(plot.title = element_text(hjust = 0.5))
<table><tbody><tr><th>Survived</th><th>Family_Size</th><th>Frequency</th><th>Percentage</th></tr><tr><td>Yes</td><td>1</td><td>20</td><td>20%</td></tr><tr><td>No</td><td>1</td><td>80</td><td>80%</td></tr><tr><td>Yes</td><td>2</td><td>40</td><td>40%</td></tr><tr><td>No</td><td>2</td><td>60</td><td>60%</td></tr></tbody></table>
Are you looking for something like that ?
ggplot(df, aes(x = Family_Size, y = Frequency, fill = Survived))+
geom_col()+
scale_y_continuous(breaks = seq(0,100, by = 20))+
geom_text(aes(label = Percentage), position = position_stack(0.5))
EDIT: Formatting percentages with two decimales
ggplot(df, aes(x = Family_Size, y = Frequency, fill = Survived))+
geom_col()+
scale_y_continuous(breaks = seq(0,100, by = 20))+
geom_text(aes(label = paste(format(round(Frequency,2),nsmall = 2),"%")), position = position_stack(0.5))
Reproducible example
structure(list(Survived = c("Yes", "No", "Yes", "No"), Family_Size = c(1L,
1L, 2L, 2L), Frequency = c(20L, 80L, 40L, 60L), Percentage = c("20%",
"80%", "40%", "60%")), row.names = c(NA, -4L), class = c("data.table",
"data.frame"))

Positioning labels and color coding in sunburst - R

This is what is the output.I have a data set which contains unit, weight of each unit and compliance score for each unit in year 2016.
I was not able to add the table but here is the screenshot for the data in csv
I have named the columns in the data as unit, weight and year(which is compliance score) .
I want to create a sunburst chart where the first ring will be the unit divided based on weight and the second ring will be the same but will have labels compliance score.
The colour for each ring will be different.
I was able to do some code with the help from an online blog and the output I have gotten is similar to what I want but I am facing difficulty in positioning of the labels and also the colour coding for each ring
#using ggplot
library(ggplot2) # Visualisation
library(dplyr) # data wrangling
library(scales) # formatting
#read file
weight.eg = read.csv("Dummy Data.csv", header = FALSE, sep =
";",encoding = "UTF-8")
#change column names
colnames(weight.eg) <- c ("unit","weight","year")
#as weight column is factor change into integer
weight.eg$weight = as.numeric(levels(weight.eg$weight))
[as.integer(weight.eg$weight)]
weight.eg$year = as.numeric(levels(weight.eg$year))
[as.integer(weight.eg$year)]
#Nas are introduced, remove
weight.eg <- na.omit(weight.eg)
#Sum of the total weight
sum_total_weight = sum(weight.eg$weight)
#First layer
firstLevel = weight.eg %>% summarize(total_weight=sum(weight))
sunburst_0 = ggplot(firstLevel) # Just a foundation
#this will generate a bar chart
sunburst_1 =
sunburst_0 +
geom_bar(data=firstLevel, aes(x=1, y=total_weight),
fill='darkgrey', stat='identity') +
geom_text(aes(x=1, y=sum_total_weight/2, label=paste("Total
Weight", comma(total_weight))), color='black')
#View
sunburst_1
#this argument is used to rotate the plot around the y-axis which
the total weight
sunburst_1 + coord_polar(theta = "y")
sunburst_2=
sunburst_1 +
geom_bar(data=weight.eg,
aes(x=2, y=weight.eg$weight, fill=weight.eg$weight),
color='white', position='stack', stat='identity', size=0.6)
+
geom_text(data=weight.eg, aes(label=paste(weight.eg$unit,
weight.eg$weight), x=2, y=weight.eg$weight), position='stack')
sunburst_2 + coord_polar(theta = "y")
sunburst_3 =
sunburst_2 +
geom_bar(data=weight.eg,
aes(x=3, y=weight.eg$weight,fill=weight.eg$weight),
color='white', position='stack', stat='identity',
size=0.6)+
geom_text(data = weight.eg,
aes(label=paste(weight.eg$year),x=3,y=weight.eg$weight),position =
'stack')
sunburst_3 + coord_polar(theta = "y")
sunburst_3 + scale_y_continuous(labels=comma) +
scale_fill_continuous(low='white', high='darkred') +
coord_polar('y') + theme_minimal()
Output for dput(weight.eg)
structure(list(unit = structure(2:7, .Label = c("", "A", "B",
"C", "D", "E", "F", "Unit"), class = "factor"), weight = c(30,
25, 10, 17, 5, 13), year = c(70, 80, 50, 30, 60, 40)), .Names =
c("unit",
"weight", "year"), row.names = 2:7, class = "data.frame", na.action
= structure(c(1L,
8L), .Names = c("1", "8"), class = "omit"))
output for dput(firstLevel)
structure(list(total_weight = 100), .Names = "total_weight", row.names
= c(NA,
-1L), na.action = structure(c(1L, 8L), .Names = c("1", "8"), class =
"omit"), class = "data.frame")
So I think I might have some sort of solution for you. I wasn't sure what you wanted to color-code on the outer ring; from your code it seems you wanted it to be the weight again, but it was not obvious to me. For different colour scales per ring, you could use the ggnewscale package:
library(ggnewscale)
For the centering of the labels you could write a function:
cs_fun <- function(x){(cumsum(x) + c(0, cumsum(head(x , -1))))/ 2}
Now the plotting code could look something like this:
ggplot(weight.eg) +
# Note: geom_col is equivalent to geom_bar(stat = "identity")
geom_col(data = firstLevel,
aes(x = 1, y = total_weight)) +
geom_text(data = firstLevel,
aes(x = 1, y = total_weight / 2,
label = paste("Total Weight:", total_weight)),
colour = "black") +
geom_col(aes(x = 2,
y = weight, fill = weight),
colour = "white", size = 0.6) +
scale_fill_gradient(name = "Weight",
low = "white", high = "darkred") +
# Open up new fill scale for next ring
new_scale_fill() +
geom_text(aes(x = 2, y = cs_fun(weight),
label = paste(unit, weight))) +
geom_col(aes(x = 3, y = weight, fill = weight),
size = 0.6, colour = "white") +
scale_fill_gradient(name = "Another Weight?",
low = "forestgreen", high = "white") +
geom_text(aes(label = paste0(year), x = 3,
y = cs_fun(weight))) +
coord_polar(theta = "y")
Which looks like this:

Stacked Bar Graph Labels with ggplot2

I am trying to graph the following data:
to_graph <- structure(list(Teacher = c("BS", "BS", "FA"
), Level = structure(c(2L, 1L, 1L), .Label = c("BE", "AE", "ME",
"EE"), class = "factor"), Count = c(2L, 25L, 28L)), .Names = c("Teacher",
"Level", "Count"), row.names = c(NA, 3L), class = "data.frame")
and want to add labels in the middle of each piece of the bars that are the percentage for that piece. Based on this post, I came up with:
ggplot(data=to_graph, aes(x=Teacher, y=Count, fill=Level), ordered=TRUE) +
geom_bar(aes(fill = Level), position = 'fill') +
opts(axis.text.x=theme_text(angle=45)) +
scale_y_continuous("",formatter="percent") +
opts(title = "Score Distribution") +
scale_fill_manual(values = c("#FF0000", "#FFFF00","#00CC00", "#0000FF")) +
geom_text(aes(label = Count), size = 3, hjust = 0.5, vjust = 3, position = "stack")
But it
Doesn't have any effect on the graph
Probably doesn't display the percentage if it did (although I'm not entirely sure of this point)
Any help is greatly appreciated. Thanks!
The y-coordinate of the text is the actual count (2, 25 or 28), whereas the y-coordinates in the plot panel range from 0 to 1, so the text is being printed off the top.
Calculate the fraction of counts using ddply (or tapply or whatever).
graph_avgs <- ddply(
to_graph,
.(Teacher),
summarise,
Count.Fraction = Count / sum(Count)
)
to_graph <- cbind(to_graph, graph_avgs$Count.Fraction)
A simplified version of your plot. I haven't bothered to play about with factor orders so the numbers match up to the bars yet.
ggplot(to_graph, aes(Teacher), ordered = TRUE) +
geom_bar(aes(y = Count, fill = Level), position = 'fill') +
scale_fill_manual(values = c("#FF0000", "#FFFF00","#00CC00", "#0000FF")) +
geom_text(
aes(y = graph_avgs$Count.Fraction, label = graph_avgs$Count.Fraction),
size = 3
)

Resources