stacked geom_bar issue with stacked bar and labels misplaced - r

I have this bar chart:
group = c("A","A","B","B")
value = c(25,-75,-40,-76)
day = c(1,2,1,2)
dat = data.frame(group = group , value = value, day = day)
ggplot(data = dat, aes(x = group, y = value, fill = factor(day))) +
geom_bar(stat = "identity", position = "identity")+
geom_text(aes(label = round(value,0)), color = "black", position = "stack")
and I'd like the bars stacked and the values to show up. When I run the code above the -76 is not in the correct location (and neither is the 75 it seems).
Any idea how to get the numbers to appear in the correct location?

ggplot(data=dat, aes(x=group, y=value, fill=factor(day))) +
geom_bar(stat="identity", position="identity")+
geom_text(label =round(value,0),color = "black")+
scale_y_continuous(breaks=c(-80,-40,0))

Stacking a mix of negative and positive values is difficult for ggplot2. The easiest thing to do is to split the dataset into two, one for positives and one for negatives, and then add bar layers separately. A classic example is here.
You can do the same thing with the text, adding one text layer for the positive y values and one for the negatives.
dat1 = subset(dat, value >= 0)
dat2 = subset(dat, value < 0)
ggplot(mapping = aes(x = group, y = value, fill = factor(day))) +
geom_bar(data = dat1, stat = "identity", position = "stack")+
geom_bar(data = dat2, stat = "identity", position = "stack") +
geom_text(data = dat1, aes(label = round(value,0)), color = "black", position = "stack") +
geom_text(data = dat2, aes(label = round(value,0)), color = "black", position = "stack")
If using the currently development version of ggplot2 (2.1.0.9000), the stacking doesn't seem to be working correctly in geom_text for negative values. You can always calculate the text positions "by hand" if you need to.
library(dplyr)
dat2 = dat2 %>%
group_by(group) %>%
mutate(pos = cumsum(value))
ggplot(mapping = aes(x = group, y = value, fill = factor(day))) +
geom_bar(data = dat1, stat = "identity", position = "stack")+
geom_bar(data = dat2, stat = "identity", position = "stack") +
geom_text(data = dat1, aes(label = round(value,0)), color = "black") +
geom_text(data = dat2, aes(label = round(value,0), y = pos), color = "black")

Related

Make a line separated by group in bar chart

I am trying to overlay two sets of data that with be used in bar charts. The first is the main set of data and I want that to be the main focus. For the second dataset I want just a line marking where on the chart it would be. I can get close to what I want by doing this:
Tbl = data.frame(Factor = c("a","b","c","d"),
Percent = c(43,77,37,55))
Tbl2 = data.frame(Factor = c("a","b","c","d"),
Percent = c(58,68,47,63))
ggplot(aes(x = Factor, y = Percent), data = Tbl) +
geom_bar(position = "stack", stat = "identity", fill = "blue") +
ylim(0,100) +
geom_bar(aes(x = Factor, y = Percent), data = Tbl2,
position = "stack", stat = "identity", fill = NA, color = "black") +
theme_bw()
What I have so far
I believe I can accomplish what I want by using geom_vline if there is a way to separate it by groups. Another option I came up with is if it is possible to change the colors of the "sides" of the bars in the overlay to white while keeping the "top" of each bar chart as black.
An idea of what I want (Edited in paint)
You could use geom_errorbar where the ymin and ymax are the same values like this:
library(ggplot2)
ggplot(aes(x = Factor, y = Percent), data = Tbl) +
geom_bar(position = "stack", stat = "identity", fill = "blue") +
ylim(0,100) +
geom_errorbar(aes(x = Factor, ymin = Percent, ymax = Percent), data = Tbl2,
stat = "identity", color = "black") +
theme_bw()
Created on 2022-12-28 with reprex v2.0.2
Another option is geom_point with shape = 95 (line) and the size adjusted to suit:
library(tidyverse)
Tbl = data.frame(Factor = c("a","b","c","d"),
Percent = c(43,77,37,55))
Tbl2 = data.frame(Factor = c("a","b","c","d"),
Percent = c(58,68,47,63))
ggplot(aes(x = Factor, y = Percent), data = Tbl) +
geom_bar(position = "stack", stat = "identity", fill = "blue") +
ylim(0,100) +
geom_point(aes(x = Factor, y = Percent), data = Tbl2,
position = "stack", stat = "identity", color = "black", shape = 95, size = 30) +
theme_bw()
Created on 2022-12-28 with reprex v2.0.2
Here is one more using geom_segment(). Some would say to fancy, but anyway:
For this we have to extend Tbl2:
library(ggplot2)
library(dplyr)
ggplot(aes(x = Factor, y = Percent), data = Tbl) +
geom_bar(position = "stack", stat = "identity", fill = "blue") +
ylim(0,100) +
geom_segment(data = Tbl2 %>%
mutate(x = c(0.55, 1.55, 2.55, 3.55),
xend = x+0.9), aes(x=x,xend=xend, y = Percent, yend=Percent), size=2)+
theme_bw()

Mixed Variable Stacked Barplot

I would like to produce a GGPlot stacked barplot with variables on the horizontal axis, yet some variables have different responses.
Some variables are 'Y/N' responses. Some variables are 'Old/Young'. And some of the variables are a Likert scale of 0-5.
Therefore, I would like to plot these in a stacked barplot form, with each variable type encoded with a different colour palette, and with a legend reflecting the different palettes/variable types.
I wondered if someone could help with this, please? Would be greatly appreciated.
X1<-c("N","N","N","N","Y","N","Y","N","N","N","N","N","Y","N","N","Y","N","N","N","Y","N","Y","Y","N","N","Y","Y","Y","N","N","N","N","N","N","N","N","Y","N","Y","N","N","N","N","Y","N","N","Y","N","Y","Y","N","Y","N","N")
X2 <-c("N","N","N","N","Y","N","Y","N","N","N","N","N","Y","N","N","Y","N","N","N","Y","N","Y","Y","N","N","Y","Y","Y","N","N","N","N","N","N","N","N","Y","N","Y","N","N","N","N","Y","N","N","Y","N","Y","Y","N","Y","N","N")
X3<-c(1,1,0,1,2,0,0,0,0,0,1,1,1,2,0,1,2,1,1,0,0,0,4,1,0,0,0,0,1,0,2,0,0,2,1,1,0,0,0,1,1,0,1,0,1,0,1,1,0,1,0,1,1,1)
X4 <-c("YouNg","Old","Old","YouNg","Old","Old","Old","YouNg","YouNg","YouNg","Old","Old","Old",
"Old","Old","Old","Old","YouNg","Old","Old","Old","YouNg","YouNg","Old","Old","Old",
"Old","Old","Old","Old","Old","Old","Old","YouNg","Old","YouNg","Old","YouNg","Old",
"Old","YouNg","Old","YouNg","YouNg","Old","Old","Old","YouNg","Old","Old","Old","YouNg", "Old", "Old")
Y <- data.frame(X1, X2, X3, X4)
One option to achieve your desired result would to use the ggnewscale package which allows for multiple scales and legends for the same aesthetic:
library(ggplot2)
library(ggnewscale)
library(tidyr)
library(dplyr)
dat <- Y |>
mutate(across(everything(), as.character)) |>
pivot_longer(everything(), names_to = "var")
ggplot(dat, aes(y = var)) +
geom_bar(data = ~subset(.x, var %in% c("X1", "X2")), aes(fill = value), position = "fill") +
scale_fill_brewer(palette = "Accent", guide = guide_legend(order = 3))+
new_scale_fill() +
geom_bar(data = ~subset(.x, var %in% c("X3")), aes(fill = value), position = "fill") +
scale_fill_brewer(palette = "Dark2", guide = guide_legend(order = 2)) +
new_scale_fill() +
geom_bar(data = ~subset(.x, var %in% c("X4")), aes(fill = value), position = "fill") +
scale_fill_brewer(palette = "Paired", guide = guide_legend(order = 1))
EDIT Depending on what you want to achieve there are three arguments to consider:
Use position = position_fill(reverse = TRUE) in geom_bar to reverse the order of the stack
To reverse the order of the fill colors use direction=-1 in scale_fill_xxx
Finally, if you want to reverse the order in the legend use reverse=TRUE in guide_legend.
The example code below uses all three, i.e.
the bars run from 0 on the left to 4 on the right.
the fill colors are reversed so that 0 is now assigned "pink" and 4 the "green" color
and finally the order in the legend is reversed.
ggplot(dat, aes(y = var)) +
geom_bar(data = ~subset(.x, var %in% c("X1", "X2")), aes(fill = value), position = "fill") +
scale_fill_brewer(palette = "Accent", guide = guide_legend(order = 3))+
new_scale_fill() +
geom_bar(data = ~subset(.x, var %in% c("X3")), aes(fill = value), position = position_fill(reverse = TRUE)) +
scale_fill_brewer(palette = "Dark2", guide = guide_legend(order = 2, reverse = TRUE), direction = -1) +
new_scale_fill() +
geom_bar(data = ~subset(.x, var %in% c("X4")), aes(fill = value), position = "fill") +
scale_fill_brewer(palette = "Paired", guide = guide_legend(order = 1))

add percentage to barplot with position = fill

I have createg stacked barplot with the corresponding lables inside of the bars are. However, I would like to change the plot, such that position = fill. This does something weird with my plot. Any ideas how to fix this ?
cat <- c (0,0,0,0,0,1,1,1,1,1)
value <- c(100,200,300,100,300,200,200,200,300,300)
N <- c(15,43,7,53,25,33,5,3,2,2)
year <- c(2014,2017,2018,2016,2015,2014,2016,2018,2017,2015)
exdata <- cbind(cat, value, N, year)
exdata <- as.data.frame(exdata)
exdata2 <- group_by(exdata, year) %>% mutate (pct = paste0((round(value/sum(value)*100, 2))," %"))
exdata2 <- as.data.frame(exdata2)
plot.exdata2 <- ggplot(exdata2, aes(year, value, fill = factor(cat))) +
geom_bar(stat = "identity", position = "stack") + # position = stack
geom_text(aes(label = exdata2$pct), position = position_stack(vjust = 0.5), size = 3.2) +
theme_pubclean()
desired format:
You can try geom_col instead as it is recommended to use when stat = "identity". In addition you have to specify "fill" position in the geom_text as well:
ggplot(exdata2, aes(year, value, fill = factor(cat))) +
geom_col(position = "fill") +
geom_text(aes(label = pct), position = position_fill(vjust = 0.5))

ggplot2 weighted bar plot with position = "dodge"

I am trying to make a weighted dodged bar plot with ggplot2. With stacked bars the behavior is as expected:
df <- data.frame(group = rep(letters[1:3], each = 4),
sex = rep(c("m", "f"), times = 6),
weight = 1:12)
ggplot(df, aes(x = group, fill = sex, y = weight)) +
geom_bar(stat = "identity")
The bars have length equal to the total weight.
If I add position = "dodge", the length of the female group a bar is 4 rather than the expected 6. Similarly, all other bars are only as long as the highest weight in each group & sex combination rather than representing the total weight.
ggplot(df, aes(x = group, fill = sex, y = weight)) +
geom_bar(stat = "identity", position = "dodge")
How do I make the bar lengths match the total weight?
You can first summarise the data in your desired way and then plot it:
library(dplyr)
library(ggplot2)
df %>%
group_by(group, sex) %>%
summarise(total_weight = sum(weight)) %>%
ggplot(aes(x = group, fill = sex, y = total_weight)) +
geom_bar(stat = "identity", position = "dodge")
The problem with your original approach is that as you have several values of weight for one group, sex combination and then specify stat="identity", they are plotted on top of each other. This can be visualized:
ggplot(df, aes(x = group, fill = sex, y = weight)) +
geom_bar(stat = "identity", position = "dodge", color = "black", alpha = 0.5)
#kath's explanation is correct.
Another alternative, if you don't want to summarise the data frame before passing it to ggplot(): use the stat_summary() function instead of geom_bar():
ggplot(df, aes(x = group, fill = sex, y = weight)) +
stat_summary(geom = "bar", position = "dodge", fun.y = sum)

ggplot2: show relative % in a stacked barplot per group

I'm trying to plot a basic bar chart per group.
As values are pretty big, I want to show for each bar (i.e. group) the % of each group within the bar.
I managed to show percentage of the total, but this is not what I'm expecting : in each bar, I would like that the sum of % equal 100%.
Is there an easy way to do it without changing the dataframe ?
(DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000))) )
ggplot(DF) +
geom_bar(aes(x = year, y = Value, fill = Grp),
stat = "identity",
position = position_stack()) +
geom_text(aes(x = year, y = Value, group = Grp,
label = percent(Value/sum(Value))) ,
position = position_stack(vjust = .5))
You can create a new variable for percentile by year:
library(dplyr)
library(ggplot2)
library(scales)
DF <- DF %>% group_by(year) %>% mutate(ValuePer=(Value/sum(Value))) %>% ungroup()
ggplot(DF, aes(year, ValuePer, fill = Grp)) +
geom_bar(stat = "identity", position = "fill") +
geom_text(aes(label = percent(ValuePer)),
position = position_fill())+
scale_y_continuous(labels = percent_format())
Use position = "fill" to turn scale into proportions and scale_y_continuous(labels = percent_format()) to turn this scale into percent.
DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000)))
library(ggplot2)
library(scales)
ggplot(DF, aes(year, Value, fill = Grp)) +
geom_bar(stat = "identity", position = "fill") +
geom_text(aes(label = percent(Value / sum(Value))),
position = position_fill()) +
scale_y_continuous(labels = percent_format())
OK gathering all your tricks, I finally get this :
I need to adjust my DF, what I wanted to avoid, but it remains simple so it works
library(dplyr)
library(ggplot2)
library(scales)
DF <- DF %>% group_by(year) %>% mutate(ValuePer=(Value/sum(Value))) %>% ungroup()
ggplot(DF, aes(year, Value, fill = Grp)) +
geom_bar(stat = "identity", position = "stack") +
geom_text(aes(label = percent(ValuePer)),
position = position_stack()) +
scale_y_continuous(labels = unit_format("M", 1e-6) )
I would use a single geom_text for each bar while filtering data by year (bar) using dplyr. Check if is that what you need:
(DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000))) )
library(dplyr)
ggplot(DF) +
geom_bar(aes(x = year, y = Value, fill = Grp),
stat = "identity",
position = position_stack()) +
geom_text(data = DF %>% filter(year == 2015),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5)) +
geom_text(data = DF %>% filter(year == 2016),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5)) +
geom_text(data = DF %>% filter(year == 2017),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5))
Argument group is not necessary here. There may be more elegant solutions but that is the one I could think about. Tell me if this is the output you were waiting for:
Maybe creating a new column doing the right computation. I could not figure out how the computation could be done right inside aes(), the way you did you just computed the overall %, the Value should be grouped by year instead.
At least you got yourself the actually value by the Y axis and the Year grouped % inside bars. I would advise changing this labels by stacking something like this:
scale_y_continuous(breaks = seq(0,8*10^6,10^6),
labels = c(0, paste(seq(1,8,1),'M')))
Resulting this:
You can adapt to your context.

Resources