I'm trying to plot a basic bar chart per group.
As values are pretty big, I want to show for each bar (i.e. group) the % of each group within the bar.
I managed to show percentage of the total, but this is not what I'm expecting : in each bar, I would like that the sum of % equal 100%.
Is there an easy way to do it without changing the dataframe ?
(DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000))) )
ggplot(DF) +
geom_bar(aes(x = year, y = Value, fill = Grp),
stat = "identity",
position = position_stack()) +
geom_text(aes(x = year, y = Value, group = Grp,
label = percent(Value/sum(Value))) ,
position = position_stack(vjust = .5))
You can create a new variable for percentile by year:
library(dplyr)
library(ggplot2)
library(scales)
DF <- DF %>% group_by(year) %>% mutate(ValuePer=(Value/sum(Value))) %>% ungroup()
ggplot(DF, aes(year, ValuePer, fill = Grp)) +
geom_bar(stat = "identity", position = "fill") +
geom_text(aes(label = percent(ValuePer)),
position = position_fill())+
scale_y_continuous(labels = percent_format())
Use position = "fill" to turn scale into proportions and scale_y_continuous(labels = percent_format()) to turn this scale into percent.
DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000)))
library(ggplot2)
library(scales)
ggplot(DF, aes(year, Value, fill = Grp)) +
geom_bar(stat = "identity", position = "fill") +
geom_text(aes(label = percent(Value / sum(Value))),
position = position_fill()) +
scale_y_continuous(labels = percent_format())
OK gathering all your tricks, I finally get this :
I need to adjust my DF, what I wanted to avoid, but it remains simple so it works
library(dplyr)
library(ggplot2)
library(scales)
DF <- DF %>% group_by(year) %>% mutate(ValuePer=(Value/sum(Value))) %>% ungroup()
ggplot(DF, aes(year, Value, fill = Grp)) +
geom_bar(stat = "identity", position = "stack") +
geom_text(aes(label = percent(ValuePer)),
position = position_stack()) +
scale_y_continuous(labels = unit_format("M", 1e-6) )
I would use a single geom_text for each bar while filtering data by year (bar) using dplyr. Check if is that what you need:
(DF <- data.frame( year = rep(2015:2017, each = 4),
Grp = c("Grp1", "Grp2", "Grp3", "Grp4"),
Value = trunc(rnorm(12, 2000000, 100000))) )
library(dplyr)
ggplot(DF) +
geom_bar(aes(x = year, y = Value, fill = Grp),
stat = "identity",
position = position_stack()) +
geom_text(data = DF %>% filter(year == 2015),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5)) +
geom_text(data = DF %>% filter(year == 2016),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5)) +
geom_text(data = DF %>% filter(year == 2017),
aes(x = year, y = Value,
label = scales::percent(Value/sum(Value))) ,
position = position_stack(vjust = .5))
Argument group is not necessary here. There may be more elegant solutions but that is the one I could think about. Tell me if this is the output you were waiting for:
Maybe creating a new column doing the right computation. I could not figure out how the computation could be done right inside aes(), the way you did you just computed the overall %, the Value should be grouped by year instead.
At least you got yourself the actually value by the Y axis and the Year grouped % inside bars. I would advise changing this labels by stacking something like this:
scale_y_continuous(breaks = seq(0,8*10^6,10^6),
labels = c(0, paste(seq(1,8,1),'M')))
Resulting this:
You can adapt to your context.
Related
How can I set the break of the x-axis and drop every second factor? And also how can I adjust hover of ggplotly from factor(Year) to Year?
data <- data.frame (Year = c("2017","2017","2017","2016","2016","2016","2015","2015","2015","2018" ,"2018" ,"2018"),
condition = c("normal","stress","Nitrogen" ,"normal","stress", "Nitrogen","normal","stress","Nitrogen","normal","stress","Nitrogen"),
value = c(22.221268, 1.598309 ,20.560815 ,17.337966,20.440174 , 9.074674, 11.739466, 1.905651, 32.270223, 14.271606 ,12.375446, 17.470793))
library(tidyverse)
data %>%
group_by(Year) %>%
mutate(value = value / sum(value)) %>%
ggplot(aes(fill=condition, y=value, x=factor(Year))) +
geom_col(position="fill", width = 1, color = "white") +
geom_text(aes(label = scales::percent(value, accuracy = 0.1)),
position = position_fill(vjust = 0.50),
color = "white") +
scale_y_continuous(labels = scales::percent) +
scale_fill_brewer(palette = "Set1")
How to show every other value on a discrete axis is a duplicate of this question. Using my answer from there, we can define the every_nth function. As for the factor(Year) tooltip label, the easiest way to avoid that is to convert the column to factor before plotting, so the aesthetic mapping is simply x = Year.
every_nth = function(n) {
return(function(x) {x[c(TRUE, rep(FALSE, n - 1))]})
}
data %>%
group_by(Year) %>%
mutate(
value = value / sum(value),
Year = factor(Year) ## put this in mutate() before plotting
) %>%
ggplot(aes(fill = condition, y = value, x = Year)) +
geom_col(position = "fill", width = 1, color = "white") +
geom_text(aes(label = scales::percent(value, accuracy = 0.1)),
position = position_fill(vjust = 0.50),
color = "white") +
scale_y_continuous(labels = scales::percent) +
scale_x_discrete(breaks = every_nth(2)) +
scale_fill_brewer(palette = "Set1") -> p
ggplotly(p)
I have a simple data frame which has the probabilities that an id is real and fake, respectively:
library(tidyverse)
dat <- data.frame(id = "999", real = 0.7, fake = 0.3)
I know that I can show this as a horizontal bar chart using the code below:
dat %>%
gather(key = grp, value = prob, -id) %>%
ggplot(aes(x = id, y = prob, fill = grp)) +
geom_bar(stat = "identity") +
coord_flip()
But I was wondering if there was a way to show this in the same way as shown below, with the class labels and probabilities on either end of the bar chart?
Many thanks
A straight forward, maybe somewhat cheeky workaround is to re-define your 0.
I added a few calls that are not strictly necessary, but make it look closer to your example plot.
library(tidyverse)
dat <- data.frame(id = "999", real = -0.7, fake = 0.3) # note the minus sign!
dat %>%
gather(key = grp, value = prob, -id) %>%
ggplot(aes(x = id, y = prob, fill = grp)) +
geom_col(show.legend = FALSE) +
geom_text(aes(label = stringr::str_to_title(paste0(grp, " (", as.character(100*abs(prob)), "%)"))),
hjust = c(1,0))+
coord_flip(clip = "off") +
scale_fill_brewer(palette = "Greys") +
theme_void() +
theme(aspect.ratio = .1,
plot.margin = margin(r = 3, l = 3, unit = "lines"))
Created on 2021-02-06 by the reprex package (v0.3.0)
I'm not sure this fully answers the question but I think it will improve the plot, can you try it out?
dat %>%
gather(key = grp, value = prob, -id) %>%
ggplot(aes(x = id, y = prob, fill = grp)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous("Proportion") +
scale_x_discrete("", expand = c(0,0)) +
scale_fill_identity() +
coord_flip()
I have createg stacked barplot with the corresponding lables inside of the bars are. However, I would like to change the plot, such that position = fill. This does something weird with my plot. Any ideas how to fix this ?
cat <- c (0,0,0,0,0,1,1,1,1,1)
value <- c(100,200,300,100,300,200,200,200,300,300)
N <- c(15,43,7,53,25,33,5,3,2,2)
year <- c(2014,2017,2018,2016,2015,2014,2016,2018,2017,2015)
exdata <- cbind(cat, value, N, year)
exdata <- as.data.frame(exdata)
exdata2 <- group_by(exdata, year) %>% mutate (pct = paste0((round(value/sum(value)*100, 2))," %"))
exdata2 <- as.data.frame(exdata2)
plot.exdata2 <- ggplot(exdata2, aes(year, value, fill = factor(cat))) +
geom_bar(stat = "identity", position = "stack") + # position = stack
geom_text(aes(label = exdata2$pct), position = position_stack(vjust = 0.5), size = 3.2) +
theme_pubclean()
desired format:
You can try geom_col instead as it is recommended to use when stat = "identity". In addition you have to specify "fill" position in the geom_text as well:
ggplot(exdata2, aes(year, value, fill = factor(cat))) +
geom_col(position = "fill") +
geom_text(aes(label = pct), position = position_fill(vjust = 0.5))
I'm tryng to add label to a grouped bar plot in r.
However I'm using percentege in the y axis, and I want the label to be count.
I've tried to use the geom_text() function, but I don't how exacly the parameters i need to use.
newdf3 %>%
dplyr::count(key, value) %>%
dplyr::group_by(key) %>%
dplyr::mutate(p = n / sum(n)) %>%
ggplot() +
geom_bar(
mapping = aes(x = key, y = p, fill = value),
stat = "identity",
position = position_dodge()
) +
scale_y_continuous(labels = scales::percent_format(),limits=c(0,1))+
labs(x = "", y = "%",title="")+
scale_fill_manual(values = c('Before' = "deepskyblue", 'During' = "indianred1", 'After' = "green2", '?'= "mediumorchid3"),
drop = FALSE, name="")
Here is an exemple of how I need it:
here's a sample of data I'm using:
key value
A Before
A After
A During
B Before
B Before
C After
D During
...
I also wanted to keep the bars with no value (label = 0).
Can someone help me with this?
Here is MWE of how to add count labels to a simple bar chart. See below for the case when these are grouped.
library(datasets)
library(tidyverse)
data <- chickwts %>%
group_by(feed) %>%
count %>%
ungroup %>%
mutate(p = n / sum(n))
ggplot(data, aes(x = feed, y = p, fill = feed)) +
geom_bar(stat = "identity") +
geom_text(stat = "identity",
aes(label = n), vjust = -1)
You should be able to do the same thing on your data.
EDIT: StupidWolf points out in the comments that the original example has grouped data. Adding position = position_dodge(0.9) in geom_text deals with this.
Again, no access to the original data, but here's a different MWE using mtcars showing this:
library(datasets)
library(tidyverse)
data <- mtcars %>%
as_tibble %>%
transmute(gear = as_factor(gear),
carb = as_factor(carb),
cyl = cyl) %>%
group_by(gear, carb) %>%
count
ggplot(data, aes(x = gear, y = n, fill = carb)) +
geom_bar(stat = "identity",
position = "dodge") +
geom_text(aes(label = n),
stat = "identity",
vjust = -1,
position = position_dodge(0.9))
How do I draw the sum value of each class (in my case: a=450, b=150, c=290, d=90) above the stacked bar in ggplot2? Here is my code:
#Data
hp=read.csv(textConnection(
"class,year,amount
a,99,100
a,100,200
a,101,150
b,100,50
b,101,100
c,102,70
c,102,80
c,103,90
c,104,50
d,102,90"))
hp$year=as.factor(hp$year)
#Plotting
p=ggplot(data=hp)
p+geom_bar(binwidth=0.5,stat="identity")+
aes(x=reorder(class,-value,sum),y=value,label=value,fill=year)+
theme()
You can do this by creating a dataset of per-class totals (this can be done multiple ways but I prefer dplyr):
library(dplyr)
totals <- hp %>%
group_by(class) %>%
summarize(total = sum(value))
Then adding a geom_text layer to your plot, using totals as the dataset:
p + geom_bar(binwidth = 0.5, stat="identity") +
aes(x = reorder(class, -value, sum), y = value, label = value, fill = year) +
theme() +
geom_text(aes(class, total, label = total, fill = NULL), data = totals)
You can make the text higher or lower than the top of the bars using the vjust argument, or just by adding some value to total:
p + geom_bar(binwidth = 0.5, stat = "identity") +
aes(x = reorder(class, -value, sum), y = value, label = value, fill = year) +
theme() +
geom_text(aes(class, total + 20, label = total, fill = NULL), data = totals)
You can use the built-in summary functionality of ggplot2 directly:
ggplot(hp, aes(reorder(class, -amount, sum), amount, fill = year)) +
geom_col() +
geom_text(
aes(label = after_stat(y), group = class),
stat = 'summary', fun = sum, vjust = -1
)