R ggplot geom_bar count number of values by groups - r

I tried this code without faceting, it works.
I want to add counts on each bar and use facets in my plot, it brokes. I managed to make it close to what I want, like this:
mtcars %>% group_by(gear, am, vs) %>% summarize(hp_sum = sum(hp), hp = hp) %>%
ggplot(aes(gear, hp_sum, fill = factor(am))) + facet_grid(.~vs) +
geom_bar(stat = 'identity', position = 'dodge', alpha = 0.5, size = 0.25) +
geom_text(aes(label=..count.., y = ..count..), stat='count', position = position_dodge(width = 0.95), size=4)
But I want the number on top of each bar. If I use y = hp_sum, I got error:
Error: stat_count() can only have an x or y aesthetic.
Run `rlang::last_error()` to see where the error occurred.
I might have format the dataset in the wrong way. Any ideas? Thanks!

I learned from this post that geom_text does not do counts by groups.
A solution is to do the summary beforehand:
mtcars %>% group_by(gear, am, vs) %>%
summarize(hp_sum = sum(hp), count = length(hp)) %>%
ggplot(aes(gear, hp_sum, fill = factor(am))) + facet_grid(.~vs) +
geom_bar(stat = 'identity', position = 'dodge', alpha = 0.5, size = 0.25) +
geom_text(aes(gear, hp_sum, label = count),
position = position_dodge(width = 0.95), size=4)
Be sure to group data the same way in the plot. Here x=gear, facet_grid(.~vs), fill = factor(am) are three factors putting y=hp into groups. So you should group this way: group_by(gear, am, vs). Hope this helps anyone who is struggling with this issue.
plot example

Related

r/ggplot: compute bar plot share within group

I am using ggplot2 to make a bar plot that is grouped by one variable and reported in shares.
I would like the percentages to instead be a percentage of the grouping variable rather than a percentage of the whole data set.
For example,
library(ggplot2)
library(tidyverse)
ggplot(mtcars, aes(x = as.factor(cyl),
y = (..count..) / sum(..count..),
fill = as.factor(gear))) +
geom_bar(position = position_dodge(preserve = "single")) +
geom_text(aes(label = scales::percent((..count..)/sum(..count..)),
y= ((..count..)/sum(..count..))), stat="count") +
theme(legend.position = "none")
Produces this output:
I'd like the percentages (and bar heights) to reflect the "within cyl" proportion rather than share across the entire sample. Is this possible? Would this involve a stat argument?
As an aside, if its possible to similarly position the geom_text call over the relevant bars that would be ideal. Any guidance would be appreciated.
Here is one way :
library(dplyr)
library(ggplot2)
mtcars %>%
count(cyl, gear) %>%
group_by(cyl) %>%
mutate(prop = prop.table(n) * 100) %>%
ggplot() + aes(cyl, prop, fill = factor(gear),
label = paste0(round(prop, 2), '%')) +
geom_col(position = "dodge") +
geom_text(position = position_dodge(width = 2), vjust = -0.5, hjust = 0.5)

How to create scaled and faceted clustered bargraphs of a summarized dataframe in ggplot2?

I am trying to create a grid of bargraphs that show the average for different species. I am using the iris dataset for this question.
I summarised the data, melted it into long form long, and tried to use facet_wrap.
iris %>%
group_by(Species) %>%
summarise(M.Sepal.Length=mean(Sepal.Length),
M.Sepal.Width=mean(Sepal.Width),
M.Petal.Length= mean(Petal.Length),
M.Petal.Width=mean(Petal.Width)) %>%
gather(key = Part, value = Value, M.Sepal.Length:M.Petal.Width) %>%
ggplot(., aes(Part, Value, group = Species, fill=Species)) +
geom_col(position = "dodge") +
facet_grid(cols=vars(Part)) +
facet_grid(cols = vars(Part))
However, the graph I am getting has x.axis labels that are strung across each facet grid. Additionally the clustered graphs are not centered within each facet box. Instead they appear at the location of their respective x-axis label. I'd like to get rid of the x-axis labels, center the graphs, and scale the graphs within each facet.
Here is an image of the resulting graph marked up with my expected output:
Perhaps this is what you're looking for?
The key changes are:
Remove Part as the variable mapped to x, that way the data is plotted in the same location in every facet
Switch to facet_wrap so you can use scales = "free_y"
Use labs to manually add the x title
Add theme to get rid of the x-axis ticks and tick labels.
library(ggplot2)
library(dplyr) # Version >= 1.0.0
iris %>%
group_by(Species) %>%
summarise(across(1:4, mean, .names = "M.{col}")) %>%
gather(key = Part, value = Value, M.Sepal.Length:M.Petal.Width) %>%
ggplot(., aes(x = 1, y = Value, group = Species, fill=Species)) +
geom_col(position = "dodge") +
facet_wrap(.~Part, nrow = 1, scales = "free_y") +
labs(x = "Part") +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_blank())
I also took the liberty of switching out your manual call to summarise with the new across functionality.
Here's how you might also calculate error bars:
library(tidyr)
iris %>%
group_by(Species) %>%
summarise(across(1:4, list(M = mean, SE = ~ sd(.)/sqrt(length(.))),
.names = "{fn}_{col}")) %>%
pivot_longer(-Species, names_to = c(".value","Part"),
names_pattern = "([SEM]+)_(.+)") %>%
ggplot(., aes(x = 1, y = M, group = Species, fill=Species)) +
geom_col(position = "dodge") +
geom_errorbar(aes(ymin = M - SE, ymax = M + SE), width = 0.5,
position = position_dodge(0.9)) +
facet_wrap(.~Part, nrow = 1, scales = "free_y") +
labs(x = "Part", y = "Value") +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_blank())

geom_bar not displaying mean values

I'm currently trying to plot mean values of a variable pt for each combination of species/treatments in my experiments. This is the code I'm using:
ggplot(data = data, aes(x=treat, y=pt, fill=species)) +
geom_bar(position = "dodge", stat="identity") +
labs(x = "Treatment",
y = "Proportion of Beetles on Treated Side",
colour = "Species") +
theme(legend.position = "right")
As you can see, the plot seems to assume the mean of my 5N and 95E treatments are 1.00, which isn't correct. I have no idea where the problem could be here.
Took a stab at what you are asking using tidyverse and ggplot2 which is in tidyverse.
dat %>%
group_by(treat, species) %>%
summarise(mean_pt = mean(pt)) %>%
ungroup() %>%
ggplot(aes(x = treat, y = mean_pt, fill = species, group = species)) +
geom_bar(position = "dodge", stat = "identity")+
labs(x = "Treatment",
y = "Proportion of Beetles on Treated Side",
colour = "Species") +
theme(legend.position = "right") +
geom_text(aes(label = round(mean_pt, 3)), size = 3, hjust = 0.5, vjust = 3, position = position_dodge(width = 1))
dat is the actual dataset. and I calculated the mean_pt as that is what you are trying to plot. I also added a geom_text piece just so you can see what the results were and compare them to your thoughts.
From my understanding, this won't plot the means of your y variable by default. Have you calculated the means for each treatment? If not, I'd recommend adding a column to your dataframe that contains the mean. I'm sure there's an easier way to do this, but try:
data$means <- rep(NA, nrow(data))
for (x in 1:nrow(data)) {
#assuming "treat" column is column #1 in your data fram
data[x,ncol(data)] <- mean(which(data[,1]==data[x,1]))
}
Then try replacing
geom_bar(position = "dodge", stat="identity")
with
geom_col(position = "dodge")
If your y variable already contains means, simply switching geom_bar to geom_col as shown should work. Geom_bar with stat = "identity" will sum the values rather than return the mean.

Label grouped bar plot in R

I'm tryng to add label to a grouped bar plot in r.
However I'm using percentege in the y axis, and I want the label to be count.
I've tried to use the geom_text() function, but I don't how exacly the parameters i need to use.
newdf3 %>%
dplyr::count(key, value) %>%
dplyr::group_by(key) %>%
dplyr::mutate(p = n / sum(n)) %>%
ggplot() +
geom_bar(
mapping = aes(x = key, y = p, fill = value),
stat = "identity",
position = position_dodge()
) +
scale_y_continuous(labels = scales::percent_format(),limits=c(0,1))+
labs(x = "", y = "%",title="")+
scale_fill_manual(values = c('Before' = "deepskyblue", 'During' = "indianred1", 'After' = "green2", '?'= "mediumorchid3"),
drop = FALSE, name="")
Here is an exemple of how I need it:
here's a sample of data I'm using:
key value
A Before
A After
A During
B Before
B Before
C After
D During
...
I also wanted to keep the bars with no value (label = 0).
Can someone help me with this?
Here is MWE of how to add count labels to a simple bar chart. See below for the case when these are grouped.
library(datasets)
library(tidyverse)
data <- chickwts %>%
group_by(feed) %>%
count %>%
ungroup %>%
mutate(p = n / sum(n))
ggplot(data, aes(x = feed, y = p, fill = feed)) +
geom_bar(stat = "identity") +
geom_text(stat = "identity",
aes(label = n), vjust = -1)
You should be able to do the same thing on your data.
EDIT: StupidWolf points out in the comments that the original example has grouped data. Adding position = position_dodge(0.9) in geom_text deals with this.
Again, no access to the original data, but here's a different MWE using mtcars showing this:
library(datasets)
library(tidyverse)
data <- mtcars %>%
as_tibble %>%
transmute(gear = as_factor(gear),
carb = as_factor(carb),
cyl = cyl) %>%
group_by(gear, carb) %>%
count
ggplot(data, aes(x = gear, y = n, fill = carb)) +
geom_bar(stat = "identity",
position = "dodge") +
geom_text(aes(label = n),
stat = "identity",
vjust = -1,
position = position_dodge(0.9))

Charts using ggplot() to apply geom_text() in R

In learning of charts plotting in R, I am using the Australian AIDS Survival Data.
To show the genders in survival, I plot 2 charts with these codes:
data <- read.csv("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/MASS/Aids2.csv")
ggplot(data) +
geom_bar(aes(sex, fill = as.factor(status)), position = "fill") +
scale_y_continuous(labels = scales::percent)
ggplot(data) +
geom_bar(aes(as.factor(status), fill = sex))
Here are the charts.
Now I want to add the values (numbers and percentages) into the bars body.
geom_text () will do. I googled some references and tried different combinations for the geom_text (x, y, label) like xxx. They are not shown properly.
Wrong code:
geom_text(aes(as.factor(status), y = sex, label = sex))
How can I do this?
I found it easiest to summarise the data outside of ggplot and then it became relatively simple.
library(tidyverse)
data2 <- data %>%
group_by(sex, status) %>%
summarise (n = n()) %>%
mutate(percent = n / sum(n) * 100)
ggplot(data2, aes(sex, percent, group = status)) +
geom_col(aes(fill = status)) +
geom_text(aes(label = round(percent,1)), position = position_stack(vjust =
0.5))
ggplot(data2, aes(status, n, group = sex)) +
geom_col(aes(fill = sex)) +
geom_text(aes(label = n), position = position_stack(vjust = 0.5))

Resources