ggplot barplot mean values on graph - r

I want to create a barplot with 2 factors and 1 continuous variable for y.
Μy code is (it is based on the build-in dataset: mtcars):
data(mtcars)
x=mtcars
library(ggplot2)
ggplot(x,aes(x=factor(carb), y=mpg, fill=factor(carb)))
+geom_bar(stat="summary",fun.y="mean")
+labs(title="Barplot of Average MPG per Carbon category per # of Cylinders", y="Mean MPG",x="Carbon Category")
+facet_grid(.~factor(cyl))
+geom_text(aes(label=mpg),vjust=3)
My goal is to have (and show) the average MPG value per carbon category, per cylinder category. Is my code correct?
The main problem is, I just want the mean value shown on each bar, not all values for this combination of factor values.
For example:
subset(x,c(x$carb==3 & x$cyl==8)) returns 3 different values for MPG, and the graph shows all these three!

You can try
library(tidyverse)
mtcars %>%
group_by(carb, cyl) %>%
summarise(AverageMpg = mean(mpg)) %>%
ggplot(aes(factor(carb), AverageMpg, label=AverageMpg, fill=factor(carb))) +
geom_col() +
geom_text(nudge_y = 0.5) +
facet_grid(~cyl, scales = "free_x", space = "free_x")

If I understand correctly, I suppose this is what you're trying to achieve.
data(mtcars)
library(tidyverse)
mtcars %>%
group_by(carb, cyl) %>%
summarise(AverageMpg = mean(mpg)) %>%
ungroup() %>%
mutate(carb = factor(carb)) %>%
ggplot(mapping = aes(x=carb, y=AverageMpg, fill=carb)) +
geom_col() +
scale_y_continuous(name = "Mean MPG") +
scale_x_discrete("Carbon Category") +
labs(title="Barplot of Average MPG per Carbon category per # of Cylinders") +
facet_grid(.~cyl)

Related

How to connect means per group in ggplot?

I can do a scatterplot of two continuous variables like this:
mtcars %>%
ggplot(aes(x=mpg, y = disp)) + geom_point() +
geom_smooth(method="auto", se=TRUE, fullrange=FALSE, level=0.95)
I use cut to create 5 groups of mpg intervals for cars (any better command would do as well). I like to see the intervals in the graph, thus they are easy to understand.
mtcars %>%
mutate(mpg_groups = cut(mpg, 5)) %>%
group_by(mpg_groups) %>%
mutate(mean_disp = mean(disp)) %>%
ggplot(aes(x=mpg_groups, y = mean_disp)) + geom_point()
mpg_groups is a factor variable and can no longer be connected via geom_smooth().
# not working
mtcars %>%
mutate(mpg_groups = cut(mpg, 5)) %>%
group_by(mpg_groups) %>%
mutate(mean_disp = mean(disp)) %>%
ggplot(aes(x=mpg_groups, y = mean_disp)) + geom_point() +
geom_smooth(method="auto", se=TRUE, fullrange=FALSE, level=0.95)
What can I do with easy (tidyverse) code in order to create the mean values per group and connect them via line?
As a more or less general rule, when drawing a line via ggplot2 you have to explicitly set the group aesthetic in all cases where the variable mapped on x isn't a numeric, i.e. use e.g. group=1 to assign all observations to one group which I call 1 for simplicity:
library(ggplot2)
library(dplyr, warn=FALSE)
mtcars %>%
mutate(mpg_groups = cut(mpg, 5)) %>%
group_by(mpg_groups) %>%
mutate(mean_disp = mean(disp)) %>%
ggplot(aes(x = mpg_groups, y = mean_disp, group = 1)) +
geom_point() +
geom_smooth(method = "auto", se = TRUE, fullrange = FALSE, level = 0.95)

How to exclude facet wrap/grid with less than n number of observations in R

I am using ggplot to create numerous dot charts (using geom_point and facet_wrap) for two variables with multiple categories. Multiple categories have only a few points, which I cannot use,
Is there a way to set a minimum number of observations for each facet (i.e. only show plots with 10 or more observations)?
by_AB <- group_by(df, A, B)
by_AB%>%
ggplot(aes(X,Y)) +
geom_point() +
facet_wrap(A~B, scales="free") +
geom_smooth(se = FALSE) +
theme_bw()```
It is best to remove the small groups from your data before plotting. This is very easy if you do
df %>%
group_by(A, B) %>%
filter(n() > 1) %>%
ggplot(aes(X,Y)) +
geom_point() +
facet_wrap(A~B, scales="free") +
geom_smooth(se = FALSE) +
theme_bw()
Obviously, we don't have your data, so here is an example using the built-in mtcars data set. Suppose we want to plot mpg against wt, but facet by carb:
library(tidyverse)
mtcars %>%
ggplot(aes(wt, mpg)) +
geom_point() +
facet_wrap(.~carb)
Two of our facets look out of place because they only have a single point. We can simply filter these groups out en route to ggplot:
mtcars %>%
group_by(carb) %>%
filter(n() > 1) %>%
ggplot(aes(wt, mpg)) +
geom_point() +
facet_wrap(.~carb)
Created on 2022-08-25 with reprex v2.0.2

Create bar graph-find the average mpg by the number of gears

mtcars %>%
group_by(gear, mpg) %>%
summarise(m = mean(mpg)) %>%
ggplot(aes(x = mpg, y = gear)) +
geom_bar(stat = "count")
I cannot figure out to create a bargraph with the average mpg by the number of gears
Is that what you need?
packages
library(dplyr)
library(ggplot2)
Average mpg (m) by the number of gears
mtcars %>%
group_by(gear) %>%
summarise(m = mean(mpg)) %>%
ungroup() %>%
ggplot(aes(y = m, x = gear)) +
geom_bar(stat = "identity")
First, we get the mean of mpg by gear. To do that, you want to group by gear (just gear. You don't need to group by mpg as well).
Ungroup, so you have a unified dataset.
Now you want to plot the mean you created (m) by gear. You can which of them go where. In this case, I put gear on the x-axis and the mean of mpg on the y-axis.
Given you have specific values for the mean, you don't have to count all the values. Just plot the specific value you have there. Thus, use stat = "identity" instead of stat = "count"
Now you can play with colors using fill argument in aes and change the titles and axis labels.
output
In base R (i.e. without additional libraries) you might do
with(mtcars, tapply(mpg, gear, mean)) |>
barplot(xlab='gear', ylab='n', col=4, main='My plot')

How do I annotate a Mosaic Plot in R ggplot?

The code I have is as follows:
mtcars_tab <- mtcars %>% count(cyl, gear) # counts the number of gear / cylinder combinations
mtcars_tab %>%
ggplot() +
geom_mosaic(aes(x = product(gear), fill = cyl, weight = n), divider = mosaic("v")) +
xlab("Gear") +
ylab("cyl") +
ggtitle("Distribution of Gears and Cylinders in the mtcars data")
I would like to annotate the n here on the rectangles of the mosaic plot (preferably centered).

How to order facets by variable in ggplot2?

Suppose I have a graph like this:
library(tidyverse)
df <- mtcars %>%
group_by(cyl, gear) %>%
summarise(hp_mean = mean(hp))
ggplot(df, aes(x = gear, y = hp_mean)) +
geom_point(size = 2.12, colour = "black") +
theme_bw() +
facet_wrap(vars(cyl))
and would like to arrange the order of facets, according to the hp_mean value for gear=3. E.g. the facet with cyl=8 should be first as hp_mean for gear=3 is 194 which is the highest.
Any ideas?
All help is much appreaciated!
Might not be the tidiest answer out there but you could:
extract the level of hp when gear == 3 to create a variable to order by (hp_gear3)
use forcats::fct_reorder() to reorder by the mean of this value across gear (from group_by() command)
use .desc = TRUE to put in descending order
plot using stat_summary to do the mean calculation for you
mtcars %>%
group_by(gear) %>%
mutate(hp_gear3 = ifelse(gear == 3, hp, NA),
cyl = fct_reorder(factor(cyl),
hp_gear3,
mean,
na.rm = TRUE,
.desc = TRUE)) %>%
ggplot(aes(gear, hp)) +
stat_summary(fun = mean) +
facet_wrap(~cyl)

Resources