If I use ggplot2's stat_summary() to make a barplot of the average number of miles per gallon for 3-, 4-, and 5-geared cars, for example, how can I label each of the bars with the average value for mpg?
library(ggplot2)
CarPlot <- ggplot() +
stat_summary(data= mtcars,
aes(x = factor(gear),
y = mpg,
fill = factor(gear)
),
fun.y="mean",
geom="bar"
)
CarPlot
I know that you can normally use geom_text(), but I'm having trouble figuring out what to do in order to get the average value from stat_summary().
You should use the internal variable ..y.. to get the computed mean.
library(ggplot2)
CarPlot <- ggplot(data= mtcars) +
aes(x = factor(gear),
y = mpg)+
stat_summary(aes(fill = factor(gear)), fun.y=mean, geom="bar")+
stat_summary(aes(label=round(..y..,2)), fun.y=mean, geom="text", size=6,
vjust = -0.5)
CarPlot
but probably it is better to aggregate beforehand.
I'd simply precompute the statistics, and build the plot afterwards:
library(plyr)
library(ggplot2)
dat = ddply(mtcars, .(gear), summarise, mean_mpg = mean(mpg))
dat = within(dat, {
gear = factor(gear)
mean_mpg_string = sprintf('%0.1f', mean_mpg)
})
ggplot(dat, aes(x = gear, y = mean_mpg)) +
geom_bar(aes(fill = gear), stat = "identity") +
geom_text(aes(label = mean_mpg_string), vjust = -0.5)
Related
ggplot(data = results, aes(x = inst, y = value, group = inst)) +
geom_boxplot() +
facet_wrap(~color) +
#geom_line(data = mean,
#mapping = aes(x = inst, y = average, group = 1))
theme_bw()
When I run the code above with the code line commented, it runs and gives the graph below but I want a joining mean lines on the boxplots based on its own color category for each group in facet wraps. Any ideas how can I do that?
Your code is generally correct (though you'll want to add color = color to the aes() specification in geom_line()), so I suspect your mean dataset isn't set up correctly. Do you have means grouped by both your x axis and faceting variable? Using ggplot2::mpg as an example:
library(dplyr) # >= v1.1.0
library(ggplot2)
mean_dat <- summarize(mpg, average = mean(hwy), .by = c(cyl, drv))
ggplot(mpg, aes(factor(cyl), hwy)) +
geom_boxplot() +
geom_line(
data = mean_dat,
aes(y = average, group = 1, color = drv),
linewidth = 1.5,
show.legend = FALSE
) +
facet_wrap(~drv) +
theme_bw()
Alternatively, you could use stat = "summary" and not have to create a means dataframe at all:
ggplot(mpg, aes(factor(cyl), hwy)) +
geom_boxplot() +
geom_line(
aes(group = 1, color = drv),
stat = "summary",
linewidth = 1.5,
show.legend = FALSE
) +
facet_wrap(~drv) +
theme_bw()
# same result as above
Following this question: How to add a number of observations per group and use group mean in ggplot2 boxplot?, I want to add number of observations per group in ggplot boxplot too. But I have added a colour into aes mapping.
The existing answer shows how to adjust text position in y axis. How could I adjust the text position in the x axis?
This is a minimum example to reproduce my problem:
library(ggplot2)
give.n <- function(x){
return(c(y = median(x)*1.05, label = length(x)))
# experiment with the multiplier to find the perfect position
}
p <- ggplot(mtcars, aes(factor(vs), mpg, colour = factor(am))) +
geom_boxplot() +
stat_summary(fun.data = give.n, geom = "text", fun.y = median)
p
Thanks for any suggestions.
You can just use position:
p <- ggplot(mtcars, aes(factor(vs), mpg, colour = factor(am))) +
geom_boxplot() +
stat_summary(fun.data = give.n, geom = "text", fun.y = median,
position = position_dodge(width = 0.75))
p
The width argument of position_dodge() controls the positioning on the horizontal axis. 0.75 is the sweet spot, see how it works for different numbers of groupings:
p2 <- ggplot(mtcars, aes(factor(vs), mpg, colour = factor(cyl))) +
geom_boxplot() +
stat_summary(fun.data = give.n, geom = "text", fun.y = median,
position = position_dodge(width = 0.75))
p2
Instead of stat_summary, you can use geom_text. Please refer to the following question: ggplot2 add text on top of boxplots.
This is an example of how you may do it with the number of observations:
# Create an aggregate of median & count
> cts <- merge(aggregate(mpg ~ cyl + am, mtcars, length),
aggregate(mpg ~ cyl + am, mtcars, median),
by=c("cyl", "am"))
# Rename the col names to fit with the original dataset..
> names(cts) <- c("cyl", "am", "count", "mpg")
# As alexwhan suggested, position_dodge helps with positioning
# along the x-axis..
> ggplot(mtcars, aes(factor(cyl), mpg, colour = factor(am))) +
geom_boxplot(position = position_dodge(width=1.0)) +
geom_text(data = cts, aes(label=count),
position=position_dodge(width=1.0))
I have a set of data as such;
Station;Species;
CamA;SpeciesA
CamA;SpeciesB
CamB;SpeciesA
etc...
I would like to create a cumulative barplot with the cameras station in x axis and the percentage of each species added. I have tried the following code;
ggplot(data=data, aes(x=Station, y=Species, fill = Species))+ geom_col(position="stack") + theme(axis.text.x =element_text(angle=90)) + labs (x="Cameras", y= NULL, fill ="Species")
And end up with the following graph;
But clearly I don't have a percentage on the y axis, just the species name - which is in the end what I have coded for..
How could I have the percentages on the y axis, the cameras on the x axis and the species as a fill?
Thanks !
Using mtcars as example dataset one approach to get a barplot of percentages is to use geom_bar with position = "fill".
library(ggplot2)
library(dplyr)
mtcars2 <- mtcars
mtcars2$cyl = factor(mtcars2$cyl)
mtcars2$gear = factor(mtcars2$gear)
# Use geom_bar with position = "fill"
ggplot(data = mtcars2, aes(x = cyl, fill = gear)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = scales::percent_format()) +
theme(axis.text.x = element_text(angle = 90)) +
labs(x = "Cameras", y = NULL, fill = "Species")
A second approach would be to manually pre-compute the percentages and make use of geom_col with position="stack".
# Pre-compute pecentages
mtcars2_sum <- mtcars2 %>%
count(cyl, gear) %>%
group_by(cyl) %>%
mutate(pct = n / sum(n))
ggplot(data = mtcars2_sum, aes(x = cyl, y = pct, fill = gear)) +
geom_col(position = "stack") +
scale_y_continuous(labels = scales::percent_format()) +
theme(axis.text.x = element_text(angle = 90)) +
labs(x = "Cameras", y = NULL, fill = "Species")
I am having some trouble with ggplot and stat_summary.
Please consider following data:
head(mtcars)
data<-mtcars
data$hp2<-mtcars$hp+50
Please consider following code:
ggplot(mtcars, aes(x = cyl, y = hp)) +
stat_summary(aes(y = hp, group = 1), fun.y=mean, colour="red", geom="line",group=1) +
stat_summary(fun.y=mean, colour="red", geom="text", show_guide = FALSE, vjust=-0.7, aes( label=round(..y.., digits=0)))
The code will produce line plot with means of hp and text labels for means ans well. If we would like to add another line/curve we simply have to add:
ggplot(mtcars, aes(x = cyl, y = hp)) +
stat_summary(aes(y = hp, group = 1), fun.y=mean, colour="red", geom="line",group=1) +
stat_summary(fun.y=mean, colour="red", geom="text", show_guide = FALSE, vjust=-0.7, aes( label=round(..y.., digits=0)))+
stat_summary(aes(y = hp2), fun.y=mean, colour="blue", geom="line",group=1)
Now comes the tricky part:
How to use stat_summary with geom="text" but for the hp2 i.e. how to technically force stat_summary to calculate means on hp2 and print the text labels? It seems that I can only use it for the "main" y.
This type of problem, that asks for graphs of related vector columns, is almost always a wide-to-long data format reshaping problem.
library(ggplot2)
data_long <- reshape2::melt(data[c('cyl', 'hp', 'hp2')], id.vars = 'cyl')
head(data_long)
ggplot(data_long, aes(x = cyl, y = value, colour = variable)) +
stat_summary(fun.y = mean, geom = "line", show.legend = FALSE) +
stat_summary(fun.y = mean, geom = "text", show.legend = FALSE, vjust=-0.7, aes( label=round(..y.., digits=0))) +
scale_color_manual(values = c("red", "blue"))
Following this question: How to add a number of observations per group and use group mean in ggplot2 boxplot?, I want to add number of observations per group in ggplot boxplot too. But I have added a colour into aes mapping.
The existing answer shows how to adjust text position in y axis. How could I adjust the text position in the x axis?
This is a minimum example to reproduce my problem:
library(ggplot2)
give.n <- function(x){
return(c(y = median(x)*1.05, label = length(x)))
# experiment with the multiplier to find the perfect position
}
p <- ggplot(mtcars, aes(factor(vs), mpg, colour = factor(am))) +
geom_boxplot() +
stat_summary(fun.data = give.n, geom = "text", fun.y = median)
p
Thanks for any suggestions.
You can just use position:
p <- ggplot(mtcars, aes(factor(vs), mpg, colour = factor(am))) +
geom_boxplot() +
stat_summary(fun.data = give.n, geom = "text", fun.y = median,
position = position_dodge(width = 0.75))
p
The width argument of position_dodge() controls the positioning on the horizontal axis. 0.75 is the sweet spot, see how it works for different numbers of groupings:
p2 <- ggplot(mtcars, aes(factor(vs), mpg, colour = factor(cyl))) +
geom_boxplot() +
stat_summary(fun.data = give.n, geom = "text", fun.y = median,
position = position_dodge(width = 0.75))
p2
Instead of stat_summary, you can use geom_text. Please refer to the following question: ggplot2 add text on top of boxplots.
This is an example of how you may do it with the number of observations:
# Create an aggregate of median & count
> cts <- merge(aggregate(mpg ~ cyl + am, mtcars, length),
aggregate(mpg ~ cyl + am, mtcars, median),
by=c("cyl", "am"))
# Rename the col names to fit with the original dataset..
> names(cts) <- c("cyl", "am", "count", "mpg")
# As alexwhan suggested, position_dodge helps with positioning
# along the x-axis..
> ggplot(mtcars, aes(factor(cyl), mpg, colour = factor(am))) +
geom_boxplot(position = position_dodge(width=1.0)) +
geom_text(data = cts, aes(label=count),
position=position_dodge(width=1.0))