Following the nice example here:
Create stacked barplot where each stack is scaled to sum to 100%
I generated a barplot for my data in which the results are presented as percent.
My data frame is:
small_df = data.frame(type = c('A','A','B','B'),
result = c('good','bad','good','bad'),
num_cases = c(21,72,87,2))
And my attempt to draw looks like so:
library(scales)
library(ggplot2)
ggplot(small_df,aes(x = type, y = num_cases, fill = result)) +
geom_bar(position = "fill",stat = "identity") +
scale_y_continuous(labels = percent, breaks = seq(0,1.1,by=0.1))
This all works fine and produces the a figure like so:
However, I want to make the limits of the y axis to be 0-110% (I will later want to add a label on top so I need the space). Changing the line to:
scale_y_continuous(labels = percent, breaks = seq(0,1.1,by=0.1), limits = c(0,1.1))
fails with the following error:
Error: missing value where TRUE/FALSE needed
Any idea how to solve this?
Many thanks!!!
You can change the out of bounds option via oob or use coord_cartesian to set the limits. See info here
ggplot(small_df, aes(x = type, y = num_cases, fill = result)) +
geom_bar(position = "fill", stat = "identity") +
scale_y_continuous(labels = percent, breaks = seq(0, 1.1, by=0.1),
oob = rescale_none, limits = c(0, 1.1))
ggplot(small_df, aes(x = type, y = num_cases, fill = result)) +
geom_bar(position = "fill", stat = "identity") +
scale_y_continuous(labels = percent, breaks = seq(0, 1.1, by=0.1)) +
coord_cartesian(ylim = c(0, 1.1))
Related
I am trying to create a bar graph that plots rank, where lower values are better. I want larger bars to correspond to smaller values, so the "best" groups in the data receive more visual weight.
Reprex:
dat = data.frame("Group" = c(rep("Best",50),
rep("Middle",50),
rep("Worst",50)
),
"Rank" = c(rnorm(n = 50, mean = 1.5, sd = 0.5),
rnorm(n = 50, mean = 2.5, sd = 0.5),
rnorm(n = 50, mean = 3.5, sd = 0.5)
)
)
tibdat = as_tibble(dat) %>%
group_by(Group) %>%
summarise(Mean_Rank = mean(Rank,na.rm=T))
# creates simple rightside up bar graph
ggplot(data = tibdat, mapping = aes(Group, Mean_Rank, fill = Group)) +
geom_col() +
scale_y_continuous(breaks = c(1:4), limits = c(1,4), oob = scales::squish)
# my attempt below, simply reversing the breaks and limits
ggplot(data = tibdat, mapping = aes(Group, Mean_Rank, fill = Group)) +
geom_col() +
scale_y_continuous(breaks = c(4:1), limits = c(4,1), oob = scales::squish)
The graphing code at the end does succeed in flipping the axis, but the data disappears (the bars are not plotted).
Note that I do not want the graphs to originate from the top, which scale_y_reverse can achieve. I want the bars to originate from the bottom, at the y = 4 line (or below).
How is this achieved?
Edit: Added image below to show the original bar graph that works but is wrong.
I just transformed the labels. I don't know if that's what you searched.
ggplot(data = tibdat, mapping = aes(Group, Mean_Rank, fill = Group)) +
geom_col() +
scale_y_continuous(breaks = c(1:4), limits = c(1,4), oob = scales::squish, labels = function(x) 5 - x)
With another trick in the aes argument I think you can arrive to the wanted result. Maybe someone better than me knows a clean way to do it.
ggplot(data = tibdat, mapping = aes(Group, 5 - Mean_Rank, fill = Group)) +
geom_col() +
scale_y_continuous(breaks = c(1:4), limits = c(1,4), oob = scales::squish, labels = function(x) 5 - x)
And here is the result :
I use n.breaks to have a labeled x-axis mark for each cluster this works well for 4, 5, 6 clusters. Now I tried it with two cluster and it does not work anymore.
I build the graphs like this:
country_plot <- ggplot(Data) + aes(x = Cluster) +
theme(legend.title = element_blank(), axis.title.y = element_blank()) +
geom_bar(aes(fill = country), stat = "count", position = "fill", width = 0.85) +
scale_fill_manual(values = color_map_3, drop = F) +
scale_x_continuous(n.breaks = max(unique(Data$Cluster))) + scale_y_continuous(labels = percent) +
ggtitle("Country")
and export it like this:
ggsave("country_plot.png", plot = country_plot, device = "png", width = 16, height = 8, units = "cm")
When it works it looks something like this:
But with two clusters I get something like this with only one mark beyond the actual bars with a 2.5:
I manually checked the return value of
max(unique(Data$Cluster))
and it returns 2 which in my understanding should lead to two x-axis marks with 1 and 2 like it works with more clusters.
edit:
mutate(country = factor(country, levels = 1:3)) %>%
mutate(country =fct_recode(country,!!!country_factor_naming))%>%
mutate(Gender = factor(Gender, levels = 1:2)) %>%
mutate(Gender = fct_recode(Gender, !!!gender_factor_naming))%>%
If I understand correctly the issue is caused by Cluster being treated as continuous variable. It needs to be turned into a factor.
Here is a minimal, reproducible example using the mtcars dataset that reproduces the unwanted behaviour:
First attempt (continuous x-axis)
library(ggplot2)
library(scales)
ggplot(mtcars) +
aes(x = gear, fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_y_continuous(labels = percent)
In this example, gear takes over the role of Cluster and is assigned to the x-axis.
There are unwanted labeled tick marks at x = 2.5, 3.5, 4.5, 5.5 which are due to the continuous scale.
Second attempt (continuous x-axis with n.breaks given)
ggplot(mtcars) +
aes(x = gear, fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_x_continuous(n.breaks = length(unique(mtcars$gear))) +
scale_y_continuous(labels = percent)
Specifying n.breaks in scale_x_continuous() does not change the x-axis to discrete.
Third attempt (discrete x-axis, gear as factor)
When gear is turned into a factor, we get a labeled tick mark for each factor value;
ggplot(mtcars) +
aes(x = factor(gear), fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_y_continuous(labels = percent)
I've been trying to plot manually labelled significance bars for a subset of groups on a ggplot2 barplot using ggsignif or ggpubr without much luck. The data is something like the following MWE:
set.seed(3)
## create data
df <- data.frame(activity = rep(c("Flying", "Jumping"), 3),
mean = rep(rnorm(6, 50, 25)),
group = c(rep("Ecuador", 2),
rep("Peru", 2),
rep("Brazil", 2)))
## plot it
ggplot(df, aes(x = activity, y = mean, fill = group)) +
geom_bar(position = position_dodge(0.9), stat = "identity",
width = 0.9, colour = "black", size = 0.1) +
xlab("Activity") + ylab("Mean")
Where I'd like to manually specify significance labels, say between Brazil/Ecuador" on "Flying", and Ecuador/Peru on "Jumping". Does anyone know how to properly deal with this kind of data, for example with ggsignif? And is there a way to refer to each bar by name, rather than try to work out its x-axis position?
If you know on which barchart you want to add your significance labels, you can do:
library(ggsignif)
library(ggplot2)
ggplot(df, aes(x = activity, y = mean, fill = group)) +
geom_bar(position = position_dodge(0.9), stat = "identity",
width = 0.9, colour = "black", size = 0.1) +
xlab("Activity") + ylab("Mean")+
geom_signif(y_position = c(60,50), xmin = c(0.7,2), xmax = c(1,2.3),
annotation=c("**", "***"), tip_length=0)
Does it answer your question ?
I'm struggling with the following issue:
I want to plot two histograms, but since the statistics of one of the two classes is much less than the other I need to add a second y-axis to allow a direct comparison of the values.
I report below the code I used at the moment and the result.
Thank you in advance!
ggplot(data,aes(x= x ,group=class,fill=class)) + geom_histogram(position="identity",
alpha=0.5, bins = 20)+ theme_bw()
Consider the following situation where you have 800 versus 200 observations:
library(ggplot2)
df <- data.frame(
x = rnorm(1000, rep(c(1, 2), c(800, 200))),
class = rep(c("A", "B"), c(800, 200))
)
ggplot(df, aes(x, fill = class)) +
geom_histogram(bins = 20, position = "identity", alpha = 0.5,
# Note that y = stat(count) is the default behaviour
mapping = aes(y = stat(count)))
You could scale the counts for each group to a maximum of 1 by using y = stat(ncount):
ggplot(df, aes(x, fill = class)) +
geom_histogram(bins = 20, position = "identity", alpha = 0.5,
mapping = aes(y = stat(ncount)))
Alternatively, you can set y = stat(density) to have the total area integrate to 1.
ggplot(df, aes(x, fill = class)) +
geom_histogram(bins = 20, position = "identity", alpha = 0.5,
mapping = aes(y = stat(density)))
Note that after ggplot 3.3.0 stat() probably will get replaced by after_stat().
How about comparing them side by side with facets?
ggplot(data,aes(x= x ,group=class,fill=class)) +
geom_histogram(position="identity",
alpha=0.5,
bins = 20) +
theme_bw() +
facet_wrap(~class, scales = "free_y")
I am designing a bar plot with ggplot2 package.
The only problem is that I cannot get rid of the space between the bars and the x-axis.
I know that this formula should resolve the problem:
scale_y_continuous(expand = c(0, 0)) function
But it seems that the element for the error bar is overwriting it and gives always this space.
here my code:
p<-ggplot(data=tableaumergectrlmut, aes(x=ID, y=meanNSAFbait, fill=Condition)) +
geom_bar(stat="identity", position=position_dodge())+
scale_y_continuous(expand = c(0,0))+
geom_errorbar(aes(ymin=meanNSAFbait-SDNSAFbait,
ymax=meanNSAFbait+SDNSAFbait, width=0.25), position=position_dodge(.9))
Using some example data to generate a plot that (I think) shows the problem you're having.
library(ggplot2)
df <- data.frame(val = c(10, 20, 100, 5), name = LETTERS[1:4])
ggplot(df, aes(x = name, y = val, fill = name)) +
geom_bar(stat = "identity")
There is a gap from the zero point on the y axis (bottom of the bars) and where the x axis labels are.
You can remove this using scale_y_discrete or scale_y_continuous, depending on the nature of your data, and setting expand to c(0,0):
ggplot(df, aes(x = name, y = val, fill = name)) +
geom_bar(stat = "identity") +
scale_y_discrete(expand = c(0,0)) +
scale_x_discrete(expand = c(0,0))
This gives the plot:
Note I've also removed the gap along the y axis, simply remove the scale_x_discrete line to add this gap back in.
Since error bars are an issue, here are a few examples:
ggplot(df, aes(x = name, y = val, fill = name)) +
geom_bar(stat = "identity") +
geom_errorbar(aes(ymin = val - 10,
ymax = val + 10))
You can use scale to remove the padding down to the error bar:
ggplot(df, aes(x = name, y = val, fill = name)) +
geom_bar(stat = "identity") +
geom_errorbar(aes(ymin = val - 10,
ymax = val + 10)) +
scale_y_continuous(expand = c(0,0))
Or you can use coord_cartesian to give a hard cutoff:
ggplot(df, aes(x = name, y = val, fill = name)) +
geom_bar(stat = "identity") +
geom_errorbar(aes(ymin = val - 10,
ymax = val + 10)) +
scale_y_continuous(expand = c(0,0)) +
coord_cartesian(ylim = c(0, max(df$val) + 10))