How to print Frequencies on top of Histogram bars in ggplot - r

I would like to know if it is possible to display the frequencies at the top of each counting bar in a ggplot histogram.
This is the code I have got so far:
br <- seq(0, 178, 10)
ggplot(dfAllCounts, aes(x=months)) + geom_histogram(aes(months), bins = 30, fill="#d2aa47", color = '#163B8B', size = .8, alpha = 0.3) +
scale_x_continuous(breaks = br)
I would like to display that number of months on top, thanks

Instead of the geom_histogram wrapper, switch to the underlying stat_bin function, where you can use the built in geom="text", combined with the after_stat(count) to add the label to a histogram.
ggplot(mpg,aes(x=displ)) +
stat_bin(binwidth=1) +
stat_bin(binwidth=1, geom="text", aes(label=after_stat(count)), vjust=0)
Modified from https://stackoverflow.com/a/24199013/10276092

Related

Legend title gets cut off

I make a plot of Cumulative distribution functions of seven distributions. However, it does not show all the elements of the legend. The plot looks like this:
The distribution does not show "Distribution" in full. Can anyone tell me how to solve this?
The code looks like this:
ggplot(df, aes(x, colour = Distribution)) + stat_ecdf() + scale_color_discrete(labels = c("N(0, 2)", 'Laplace(0, 1)',
TeX("$0.85N(0, 0.2) + 0.15N(0, 20)$"),
'Gamma(0.05, 0.1)', 'Exp(0.9)', TeX("$F_{3, 8}$"), TeX("$P(mu_i=0) = 0.95, P(mu_i=8) = 0.05$"))) +
xlim(-3, 3) +
xlab("Value") +
ylab("Cumulative Density Function") +
theme(legend.text.align = 0, legend.position="bottom", legend.text=element_text(size=10))
Also, I do not want to change the position of legend. Because the plot can be too narrow if put the legend left or right side.

Case dependent scaling of plot size in ggplot loop

I am running a several ggplot barplots in a loop, including added text on top of each bar. I have defined plot scale via coord_fixed and expand_limits. Unfortunately, the y-axis differs from plot to plot, so that scale settings will not fit in all cases, i.e. the text gets cut off and/or the axes get compressed. Let me illustrate:
period <- c(rep("A",4),rep("B",4))
group <- rep(c("C","C","D","D"),2)
size <- rep(c("E","F"),4)
value <- c(23,29,77,62,18,30,54,81)
df <- data.frame(period,group,size,value)
library(ggplot2)
for (i in levels(df$group))
{
p <- ggplot(subset(df, group==i), aes(x=size, y=value, fill = period)) +
geom_bar(position="dodge", stat="identity", show.legend=F) +
geom_text(data=subset(df, group==i), aes(x=size, y=value,label=value),
size=10, fontface="bold", position = position_dodge(width=1),vjust = -0.5) +
expand_limits(y = max(df$value)*0.6) +
coord_fixed(ratio = 0.01)
ggsave(paste0("yourfilepath",i,".png"), width=7.72, height=4.5, units="in", p)
}
I would like the settings of coord_fixed and expand_limits to be case sensitive, dependening on value. I have experimented with using e.g. expand_limits(y = max(df$value * ifelse(df$value <= 50, 0.6, 1))), but that doesn't work in the way I had hoped. Any suggestions will be greatly appreciated!
Based on #Z.Lin's comment, I have added the df$value[df$group==i] argument to my ifelse function: expand_limits(y = max(df$value[df$group==i] * ifelse(df$value[df$group==i] <= 50, 5, 8))).

How to set automatic label position based on box height

In a previous question, I asked about moving the label position of a barplot outside of the bar if the bar was too small. I was provided this following example:
library(ggplot2)
options(scipen=2)
dataset <- data.frame(Riserva_Riv_Fine_Periodo = 1:10 * 10^6 + 1,
Anno = 1:10)
ggplot(data = dataset,
aes(x = Anno,
y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity",
width=0.8,
position="dodge") +
geom_text(aes( y = Riserva_Riv_Fine_Periodo,
label = round(Riserva_Riv_Fine_Periodo, 0),
angle=90,
hjust= ifelse(Riserva_Riv_Fine_Periodo < 3000000, -0.1, 1.2)),
col="red",
size=4,
position = position_dodge(0.9))
And I obtain this graph:
The problem with the example is that the value at which the label is moved must be hard-coded into the plot, and an ifelse statement is used to reposition the label. Is there a way to automatically extract the value to cut?
A slightly better option might be to base the test and the positioning of the labels on the height of the bar relative to the height of the highest bar. That way, the cutoff value and label-shift are scaled to the actual vertical range of the plot. For example:
ydiff = max(dataset$Riserva_Riv_Fine_Periodo)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity", width=0.8) +
geom_text(aes(label = round(Riserva_Riv_Fine_Periodo, 0), angle=90,
y = ifelse(Riserva_Riv_Fine_Periodo < 0.3*ydiff,
Riserva_Riv_Fine_Periodo + 0.1*ydiff,
Riserva_Riv_Fine_Periodo - 0.1*ydiff)),
col="red", size=4)
You would still need to tweak the fractional cutoff in the test condition (I've used 0.3 in this case), depending on the physical size at which you render the plot. But you could package the code into a function to make the any manual adjustments a bit easier.
It's probably possible to automate this by determining the actual sizes of the various grobs that make up the plot and setting the condition and the positioning based on those sizes, but I'm not sure how to do that.
Just as an editorial comment, a plot with labels inside some bars and above others risks confusing the visual mapping of magnitudes to bar heights. I think it would be better to find a way to shrink, abbreviate, recode, or otherwise tweak the labels so that they contain the information you want to convey while being able to have all the labels inside the bars. Maybe something like this:
library(scales)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo/1000)) +
geom_col(width=0.8, fill="grey30") +
geom_text(aes(label = format(Riserva_Riv_Fine_Periodo/1000, big.mark=",", digits=0),
y = 0.5*Riserva_Riv_Fine_Periodo/1000),
col="white", size=3) +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
theme_classic() +
labs(y="Riserva (thousands)")
Or maybe go with a line plot instead of bars:
ggplot(dataset, aes(Anno, Riserva_Riv_Fine_Periodo/1e3)) +
geom_line(linetype="11", size=0.3, colour="grey50") +
geom_text(aes(label=format(Riserva_Riv_Fine_Periodo/1e3, big.mark=",", digits=0)),
size=3) +
theme_classic() +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
expand_limits(y=0) +
labs(y="Riserva (thousands)")

geom_historgram() versus hist() : controlling ranges in geom

Consider the following two plots
library(ggplot2)
set.seed(666)
bigx <- data.frame(x=sample(1:12,50,replace=TRUE))
ggplot(bigx, aes(x=x)) +
geom_histogram(fill = "red", colour =
"black",stat="bin",binwidth=2) +
ylab("Frequency") +
xlab("things") +
ylim(c(0,30))
hist(bigx$x)
Why do I get the overhang above 12 on ggplot? When i play with right = TRUE this just shifts the overhang to below zero. I want the simple and simply bounded result from hist() but using ggplot2.
How can I do this?
If your goal is to reproduce the output of hist(...) using ggplot, this will work:
ggplot(bigx, aes(x=x)) +
geom_histogram(fill = "red", colour = "black",stat="bin",
binwidth=2, right=TRUE) +
scale_x_continuous(limits=c(0,12),breaks=seq(0,12,2))
Or, more generally, this:
brks <- hist(bigx$x, plot=F)$breaks
ggplot(bigx, aes(x=x)) +
geom_histogram(fill = "red", colour = "black",stat="bin",
breaks=brks, right=TRUE) +
scale_x_continuous(limits=range(brks),breaks=brks)
Evidently, the ggplot default for histograms is to use right-closed intervals, whereas the default for hist(...) is left closed intervals. Also, ggplot uses a different algorithm for calculating the x-axis breaks and limits.

Modiffy axes in ggplot

I used the following code based on a previous post How to create odds ratio and 95 % CI plot in R to produce the figure posted below. I would like to:
1) Make x and y axes as well as the legends bold
2) Increase the thickness of the lines
How can I do that in ggplot?
ggplot(alln, aes(x = apoll2, y = increase, ymin = l95, ymax = u95)) + geom_pointrange(aes(col = factor(marker)), position=position_dodge(width=0.50)) +
ylab("Percent increase & 95% CI") + geom_hline(aes(yintercept = 0)) + scale_color_discrete(name = "Marker") + xlab("")
To change axis and legend appearance you should add theme() to your plot.
+ theme(axis.text=element_text(face="bold"),
legend.text=element_text(face="bold"))
To make line wider add size=1.5 inside the geom_pointrange() call.

Resources