Rounding % Labels on bar chart in ggplot2 - r

q1 <- qplot(factor(Q1), data=survey, geom="histogram", fill=factor(Q1), ylim=c(0,300))
options(digits=2)
q1 + geom_bar(colour="black") +
stat_bin(aes(label=..count..), vjust=-2, geom="text", position="identity") +
stat_bin(geom="text", aes(label=paste(..count../sum(..count..)*100,"%"), vjust=-0.75)) +
labs(x="Question # 1:\n 0 = Didn't respond, 1 = Not at all familiar, 5 = Very familiar") +
opts(title="Histogram of Question # 1:\nHow familiar are you with the term 'Biobased Products'?",
legend.position = "none",
plot.title = theme_text(size = 16, , vjust = 1, face = "bold"),
axis.title.x =theme_text(size=14), axis.text.x=theme_text(size=12),
axis.title.y=theme_text(size=14, angle=90), axis.text.y=theme_text(size=12))
As you can see I'm getting way more digits than what is needed, I was hoping the options(digits=2) would do it but I guess not. Any ideas?

Actually you are very close to there.
Here is a minimal example:
df <- data.frame(x = factor(sample(5, 99, T)))
ggplot(df, aes(x)) +
stat_bin(aes(label = paste(sprintf("%.02f", ..count../sum(..count..)*100), "%")),
geom="text")
also, format, round, prettyNum, etc, is available.
UPDATED:
Thanks to #Tommy 's comment, here si a more simple form:
ggplot(df, aes(x)) +
stat_bin(aes(label = sprintf("%.02f %%", ..count../sum(..count..)*100)),
geom="text")

Related

Why are colours appearing in the labels of my gganimate sketch?

I have a gganimate sketch in R and I would like to have the percentages of my bar chart appear as labels.
But for some bizarre reason, I am getting seemingly random colours in place of the labels that I'm requesting.
If I run the ggplot part without animating then it's a mess (as it should be), but it's obvious that the percentages are appearing correctly.
Any ideas? The colour codes don't correspond to the colours of the bars which I have chosen separately. The codes displayed also cycle through about half a dozen different codes, at a rate different to the frame rate that I selected. And while the bars are the same height (they grow until they reach the chosen height displayed in the animation) then they display the same code until they stop and it gets frozen.
Code snippet:
df_new <- data.frame(index, rate, year, colour)
df_new$rate_label <- ifelse(round(df_new$rate, 1) %% 1 == 0,
paste0(round(df_new$rate, 1), ".0%"), paste0(round(df_new$rate, 1), "%"))
p <- ggplot(df_new, aes(x = year, y = rate, fill = year)) +
geom_bar(stat = "identity", position = "dodge") +
scale_fill_manual(values = colour) +
#geom_text(aes(y = rate, label = paste0(rate, "%")), vjust = -0.7) +
geom_shadowtext(aes(y = rate, label = rate_label),
bg.colour='white',
colour = 'black',
size = 9,
fontface = "bold",
vjust = -0.7,
alpha = 1
) +
coord_cartesian(clip = 'off') +
ggtitle("% population belonging to 'No religion', England and Wales census") +
theme_minimal() +
xlab("") + ylab("") +
theme(legend.position = "none") +
theme(plot.title = element_text(size = 18, face = "bold")) +
theme(axis.text = element_text(size = 14)) +
scale_y_continuous(limits = c(0, 45), breaks = 10*(0:4))
p
p <- p + transition_reveal(index) + view_follow(fixed_y = T)
animate(p, renderer = gifski_renderer(), nframes = 300, fps = frame_rate, height = 500, width = 800,
end_pause = 0)
anim_save("atheism.gif")
I think you have missed some delicate points about ggplot2. I will try my best to describe them to you. First of all, you need to enter the discrete values as factor or integer. So you can use as.factor() before plotting or just factor() in the aesthetic. Also, you should consider rounding the percentages as you wish. Here is an example:
set.seed(2023)
df_new <- data.frame(index=1:10, rate=runif(10), year=2001:2010, colour=1:10)
df_new$rate_label <- ifelse(round(df_new$rate, 1) %% 1 == 0,
paste0(round(df_new$rate, 1), ".0%"),
paste0(round(df_new$rate, 1), "%"))
The ggplot for this data is:
library(ggplot2)
p <- ggplot(df_new, aes(x = factor(year), y = rate, fill = factor(colour))) +
geom_bar(stat = "identity", position = "dodge") +
geom_text(aes(y = rate, label = paste0(round(rate,2), "%")), vjust = -0.7) +
coord_cartesian(clip = 'off') +
ggtitle("% population belonging to 'No religion', England and Wales census") +
theme_minimal() +
xlab("") + ylab("") +
theme(legend.position = "none",
plot.title = element_text(size = 18, face = "bold"),
axis.text = element_text(size = 14))
p
And you can combine all theme element in one theme() function (as did I). The output is:
And you can easily animate the plot using the following code:
library(gganimate)
p + transition_reveal(index)
And the output is as below:
Hope it helps.
So it was answered here although I don't know why the fix works.
For some reason, labels need to go into gganimate as factors
as.factor()
I just had to add the line:
df_new$rate_label <- as.factor(df_new$rate_label)
and it works fine.

How to stop ggplot from getting smaller due to an added index in the title?

My Problem is, that i have a gridarrange object out of 6 Scatterplots with 3 different titels. The 2nd title has in index number, which makes my plot slightly smaller. Is there a way to prevent that from happening?
My code is:
a<- ggplot(data=Bad_Lauchstaedt, mapping=aes(x= `one_h_gap_ET0`, y= `BL 2-1`))+
geom_smooth(method = "lm",se=FALSE, color="red")+
geom_point(color="darkblue", shape=1)+
geom_abline(intercept = 0, slope = 1, color="black", size=1.2, linetype="twodash")+
labs(y="measured data", x="gap filled data by lysimeters", title = expression("ET"[0]))+
theme_bw()+
theme(plot.title = element_text(hjust = 0.5, size = 20))+
theme(axis.title.x = element_blank(),axis.text.x = element_blank(),axis.ticks.x = element_blank())+
theme(axis.title.y = element_blank(),axis.text.y = element_blank(), axis.ticks.y = element_blank())+
xlim(0,0.9)+
ylim(0,0.9)+
theme(plot.margin = unit(c(0,0,0,0),"pt"))+
geom_richtext(
data = b_label,
aes(posx, posy, label = label),
hjust = 0, vjust = 0,
size = 6,
fill = "white", label.color = "black")
scatterplots<- list(a,b,c,d,e,f)
grobs= lapply(scatterplots, ggplotGrob)
grid.arrange(arrangeGrob(grobs=scatterplots, widths= c(1,1,1),
layout_matrix = rbind(c(1,2,3),
c(4,5,6))),
left=grid::textGrob('hourly ET observed [mm]', gp=grid::gpar(fontsize=18), rot= 90),
bottom=grid::textGrob('hourly ET gap filled [mm]', gp=grid::gpar(fontsize=18)),
top = grid::textGrob('gap filled by', gp=grid::gpar(fontsize=24)))
Maybe consider not using grid.arrange...
patchwork or cowplot are much easier (although this may be subjective) and maybe safer. Way less code as well.
library(ggplot2)
library(patchwork)
p_norm <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
geom_point() +
ggtitle("normal")
p_abnorm <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
geom_point() +
ggtitle(expression("ET"[0]))
wrap_plots(c(list(p_norm, p_abnorm), rep(list(p_norm), 4)))
Created on 2021-01-20 by the reprex package (v0.3.0)
so the index of ET 0 , created by
title = expression("ET"[0]))
f** s up your alignment of the plots.
my suggestet (somewhat hacky) quickfix is to add:
title = expression("normal title"[]))
to all other titles.

How to increase the hexbin legend in ggplot

I have the following hexbin plot:
I would like the count to start from the lowest possible count, for example 10, and show that with different colour. Note that the lowest count differs with different datasets. Therefore, it is difficult to set it to a specific number. The script that I have written to generate the plot is:
d <- ggplot(selectedDF, aes(BEC_Agg, AC)) + geom_hex(bins = 30) + theme_bw() +
theme(text = element_text(face = "bold", size = 16)) + xlab("\nNormalized BEC") + ylab("AC\n") + scale_fill_gradientn(colors = brewer.pal(3,"Dark2"))
I tried the solution here:
d <- ggplot(selectedDF, aes(BEC_Agg, AC)) + geom_hex(aes(fill=cut(..value..,breaks=pretty(..value..,n=5))),bins = 30) + theme_bw() +
theme(text = element_text(face = "bold", size = 16)) + xlab("\nNormalized BEC") + ylab("AC\n") + scale_fill_gradientn(colors = brewer.pal(3,"Dark2"))
But I got the following error:
Error in cut(value, breaks = pretty(value, n = 5)) :
object 'value' not found
How can I fix that?
You should define the variable value before running ggplot. Since lowest count differs among datasets, you might want to try something like value <- min(count(yourDF)).
Since your focus is tweaking the legend, here is a method. A sample data is generated as you didn't provide any.
# sample dataframe
set.seed(77)
x=rnorm(1000, mean = 4, sd = 1)
y=rnorm(1000, mean = 2, sd = 0.5)
df <- data.frame(x,y)
# -------------------------------------------------------------------------
# The following is from your script
base <- ggplot(df, aes(x, y)) + geom_hex(bins = 30) + theme_bw() +
theme(text = element_text(face = "bold", size = 16)) + xlab("\nNormalized BEC") + ylab("AC\n")
# -------------------------------------------------------------------------
base_limit_break <- base + scale_fill_continuous(limits = c(1,20), breaks = c(1:20))
# -------------------------------------------------------------------------
# This is the part relevant to your question
base_limit_break + guides(fill = guide_colorbar(barheight = unit(10, "cm"), reverse = TRUE))
Output

Error when assembling plots with cowplot::plot_grid

I am using the following code to produce the scatter diagram of a Redundancy Analysis (RDA). The plot is for only one species and I am conducting this analysis for two other species (I am not showing the code for the other two species as it is basically the same).
rda.plot.sap <- ggplot(df1, aes(x=RDA1, y=RDA2)) +
geom_point(aes(shape = df1[,"Enclos"], color = df1[,"Type_enclos"]), size = 2) +
geom_hline(yintercept=0) +
geom_vline(xintercept=0) +
coord_fixed() +
scale_shape_manual(values = c(1, 19)) +
scale_color_manual(values=c('#999999','#E69F00'))
rda.plot.sap <- rda.plot.sap +
geom_segment(data=df2,
aes(x=0, xend=RDA1, y=0, yend=RDA2),
color="red", arrow=arrow(length=unit(0.01,"npc")), size = 0.8) +
geom_text(data=df2,
aes(x=RDA1, y=RDA2, label=rownames(df2),
hjust=0.5*(1-sign(RDA1)) + hjust_sap_x,
vjust=0.5*(1-sign(RDA2) + vjust_sap_x)),
color="red", size=5)
rda.plot.sap <- rda.plot.sap +
geom_segment(data=df3,
aes(x=0, xend=RDA1, y=0, yend=RDA2),
color="blue", arrow=arrow(length=unit(0.01,"npc")), size = 0.8)+
geom_text(data=df3,
aes(x=RDA1, y=RDA2, label=rownames(df3),
hjust=0.5*(1-sign(RDA1)),
vjust=0.5*(1-sign(RDA2))),
color="blue", size=5)
rda.plot.sap <- rda.plot.sap +
theme(panel.background = element_blank(),
axis.title = element_text(size = 20),
axis.line.x = element_line(color="black", size = 1),
axis.line.y = element_line(color="black", size = 1),
axis.text = element_text(size = 15),
legend.title = element_blank(),
legend.text = element_text(size = 15),
legend.key=element_blank(),
legend.position = c(0.15, 0.9)) +
xlim(c(-0.6, 0.4))
rda.plot.sap <- rda.plot.sap +
xlab(paste("RDA1 (", var.rda1, " % - p = ", p.rda1, ")", sep = "")) +
ylab(paste("RDA2 (", var.rda2, " % - p = ", p.rda2, ")", sep = ""))
The code works perfectly fine, and I obtain three separate plots without any error or warnings. The problem is that when I try to assemble these three plots using the function plot_grid of the cowplot package:
final_plot <- plot_grid(rda.plot.sap, rda.plot.epi, rda.plot.het,
nrow = 1, ncol = 3, labels = c("A", "B", "C"))
I always get the same simple error :
"Error: Aesthetics must be either length 1 or the same as the data
(27): shape, colour".
Even stranger, after getting this error, if I want to run again the code of one of the individual plots (of one species only), I get the same error.
This is my first post so I hope I described the problem accurately enough. I am at a loss to understand what is going on here, so thanks in advance to whoever can help.
I'm not sure why, but removing the labels argument from plot_grid() usually fixes this. (You just need to add the labels to each plot individually with geom_text() or ggtitle().)
According to the comment found for this gist, the issue has to do with custom theme not setting the required aesthetics for plot_grid to use for the labels. See the following fix:
final_plot <- plot_grid(rda.plot.sap, rda.plot.epi, rda.plot.het,
nrow = 1, ncol = 3, labels = c("A", "B", "C"),
label_fontfamily = "Times", label_colour = "black")

Position dodge does not work with geom_point and geom_errorbar

I have this overplotting issue going on. Even after reading a lot of posts on dodge, jitter and jitter dodge in all kinds of implementations I can't figure it out.
Here you can get my data: http://pastebin.com/embed_js.php?i=uPXN7nPt
library(dplyr)
library(gdata)
library(ggplot2)
library(directlabels)
all<-read.xls('all_auto_bio_adjusted_c.xls')
all$size.new<-sqrt(all$size.new)
all$station<-as.factor(all$station)
all$group.new<-factor(all$group, levels=c('C. hyperboreus','C. glacialis','Special Calanus','M. longa','Pseudocalanus sp.','Copepoda'))
pd <- position_dodge(w = 50)
allp <- ggplot(data = all, aes(y = averagebiol, x = automatic, colour = group.new, group=group.new)) +
geom_abline(intercept = 0, slope = 1) +
geom_point(aes(size = size.new), show_guide=TRUE, position=pd) +
scale_size_identity()+
geom_errorbar(aes(ymin = averagebiol - stdevbiol, ymax = averagebiol + stdevbiol),colour = "grey", width = 0.1, position=pd) +
facet_grid(group.new~station, scales="free") +
xlab("Automatic identification") + ylab("Manual identification") +
ggtitle("Comparison of automatic vs manual identification") +
theme_bw() +
theme(plot.title = element_text(lineheight=.8, face="bold", size=20,vjust=1), axis.text.x = element_text(colour="grey20",size=15,angle=0,hjust=.5,vjust=.5,face="bold"), axis.text.y = element_text(colour="grey20",size=15,angle=0,hjust=1,vjust=0,face="bold"), axis.title.x = element_text(colour="grey20",size=20,angle=0,hjust=.5,vjust=0,face="bold"), axis.title.y = element_text(colour="grey20",size=20,angle=90,hjust=.5,vjust=1,face="bold"), legend.position="none", strip.text.x = element_text(size = 12, face="bold", colour = "black", angle = 0), strip.text.y = element_text(size = 12, face="bold", colour = "black"))
allp
Which produces this nice plot
But as you can see a lot of the points and error bars are cramped together. Shouldn't my implementation of position dodge work?
If I understood right position dodge takes the scale of the axes, so with a doge of 50 I should see some results. I also tried putting the dodge argument directly into the geom, but that had no effect either.
Any ideas?
If you leave out position = pd in both geom_errorbar() and geom_point() you get the same plot. The reason the data look 'cramped' is because of the spread of the x-values. As far as I know, dodging will only happen if two points 'overlap', which I interpret as having the same x-value, e.g. data on a categorical x-axis like in the case of a bar plot. Your x-axis is continuous so the points will not be dodged.
To deal with the overplotting you could try logarithmic scales:
library(ggplot2)
tmp <- tempfile()
download.file("http://pastebin.com/raw.php?i=uPXN7nPt", tmp)
all <- read.csv(tmp)
all$size.new <- sqrt(all$size.new)
all$station <- as.factor(all$station)
all$group.new <- factor(all$group, levels = c("C. hyperboreus", "C. glacialis",
"Special Calanus", "M. longa",
"Pseudocalanus sp.", "Copepoda"))
# explicitly remove missing data
all <- all[complete.cases(all), ]
allp <- ggplot(data = all, aes(y = averagebiol, x = automatic, colour = group.new,
group = group.new, ymin = averagebiol - stdevbiol,
ymax = averagebiol + stdevbiol)) +
theme_bw() +
geom_abline(intercept = 0, slope = 1) +
geom_errorbar(colour = "grey", width = 0.1) +
geom_point(aes(size = size.new)) +
scale_size_area() + # Just so I could see all the points on my monitor :)
xlab("Automatic identification") +
ylab("Manual identification") +
ggtitle("Comparison of automatic vs manual identification")
allp + scale_x_log10() +
scale_y_log10() +
facet_grid(group.new ~ station, scales = "fixed")

Resources