GGplot circular: Data dissapears when trying to format x-axis - r

I am trying to create a circular plot showing the number of movements fish have during each hour. It works fine if the code is like this:
ggplot(aragx, aes(x = eventhour, y = Changes, group=eventhour, col=Family)) +
geom_boxplot(position=position_dodge()) +
scale_x_continuous(breaks = seq(0, 23), labels = seq(0, 23)) +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer() + ylab("Count") +
ggtitle("Daily Section Changes per Hour") +ylim(0,20) +facet_wrap(.~Family) +
theme(legend.position = "none") + theme(axis.title.x = element_blank())
However, the 0-12 line doesn't quite run straight to the middle, the angle between 23 and 0 isn't straight and it just doen't look nice. So I modify scale_x_continuous as follows:
ggplot(aragx, aes(x = eventhour, y = Changes, group=eventhour, col=Family)) +
geom_boxplot(position=position_dodge()) +
scale_x_continuous(limits = c(0,24), breaks = seq(0, 23), labels = seq(0, 23)) +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer() + ylab("Count") +
ggtitle("Daily Section Changes per Hour") +ylim(0,20) +facet_wrap(.~Family) +
theme(legend.position = "none") + theme(axis.title.x = element_blank())
This fixes the cosmetic issue, but the data from eventhour=0 is all screwed up, like so:
Does anyone know how to help me? It would be much appreciated, I've been banging my head against the wall over this small thing.

The issue is that by setting limits = c(0, 24) the parts of your box plot which range to the left of 0 (or to the right of 24) are clipped off. Hence, for the boxplot at the zero position only the whiskers and the right segment of the box are drawn.
To prevent that you have to adjust the limits to take account of the width of the boxplot which by default is .75. Hence you could get a full boxplot at the zero position set limits = c(-width_bp / 2, 24 - width_bp / 2). However, doing so will rotate your circular plot slightly which we could compensate for by setting start in coord_polar eqaul to -width_bp / 8 (Note: I checked that out by trial and error but there is for sure a reason why it has to be one eigth. Sigh, was always better in algebra than in geometry. (; ).
Using some random fake example data:
library(ggplot2)
aragx <- data.frame(
eventhour = rep(0:23, 100),
Changes = runif(24 * 100)
)
width_bp <- .75
ggplot(aragx, aes(x = eventhour, y = Changes, group = eventhour)) +
geom_boxplot(position = position_dodge()) +
scale_x_continuous(
limits = c(-width_bp / 2, 24 - width_bp / 2),
breaks = seq(0, 23), labels = seq(0, 23)
) +
coord_polar(start = -width_bp / 8) +
theme_minimal() +
scale_fill_brewer() +
ylab("Count") +
ggtitle("Daily Section Changes per Hour") +
theme(legend.position = "none") +
theme(axis.title.x = element_blank())
Created on 2022-02-03 by the reprex package (v2.0.1)

Related

Why are colours appearing in the labels of my gganimate sketch?

I have a gganimate sketch in R and I would like to have the percentages of my bar chart appear as labels.
But for some bizarre reason, I am getting seemingly random colours in place of the labels that I'm requesting.
If I run the ggplot part without animating then it's a mess (as it should be), but it's obvious that the percentages are appearing correctly.
Any ideas? The colour codes don't correspond to the colours of the bars which I have chosen separately. The codes displayed also cycle through about half a dozen different codes, at a rate different to the frame rate that I selected. And while the bars are the same height (they grow until they reach the chosen height displayed in the animation) then they display the same code until they stop and it gets frozen.
Code snippet:
df_new <- data.frame(index, rate, year, colour)
df_new$rate_label <- ifelse(round(df_new$rate, 1) %% 1 == 0,
paste0(round(df_new$rate, 1), ".0%"), paste0(round(df_new$rate, 1), "%"))
p <- ggplot(df_new, aes(x = year, y = rate, fill = year)) +
geom_bar(stat = "identity", position = "dodge") +
scale_fill_manual(values = colour) +
#geom_text(aes(y = rate, label = paste0(rate, "%")), vjust = -0.7) +
geom_shadowtext(aes(y = rate, label = rate_label),
bg.colour='white',
colour = 'black',
size = 9,
fontface = "bold",
vjust = -0.7,
alpha = 1
) +
coord_cartesian(clip = 'off') +
ggtitle("% population belonging to 'No religion', England and Wales census") +
theme_minimal() +
xlab("") + ylab("") +
theme(legend.position = "none") +
theme(plot.title = element_text(size = 18, face = "bold")) +
theme(axis.text = element_text(size = 14)) +
scale_y_continuous(limits = c(0, 45), breaks = 10*(0:4))
p
p <- p + transition_reveal(index) + view_follow(fixed_y = T)
animate(p, renderer = gifski_renderer(), nframes = 300, fps = frame_rate, height = 500, width = 800,
end_pause = 0)
anim_save("atheism.gif")
I think you have missed some delicate points about ggplot2. I will try my best to describe them to you. First of all, you need to enter the discrete values as factor or integer. So you can use as.factor() before plotting or just factor() in the aesthetic. Also, you should consider rounding the percentages as you wish. Here is an example:
set.seed(2023)
df_new <- data.frame(index=1:10, rate=runif(10), year=2001:2010, colour=1:10)
df_new$rate_label <- ifelse(round(df_new$rate, 1) %% 1 == 0,
paste0(round(df_new$rate, 1), ".0%"),
paste0(round(df_new$rate, 1), "%"))
The ggplot for this data is:
library(ggplot2)
p <- ggplot(df_new, aes(x = factor(year), y = rate, fill = factor(colour))) +
geom_bar(stat = "identity", position = "dodge") +
geom_text(aes(y = rate, label = paste0(round(rate,2), "%")), vjust = -0.7) +
coord_cartesian(clip = 'off') +
ggtitle("% population belonging to 'No religion', England and Wales census") +
theme_minimal() +
xlab("") + ylab("") +
theme(legend.position = "none",
plot.title = element_text(size = 18, face = "bold"),
axis.text = element_text(size = 14))
p
And you can combine all theme element in one theme() function (as did I). The output is:
And you can easily animate the plot using the following code:
library(gganimate)
p + transition_reveal(index)
And the output is as below:
Hope it helps.
So it was answered here although I don't know why the fix works.
For some reason, labels need to go into gganimate as factors
as.factor()
I just had to add the line:
df_new$rate_label <- as.factor(df_new$rate_label)
and it works fine.

How to automatically make an axis with a gradation from 1 to max(n) when using facet_wrap ggplot2

I have a data.frame(tt) with 3 columns:
nazvReki - grouping variable (18 values);
rang - axis X (443 entries from 1 to 70. It takes values from 1 to 70, but for different grouping variables it is different, where 8, where 10, where 16...);
procent - axis Y (0-100 %).
I'm drawing a picture:
ggplot(tt) +
geom_line(aes(rang, procent)) +
scale_x_continuous(
name = c("Rang"),
breaks = scales::pretty_breaks(n = 10)
) +
scale_y_continuous(name = c("Procent")) +
facet_wrap( ~ nazvReki, nrow = 4, scales = "free_x") +
theme_bw()
The result is not very:
I need the values to start with 1, and that there are no vertical lines between the numbers.
I do it a little differently:
tt$rang <- factor(tt$rang)
ggplot(tt, aes(rang, procent)) +
geom_line(aes(group = nazvReki)) +
scale_x_discrete(name = c("Rang")) +
scale_y_continuous(name = c("Procent")) +
facet_wrap( ~ nazvReki, nrow = 4, scales = "free_x") +
theme_bw()
It turns out better:
But not all the numbers fit. I play with the parameters, make several options, combine them in a graphic editor. The result is almost perfect
What is missing for perfection? So that on the X-axis the school always starts with 1 and ends with the maximum value, 5-8 values can be inserted between them. And it was all done automatically.
And the question is: can this be done in ggplot2?
Now, this will do half of the trick. It won't necessarily start at 1 instead of 0, but I would ask myself if it really has to do so. I don't know why or how to force a start at 1...
ggplot(tt) +
geom_line(aes(rang, procent)) +
scale_x_continuous(
name = c("Rang"),
breaks = scales::pretty_breaks(n = 10),
limits = c(1,NA) #start will be at 1, but not the start of the scale
) +
scale_y_continuous(name = c("Procent")) +
facet_wrap( ~ nazvReki, nrow = 4, scales = "free_x") +
theme_bw() +
theme(panel.grid.minor = element_line(color = NA)) #this will remove the secondary lines

Ggplot2: coord_polar() with geom_col()

I have an issue when using coord_polar() together with geom_col(). I have degree values ranging from 0 to <360. Let's say there are in steps of 20, so 0, 20, 40... 340. If I plot them with coord_polar() I have two issues:
values 0 and 340 touch each other and don't have the same gap as the other columns
the "x-axis" is offset slightly, so that 0 does not point "north"
See this minimal example.
suppressWarnings(library(ggplot2))
df <- data.frame(x = seq(0,359,20),y = 1)
ninety = c(0,90,180,270)
p <- ggplot(df, aes(x,y)) +
geom_col(colour = "black",fill = "grey") +
geom_label(aes(label = x)) +
scale_x_continuous(breaks = ninety) +
geom_vline(xintercept = ninety, colour = "red") +
coord_polar()
p
If I set the x-axis limits, the rotation of the coordinate system is correct, but the column at 0 disappears due to lack of space.
p+scale_x_continuous(breaks = c(0,90,180,270),limits = c(0,360))
#> Scale for 'x' is already present. Adding another scale for 'x', which
#> will replace the existing scale.
#> Warning: Removed 1 rows containing missing values (geom_col).
Created on 2019-05-15 by the reprex package (v0.2.1)
Since the space occupied by each bar is 20 degrees, you can shift things by half of that in both scales and coordinates:
ggplot(df, aes(x,y)) +
geom_col(colour = "black",fill = "grey") +
geom_label(aes(label = x)) +
scale_x_continuous(breaks = ninety,
limits = c(-10, 350)) + # shift limits by 10 degrees
geom_vline(xintercept = ninety, colour = "red") +
coord_polar(start = -10/360*2*pi) # offset by 10 degrees (converted to radians)
I got it closer to what you want, but it's a bit of a hack so I don't know if it's a great solution.
Code:
df <- data.frame(x = seq(0,359,20),y = 1)
ggplot(df, aes(x+10,y, hjust=1)) +
geom_col(colour = "black",fill = "grey") +
geom_label(aes(x=x+5,label = x)) +
scale_x_continuous(breaks = c(0,90,180,270),limits = c(0,360)) +
coord_polar()
Instead of plotting the geom_cols's at c(0,20,40,...) I'm now plotting them at c(10,30,50,...). I'm plotting the geom_labels at c(5, 15, 25,...).
The label positioning at the bottom of the chart is still not perfect since 180deg is not South.
I get this graph:

Circular histogram in ggplot2 with even spacing of bars and no extra lines

I'm working on making a circular histogram in ggplot2 that shows how the number of calls varies over 24 hours. My dataset starts at 0 and goes to 23, with the number of calls per hour:
df = data.frame(xvar = 0:23,
y = c(468,520,459,256,397,241,117,120,45,100,231,398,340,276,151,134,157,203,308,493,537,462,448,383))
I'm using the following code to create the circular histogram:
ggplot(df, aes(xvar, y)) +
coord_polar(theta = "x", start = -.13, direction = 1) +
geom_bar(stat = "identity", fill = "maroon4", width = .9) +
geom_hline(yintercept = seq(0, 500, by = 100), color = "grey80", size = 0.3) +
scale_x_continuous(breaks = seq(0, 24), labels = seq(0, 24)) +
xlab("Hour") +
ylab("Number of Calls") +
ggtitle("Number of Calls per Hour") +
theme_bw()
I really like the resulting plot:
but I can't figure out how to get the same spacing between the 23 and 0 bars as is present for the other bars. Right now, those two bars are flush against one another and nothing I've tried so far will separate them. I'm also interested in removing the lines between the different hours (ex. the line between 21 and 22) since it's somewhat distracting and doesn't convey any information. Any advice would be much appreciated, particularly on spacing the 23 and 0 bars!
You can use the expand parameter of scale_x_continuous to adjust. Simplified a little,
ggplot(df, aes(x = xvar, y = y)) +
coord_polar(theta = "x", start = -.13) +
geom_bar(stat = "identity", fill = "maroon4", width = .9) +
geom_hline(yintercept = seq(0, 500, by = 100), color = "grey80", size = 0.3) +
scale_x_continuous(breaks = 0:24, expand = c(.002,0)) +
labs(x = "Hour", y = "Number of Calls", title = "Number of Calls per Hour") +
theme_bw()

ggplot dot covering error bar

I have a huge file and I don't really know what small test dataset I can give here to produce the same problem in the plot, so I will not give any test dataset, I will only attach the plot image here to show the problem.
My code:
ggplot(tgc, aes(x=Week, y=MuFreq)) +
theme_gray(base_size=18) +
theme(plot.title=element_text(hjust=.5),
axis.title.x = element_text(face="bold"),
axis.title.y = element_text(face="bold")) +
geom_errorbar(aes(ymin=MuFreq-(1.96*se), ymax=MuFreq+(1.96*se)), width=3) +
geom_line() +
geom_point(aes(size= N), color="blue")+
scale_x_continuous(breaks=c(68,98,188), labels=c("Wk68", "Wk98", "Wk188")) +
scale_y_continuous(limits=c(0,0.15)) +
scale_size( breaks = unique(tgc$N))
So the problem is that I'm sizing the dots based on the sample size for each week, the middle dot actually has error bars associated with it but it's covering the error bar. I tried to use horizontal error bar but it didn't work because my x-axis is customized to be non-numerical.
What can I do to show the error bar that's being covered?
Also is there any way to make the background vertical grid lines spaced evenly?
The Q asks to improve two things in the ggplot2 chart:
Show error bars that are being covered
Make the background vertical grid lines spaced evenly
Data
As the OP didn't supply any data, we need a dummy data set. This is easily done by reading values from the plot:
tgc <- data.frame(Week = c(68, 98, 188),
MuFreq = c(0.08, 0.09, 0.091),
se = c(0.003, 0.001, 0.019)/1.96,
N = c(91, 835, 7))
This reproduces the original plot quite nicely:
Variant 1
This one is picking up Nick Criswell's comments:
Change order in which layers are plotted, so that error bars are plotted on top
Change colour and alpha
plus
Remove all vertical grid lines except those which are explicetly specified as breaks. The distances of major grid lines are still uneven but reflect the difference in time
With this code
library(ggplot2)
ggplot(tgc, aes(x = Week, y = MuFreq)) +
theme_gray(base_size = 18) +
theme(plot.title = element_text(hjust = .5),
axis.title = element_text(face = "bold")) +
geom_line() +
geom_point(aes(size = N), color = "dodgerblue1", alpha = 0.5) +
geom_errorbar(aes(ymin = MuFreq - (1.96 * se),
ymax = MuFreq + (1.96 * se)), width = 3) +
scale_x_continuous(
breaks = c(68, 98, 188),
labels = c("Wk68", "Wk98", "Wk188"),
minor_breaks = NULL
) +
scale_y_continuous(limits = c(0, 0.15)) +
scale_size(breaks = unique(tgc$N))
we do get:
Variant 2
To get evenly spaced data points on the x-axis we can turn weeks into factor. This requires to tell ggplot2 that the data belong to one group in order to have lines plotted and to add a custom x-axis label.
In addition, theme_bw is used instead of theme_gray:
library(ggplot2)
ggplot(tgc, aes(x = factor(Week, labels = c("Wk68", "Wk98", "Wk188")),
y = MuFreq, group = 1)) +
theme_bw(base_size = 18) +
theme(plot.title = element_text(hjust = .5),
axis.title = element_text(face = "bold")) +
geom_line() +
geom_point(aes(size = N), color = "dodgerblue1", alpha = 0.5) +
geom_errorbar(aes(ymin = MuFreq - (1.96 * se),
ymax = MuFreq + (1.96 * se)), width = 0.05 ) +
scale_y_continuous(limits = c(0, 0.15)) +
scale_size(breaks = unique(tgc$N)) +
xlab("Week")

Resources