I want to create an animated barplot with the gganimate package. The barplot should contain 4 bars, but only three of the bars should be shown at the same time. When a bar drops out and a new bar comes in, the animation should be smooth (as it is when two bars switch position within the plot).
Consider the following example:
# Set seed
set.seed(642)
# Create example data
df <- data.frame(ordering = c(rep(1:3, 2), 3:1, rep(1:3, 2)),
year = factor(sort(rep(2001:2005, 3))),
value = round(runif(15, 0, 100)),
group = c(letters[sample(1:4, 3)],
letters[sample(1:4, 3)],
letters[sample(1:4, 3)],
letters[sample(1:4, 3)],
letters[sample(1:4, 3)]))
# Load packages
library("gganimate")
library("ggplot2")
# Create animated ggplot
ggp <- ggplot(df, aes(x = ordering, y = value)) +
geom_bar(stat = "identity", aes(fill = group)) +
transition_states(year, transition_length = 2, state_length = 0)
ggp
If a bar is exchanged, the color of the bar just changes without any smooth animation (i.e. the new bar should fly in from the side and the replaced bar should fly out).
Question: How could I smoothen the replacement of bars?
I'm getting a little glitch at 2003 (b and c seem to swap upon transition), but hopefully this helps you get closer. I think enter_drift and exit_drift are what you're looking for.
library("gganimate")
library("ggplot2")
ggp <- ggplot(df, aes(x = ordering, y = value, group = group)) +
geom_bar(stat = "identity", aes(fill = group)) +
transition_states(year, transition_length = 2, state_length = 0) +
ease_aes('quadratic-in-out') + # Optional, I used to see settled states clearer
enter_drift(x_mod = -1) + exit_drift(x_mod = 1) +
labs(title = "Year {closest_state}")
animate(ggp, width = 600, height = 300, fps = 20)
Related
Is there any method to set scale = 'free_y' on the left hand (first) axis in ggplot2 and use a fixed axis on the right hand (second) axis?
I have a dataset where I need to use free scales for one variable and fixed for another but represent both on the same plot. To do so I'm trying to add a second, fixed, y-axis to my data. The problem is I cannot find any method to set a fixed scale for the 2nd axis and have that reflected in the facet grid.
This is the code I have so far to create the graph -
#plot weekly seizure date
p <- ggplot(dfspw_all, aes(x=WkYr, y=Seizures, group = 1)) + geom_line() +
xlab("Week Under Observation") + ggtitle("Average Seizures per Week - To Date") +
geom_line(data = dfsl_all, aes(x =WkYr, y = Sleep), color = 'green') +
scale_y_continuous(
# Features of the first axis
name = "Seizures",
# Add a second axis and specify its features
sec.axis = sec_axis(~.[0:20], name="Sleep")
)
p + facet_grid(vars(Name), scales = "free_y") +
theme(axis.ticks.x=element_blank(),axis.text.x = element_blank())
This is what it is producing (some details omitted from code for simplicity) -
What I need is for the scale on the left to remain "free" and the scale on the right to range from 0-24.
Secondary axes are implemented in ggplot2 as a decoration that is a transformation of the primary axis, so I don't know an elegant way to do this, since it would require the secondary axis formula to be aware of different scaling factors for each facet.
Here's a hacky approach where I scale each secondary series to its respective primary series, and then add some manual annotations for the secondary series. Another way might be to make the plots separately for each facet like here and use patchwork to combine them.
Given some fake data where the facets have different ranges for the primary series but the same range for the secondary series:
library(tidyverse)
fake <- tibble(facet = rep(1:3, each = 10),
x = rep(1:10, times = 3),
y_prim = (1+sin(x))*facet/2,
y_sec = (1 + sin(x*3))/2)
ggplot(fake, aes(x, y_prim)) +
geom_line() +
geom_line(aes(y= y_sec), color = "green") +
facet_wrap(~facet, ncol = 1)
...we could scale each secondary series to its primary series, and add custom annotations for that secondary series:
fake2 <- fake %>%
group_by(facet) %>%
mutate(y_sec_scaled = y_sec/max(y_sec) * (max(y_prim))) %>%
ungroup()
fake2_labels <- fake %>%
group_by(facet) %>%
summarize(max_prim = max(y_prim), baseline = 0, x_val = 10.5)
ggplot(fake2, aes(x, y_prim)) +
geom_line() +
geom_line(aes(y= y_sec_scaled), color = "green") +
facet_wrap(~facet, ncol = 1, scales = "free_y") +
geom_text(data = fake2_labels, aes(x = x_val, y = max_prim, label = "100%"),
hjust = 0, color = "green") +
geom_text(data = fake2_labels, aes(x = x_val, y = baseline, label = "0%"),
hjust = 0, color = "green") +
coord_cartesian(xlim = c(0, 10), clip = "off") +
theme(plot.margin = unit(c(1,3,1,1), "lines"))
I'm new to gganimate and was having difficulty figuring out how to do this.
I'd like to show the spread in two different levels of a variable by animating colour transitions. I want to show this by having the narrow level transition through a smaller range of colours than the wider level in the same amount of time. Is this possible?
Here's the reproducible example I have up-to now.
library(ggplot2)
df <- data.frame(x = rep(c("narrow", "wide"), each = 1000),
y = c(sort(runif(n = 1000, min = 6, max = 7)),
sort(runif(n = 1000, min = 3, 10))),
animate.time = c(1:1000))
ggplot(data = df, aes(x = x, fill = y)) +
geom_bar()
Created on 2021-06-12 by the reprex package (v2.0.0)
As part of this, the color scale that each bar uses should be the same, but the narrow x should occupy a narrower range of the color range than the wide x.
I used geom_bar because I don't want the shape of the visualisation to change, however I'm not sure how to proceed. I tried adding animate.time so that gganimate would have something to step through, but then I don't see how I'm supposed to make it change colors based on y.
I'd appreciate any pointers.
There is an easier way to do this based on this.
library(colorspace)
library(gganimate)
library(ggplot2)
# Narrow Visualization
narrow<-data.frame(x = rep(10,10),
state_cont=rep(1:2, each = 5))
ggplot(data = narrow, aes(x = x, fill = state_cont)) +
geom_bar() +
colorspace::scale_fill_continuous_sequential(palette = "Reds 3", begin = 0.4, end = .6) +
transition_states(states = state_cont) + theme_void() +
theme(legend.position = "none")
# Wide Visualization
wide <- data.frame(x=rep(10,10),
state_cont=rep(1:2, each = 5))
ggplot(data = wide, aes(x = x, fill = state_cont)) +
geom_bar() +
colorspace::scale_fill_continuous_sequential(palette = "Reds 3", begin = 0, end = 1) +
transition_states(states = state_cont) + theme_void() + theme(legend.position = "none")
The explanation is that for this particular question, you don't want to set every single value that it should oscillate through, it'd be much easier to simply set the boundaries in the scale_fill_continuous_sequential() via the begin and end arguments. Then, gganimate will automatically cycle through based on state_cont
I'm trying to use ggplot to create a bar plot (or histogram) to mirror the freq function from the descr package (where each discrete value in the variable gets its own column in the frequency plot, with the x-ais ticks centered around each value), but I'm having some trouble getting this to work.
Here is what I'm trying to create (but using ggplot so I can use its nice graphics):
library(ggplot2)
library(descr)
variable <- c(0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 3, 4, 7)
df <- data.frame(variable)
freq(df$variable)
And here is my trying (and failing) to do the same in ggplot:
histo.variable <- ggplot(df, aes(x = variable)) + # create histogram
geom_bar(stat = "bin") +
xlab("Variable Value") +
ylab("Count") +
scale_x_continuous(breaks = scales::pretty_breaks(n = 10))
histo.variable
As you can see, the bars are not centered on the tick marks. (Additionally, it'd be great to get rid of the little half-lines in between the bars.)
Thanks to anyone who can help!
Maybe like this:
ggplot(df, aes(x = variable)) +
geom_histogram(aes(y = ..density..),
binwidth = 1,
colour = "blue", fill = "lightblue")
I'm trying to produce a facetted pie-chart with ggplot and facing problems with placing text in the middle of each slice:
dat = read.table(text = "Channel Volume Cnt
AGENT high 8344
AGENT medium 5448
AGENT low 23823
KIOSK high 19275
KIOSK medium 13554
KIOSK low 38293", header=TRUE)
vis = ggplot(data=dat, aes(x=factor(1), y=Cnt, fill=Volume)) +
geom_bar(stat="identity", position="fill") +
coord_polar(theta="y") +
facet_grid(Channel~.) +
geom_text(aes(x=factor(1), y=Cnt, label=Cnt, ymax=Cnt),
position=position_fill(width=1))
The output:
What parameters of geom_text should be adjusted in order to place numerical labels in the middle of piechart slices?
Related question is Pie plot getting its text on top of each other but it doesn't handle case with facet.
UPDATE: following Paul Hiemstra advice and approach in the question above I changed code as follows:
---> pie_text = dat$Cnt/2 + c(0,cumsum(dat$Cnt)[-length(dat$Cnt)])
vis = ggplot(data=dat, aes(x=factor(1), y=Cnt, fill=Volume)) +
geom_bar(stat="identity", position="fill") +
coord_polar(theta="y") +
facet_grid(Channel~.) +
geom_text(aes(x=factor(1),
---> y=pie_text,
label=Cnt, ymax=Cnt), position=position_fill(width=1))
As I expected tweaking text coordiantes is absolute but it needs be within facet data:
NEW ANSWER: With the introduction of ggplot2 v2.2.0, position_stack() can be used to position the labels without the need to calculate a position variable first. The following code will give you the same result as the old answer:
ggplot(data = dat, aes(x = "", y = Cnt, fill = Volume)) +
geom_bar(stat = "identity") +
geom_text(aes(label = Cnt), position = position_stack(vjust = 0.5)) +
coord_polar(theta = "y") +
facet_grid(Channel ~ ., scales = "free")
To remove "hollow" center, adapt the code to:
ggplot(data = dat, aes(x = 0, y = Cnt, fill = Volume)) +
geom_bar(stat = "identity") +
geom_text(aes(label = Cnt), position = position_stack(vjust = 0.5)) +
scale_x_continuous(expand = c(0,0)) +
coord_polar(theta = "y") +
facet_grid(Channel ~ ., scales = "free")
OLD ANSWER: The solution to this problem is creating a position variable, which can be done quite easily with base R or with the data.table, plyr or dplyr packages:
Step 1: Creating the position variable for each Channel
# with base R
dat$pos <- with(dat, ave(Cnt, Channel, FUN = function(x) cumsum(x) - 0.5*x))
# with the data.table package
library(data.table)
setDT(dat)
dat <- dat[, pos:=cumsum(Cnt)-0.5*Cnt, by="Channel"]
# with the plyr package
library(plyr)
dat <- ddply(dat, .(Channel), transform, pos=cumsum(Cnt)-0.5*Cnt)
# with the dplyr package
library(dplyr)
dat <- dat %>% group_by(Channel) %>% mutate(pos=cumsum(Cnt)-0.5*Cnt)
Step 2: Creating the facetted plot
library(ggplot2)
ggplot(data = dat) +
geom_bar(aes(x = "", y = Cnt, fill = Volume), stat = "identity") +
geom_text(aes(x = "", y = pos, label = Cnt)) +
coord_polar(theta = "y") +
facet_grid(Channel ~ ., scales = "free")
The result:
I would like to speak out against the conventional way of making pies in ggplot2, which is to draw a stacked barplot in polar coordinates. While I appreciate the mathematical elegance of that approach, it does cause all sorts of headaches when the plot doesn't look quite the way it's supposed to. In particular, precisely adjusting the size of the pie can be difficult. (If you don't know what I mean, try to make a pie chart that extends all the way to the edge of the plot panel.)
I prefer drawing pies in a normal cartesian coordinate system, using geom_arc_bar() from ggforce. It requires a little bit of extra work on the front end, because we have to calculate angles ourselves, but that's easy and the level of control we get as a result is more than worth it.
I've used this approach in previous answers here and here.
The data (from the question):
dat = read.table(text = "Channel Volume Cnt
AGENT high 8344
AGENT medium 5448
AGENT low 23823
KIOSK high 19275
KIOSK medium 13554
KIOSK low 38293", header=TRUE)
The pie-drawing code:
library(ggplot2)
library(ggforce)
library(dplyr)
# calculate the start and end angles for each pie
dat_pies <- left_join(dat,
dat %>%
group_by(Channel) %>%
summarize(Cnt_total = sum(Cnt))) %>%
group_by(Channel) %>%
mutate(end_angle = 2*pi*cumsum(Cnt)/Cnt_total, # ending angle for each pie slice
start_angle = lag(end_angle, default = 0), # starting angle for each pie slice
mid_angle = 0.5*(start_angle + end_angle)) # middle of each pie slice, for the text label
rpie = 1 # pie radius
rlabel = 0.6 * rpie # radius of the labels; a number slightly larger than 0.5 seems to work better,
# but 0.5 would place it exactly in the middle as the question asks for.
# draw the pies
ggplot(dat_pies) +
geom_arc_bar(aes(x0 = 0, y0 = 0, r0 = 0, r = rpie,
start = start_angle, end = end_angle, fill = Volume)) +
geom_text(aes(x = rlabel*sin(mid_angle), y = rlabel*cos(mid_angle), label = Cnt),
hjust = 0.5, vjust = 0.5) +
coord_fixed() +
scale_x_continuous(limits = c(-1, 1), name = "", breaks = NULL, labels = NULL) +
scale_y_continuous(limits = c(-1, 1), name = "", breaks = NULL, labels = NULL) +
facet_grid(Channel~.)
To show why I think this this approach is so much more powerful than the conventional (coord_polar()) approach, let's say we want the labels on the outside of the pie rather than inside. This creates a couple of problems, such as we will have to adjust hjust and vjust depending on the side of the pie a label falls, and also we will have to make the
plot panel wider than high to make space for the labels on the side without generating excessive space above and below. Solving these problems in the polar coordinate approach is not fun, but it's trivial in the cartesian coordinates:
# generate hjust and vjust settings depending on the quadrant into which each
# label falls
dat_pies <- mutate(dat_pies,
hjust = ifelse(mid_angle>pi, 1, 0),
vjust = ifelse(mid_angle<pi/2 | mid_angle>3*pi/2, 0, 1))
rlabel = 1.05 * rpie # now we place labels outside of the pies
ggplot(dat_pies) +
geom_arc_bar(aes(x0 = 0, y0 = 0, r0 = 0, r = rpie,
start = start_angle, end = end_angle, fill = Volume)) +
geom_text(aes(x = rlabel*sin(mid_angle), y = rlabel*cos(mid_angle), label = Cnt,
hjust = hjust, vjust = vjust)) +
coord_fixed() +
scale_x_continuous(limits = c(-1.5, 1.4), name = "", breaks = NULL, labels = NULL) +
scale_y_continuous(limits = c(-1, 1), name = "", breaks = NULL, labels = NULL) +
facet_grid(Channel~.)
To tweak the position of the label text relative to the coordinate, you can use the vjust and hjust arguments of geom_text. This will determine the position of all labels simultaneously, so this might not be what you need.
Alternatively, you could tweak the coordinate of the label. Define a new data.frame where you average the Cnt coordinate (label_x[i] = Cnt[i+1] + Cnt[i]) to position the label in the center of that particular pie. Just pass this new data.frame to geom_text in replacement of the original data.frame.
In addition, piecharts have some visual interpretation flaws. In general I would not use them, especially where good alternatives exist, e.g. a dotplot:
ggplot(dat, aes(x = Cnt, y = Volume)) +
geom_point() +
facet_wrap(~ Channel, ncol = 1)
For example, from this plot it is obvious that Cnt is higher for Kiosk than for Agent, this information is lost in the piechart.
Following answer is partial, clunky and I won't accept it.
The hope is that it will solicit better solution.
text_KIOSK = dat$Cnt
text_AGENT = dat$Cnt
text_KIOSK[dat$Channel=='AGENT'] = 0
text_AGENT[dat$Channel=='KIOSK'] = 0
text_KIOSK = text_KIOSK/1.7 + c(0,cumsum(text_KIOSK)[-length(dat$Cnt)])
text_AGENT = text_AGENT/1.7 + c(0,cumsum(text_AGENT)[-length(dat$Cnt)])
text_KIOSK[dat$Channel=='AGENT'] = 0
text_AGENT[dat$Channel=='KIOSK'] = 0
pie_text = text_KIOSK + text_AGENT
vis = ggplot(data=dat, aes(x=factor(1), y=Cnt, fill=Volume)) +
geom_bar(stat="identity", position=position_fill(width=1)) +
coord_polar(theta="y") +
facet_grid(Channel~.) +
geom_text(aes(y=pie_text, label=format(Cnt,format="d",big.mark=','), ymax=Inf), position=position_fill(width=1))
It produces following chart:
As you noticed I can't move labels for green (low).
I have a dataframe in R like this:
dat = data.frame(Sample = c(1,1,2,2,3), Start = c(100,300,150,200,160), Stop = c(180,320,190,220,170))
And I would like to plot it such that the x-axis is the position and the y-axis is the number of samples at that position, with each sample in a different colour. So in the above example you would have some positions with height 1, some with height 2 and one area with height 3. The aim being to find regions where there are a large number of samples and what samples are in that region.
i.e. something like:
&
---
********- -- **
where * = Sample 1, - = Sample 2 and & = Sample 3
My first try:
dat$Sample = factor(dat$Sample)
ggplot(aes(x = Start, y = Sample, xend = Stop, yend = Sample, color = Sample), data = dat) +
geom_segment(size = 2) +
geom_segment(aes(x = Start, y = 0, xend = Stop, yend = 0), size = 2, alpha = 0.2, color = "black")
I combine two segment geometries here. One draws the colored vertical bars. These show where Samples have been measured. The second geometry draws the grey bar below where the density of the samples is shown. Any comments to improve on this quick hack?
This hack may be what you're looking for, however I've greatly increased the size of the dataframe in order to take advantage of stacking by geom_histogram.
library(ggplot2)
dat = data.frame(Sample = c(1,1,2,2,3),
Start = c(100,300,150,200,160),
Stop = c(180,320,190,220,170))
# Reformat the data for plotting with geom_histogram.
dat2 = matrix(ncol=2, nrow=0, dimnames=list(NULL, c("Sample", "Position")))
for (i in seq(nrow(dat))) {
Position = seq(dat[i, "Start"], dat[i, "Stop"])
Sample = rep(dat[i, "Sample"], length(Position))
dat2 = rbind(dat2, cbind(Sample, Position))
}
dat2 = as.data.frame(dat2)
dat2$Sample = factor(dat2$Sample)
plot_1 = ggplot(dat2, aes(x=Position, fill=Sample)) +
theme_bw() +
opts(panel.grid.minor=theme_blank(), panel.grid.major=theme_blank()) +
geom_hline(yintercept=seq(0, 20), colour="grey80", size=0.15) +
geom_hline(yintercept=3, linetype=2) +
geom_histogram(binwidth=1) +
ylim(c(0, 20)) +
ylab("Count") +
opts(axis.title.x=theme_text(size=11, vjust=0.5)) +
opts(axis.title.y=theme_text(size=11, angle=90)) +
opts(title="Segment Plot")
png("plot_1.png", height=200, width=650)
print(plot_1)
dev.off()
Note that the way I've reformatted the dataframe is a bit ugly, and will not scale well (e.g. if you have millions of segments and/or large start and stop positions).