Gradually change colour of polygon with gganimate - r

I'm trying to create an animation where an object is drawn whilst gradually changing its colour. For illustrative purposes assume that the object I want to draw is a circle.
What I have so far
I create a dataset that contains the coordinates of my circle, a time variable and a alpha variable for gradually changing the colour:
t = seq(-pi, pi, length.out = 30)
df <- data.frame(
x = sin(t),
y = cos(t),
time = 1:30,
alpha = seq(0.1, 1, length.out = 30)
)
Then I use transition_reveal() from {gganimate} to make the animation:
ggplot(df, aes(x = x, y = y)) +
geom_path() +
geom_polygon(aes(alpha = alpha), fill = 'grey30') +
coord_fixed() +
scale_alpha_identity() +
transition_reveal(time)
Which yields the following animation:
The circle is drawn sequentially (which is what I want), but the colour of the polygon does not change. It appears that only the first value of df$alpha is used for the whole animation.
My question is thus how can I change this animation such that the circle is still drawn sequentially AND the colour of the polygon gradually becomes greyer?

There may be an easier way to do this, but with a bit of data manipulation you can do it via transition_states:
df$frame <- 1
df <- do.call(rbind, lapply(seq(nrow(df)), function(i) {
df$frame <- i
df$alpha <- seq(0.1, 1, length = 30)[i]
df[1:i,]
}
))
df <- df[df$frame > 2,]
ggplot(df, aes(x = x, y = y)) +
geom_path() +
geom_polygon(aes(alpha = alpha), fill = 'grey30') +
coord_fixed() +
scale_alpha_identity() +
transition_states(frame)

Related

r ggplot when two colors overlap

I have some codes to generate a plot,the only problem I have is there're many overlapping colors.
When two colors overlap, how do I specify the dominant color?
For example, there're 4 black points when indicator = threshold. They are at 4 x-axis correspondingly. However, the black points at "Wire" and "ACH" scales do not show up because it is overlap with blue points. The black point at "RDFI" scale barely shows up. How can I make black as the dominant color when two colors overlap? Thanks ahead!
ggplot(df, aes(a-axis, y-axis), color=indicator)) +
geom_quasirandom(groupOnX=TRUE, na.rm = TRUE) +
labs(title= 'chart', x='x-axis', y= 'y-axis') +
scale_color_manual(name = 'indicator', values=c("#99ccff","#000000" ))
for specify the dominant color you should use the function new_scale () and its aliases new_scale_color () and new_scale_fill ().
As an example, lets overlay some measurements over a contour map of topography using the beloed volcano
library(ggplot2)
library(ggnewscale)
# Equivalent to melt(volcano)
topography <- expand.grid(x = 1:nrow(volcano),
y = 1:ncol(volcano))
topography$z <- c(volcano)
# point measurements of something at a few locations
set.seed(42)
measurements <- data.frame(x = runif(30, 1, 80),
y = runif(30, 1, 60),
thing = rnorm(30))
dominant point:
ggplot(mapping = aes(x, y)) +
geom_contour(data = topography, aes(z = z, color = stat(level))) +
# Color scale for topography
scale_color_viridis_c(option = "D") +
# geoms below will use another color scale
new_scale_color() +
geom_point(data = measurements, size = 3, aes(color = thing)) +
# Color scale applied to geoms added after new_scale_color()
scale_color_viridis_c(option = "A")
dominant contour:
ggplot(mapping = aes(x, y)) +
geom_point(data = measurements, size = 3, aes(color = thing)) +
scale_color_viridis_c(option = "A")+
new_scale_color() +
geom_contour(data = topography, aes(z = z, color = stat(level))) +
scale_color_viridis_c(option = "D")
Your problem may not lie with what color is dominant. You have selected colors that will show up often. You may be losing the bottom of your Y axis. The code you have in your example can not have possibly produced that plot it has errors.
Here is a simple example that show's one way to overcome your problem by simply overplottting the threshold points after you have plotted the beeswarm.
library(dplyr)
library(ggbeeswarm)
distro <- data.frame(
'variable'=rep(c('runif','rnorm'),each=1000),
'value'=c(runif(2000, min=-3, max=3))
)
distro$indicator <- "NA"
distro[3,3] <- "Threshhold"
distro[163,3] <- "Threshhold"
ggplot2::ggplot(distro,aes(variable, value, color=indicator)) +
geom_quasirandom(groupOnX=TRUE, na.rm = TRUE, width=0.1) +
scale_color_manual(name = 'indicator', values=c("#99ccff","#000000")) +
geom_point(data = distro %>% filter(indicator == "Threshhold"))
You sort your data based on the color variable (your indicator).
Basically you want your black dots to be plotted last = on top of the other ones.
df$indicator <- sort(df$indicator, decreasing=T)
#Tidyverse solution
df <- df %>% arrange(desc(indicator))
Dependent on your levels you may have to reverse sort or not.
Then you just plot.
pd <- tibble(x=rnorm(1000), y=1, indicator=sample(c("A","B"), replace=T, size = 1000))
ggplot(pd, aes(x=x,y=y,color=indicator)) + geom_point()
pd <- pd %>% arrange(indicator)
ggplot(pd, aes(x=x,y=y,color=indicator)) + geom_point()
pd <- pd %>% arrange(desc(indicator))
ggplot(pd, aes(x=x,y=y,color=indicator)) + geom_point()

Transforming the y-axis without changing raw data in ggplot2

I have a question about how to transform the y-axis in ggplot2. My plot now has two lines and a scatter plot. For the scatter plot, I am very interested in the area around zero. Is there a possible way to enlarge the space between 0% and 5% and narrow the space between 20% and 30%?
I have tried to use coord_trans(y = "log10") to transform into a log form. But in this case, I have a lot of negative values, so if I want to use sqrt or log, the negative values will be removed. Do you have any suggestions?
Example of data points:
df1 = data.frame(y = runif(200,min = -1, max = 1))
df1 = data.frame( x= seq(1:200), y = df1[order(abs(df1$y)),])
ggplot(df1) +
geom_point(colour = "black",aes(x,y) ,size = 0.1)
I want to have more space between 0% and 5 % and less space between 5% and 30%.
I have tried to use trans_new() to transform the axes.
eps <- 1e-8
tn <- trans_new("logpeps",
function(x) (x+eps)^(3),
function(y) ((y)^(1/3) ),
domain=c(- Inf, Inf)
)
ggplot(df1)+ geom_point(colour = "black",aes(x,y) ,size = 0.1) +
# xlab("Observations sorted by PD in v3.1") + ylab("Absolute PD difference ") +
# ggtitle("Absolute PD for RiskCalc v4.0 relative to v3.1") +
scale_x_continuous(breaks = seq(0, round(rownum/1000)*1000, by = round(rownum/100)*10)) +
scale_y_continuous(limits = c(-yrange,yrange),breaks = c(-breaksY,breaksY),
sec.axis = sec_axis(~.,breaks = c(-breaksY[2:length(breaksY)],breaksY), labels = scales:: percent
)) +
# geom_line(data = df, aes(x,y[,3], colour = "blue"),size = 1) +
# geom_line(data = ds,aes(xval, yval,colour = "red"),size = 1) +
coord_trans(y = tn) +
scale_color_discrete(name = element_blank())
But it compresses the plot to the center, which is opposite to what I want. Then I try to use y = y^3, but it shows an
ERROR: zero_range(range)
Try a cube root transform on the y values:
aes(y=yVariable^(1/3))
or use trans_new() to define a new transformation (such as cube root, with pleasing breaks and labels).
A couple thoughts:
You can remove the empty edges of the plot like so:
scale_y_continuous(expand = c(0,0))
If you want to try the log transformation, just do:
scale_y_log10()
If you want to focus the window:
scale_y_continuous(limits=c(-.15,.15), expand=c(0,0))
Also consider adding theme_bw() for a cleaner look

Know proportions of ggplot2 plot

I usually save the plots from ggplot2 using the the png device. The width and the height of the output are set by the arguments of the function. Blank zones are drawn when the "natural proportions" of the graph dont't suit the proportions of the device. In order to avoid this and use the whole defined canvas, the proportions of the plot must be known. ¿Is there a way to find out this value without trial and error?
This code can be used as an example:
x <- seq(from = 0, to = 1, by = 0.1)
y <- seq(from = 1, to = 2, by = 0.1)
df <- expand.grid(x = x, y = y)
df <- cbind(df, z = rnorm(ncol(df), 0, 1))
p <- ggplot(df, aes(x,y, fill = z)) + geom_raster() + coord_fixed()
ppi <- 300
#Value 0.4 is used to change inches into milimeters
png("plot.png", width = 16*0.4*ppi, height = 20*0.4*ppi, res = ppi)
print(p)
dev.off()
It can be seen that some blank space is added at the top and at the bottom to fill the png file. This could be easily corrected by using a proportion different from 20/16, which is not optimal.
You can modify the ratio arg inside coord_fixed():
p <- ggplot(df, aes(x,y, fill = z)) +
geom_raster() +
coord_fixed(ratio = 20/16)
Alteratively you can specify the aspect.ratio inside the theme():
p <- ggplot(df, aes(x,y, fill = z)) +
geom_raster() +
theme(aspect.ratio = 20/16)
The result is the same:

Create a colour blind test with ggplot

I would like to create a colour blind test, similar to that below, using ggplot.
The basic idea is to use geom_hex (or perhaps a voronoi diagram, or possibly even circles as in the figure above) as the starting point, and define a dataframe that, when plotted in ggplot, produces the image.
We would start by creating a dataset, such as:
df <- data.frame(x = rnorm(10000), y = rnorm(10000))
then plot this:
ggplot(df, aes(x, y)) +
geom_hex() +
coord_equal() +
scale_fill_gradient(low = "red", high = "green", guide = FALSE) +
theme_void()
which gives the image below:
The main missing step is to create a dataset that actually plots a meaningful symbol (letter or number), and I'm not sure how best to go about this without painstakingly mapping the coordinates. Ideally one would be able to read in the coordinates perhaps from an image file.
Finally, a bit of tidying up could round the plot edges by removing the outlying points.
All suggestions are very welcome!
EDIT
Getting a little closer to what I'm after, we can use the image below of the letter 'e':
Using the imager package, we can read this in and convert it to a dataframe:
img <- imager::load.image("e.png")
df <- as.data.frame(img)
then plot that dataframe using geom_raster:
ggplot(df, aes(x, y)) +
geom_raster(aes(fill = value)) +
coord_equal() +
scale_y_continuous(trans = scales::reverse_trans()) +
scale_fill_gradient(low = "red", high = "green", guide = FALSE) +
theme_void()
If we use geom_hex instead of geom_raster, we can get the following plot:
ggplot(df %>% filter(value %in% 1), aes(x, y)) +
geom_hex() +
coord_equal() +
scale_y_continuous(trans = scales::reverse_trans()) +
scale_fill_gradient(low = "red", high = "green", guide = FALSE) +
theme_void()
so, getting there but clearly still a long way off...
Here's an approach for creating this plot:
Packages you need:
library(tidyverse)
library(packcircles)
Get image into a 2D matrix (x and y coordinates) of values. To do this, I downloaded the .png file of the e as "e.png" and saved in my working directory. Then some processing:
img <- png::readPNG("e.png")
# From http://stackoverflow.com/questions/16496210/rotate-a-matrix-in-r
rotate <- function(x) t(apply(x, 2, rev))
# Convert to one colour layer and rotate it to be in right direction
img <- rotate(img[,,1])
# Check that matrix makes sense:
image(img)
Next, create a whole lot of circles! I did this based on this post.
# Create random "circles"
# *** THESE VALUES WAY NEED ADJUSTING
ncircles <- 1200
offset <- 100
rmax <- 80
x_limits <- c(-offset, ncol(img) + offset)
y_limits <- c(-offset, nrow(img) + offset)
xyr <- data.frame(
x = runif(ncircles, min(x_limits), max(x_limits)),
y = runif(ncircles, min(y_limits), max(y_limits)),
r = rbeta(ncircles, 1, 10) * rmax)
# Find non-overlapping arrangement
res <- circleLayout(xyr, x_limits, y_limits, maxiter = 1000)
cat(res$niter, "iterations performed")
#> 1000 iterations performed
# Convert to data for plotting (just circles for now)
plot_d <- circlePlotData(res$layout)
# Check circle arrangement
ggplot(plot_d) +
geom_polygon(aes(x, y, group=id), colour = "white", fill = "skyblue") +
coord_fixed() +
theme_minimal()
Finally, interpolate the image pixel values for the centre of each circle. This will indicate whether a circle is centered over the shape or not. Add some noise to get variance in colour and plot.
# Get x,y positions of centre of each circle
circle_positions <- plot_d %>%
group_by(id) %>%
summarise(x = min(x) + (diff(range(x)) / 2),
y = min(y) + (diff(range(y)) / 2))
# Interpolate on original image to get z value for each circle
circle_positions <- circle_positions %>%
mutate(
z = fields::interp.surface(
list(x = seq(nrow(img)), y = seq(ncol(img)), z = img),
as.matrix(.[, c("x", "y")])),
z = ifelse(is.na(z), 1, round(z)) # 1 is the "empty" area shown earlier
)
# Add a little noise to the z values
set.seed(070516)
circle_positions <- circle_positions %>%
mutate(z = z + rnorm(n(), sd = .1))
# Bind z value to data for plotting and use as fill
plot_d %>%
left_join(select(circle_positions, id, z)) %>%
ggplot(aes(x, y, group = id, fill = z)) +
geom_polygon(colour = "white", show.legend = FALSE) +
scale_fill_gradient(low = "#008000", high = "#ff4040") +
coord_fixed() +
theme_void()
#> Joining, by = "id"
To get colours right, tweak them in scale_fill_gradient

Plotting overlapping positions in R

I have a dataframe in R like this:
dat = data.frame(Sample = c(1,1,2,2,3), Start = c(100,300,150,200,160), Stop = c(180,320,190,220,170))
And I would like to plot it such that the x-axis is the position and the y-axis is the number of samples at that position, with each sample in a different colour. So in the above example you would have some positions with height 1, some with height 2 and one area with height 3. The aim being to find regions where there are a large number of samples and what samples are in that region.
i.e. something like:
&
---
********- -- **
where * = Sample 1, - = Sample 2 and & = Sample 3
My first try:
dat$Sample = factor(dat$Sample)
ggplot(aes(x = Start, y = Sample, xend = Stop, yend = Sample, color = Sample), data = dat) +
geom_segment(size = 2) +
geom_segment(aes(x = Start, y = 0, xend = Stop, yend = 0), size = 2, alpha = 0.2, color = "black")
I combine two segment geometries here. One draws the colored vertical bars. These show where Samples have been measured. The second geometry draws the grey bar below where the density of the samples is shown. Any comments to improve on this quick hack?
This hack may be what you're looking for, however I've greatly increased the size of the dataframe in order to take advantage of stacking by geom_histogram.
library(ggplot2)
dat = data.frame(Sample = c(1,1,2,2,3),
Start = c(100,300,150,200,160),
Stop = c(180,320,190,220,170))
# Reformat the data for plotting with geom_histogram.
dat2 = matrix(ncol=2, nrow=0, dimnames=list(NULL, c("Sample", "Position")))
for (i in seq(nrow(dat))) {
Position = seq(dat[i, "Start"], dat[i, "Stop"])
Sample = rep(dat[i, "Sample"], length(Position))
dat2 = rbind(dat2, cbind(Sample, Position))
}
dat2 = as.data.frame(dat2)
dat2$Sample = factor(dat2$Sample)
plot_1 = ggplot(dat2, aes(x=Position, fill=Sample)) +
theme_bw() +
opts(panel.grid.minor=theme_blank(), panel.grid.major=theme_blank()) +
geom_hline(yintercept=seq(0, 20), colour="grey80", size=0.15) +
geom_hline(yintercept=3, linetype=2) +
geom_histogram(binwidth=1) +
ylim(c(0, 20)) +
ylab("Count") +
opts(axis.title.x=theme_text(size=11, vjust=0.5)) +
opts(axis.title.y=theme_text(size=11, angle=90)) +
opts(title="Segment Plot")
png("plot_1.png", height=200, width=650)
print(plot_1)
dev.off()
Note that the way I've reformatted the dataframe is a bit ugly, and will not scale well (e.g. if you have millions of segments and/or large start and stop positions).

Resources