I have the following data frame:
# Seed RNG
set.seed(33550336)
# Create data frame
df <- data.frame(x = runif(100),
y = runif(100),
t = runif(100, min = 0, max = 10))
I'd like to plot points (i.e., at x and y coordinates) appearing and disappearing as a function of t. gganimate is awesome, so I used that.
# Load libraries
library(gganimate)
library(ggplot2)
# Create animation
g <- ggplot(df, aes(x = x, y = y))
g <- g + geom_point(colour = "#FF3300", shape = 19, size = 5, alpha = 0.25)
g <- g + labs(title = 'Time: {frame_time}')
g <- g + transition_time(t)
g <- g + enter_fade() + exit_fade()
animate(g, fps = 1)
This code produced the following:
There are a couple of things that I don't like about this.
The transitions are very abrupt. My hope using enter_fade and exit_fade was that the points would fade into view, then back out. Clearly this isn't the case, but how would I achieve this result?
I would like to round {frame_time}, so that while the points fade in and out at fractions of t, the actual time t that would be shown would be an integer. If frame_time was a regular variable, this would be simple enough using something like bquote and round, but this doesn't seem to be the case. How can I round frame_time in my title?
Here's a relatively manual approach that relies on doing more of the prep beforehand and feeding that into gganimate. I'd like to see if there's a simpler way to do this inside gganimate more automatically.
First I make a copy of the data frame for each frame I want to show. Then I calculate the difference between the time I'm presently viewing (time) and the t when I want to show each data point. I use cos to handle the easing in and out, so that each dot's appearance at given time is described with display. In the ggplot call, I then map alpha and size to display, and use transition_time(time) to move through the frames.
# Create prep table
fade_time = 1
frame_count = 100
frames_per_time = 10
df2 <- map_df(seq_len(frame_count), ~df, .id = "time") %>%
mutate(time = as.numeric(time)/frames_per_time,
delta_norm = (t - time) / fade_time,
display = if_else(abs(delta_norm) > 1, 0, cos(pi / 2 * delta_norm)))
# Create animation
g <- ggplot(df2, aes(x = x, y = y, alpha = display, size = display))
g <- g + geom_point(colour = "#FF3300", shape = 19)
g <- g + scale_alpha(range = c(0, 1)) + scale_size_area(max_size = 5)
g <- g + labs(title = "{round(frame_time, 1)}")
g <- g + transition_time(time)
animate(g)
Related
I would like to use gganimate to:
Graph two separate curved lines with geom_path
Call a function that performs a calculation with the data from those lines and returns a single coordinate (x, y)
Plot that coordinate as a geom_point
Move the lines around, with the geom_point updating as the lines move
This is simple if the movement is such that the single (x, y) coordinate moves linearly (just calculate it at each stage ahead of time and then animate it, it will move linearly from each stage to the next), but if it's not I'm not sure what to do. If I call a function within aes(), which seems like the natural solution, it seems to calculate it once at the beginning and then not update it as the lines move.
Here is an example.
library(tidyverse)
library(gganimate)
# A function to find the x and y coordinate of the minimum y value of either set
min_of_both <- function(x1, y1, x2, y2) {
cm <- bind_rows(tibble(x = x1, y = y1),
tibble(x = x2, y = y2))
return(list(x = cm[which(cm$y == min(cm$y)),]$x,
y = min(cm$y)))
}
# Create two parabola paths, curve A which moves downwards from t = 1 to t = 2
curveA <- tibble(xA = -50:50/10, yA = 5+(-50:50/10)^2, t = 1) %>%
bind_rows(tibble(xA = -50:50/10, yA = -10 + (-50:50/10)^2, t = 2))
# And curve B which is static in both time 1 and 2
curveB <- tibble(xB = -50:50/10, yB = 1 + (-30:70/10)^2)
data <- curveB %>%
bind_rows(curveB) %>%
bind_cols(curveA)
# Plot Curve A
p <- ggplot(data, aes(x = xA, y = yA)) +
geom_path(color = 'red') +
# And Curve B
geom_path(aes(x=xB,y=yB), color = 'blue')+
# Then plot a single point that uses both curves as input
# Note I also get problems if trying to run the function through data= instead of mapping=
# or if I define two separate functions, one for x and one for y, so as to avoid $
geom_point(aes(
x = min_of_both(xA,yA,xB,yB)$x,
y = min_of_both(xA,yA,xB,yB)$y),
size = 3,
color = 'black') +
theme_minimal()+
transition_states(t)+
ease_aes('sine-in-out')
animate(p)
This results in (not sure if the animation will play on StackOverflow but the parabola does indeed move):
The black dot is intended to mark the lowest y-coordinate on either parabola at each moment, but instead it marks the lowest y-coordinate on either parabola at any point in the animation (at the end).
Any tips appreciated.
After a lot of head-scratching I think I've understood your point and have found one solution. The best way forward might be to manually tween the paths and calculate the min values using your function whilst grouping by .frame before plotting:
# Same curve setup, but labelling points for grouping later
curveA <- tibble(xA = -50:50/10,
yA = 5+(-50:50/10)^2,
point = 1:101,
t = 1) %>%
bind_rows(tibble(xA = -50:50/10,
yA = -10 + (-50:50/10)^2,
point = 1:101,
t = 2))
curveB <- tibble(xB = -50:50/10,
yB = 1 + (-30:70/10)^2,
point = 1:101,
t = 1)
A_frames <- curveA %>%
tween_along(ease = 'sine-in-out', 100, along = t, id = point) %>%
filter(.phase == "transition") %>%
select(xA, yA, point, .frame) %>%
arrange(.frame, point) # arrange by point needed to keep in order
B_frames <- curveB %>%
bind_rows(curveB %>% mutate(t = 2)) %>%
tween_along(ease = 'sine-in-out', 100, along = t, id = point) %>%
filter(.phase == "transition") %>%
select(xB, yB, point, .frame) %>%
arrange(.frame, point)
data <- A_frames %>%
left_join(B_frames, by = c(".frame", "point")) %>%
group_by(.frame) %>%
mutate(xmin = min_of_both(xA,yA,xB,yB)$x,
ymin = min_of_both(xA,yA,xB,yB)$y)
# Plot Curve A
p <- ggplot(data, aes(x = xA, y = yA)) +
geom_path(color = 'red') +
# And Curve B
geom_path(aes(x=xB,y=yB), color = 'blue')+
# Then plot a single point that uses both curves as input
# Note I also get problems if trying to run the function through data= instead of mapping=
# or if I define two separate functions, one for x and one for y, so as to avoid $
geom_point(aes(xmin, ymin),
size = 3,
color = 'black') +
theme_minimal()+
transition_states(.frame)+
ease_aes('sine-in-out')
animate(p, fps = 24)
Problem:
1.) I have a shapefile that looks like this:
Extreme values for coordinates are: xmin = 300,000, xmax = 620,000, ymin = 31,000 and ymax = 190,000.
2.) I have a dataset of approx. 2mio points (every point is inside the given polygon) - each one is in one of a 5 different categories.
Now, for every point inside the border (distance between points has to be 10, so that would give us 580,800,000 points) I want to determine color, depending on a category of the nearest point in a dataset.
In the end I would like to draw a ggplot, where the color of every point is dependent on its category (so I'll use 5 different colors).
What I have so far:
My ideas for solution are not optimized and it takes R forever to determine categories for every point inside the polygon.
1.) I created a new dataset with points in a shape of a rectangle with extreme values of coordinates, with 10 units between points. From a new dataset I selected points that have fallen inside the border of polygons (with a function pnt.in.poly from package SDMTools). Then I wanted to find nearest points (from dataset) of every point in a polygon and determined category, but I never manage to get a subset from 580,800,000 points (obviously).
2.) I tried to take 2mio points and color an area around them, dependent on their category, but that did not work right.
I know that it is not possible to plot so many points and see the difference between plot with 200,000,000 points and plot with 1,000,000 points, but I would like to have an accurate coloring when zooming (drawing) only one little spot in a polygon (size of 100 x 100 for example).
Question: Is there any better a way of coloring so many points in a polygon (with creating a new shapefile or grouping points)?
Thank you for your ideas!
It’s really helpful if you include some data with your question, even (especially) if it’s a toy data set. As you don’t, I’ve made a toy example. First, I define a simple shape data frame and a data frame of synthetic data that includes x, y, and grp (i.e., a categorical variable with 5 levels). I crop the latter to the former and plot the results,
# Dummy shape function
df_shape <- data.frame(x = c(0, 0.5, 1, 0.5, 0),
y = c(0, 0.2, 1, 0.8, 0))
# Load library
library(ggplot2)
library(sgeostat) # For in.polygon function
# Data frame of synthetic data: random [x, y] and category (grp)
df_synth <- data.frame(x = runif(500),
y = runif(500),
grp = factor(sample(1:5, 500, replace = TRUE)))
# Remove points outside polygon
df_synth <- df_synth[in.polygon(df_synth$x, df_synth$y, df_shape$x, df_shape$y), ]
# Plot shape and synthetic data
g <- ggplot(df_shape, aes(x = x, y = y)) + geom_path(colour = "#FF3300", size = 1.5)
g <- g + ggthemes::theme_clean()
g <- g + geom_point(data = df_synth, aes(x = x, y = y, colour = grp))
g
Next, I create a regular grid and crop that using the polygon.
# Create a grid
df_grid <- expand.grid(x = seq(0, 1, length.out = 50),
y = seq(0, 1, length.out = 50))
# Check if grid points are in polygon
df_grid <- df_grid[in.polygon(df_grid$x, df_grid$y, df_shape$x, df_shape$y), ]
# Plot shape and show points are inside
g <- ggplot(df_shape, aes(x = x, y = y)) + geom_path(colour = "#FF3300", size = 1.5)
g <- g + ggthemes::theme_clean()
g <- g + geom_point(data = df_grid, aes(x = x, y = y))
g
To classify each point on this grid by the nearest point in the synthetic data set, I use knn or k-nearest-neighbours with k = 1. That gives something like this.
# Classify grid points according to synthetic data set using k-nearest neighbour
df_grid$grp <- class::knn(df_synth[, 1:2], df_grid, df_synth[, 3])
# Show categorised points
g <- ggplot()
g <- g + ggthemes::theme_clean()
g <- g + geom_point(data = df_grid, aes(x = x, y = y, colour = grp))
g
So, that's how I'd address that part of your question about classifying points on a grid.
The other part of your question seems to be about resolution. If I understand correctly, you want the same resolution even if you're zoomed in. Also, you don't want to plot so many points when zoomed out, as you can't even see them. Here, I create a plotting function that lets you specify the resolution. First, I plot all the points in the shape with 50 points in each direction. Then, I plot a subregion (i.e., zoom), but keep the same number of points in each direction the same so that it looks pretty much the same as the previous plot in terms of numbers of dots.
res_plot <- function(xlim, xn, ylim, yn, df_data, df_sh){
# Create a grid
df_gr <- expand.grid(x = seq(xlim[1], xlim[2], length.out = xn),
y = seq(ylim[1], ylim[2], length.out = yn))
# Check if grid points are in polygon
df_gr <- df_gr[in.polygon(df_gr$x, df_gr$y, df_sh$x, df_sh$y), ]
# Classify grid points according to synthetic data set using k-nearest neighbour
df_gr$grp <- class::knn(df_data[, 1:2], df_gr, df_data[, 3])
g <- ggplot()
g <- g + ggthemes::theme_clean()
g <- g + geom_point(data = df_gr, aes(x = x, y = y, colour = grp))
g <- g + xlim(xlim) + ylim(ylim)
g
}
# Example plot
res_plot(c(0, 1), 50, c(0, 1), 50, df_synth, df_shape)
# Same resolution, but different limits
res_plot(c(0.25, 0.75), 50, c(0, 1), 50, df_synth, df_shape)
Created on 2019-05-31 by the reprex package (v0.3.0)
Hopefully, that addresses your question.
In short, I am looking for help with creating a plot in which a specific area of the plot is designated to a certain randomly assigned observation (who possesses a certain factor designated by color). I'd prefer ggplot2 or d3.
I was in a discussion today and was trying to demonstrate the point that even if we randomly chose an observation, we would probably pick an observation of interest. In other words, if every observation was randomly assigned to a certain part of a dartboard and I threw a dart, chances are that my dart would hit an observation that would be of interest
Now, I'm now literally trying to create this dartboard.
I've been messing around with ggplot2 and I've come close to visualizations that I want but not quite.
First, I tried using geom_polar. I randomly generated coordinates for each observation
df$Coord1 <- sample(50, size = nrow(df), replace = TRUE)
df$Coord2 <- sample(50, size = nrow(df), replace = TRUE)
and then plotted them The problem here, though, is that all of the area of the plot was not taken. (There's also the issue that some points actually overlapped... so if someone also knows how to generate coordinates that wouldn't ever overlap that would be nice.) If I were to throw a dart, I might not hit an observation. Here's the code I used:
dartboard <- ggplot(df, aes(Coord1, Coord2, fill = Classification)) +
geom_tile()+coord_polar()
So, then I tried my hand with a pie chart.
pie <- ggplot(df, aes(x = factor(1), fill = factor(Classification))) +
geom_bar(width = 1) + coord_polar()
which was nice because it was a whole circle, but it grouped the classifications together when I want them randomly scattered across the plot.
I also tried replicating this heat map creation (http://www.r-bloggers.com/controlling-heatmap-colors-with-ggplot2/) but I wasn't quite able to figure out how to make it fit correctly with my data.
In short, I am looking for help with creating a plot in which a specific area of the plot is designated to a certain observation who possesses a certain factor.
Any ideas?
Update 1:
This code is what I'm looking for visually from the conceptual level (all part of the chart is covered by an observation:
df <- expand.grid(x = 1:20, y = 1:20)
samples <- c("one", "two", "three", "four", "five")
df$series <- samples[runif(n = nrow(df), min=1,max=length(samples))]
g <- ggplot(df, aes(fill=series, xmin = x, ymin = y, xmax = x+1, ymax = y+1))
g <- g + geom_rect()
g <- g + coord_polar(theta="y")
g <- g + theme(panel.grid=element_blank())
g <- g + theme(axis.text=element_blank())
g <- g + theme(axis.ticks=element_blank())
g
but now I'm trying to figure out how to apply this to my own data set which has about 1,500 observations. The coordinates in that picture are used with expand.grid, so now I'm trying to figure out how to assign coordinates to my own 1,500 observations.
Update 2:
I have relative success with the code below.
random <- expand.grid(x = 1:40, y = 1:39)
random <- random %>%
mutate(ID = 1,
ID = cumsum(ID))
plot.data <- full_join(data, random, by = "ID")
samples <- c("UI", "IN", "OW", "BE" , "Five")
plot.data$Classification <- samples[runif(n = nrow(plot.data), min=1,max=length(samples))]
g <- ggplot(plot.data, aes(fill=Classification, xmin = x, ymin = y, xmax = x+1, ymax = y+1))
g <- g + geom_rect()
g <- g + coord_polar(theta="y")
g <- g + theme(panel.grid=element_blank())
g <- g + theme(axis.text=element_blank())
g <- g + theme(axis.ticks=element_blank())
g <- g + scale_fill_manual(values=c("dark green", "orange" , "yellow" , "red" , "green"))
g
I used the expand.grid function to assign coordinates to kids. Some observations got left out because they couldn't be given coordinates, but that's okay.
My only concern/complaint now is that some observations are larger (and thus easier to hit) than others.... so I might examine that heat maproute. Time will tell. Thank you very much for the help so far.
Update 3:
Another version (and probably final form):
This way, each observation is roughly the same size.
The base of this heatmap version can be found here: http://www.r-bloggers.com/controlling-heatmap-colors-with-ggplot2/ but here's my own code
ggplot(plot.data, aes(x = x, y = y, fill = factor(Classification))) +
geom_tile(color = "black") +
scale_fill_manual(values=c("dark green", "orange" , "yellow" , "red" , "green")) +
theme(legend.position="none") +
theme_change
How's this?
df <- expand.grid(x = 1:20, y = 1:20)
samples <- c("one", "two", "three", "four", "five")
df$series <- samples[runif(n = nrow(df), min=1,max=length(samples))]
g <- ggplot(df, aes(fill=series, xmin = x, ymin = y, xmax = x+1, ymax = y+1))
g <- g + geom_rect()
g <- g + coord_polar(theta="y")
g <- g + theme(panel.grid=element_blank())
g <- g + theme(axis.text=element_blank())
g <- g + theme(axis.ticks=element_blank())
g
I'm trying to draw an area plot with a series of % value, one for each day during a set period. I would like to add a segment to the top of the plot to show more clearly the areas where the % is decreasing.
I tried to use this code (the example has just a few data points for simplicity):
library(ggplot2)
library(scales)
limit = c(0.85,0.87,0.88,0.90,0.72,0.74)
day <- as.Date(strptime((seq(20150201,20150206,1)),format = "%Y%m%d"))
dati = data.frame("Day" = day, "Limit" = limit)
g <- ggplot(data = dati, aes(Day, Limit))
g <- g + geom_area(fill = "dark red")
g <- g + coord_cartesian(ylim = c(0,1))
g <- g + scale_y_continuous(labels=percent)
g <- g + annotate("segment", y= 1, yend = 1, x = dati[3, "Day"], xend = dati[4, "Day"])
print(g)
But I get this error: Error: / not defined for "Date" objects
Any ideas on how to solve this?
I already checked How to use ggplot2's annotate with dates in x-axis?, but it appears the bug is back. Plus I'd like to do this without using the lubridate package.
Wrapping my earlier comment into an answer: use geom_segment instead.
+ geom_segment(y = 1, yend = 1,
x = as.numeric(dati[3, "Day"]), xend = as.numeric(dati[4, "Day"]))
Following up on a recent question of mine, this one is a bit different and illustrates the problem more fully using simpler examples. Below are two data sets and three functions. The first one draws some points and a circle as expected:
library("ggplot2")
library("grid")
td1 <- data.frame(x = rnorm(10), y = rnorm(10))
tf1 <- function(df) { # works as expected
p <- ggplot(aes(x = x, y = y), data = df)
p <- p + geom_point(color = "red")
p <- p + annotation_custom(circleGrob())
print(p)
}
tf1(td1)
This next one seems to ask for the exact sample plot but the code is slightly different. It does not give an error but does not draw the circle:
tf2 <- function(df) { # circle isn't draw, but no error either
p <- ggplot()
p <- p + geom_point(data = df, aes(x = x, y = y), color = "red")
p <- p + annotation_custom(circleGrob())
print(p)
}
tf2(td1)
Finally, this one involves a more complex aesthetic and gives an empty layer when you try to create the circle:
td3 <- data.frame(r = c(rnorm(5, 5, 1.5), rnorm(5, 8, 2)),
f1 = c(rep("L", 5), rep("H", 5)), f2 = rep(c("A", "B"), 5))
tf3 <- function(df) {
p <- ggplot()
p <- p + geom_point(data = df,
aes(x = f1, y = r, color = f2, group = f2))
# p <- p + annotation_custom(circleGrob()) # comment out and it works
print(p)
}
tf3(td3)
Now, I suspect the problem here is not the code but my failure to grasp the inner workings of ggplot2. I could sure use an explanation of why the circle is not drawn in the 2nd case and why the layer is empty in the third case. I looked at the code for annotation_custom and it has a hard-wired inherit.aes = TRUE which I think is the problem. I don't see why this function needs any aesthetic at all (see the docs on it). I did try several ways to override it and set inherit.aes = FALSE but I was unable to fully penetrate the namespace and make it stick. I tried to example the objects created by ggplot2 but these proto objects are nested very deeply and hard to decipher.
To answer this :
"I don't see why this function needs any aesthetic at all".
In fact annotation_custom need x and y aes to scale its grob, and to use after the native units.
Basically it did this :
x_rng <- range(df$x, na.rm = TRUE) ## ranges of x :aes x
y_rng <- range(df$y, na.rm = TRUE) ## ranges of y :aes y
vp <- viewport(x = mean(x_rng), y = mean(y_rng), ## create a viewport
width = diff(x_rng), height = diff(y_rng),
just = c("center","center"))
dd <- editGrob(grod =circleGrob(), vp = vp) ##plot the grob in this vp
To illustrate this I add a grob to a dummy plot used as a scale for my grob. The first is a big scale and the second is a small one.
base.big <- ggplot(aes(x = x1, y = y1), data = data.frame(x1=1:100,y1=1:100))
base.small <- ggplot(aes(x = x1, y = y1), data = data.frame(x1=1:20,y1=1:1))
I define my grob, see I use the native scales for xmin,xmax,ymin,ymax
annot <- annotation_custom(grob = circleGrob(), xmin = 0,
xmax = 20,
ymin = 0,
ymax = 1)
Now see the scales difference(small point / big circle) between (base.big +annot) and (base.small + annot).
library(gridExtra)
grid.arrange(base.big+annot,
base.small+annot)