I have a bunch of movement vectors with a direction (compass angle from 0-360), and velocity (km/hour, for example), and I want to summarize them into one average vector, and I can't seem to find a good way. Basically I want to be able to say "the average movement for this group is ___ degrees at ___ km/hour."
I've found some piecemeal ways to do it, but there has to be an easier way. I'm trying to scale this up, and it will eventually be applied to vectors not starting from the origin, and eventually vectors in geographic space (with CRS and stuff), so I'm trying to simplify as much as possible right now.
example code:
# data frame of vectors all starting at one origin with a compass angle and velocity
df <- data.frame(
x=0,
y=0,
# compass degrees where 0 is north and moves clockwise
compassdegree = c(270,275,277,280,285,330, 40),
velocity = c(2,2,2,2,1,1,1)
) %>%
# we will plot with geom_spoke, which takes radians,
# so some transformations are necessary
mutate(
# convert to radian degrees:
# (0 degrees points right from origin and goes counterclockwise)
radiandegree = ((-compassdegree) + 450) %% 360 ,
# convert from radiandegrees to radians
radian = DescTools::DegToRad(radiandegree))
# plot the vectors
plot <- df %>%
ggplot(aes(x=x,y=y)) +
geom_spoke(aes(angle = radian,
radius = velocity)) +
coord_equal(xlim = c(-2,2),
ylim = c(-2,2))
plot
# find mean angle with circular::weighted.mean.circular
weighted.mean <- circular::weighted.mean.circular(
x = circular::circular(df$radian, unit = "radians"),
w = df$velocity
) %>% as.data.frame() %>%
rename(radians = x) %>%
# convert back to radian degrees, then to compass degrees
mutate(radianDegree = circular::deg(radians),
compassDegree = ((-radianDegree) + 450) %% 360 )
# alternatively, we could have used the radiandegrees
# directly in the weighted.mean.circular function, but I'll still need it
# in radians to plot, so this isn't really saving much
weighted.mean2 <- circular::weighted.mean.circular(
x = circular::circular(df$radiandegree, unit = "degrees"),
w = df$velocity
) %>% as.data.frame() %>%
rename(radianDegree = x) %>%
# convert back to radian degrees, then to compass degrees
mutate(compassDegree = ((-radianDegree) + 450) %% 360 )
# great, so now we have the average compassDegree: 286.85°,
# but we still don't know the mean velocity
# try by finding endpoints of all original vectors, and finding the
# centroid of them by summing all X and all Y values
# find endpoints of all vectors
df_endpoints <- df %>%
mutate(x_end = x + (velocity * cos(radian)),
y_end = y + (velocity * sin(radian)))
# find mean endpoint
mean_endpoint <- df_endpoints %>%
summarize(x = mean(x_end),
y = mean(y_end))
# find mean velocity by finding distance from averaged endpoint to origin
# using the pythagorean theorem
mean_vel <- sqrt(mean_endpoint$x^2 + mean_endpoint$y^2)
# now we have the mean velocity
plot +
geom_point(data = mean_endpoint,
aes(x=x,y=y),
color = "red", size = 2) +
geom_spoke(aes(x = 0, y = 0,
angle = weighted.mean$radians,
radius = mean_vel),
color = "red", linetype = "longdash", size = .75)
Now we have the trajectories in black, and the average trajectory in red.
Is there not just a function that I can feed in a df of angles and velocities and receive this information (mean angle and mean velocity) directly out? Will the above method still apply when I move to spatial data (assuming I keep in mind distortion from projections and CRS)?
Here's an approach using the average x-y delta and converting that back to radians and velocity.
df_avg <- df %>%
mutate(x_delta = velocity * cos(radian),
y_delta = velocity * sin(radian)) %>%
summarize(across(c(x, y, x_delta:y_delta), mean)) %>%
mutate(velocity = sqrt(x_delta^2 + y_delta^2),
radian = atan(y_delta/x_delta) + pi,
src = "avg")
ggplot(bind_rows(df, df_avg),
aes(x=x,y=y, color = src)) +
geom_spoke(aes(angle = radian,
radius = velocity)) +
coord_equal(xlim = c(-2,2),
ylim = c(-2,2))
I often find myself working with data with long-tail distributions, so that a huge amount of range in values happens in the top 1-2% of the data. When I plot the data, the upper outliers cause variation in the rest of the data to wash out, but I want to show those difference.
I know there are other ways of handling this, but I found that capping the values towards the end of the distribution and then applying a continuous color palette (i.e., in ggplot) is one way that works for me to represent the data. However, I want to ensure the legend stays accurate, by adding a >= sign to the last legend label
The picture below shows the of legend I'd like to achieve programmatically, with the >= sign drawn in messily in red.
I also know I can manually set breaks and labels, but I'd really like to just do something like, if(it's the last label) ~paste0(">=",label) else label) (to show with pseudo code)
Reproducible example:
(I want to alter the plot legend to prefix just the last label)
set.seed(123)
x <- rnorm(1:1e3)
y <- rnorm(1:1e3)
z <- rnorm(1e3, mean = 50, sd = 15)
d <- tibble(x = x
,y = y
,z = z)
d %>%
ggplot(aes(x = x
,y = y
,fill = z
,color = z)) +
geom_point() +
scale_color_viridis_c()
One option would be to pass a function to the labels argument which replaces the last element or label with your desired label like so:
library(ggplot2)
set.seed(123)
x <- rnorm(1:1e3)
y <- rnorm(1:1e3)
z <- rnorm(1e3, mean = 50, sd = 15)
d <- data.frame(
x = x,
y = y,
z = z
)
ggplot(d, aes(
x = x,
y = y,
fill = z,
color = z
)) +
geom_point() +
scale_fill_continuous(labels = function(x) {
x[length(x)] <- paste0(">=", x[length(x)])
x
}, aesthetics = c("color", "fill"))
I have the following data frame:
# Seed RNG
set.seed(33550336)
# Create data frame
df <- data.frame(x = runif(100),
y = runif(100),
t = runif(100, min = 0, max = 10))
I'd like to plot points (i.e., at x and y coordinates) appearing and disappearing as a function of t. gganimate is awesome, so I used that.
# Load libraries
library(gganimate)
library(ggplot2)
# Create animation
g <- ggplot(df, aes(x = x, y = y))
g <- g + geom_point(colour = "#FF3300", shape = 19, size = 5, alpha = 0.25)
g <- g + labs(title = 'Time: {frame_time}')
g <- g + transition_time(t)
g <- g + enter_fade() + exit_fade()
animate(g, fps = 1)
This code produced the following:
There are a couple of things that I don't like about this.
The transitions are very abrupt. My hope using enter_fade and exit_fade was that the points would fade into view, then back out. Clearly this isn't the case, but how would I achieve this result?
I would like to round {frame_time}, so that while the points fade in and out at fractions of t, the actual time t that would be shown would be an integer. If frame_time was a regular variable, this would be simple enough using something like bquote and round, but this doesn't seem to be the case. How can I round frame_time in my title?
Here's a relatively manual approach that relies on doing more of the prep beforehand and feeding that into gganimate. I'd like to see if there's a simpler way to do this inside gganimate more automatically.
First I make a copy of the data frame for each frame I want to show. Then I calculate the difference between the time I'm presently viewing (time) and the t when I want to show each data point. I use cos to handle the easing in and out, so that each dot's appearance at given time is described with display. In the ggplot call, I then map alpha and size to display, and use transition_time(time) to move through the frames.
# Create prep table
fade_time = 1
frame_count = 100
frames_per_time = 10
df2 <- map_df(seq_len(frame_count), ~df, .id = "time") %>%
mutate(time = as.numeric(time)/frames_per_time,
delta_norm = (t - time) / fade_time,
display = if_else(abs(delta_norm) > 1, 0, cos(pi / 2 * delta_norm)))
# Create animation
g <- ggplot(df2, aes(x = x, y = y, alpha = display, size = display))
g <- g + geom_point(colour = "#FF3300", shape = 19)
g <- g + scale_alpha(range = c(0, 1)) + scale_size_area(max_size = 5)
g <- g + labs(title = "{round(frame_time, 1)}")
g <- g + transition_time(time)
animate(g)
I am trying to create a graph that plots points, labels, and lines that connect the points given a start and end position. Then transform it into a polar chart. I can plot the points, labels, and lines, but my issue is when I transform my chart into polar. I have used both geom_curve and geom_segment.
In using geom_curve I get an error because geom_curve is not implemented for non-linear coordinates. Therefore the furthest I can get is this:
In using geom_segment I get it closer to my desired effect, but it draws the lines along the cirlce's circumfrence, which makes sense given how I pass through the coordinates. Here is a photo:
I essentially need a geom_curve for polar coordinates, but I have been unable to find one. I would like the lines on the inside of the circle and curved, there will be some overlap but anyway suggestions it look nice with spacing or something would be welcomed.
Data:
k<-18
ct<-12
q<-6
x_vector1<-seq(1,k,1)
x_vector2<-seq(1,3,1)
x_vector3<-seq(k-2,k,1)
x_vector<-c(x_vector1,x_vector2,x_vector3)
n<-9 ## sets first level radius
radius1<-rep(n,k)
b<-13 ## sets second level radius
radius2<-rep(b,q)
radius<-c(radius1,radius2)
name<-c('Alice','Bob','Charlie','D','E','F','G','H','I','J','K','L',
'M','N','O','Peter','Quin','Roger','Alice2','Bob2','Charlie2',
'Peter2','Quin2','Roger2')
dframe<-data.frame(x_vector,radius,name)
dframe$label_radius<-dframe$radius+1
from<-c('Alice2','Bob','Charlie','D','E','Alice2','Charlie2','Charlie',
'I','J','K','L','M','N','O','Peter','Quin','Alice')
to<-c('Alice','Alice','Alice','Alice','Alice','Bob',
'Bob','Bob','Bob','Charlie','Charlie','Peter',
'Peter','Quin','Quin','Quin','Roger','Roger')
amt<-c(3,8,8,8,6,2,2,4,2,4,8,1,10,5,9,5,2,1)
linethick<-c(0.34,0.91,0.91,0.91,0.68,0.23,0.23,0.45,0.23,0.45,
0.91,0.11,1.14,0.57,1.02,0.57,0.23,0.11)
to_x<-c(1,1,1,1,1,2,2,2,2,3,3,16,16,17,17,17,18,18)
to_rad<-c(9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9)
from_x<-c(1,2,3,4,5,1,3,3,9,10,11,12,13,14,15,16,17,1)
from_rad<-c(13,9,9,9,9,13,13,9,9,9,9,9,9,9,9,9,9,9)
stats<-data.frame(from,to,amt,linethick,to_x,to_rad,from_x,from_rad)
p<-ggplot()+
geom_point(data=dframe,aes(x=x_vector,y=radius),size=3,shape=19)+
geom_text(data=dframe,aes(x=x_vector,y=label_radius,label=name))+
geom_segment(data=stats,aes(x=from_x,y=from_rad,xend=to_x,yend=to_rad, color=to), ## I need arrows starting at TO and going to FROM. ##
arrow=arrow(angle=15,ends='first',length=unit(0.03,'npc'), type='closed'))+
## transform into polar coordinates coord_polar(theta='x',start=0,direction=-1)
## sets up the scale to display from 0 to 7 scale_y_continuous(limits=c(0,14))+
## Used to 'push' the points so all 'k' show up. expand_limits(x=0) p
As others have commented, you can mimic the desired positions produced by coord_polar() by calculating them yourself, in Cartesian coordinates. I.e.:
x = radius * cos(theta)
y = radius * sin(theta)
# where theta is the angle in radians
Manipulate the 2 data frames:
dframe2 <- dframe %>%
mutate(x_vector = as.integer(factor(x_vector))) %>%
mutate(theta = x_vector / n_distinct(x_vector) * 2 * pi + pi / 2) %>%
mutate(x = radius * cos(theta),
y = radius * sin(theta),
y.label = label_radius * sin(theta),
name = as.character(name))
stats2 <- stats %>%
select(from, to, amt, linethick) %>%
mutate_at(vars(from, to), as.character) %>%
left_join(dframe2 %>% select(name, x, y),
by = c("from" = "name")) %>%
rename(x.start = x, y.start = y) %>%
left_join(dframe2 %>% select(name, x, y),
by = c("to" = "name")) %>%
rename(x.end = x, y.end = y)
Plot using geom_curve():
# standardize plot range in all directions
plot.range <- max(abs(c(dframe2$x, dframe2$y, dframe2$y.label))) * 1.1
p <- dframe2 %>%
ggplot(aes(x = x, y = y)) +
geom_point() +
geom_text(aes(y = y.label, label = name)) +
# use 2 geom_curve() layers with different curvatures, such that all segments align
# inwards inside the circle
geom_curve(data = stats2 %>% filter(x.start > 0),
aes(x = x.start, y = y.start,
xend = x.end, yend = y.end,
color = to),
curvature = -0.3,
arrow = arrow(angle=15, ends='first',
length=unit(0.03,'npc'),
type='closed')) +
geom_curve(data = stats2 %>% filter(x.start <= 0),
aes(x = x.start, y = y.start,
xend = x.end, yend = y.end,
color = to),
curvature = 0.3,
arrow = arrow(angle=15, ends='first',
length=unit(0.03,'npc'),
type='closed')) +
expand_limits(x = c(-plot.range, plot.range),
y = c(-plot.range, plot.range)) +
coord_equal() +
theme_void()
p
If you want polar grid lines, these can be mimicked as well using geom_spoke() and ggfortify package's geom_circle():
library(ggforce)
p +
geom_spoke(data = data.frame(x = 0,
y = 0,
angle = pi * seq(from = 0,
to = 2,
length.out = 9), # number of spokes + 1
radius = plot.range),
aes(x = x, y = y, angle = angle, radius = radius),
inherit.aes = FALSE,
color = "grey") +
geom_circle(data = data.frame(x0 = 0,
y0 = 0,
r = seq(from = 0,
to = plot.range,
length.out = 4)), # number of concentric circles + 1
aes(x0 = x0, y0 = y0, r = r),
inherit.aes = FALSE,
color = "grey", fill = NA)
(Note: If you really want these pseudo-grid lines, plot them before the other geom layers.)
Do yo have to do everything in ggplot2?
If not, then one option would be to create the plot with the points (potentially using ggplot2, or just straight grid graphics, maybe even base graphics), then push to the appropriate viewport and use xsplines to add curves between the points (see this answer: Is there a way to make nice "flow maps" or "line area" graphs in R? for a basic example of using xspline).
If you insist on doing everything using ggplot2 then you will probably need to create your own geom function that plots the curves in the polar coordinate plot.