ggplot: Mask Circles inside a non Geographic Shape - r

Is there a way within ggplot, to plot circles within a defined, non geographic shape, defined through a series of points, or alternatively an imported SVG?
The circles would be placed in rows and columns, similar to the simple example below. But then any circles, either with their circumference, or centre if that is more achievable, outside the shape would be excluded from the plot. So a kind of mask.
I know I could do this by through comparing the coordinates, but I'm interested to know if there is a more sophisticated masking function.
library(tidyverse)
maxX <- 12
maxY <- 9
circles <- data.frame(circleNo = seq(1, maxX * maxY, 1) - 1) %>%
mutate(x = circleNo %% maxX, y = floor(circleNo / maxX))
# Set line end to coordinates for next point
shape <- data.frame(x = c(1, 1, 7, 7, 11, 11, 6, 5, 3, 2, 1), y = c(1, 8, 7, 5, 5, 1, 3, 3, 3, 1, 1)) %>%
mutate(xend = lead(x), yend = lead(y))
# Set line end for last point to the first
shape[nrow(shape),3] = shape[1,1]
shape[nrow(shape),4] = shape[1,2]
ggplot(circles, aes(x = x, y = y)) +
geom_point(shape = 1, size = 9, fill = NA) +
geom_segment(data = shape, aes(x = x, xend = xend, y = y, yend = yend)) +
theme_void() +
coord_fixed(ratio = 1)

Here's one approach that is based on manipulating the pixels as a last step. It is not sophisticated enough to identify which circles are entirely within the polygon, though. For that, the sf package and this approach sound like what you want:
How to mark points by whether or not they are within a polygon
library(ggfx)
ggplot(circles, aes(x = x, y = y)) +
as_reference(
geom_polygon(data = shape),
id = "mask_layer"
) +
with_mask(
geom_point(shape = 1, size = 9, fill = NA),
mask = "mask_layer"
) +
theme_void() +
coord_fixed(ratio = 1)

My thanks to #Jon above for the pointers. This is what I came up. Note that I added a hole in the middle of the polygon for good measure.
library(tidyverse)
library(ggplot)
# Create grid of circles
maxX <- 24
maxY <- 18
circles <- data.frame(circleNo = seq(1, maxX * maxY, 1) - 1)
circles <- circles %>%
mutate(x = circleNo %% maxX, y = floor(circleNo / maxX))
# Create polygon
shape <- data.frame(x = c(2, 2, 14, 14, 22, 22, 12, 10, 6, 4, 2), y = c(2, 16, 14, 10, 10, 2, 6, 6, 6, 2, 2)) %>%
# With line ends equal to the next point
mutate(xend = lead(x), yend = lead(y))
# Except for the last, where it needs to equal the first
shape[nrow(shape),3] = shape[1,1]
shape[nrow(shape),4] = shape[1,2]
# Plot the circles and polygon without any masking
ggplot(circles, aes(x = x, y = y)) +
geom_point(shape = 1, size = 5, fill = NA) +
geom_segment(data = shape, aes(x = x, xend = xend, y = y, yend = yend)) +
theme_void() +
coord_fixed(ratio = 1)
# Now do similar with SF which allows masking using the helpful posts below
# Create simple feature from a numeric vector, matrix or list
# https://r-spatial.github.io/sf/reference/st.html
# How to mark points by whether or not they are within a polygon
# https://stackoverflow.com/questions/50144222/how-to-mark-points-by-whether-or-not-they-are-within-a-polygon
library(sf)
# Create outer polygon
outer = matrix(c(2,2, 2,16, 14,14, 14,10, 22,10, 22,2, 12,6, 10,6, 6,6, 4,2, 2,2), ncol=2, byrow=TRUE)
# And for good measure, lets put a hole in it
hole1 = matrix(c(10,10, 10,12, 12,12, 12,10, 10,10),ncol=2, byrow=TRUE)
polygonList= list(outer, hole1)
# Convert to simple feature
combinedPoints = lapply(polygonList, function(x) cbind(x, 0))
polygons = st_polygon(combinedPoints)
# Plot these new polygons
ggplot(polygons) +
geom_sf(aes())
# Not entirely sure why we need these two lines
polygonCast <- polygons %>% st_cast("POLYGON")
circlesSF <- st_as_sf(circles, coords = c("x", "y"))
# Detect which ones are inside the outer polygon and outside the inner one
circlesSF <- circlesSF %>% mutate(outside = lengths(st_within(circlesSF, polygonCast)))
# Convert to a data frame, extract out the coordinates and filter out the ones outside
circleCoords <- as.data.frame(st_coordinates(circlesSF))
circles2 <- circlesSF %>%
as.data.frame() %>%
cbind(circleCoords) %>%
select(-geometry) %>%
filter(outside > 0)
ggplot(circles2, aes(x = X, y = Y)) +
geom_point(shape = 1, size = 5, fill = NA) +
geom_segment(data = shape, aes(x = x, xend = xend, y = y, yend = yend)) +
theme_void() +
coord_fixed(ratio = 1)

Related

Avoid overlap of points on a timeline (1-D repeling)

I want to create a timeline plot that roughly resembles the example below: lots of overlap at some points, not a lot of overlap at others.
What I need: overlapping images should repel each other where necessary, eliminating or reducing overlap. Ideally I'd be able to implement either a vertical or horizontal repel.
library(tidyverse)
library(ggimage)
test_img <- list.files(system.file("extdata", package="ggimage"), pattern="png", full.names=TRUE)
set.seed(123)
df <-
tibble(date = as.Date(paste0("2020-", round(runif(45, 1, 2)), "-", round(runif(45, 1, 10)))),
group = paste0("Timeline ", rep(1:9, each = 5)),
img = sample(test_img, size = 45, replace = T) )
df %>%
ggplot() +
geom_line(aes(x = date, y = group, group = group), size = 5, alpha = 0.2) +
geom_image(aes(x = date, y = group, image = img, group = group), asp = 1)
Something similar to the repelling in ggbeeswarm::geom_beeswarm or ggrepel::geom_text_repel would be nice, but those don't support images. So I think I need to pre-apply some kind of 1-dimensional packing algorithm, implementing iterative pair-wise repulsion on my vector of dates within each group, to try to find a non-overlapping arrangement.
Any ideas? Thank you so much!
Created on 2021-10-30 by the reprex package (v2.0.1)
Here is the solution I’ve been able to come up with, repurposing the circleRepelLayout function from the awesome packcircles package
into the repel_vector vector function that takes in your overlapping vector and a "repel_radius", and returns, if possible, a non-overlapping version.
I demonstrate the solution with the richtext geom since this is a geom I’ve always wished had repel functionality.
library(packcircles)
library(tidyverse)
library(ggtext)
library(ggimage)
repel_vector <- function(vector, repel_radius = 1, repel_bounds = range(vector)){
stopifnot(is.numeric(vector))
repelled_vector <-
packcircles::circleRepelLayout(x = data.frame(vector, ypos = 1, repel_radius),
xysizecols = c("vector", "ypos", "repel_radius"),
xlim = repel_bounds, ylim = c(0,1),
wrap = FALSE) %>%
as.data.frame() %>%
.$layout.x
return(repelled_vector)
}
overlapping_vec <- c(1, 1.1, 1.2, 10, 10.1, 10.2)
repelled_vec_default <- repel_vector(overlapping_vec)
repelled_vec_tighter <- repel_vector(overlapping_vec, repel_radius = 0.35)
ggplot() +
annotate("richtext", x = overlapping_vec, y = 3, label = "**test**", alpha = 0.5) +
annotate("richtext", x = repelled_vec_default, y = 2, label = "**test**", alpha = 0.5) +
annotate("richtext", x = repelled_vec_tighter, y = 1, label = "**test**", alpha = 0.5) +
scale_y_continuous(breaks = 1:3, labels = c("Tighter repel", "Default repel", "Overlapping points"))
In theory you apply this to 2D repelling as well.
To solve the problem in my question, this can be applied like so:
test_img <- list.files(system.file("extdata", package="ggimage"), pattern="png", full.names=TRUE)
set.seed(123)
df <-
tibble(date = as.Date(paste0("2020-", round(runif(45, 1, 2)), "-", round(runif(45, 1, 10)))),
group = paste0("Timeline ", rep(1:9, each = 5)),
img = sample(test_img, size = 45, replace = T) ) %>%
group_by(group) %>%
mutate(repelled_date = repel_vector(as.numeric(date),
repel_radius = 4,
repel_bounds = range(as.numeric(date)) + c(-3,3)),
repelled_date = as.Date(repelled_date, origin = "1970-01-01"))
df %>%
ggplot() +
geom_line(aes(x = date, y = group, group = group), size = 5, alpha = 0.2) +
geom_image(aes(x = repelled_date, y = group, image = img, group = group), asp = 1)
Created on 2021-10-30 by the reprex package (v2.0.1)

Coloring each quadrant on a ggplot xy scatter according to a color scale specified given by a third variable. geom_rect R

Objective:
Create the XY scatterplot of variables (xx,yy). Color the corresponding Cartesian quadrants according to a third variable's (return) median.
I've created the color vector using colorRampPalette. The issue is that it is being read as continuous (though the vector is discrete).
Have the scatter points be blue (not labeled "blue")
Include a label on each quadrant according to dt.data[, quadrants] so that it is easy to identify what the area corresponds to. So the mark A or the top right, B on bottom right, etc.
This is the code I've written.
library(data.table)
set.seed(42)
dt <- data.table(
xx = rnorm(40, 0, 2),
yy = rnorm(40, 0, 2),
return = rnorm(40, 1, 3))
## compute the range we're going to want to plot over
## in this case 50% more than the max value
RANGE <- 1.5 * dt[, max(abs(c(xx, yy)))]
## compute the medians per quadrant
dtMedians <- dt[,
.(med = median(return)),
.(sign_x = sign(xx), sign_y = sign(yy))]
## set up some fake labels
dtMedians[, quadrant := letters[1:4]]
## compute a color scale for the medians and assign it
fcol <- colorRampPalette(c("#FC4445", "#3FEEE6", "#5CDB95"))
dtMedians[, col := fcol(4)[rank(med)]]
Mycol <- dt.Medians[, .(col)]
dt.rects2<- data.table(
quadrant = letters[1:4],
xmin= c(0,0,-RANGE, -RANGE),
xmax= c(RANGE,RANGE,0,0),
ymin= c(0,-RANGE,-RANGE,0),
ymax= c(RANGE,0,0,RANGE))
dt.data <- merge(dtMedians, dt.rects2, by ="quadrant")
gg<- ggplot() +
geom_rect(data = dt.data,
aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax, fill = med ))
gg+
scale_fill_manual(values = Mycol ) +
labs(x="xx", y="yy", title='US. Growth Quadrant') +
geom_point(data = dt,
aes(x = xx,
y = yy,
color = 'blue'))
While I think the code could be much cleaner, I left it unchanged to the extent possible - there were a few mistakes (e.g., with the variables x and y) that I had to correct to be able to run the code. Now as to your questions:
You can tell R to treat a variable as a factor with fill = as.factor(med). In addition, I had to adjust scale_fill_manual(values = Mycol$col) to select the colors defined in variable col of df Mycol.
To make the scatters blue, I took the color = 'blue' outside of the aes() in the geom_point().
I used annotate() to label the corners of the plot, which relies on manually defining the x and y coordinates. I am sure there are other, potentially better (and automated) solutions out there.
Full code for the plot (taking your data):
ggplot() +
geom_rect(data = dt.data,
aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax, fill = as.factor(med))) +
scale_fill_manual(values = Mycol$col) +
labs(x="xx", y="yy", title='US. Growth Quadrant') +
geom_point(data = dt,
aes(x = x,
y = y),
color = 'blue') +
annotate(geom = 'text', label = 'A', x = 5, y = 5, size = 8) +
annotate(geom = 'text', label = 'B', x = 5, y = -5, size = 8) +
annotate(geom = 'text', label = 'C', x = -5, y = -5, size = 8) +
annotate(geom = 'text', label = 'D', x = -5, y = 5, size = 8)
Output:

ggplot from two tibbles; scatterplot with contours in background

I have two tibbles -
tbl1 contains real data : X, Y and choice.
tbl2 is synthetic tibble to calculate contours of predicted choice P.
library(tidyverse)
# tibble1
X <- c(1, 3, 5)
Y <- c(1, 5, 3)
choice <- c(0, 1, 1)
tbl1 <- tibble(X,Y,choice)
# tibble2
X <- seq(0, 5, 0.1)
Y <- seq(0, 5, 0.1)
tbl2 <- crossing(X,Y)
tbl2 <- tbl2 %>%
mutate(V = (X + Y - 4)/2,
P = 1/(1+exp(-V)))
I wish to create a single ggplot with
scatterplot X vs Y from tbl1 (with color = choice)
filled contours of P from tbl2 in the background
Thanks
Perhaps this?
library(ggplot2)
ggplot(tbl2, aes(X, Y)) +
geom_contour_filled(aes(z = P), alpha = 0.3) +
geom_point(aes(color = factor(choice)), size = 5, data = tbl1) +
guides(fill = guide_none()) +
labs(color = "Choice")

R - Overlay multiple least squares plots with colour coding

I'm trying to visualize some data that looks like this
line1 <- data.frame(x = c(4, 24), y = c(0, -0.42864), group = "group1")
line2 <- data.frame(x = c(4, 12 ,24), y = c(0, 2.04538, 3.4135), group = "group2")
line3 <- data.frame(x = c(4, 12, 24), y = c(0, 3.14633, 3.93718), group = "group3")
line4 <- data.frame(x = c(0, 3, 7, 12, 18), y = c(0, -0.50249, 0.11994, -0.68694, -0.98949), group = "group4")
line5 <- data.frame(x = c(0, 3, 7, 12, 18, 24), y = c(0, -0.55753, -0.66006, 0.43796, 1.38723, 3.17906), group = "group5")
df <- do.call(rbind, list(line1, line2, line3, line4, line5))
What I'm trying to do is plot the least squares line (and points) for each group on the same plot. And I'd like the colour of the lines and points to correspond to the group.
All I've been able to do is plot the points according to their group
ggplot(data = df, aes(x, y, colour = group)) + geom_point(aes(size = 10))
But I have no idea how to add in the lines as well and make their colours correspond to the points that they are fitting.
I'd really appreciate any help with this. It's turning out to be so much harder than I though it would be.
You can simply add a geom_smooth layer to your plot
ggplot(data = df, aes(x, y, colour = group)) + geom_point(aes(size = 10)) +
geom_smooth(method="lm",se=FALSE)
method="lm" specifies that you want a linear model
se=FALSE to avoid plotting confidence intervals

Can ggplot's faceting be used here?

Welcome to Tidyville.
Below is a small df showing the populations of cities in Tidyville. Some cities belong to the A state and some the B state.
I wish to highlight the cities that decreased in population in red. Mission accomplished so far.
But there are many states in Tidyville. Is there a way to use ggplot's faceting faceting to show a plot for each state. I'm uncertain because I'm new and I do a little calculation outside the ggplot call to identify the cities that decreased in population.
library(ggplot2)
library(tibble)
t1 <- tibble (
y2001 = c(3, 4, 5, 6, 7, 8, 9, 10),
y2016 = c(6, 3, 9, 2, 8, 2, 11, 15),
type = c("A", "A", "B", "B", "A", "A", "B", "B")
)
years <- 15
y2001 <- t1$y2001
y2016 <- t1$y2016
# Places where 2016 pop'n < 2001 pop'n
yd <- y2016 < y2001
decrease <- tibble (
y2001 = t1$y2001[yd],
y2016 = t1$y2016[yd]
)
# Places where 2016 pop'n >= 2001 pop'n
yi <- !yd
increase <- tibble (
y2001 = t1$y2001[yi],
y2016 = t1$y2016[yi]
)
ggplot() +
# Decreasing
geom_segment(data = decrease, aes(x = 0, xend = years, y = y2001, yend = y2016),
color = "red") +
# Increasing or equal
geom_segment(data = increase, aes(x = 0, xend = years, y = y2001, yend = y2016),
color = "black")
I think this would be much easier if you just put your data in a tidy format like ggplot2 expects. Here's a possible solution using tidyverse functions
library(tidyverse)
t1 %>%
rowid_to_column("city") %>%
mutate(change=if_else(y2016 < y2001, "decrease", "increase")) %>%
gather(year, pop, y2001:y2016) %>%
ggplot() +
geom_line(aes(year, pop, color=change, group=city)) +
facet_wrap(~type) +
scale_color_manual(values=c("red","black"))
This results in
Your intermediary steps are unnecessary and lose some of your data. We'll keep what you created first:
t1 <- tibble (
y2001 = c(3, 4, 5, 6, 7, 8, 9, 10),
y2016 = c(6, 3, 9, 2, 8, 2, 11, 15),
type = c("A", "A", "B", "B", "A", "A", "B", "B")
)
years <- 15
But instead of doing all the separating and subsetting, we'll just create a dummy variable for whether or not y2016 > y2001.
t1$incr <- as.factor(ifelse(t1$y2016 >= t1$y2001, 1, 0))
Then we can extract the data argument to the ggplot() call to make it more efficient. We'll only use one geom_segment() argument and set the color() argument to be that dummy variable we created before. We then need to pass a vector of colors to scale_fill_manual()'s value argument. Finally, add the facet_grid() argument. If you're only faceting on one variable, you put a period on the opposite side of the tilde. Period first mean's they'll be paneled side-by-side, period last means they'll be stacked on top of each toher
ggplot(data = t1) +
geom_segment(aes(x = 0, xend = years, y = y2001, yend = y2016, color=incr)) +
scale_fill_manual(values=c("black", "red")) +
facet_grid(type~.)
I believe you don't need to create two new datasets, you can add a column to t1.
t2 <- t1
t2$decr <- factor(yd + 0L, labels = c("increase", "decrease"))
I have left the original t1 intact and altered a copy, t2.
Now in order to apply ggplot facets, maybe this is what you are looking for.
ggplot() +
geom_segment(data = t2, aes(x = 0, xend = years, y = y2001, yend = y2016), color = "red") +
facet_wrap(~ decr)
If you want to change the colors, use the new column decr as an value tocolor. Note that this argument changes its position, it is now aes(..., color = decr).
ggplot() +
geom_segment(data = t2, aes(x = 0, xend = years, y = y2001, yend = y2016, color = decr)) +
facet_wrap(~ decr)
require(dplyr)
t1<-mutate(t1,decrease=y2016<y2001)
ggplot(t1)+facet_wrap(~type)+geom_segment(aes(x = 0, xend = years, y = y2001, yend = y2016, colour=decrease))

Resources