Overlay multiple geom_raster plots with different gradients - r

I would like to plot with gglot's geom_raster a 2D plot with 2 different gradients, but I do not know if there is a fast and elegant solution for this and I am stuck.
The effect that I would like to see is the overlay of multiple geom_raster, essentially. Also, I would need a solution that scales to N different gradients; let me give an example with N=2 gradients which is easier to follow.
I first create a 100 x 100 grid of positions X and Y
# the domain are 100 points on each axis
domain = seq(0, 100, 1)
# the grid with the data
grid = expand.grid(domain, domain, stringsAsFactors = FALSE)
colnames(grid) = c('x', 'y')
Then I compute one value per grid point; imagine something stupid like this
grid$val = apply(grid, 1, function(w) { w['x'] * w['y'] }
I know how to plot this with a custom white to red gradient
ggplot(grid, aes(x = x, y = y)) +
geom_raster(aes(fill = val), interpolate = TRUE) +
scale_fill_gradient(
low = "white",
high = "red", aesthetics = 'fill')
But now imagine I have another value per grid point
grid$second_val = apply(grid, 1, function(w) { w['x'] * w['y'] + runif(1) }
Now, how do I plot a grid where each position "(x,y)" is coloured with an overlay of:
1 "white to red" gradient with value given by val
1 "white to blue" gradient with value given by second_val
Essentially, in most applications val and second_val will be two 2D density functions and I would like each gradient to represent the density value. I need two different colours to see the different distribution of the values.
I have seen this similar question but don't know how to use that answer in my case.

#Axeman's answer to my question, which you linked to, applies directly the same to your question.
Note that scales::color_ramp() uses values between 0 and 1, so normalize val and second_val between 0, 1 before plotting
grid$val_norm <- (grid$val-min(grid$val))/diff(range(grid$val))
grid$second_val_norm <- (grid$second_val-min(grid$second_val))/diff(range(grid$second_val))
Now plot using #Axeman's answer. You can plot one later as raster, and overlay the second with annotate. I have added transparency (alpha=.5) otherwise you'll only be able to see the second layer.:
ggplot(grid, aes(x = x, y = y)) +
geom_raster(aes(fill=val)) +
scale_fill_gradient(low = "white", high = "red", aesthetics = 'fill') +
annotate(geom="raster", x=grid$x, y=grid$y, alpha=.5,
fill = scales::colour_ramp(c("transparent","blue"))(grid$second_val_norm))
Or, you can plot both layers using annotate().
# plot using annotate()
ggplot(grid, aes(x = x, y = y)) +
annotate(geom="raster", x=grid$x, y=grid$y, alpha=.5,
fill = scales::colour_ramp(c("transparent","red"))(grid$val_norm)) +
annotate(geom="raster", x=grid$x, y=grid$y, alpha=.5,
fill = scales::colour_ramp(c("transparent","blue"))(grid$second_val_norm))

Related

How to draw a radar plot in ggplot using polar coordinates? [duplicate]

This question already has an answer here:
ggplot2: connecting points in polar coordinates with a straight line 2
(1 answer)
Closed 2 years ago.
I am trying to use ggplot to draw a radar-chart following the guidelines from the Grammar of Graphics. I am aware of the ggradar package but based on the grammar it looks like coord_polar should be enough here. This is the pseudo-code from the grammar:
So I thought something like this may work, however, the contour of the area chart is curved as if I used geom_line:
library(tidyverse)
dd <- tibble(category = c('A', 'B', 'C'), value = c(2, 7, 4))
ggplot(dd, aes(x = category, y = value, group=1)) +
coord_polar(theta = 'x') +
geom_area(color = 'blue', alpha = .00001) +
geom_point()
While I understand why geom_line draws arcs once in coord_polar, my understanding of the explanation from the Grammar of Graphics is that there may be an element/geom area that could plot straight lines:
here is one technical detail concerning the shape of Figure 9.29. Why
is the outer edge of the area graphic a set of straight lines instead
of arcs? The answer has to do with what is being measured. Since
region is a categorical variable, the line segments linking regions
are not in a metric region of the graph. That is, the segments of the
domain between regions are not measurable and thus the straight lines
or edges linking them are arbitrary and perhaps not subject to
geometric transformation. There is one other problem with the
grammatical specification of this figure. Can you spot it? Undo the
polar trans- formation and think about the domain of the plot. We
cheated.
For completeness, this question derives from this other question I asked about plotting in polar system.
tl;dr we can write a function to solve this problem.
Indeed, ggplot uses a process called data munching for non-linear coordinate systems to draw lines. It basically breaks up a straight line in many pieces, and applies the coordinate transformation on the individual pieces instead of merely the start- and endpoints of lines.
If we look at the panel drawing code of for example GeomArea$draw_group:
function (data, panel_params, coord, na.rm = FALSE)
{
...other_code...
positions <- new_data_frame(list(x = c(data$x, rev(data$x)),
y = c(data$ymax, rev(data$ymin)), id = c(ids, rev(ids))))
munched <- coord_munch(coord, positions, panel_params)
ggname("geom_ribbon", polygonGrob(munched$x, munched$y, id = munched$id,
default.units = "native", gp = gpar(fill = alpha(aes$fill,
aes$alpha), col = aes$colour, lwd = aes$size * .pt,
lty = aes$linetype)))
}
We can see that a coord_munch is applied to the data before it is passed to polygonGrob, which is the grid package function that matters for drawing the data. This happens in almost any line-based geom for which I've checked this.
Subsequently, we would like to know what is going on in coord_munch:
function (coord, data, range, segment_length = 0.01)
{
if (coord$is_linear())
return(coord$transform(data, range))
...other_code...
munched <- munch_data(data, dist, segment_length)
coord$transform(munched, range)
}
We find the logic I mentioned earlier that non-linear coordinate systems break up lines in many pieces, which is handled by ggplot2:::munch_data.
It would seem to me that we can trick ggplot into transforming straight lines, by somehow setting the output of coord$is_linear() to always be true.
Lucky for us, we wouldn't have to get our hands dirty by doing some deep ggproto based stuff if we just override the is_linear() function to return TRUE:
# Almost identical to coord_polar()
coord_straightpolar <- function(theta = 'x', start = 0, direction = 1, clip = "on") {
theta <- match.arg(theta, c("x", "y"))
r <- if (theta == "x")
"y"
else "x"
ggproto(NULL, CoordPolar, theta = theta, r = r, start = start,
direction = sign(direction), clip = clip,
# This is the different bit
is_linear = function(){TRUE})
}
So now we can plot away with straight lines in polar coordinates:
ggplot(dd, aes(x = category, y = value, group=1)) +
coord_straightpolar(theta = 'x') +
geom_area(color = 'blue', alpha = .00001) +
geom_point()
Now to be fair, I don't know what the unintended consequences are for this change. At least now we know why ggplot behaves this way, and what we can do to avoid it.
EDIT: Unfortunately, I don't know of an easy/elegant way to connect the points across the axis limits but you could try code like this:
# Refactoring the data
dd <- data.frame(category = c(1,2,3,4), value = c(2, 7, 4, 2))
ggplot(dd, aes(x = category, y = value, group=1)) +
coord_straightpolar(theta = 'x') +
geom_path(color = 'blue') +
scale_x_continuous(limits = c(1,4), breaks = 1:3, labels = LETTERS[1:3]) +
scale_y_continuous(limits = c(0, NA)) +
geom_point()
Some discussion about polar coordinates and crossing the boundary, including my own attempt at solving that problem, can be seen here geom_path() refuses to cross over the 0/360 line in coord_polar()
EDIT2:
I'm mistaken, it seems quite trivial anyway. Assume dd is your original tibble:
ggplot(dd, aes(x = category, y = value, group=1)) +
coord_straightpolar(theta = 'x') +
geom_polygon(color = 'blue', alpha = 0.0001) +
scale_y_continuous(limits = c(0, NA)) +
geom_point()

Draw a geom_segment line with a colour gradient? (or is there another way to emphasize start vs end?)

I have two sets of latitude & longitude variables for a large number of rows in my data frame (~100,000). I am trying to make a plot that connects those two sets of coordinates (i.e, ~100,000 lines that go from latitude1,longitude1 to latitude2,longitude2), using geom_segment, with a very low alpha to make the lines transparent because there are so many lines.
I would like to emphasize the starting points and end points of those lines, and I reckoned the best way to do that would would be to have a colour gradient from start to end (let's say green to red).
Is it possible to draw a geom_segment line with a colour gradient? If not, do you know another way to emphasize start vs end with so many lines?
(I realize that it could end up looking messy because there are so many lines, but I suspect that many of them go in the same direction..)
Here is some example data of 5 rows (but in reality I have ~100,000, so it should be somewhat computationally efficient):
example.df <- as.data.frame(matrix(c(1,435500,387500,320000,197000,
2,510500,197500,513000,164000,
3,164500,40500,431000,385000,
4,318500,176500,316000,172000,
5,331500,188500,472000,168000),
nrow=5, ncol=5, byrow = TRUE))
colnames(example.df) <- c("ID","longitude.1","latitude.1",
"longitude.2","latitude.2")
library(ggforce)
ggplot(example.df, aes(longitude.1, latitude.1))+
geom_link(aes(x=longitude.1, y=latitude.1,
xend=longitude.2, yend=latitude.2,
alpha=0.5), col="black")+
coord_equal()
This produces these five lines:
I would like these lines to start as blue at their first longitute-latitude coordinate point and end as red at the second longitute-latitude coordinate point.
The ggforce approach seems like the best approach to what you are asking. Your code was almost what you were looking for, but I think you may have overlooked the colour = stat(index) statement inside the mapping. I assume that the index is a statistic that geom_link() calculates under the hood to interpolate the colours.
ggplot(example.df, aes(longitude.1, latitude.1))+
geom_link(aes(x = longitude.1, y = latitude.1,
xend = longitude.2, yend = latitude.2,
colour = stat(index)), lineend = "round") +
scale_colour_gradient(low = "red", high = "green") +
coord_equal()
A word of warning though, seeing as you intend to plot many lines; if you use an alpha = for geom_link() you can clearly see segmentation of the lines:
ggplot(example.df, aes(longitude.1, latitude.1))+
geom_link(aes(x = longitude.1, y = latitude.1,
xend = longitude.2, yend = latitude.2,
colour = stat(index)), lineend = "round", size = 10, alpha = 0.1) +
scale_colour_gradient(low = "red", high = "green") +
coord_equal()
Alternatively, you can use arrowheads to indicate end positions as follows:
ggplot(example.df, aes(longitude.1, latitude.1)) +
geom_segment(aes(xend = longitude.2, yend = latitude.2),
arrow = arrow()) +
coord_equal()
A potential downside is that very short segments may be overemphasized with the arrows.
Hope this helps!

How to highlight lines on ggplot without losing color coding

This question is part R plotting and part graphic design. I am using ggplot2 to make a scatter plot to compare the 3 different populations. I want to show the individual points as well as a linear regression line, and they all have to be on the same plot for comparison purposes.
The problem is that the points are very densely plotted and when the regression line is the same color as the points they blend in together and it's very hard to see it. But since some of the populations overlap, if I make the lines all black it is impossible to see which goes with which.
What I would like is a way to highlight the regression line, possibly by outlining it with a black border, or making it a darker shade of the same color, so that it stand out against the background of points. Here's an example below. I've made the points large here to exaggerate the over-plotting, but shrinking them or reducing the alpha isn't going to help (I've tried).
library(ggplot2)
df <- data.frame('x' = c(rnorm(1000, 1),
rnorm(1000, 2),
rnorm(1000, 1)),
'y' = c(rnorm(1000, 1),
rnorm(1000, 2),
rnorm(1000, 1)),
'z' = factor(c(rep_len(1, 1000),
rep_len(2, 1000),
rep_len(3, 1000))))
# To make the angle of this line sharp
df$y[2000:3000] <- df$y[2000:3000] + df$x[2000:3000]
ggplot(data = df) +
geom_point(aes(x = x, y = y, color = z), size = 3) +
geom_smooth(aes(x = x, y = y, color = z), method = 'lm', size = 2, fill = NA) +
scale_color_brewer(palette = 'Set1')
EDIT: As per #Gregor's suggestion, plotting a black line underneath the colored ones does what I want but generates an ugly aliasing effect (which is particularly clear with sharply angled lines) which persists no matter the size of the image(see below image). Any suggestions to deal with this, or is it just a specific problem with my system?
Two recommendations you can adjust as you like:
Make the points somewhat transparent
Highlight the line by plotting a lightly bigger black (or gray) line underneath
Here's the result of a little bit of both. I also reduced the size of the points and lines.
ggplot(data = df) +
geom_point(aes(x = x, y = y, color = z),
size = 1, alpha = 0.4) +
geom_smooth(aes(x = x, y = y, group = z),
color = "black", method = 'lm', size = 1.3, fill = NA) +
geom_smooth(aes(x = x, y = y, color = z), method = 'lm', size = 1.1, fill = NA) +
scale_color_brewer(palette = 'Set1')
To deal with aliasing with the overlapping lines, try using the cairoDevice package.
I believe the links will answer better than I can.
You should use scale_color_manual() to color the points manually, or within geom_point() set alpha = 0.4 or something to make the points transparent.
http://www.sthda.com/english/wiki/ggplot2-colors-how-to-change-colors-automatically-and-manually
You can also change the shape and size too to distinguish between populations or key individuals within populations as well. http://www.sthda.com/english/wiki/ggplot2-point-shapes
As for the line, you can adjust it manually, and use scale_color_manual(), scale_size_manual(), or scale_linetype_manual().
http://www.sthda.com/english/wiki/ggplot2-line-types-how-to-change-line-types-of-a-graph-in-r-software#change-manually-the-appearance-of-lines

Gradient fill columns using ggplot2 doesn't seem to work

I would like to create a gradient within each bar of this graph going from red (low) to green (high).
At present I am using specific colours within geom_col but want to have each individual bar scale from red to green depending on the value it depicts.
Here is a simplified version of my graph (there is also a geom_line on the graph (among several other things such as titles, axis adjustments, etc.), but it isn't relevant to this question so I have excluded it from the example):
I have removed the hard-coded colours from the columns and tried using things such as scale_fill_gradient (and numerous other similar functions) to apply a gradient to said columns, but nothing seems to work.
Here is what the output is when I use scale_fill_gradient(low = "red", high = "green"):
What I want is for each bar to have its own gradient and not for each bar to represent a step in said gradient.
How can I achieve this using ggplot2?
My code for the above (green) example:
ggplot(data = all_chats_hn,
aes(x = as.Date(date))) +
geom_col(aes(y = total_chats),
colour = "black",
fill = "forestgreen")
I'm not sure if that is possible with geom_col. It is possible by using geom_line and a little data augmentation. We have to use the y value to create a sequence of y values (y_seq), so that the gradient coloring works. We also create y_seq_scaled in case you want each line to have an "independent" gradient.
library(tidyverse)
set.seed(123) # reproducibility
dat <- data_frame(x = 1:10, y = abs(rnorm(10))) %>%
group_by(x) %>%
mutate(y_seq = list(seq(0, y, length.out = 100))) %>% # create sequence
unnest(y_seq) %>%
mutate(y_seq_scaled = (y_seq - mean(y_seq)) / sd(y_seq)) # scale sequence
# gradient for all together
ggplot(dat, aes(x = factor(x), y = y_seq, colour = y_seq))+
geom_line(size = 2)+
scale_colour_gradient(low = 'red', high = 'green')
# independent gradients per x
ggplot(dat, aes(x = factor(x), y = y_seq, colour = y_seq_scaled))+
geom_line(size = 2)+
scale_colour_gradient(low = 'red', high = 'green')

Adding points to GGPLOT2 Histogram

I'm trying to produce a histogram that illustrates observed points(a sub-set) on a histogram of all observations. To make it meaningful, I need to color each point differently and place a legend on the plot. My problem is, I can't seem to get a scale to show up on the plot. Below is an example of what I've tried.
subset <-1:8
results = data.frame(x_data = rnorm(5000),TestID=1:5000)
m <- ggplot(results,aes(x=x_data))
m+stat_bin(aes(y=..density..))+
stat_density(colour="blue", fill=NA)+
geom_point(data = results[results$TestID %in% subset,],
aes(x = x_data, y = 0),
colour = as.factor(results$TestID[results$TestID %in% subset]),
size = 5)+
scale_colour_brewer(type="seq", palette=3)
Ideally, I'd like the points to be positioned on the density line(but I'm really unsure of how to make that work, so I'll settle to position them at y = 0). What I need most urgently is a legend which indicates the TestID that corresponds to each of the points in subset.
Thanks a lot to anyone who can help.
This addresses your second point - if you want a legend, you need to include that variable as an aesthetic and map it to a variable (colour in this case). So all you really need to do is move colour = as.factor(results$TestID[results$TestID %in% subset]) inside the call to aes() like so:
ggplot(results,aes(x=x_data)) +
stat_bin(aes(y=..density..))+
stat_density(colour="blue", fill=NA)+
geom_point(data = results[results$TestID %in% subset,],
aes(x = x_data,
y = 0,
colour = as.factor(results$TestID[results$TestID %in% subset])
),
size = 5) +
scale_colour_brewer("Fancy title", type="seq", palette=3)

Resources