geom_raster interpolation with log scale - r

I'm a bit stuck plotting a raster with a log scale. Consider this plot for example:
ggplot(faithfuld, aes(waiting, eruptions)) +
geom_raster(aes(fill = density))
But how to use a log scale with this geom? None of the usual methods are very satisfying:
ggplot(faithfuld, aes(waiting, log10(eruptions))) +
geom_raster(aes(fill = density))
ggplot(faithfuld, aes(waiting, (eruptions))) +
geom_raster(aes(fill = density)) +
scale_y_log10()
and this doesn't work at all:
ggplot(faithfuld, aes(waiting, (eruptions))) +
geom_raster(aes(fill = density)) +
coord_trans(x="log10")
Error: geom_raster only works with Cartesian coordinates
Are there any options for using a log scale with a raster?
To be precise, I have three columns of data. The z value is the one I want to use to colour the raster, and it is not computed from the x and y values. So I need to supply all three columns to the ggplot function. For example:
dat <- data.frame(x = rep(1:10, 10),
y = unlist(lapply(1:10, function(i) rep(i, 10))),
z = faithfuld$density[1:100])
ggplot(dat, aes(x = log(x), y = y, fill = z)) +
geom_raster()
What can I do to get rid of those gaps in the raster?
Note that this question is related to these two:
geom_raster interpolation with log scale
Use R to recreate contour plot made in Igor
I have been keeping an updated gist of R code that combines details from the answers to these questions (example output included in the gist). That gist is here: https://gist.github.com/benmarwick/9a54cbd325149a8ff405

The dataset faithfuld already have a column for density which is the estimates of the 2D density for waiting and eruptions. You can find that the eruptions and waiting in the dataset are points in a grid. When you use geom_raster, it doesn't compute the density for you. Instead, it plots the density according to the x, y coordinates, in this case, is the grid. Hence, if you just apply the log transformation on y, it will distort the difference between y (originally they are equally spaced) and this is why you see the space in your plot. I used points to visualize the effects:
library(ggplot2)
library(gridExtra)
# Use point to visualize the effect of log on the dataset
g1 <- ggplot(faithfuld, aes(x=waiting, y=eruptions)) +
geom_point(size=0.5)
g2 <- ggplot(faithfuld, aes(x=waiting, y=log(eruptions))) +
geom_point(size=0.5)
grid.arrange(g1, g2, ncol=2)
If you really want to transform y to log scale and produce the density plot, you have to use the faithful dataset with geom_density_2d.
# Use geom_density_2d
ggplot(faithful, aes(x=waiting, y=log(eruptions))) +
geom_density_2d() +
stat_density_2d(geom="raster", aes(fill=..density..),
contour=FALSE)
Update: Use geom_rect and supply custom xmin, xmax, ymin, ymax values to fit the spaces of the log scale.
Since the geom_raster use the same size of tiles, you probably have to use geom_tile or geom_rect to create the plot. My idea is to calculate how large (width) each tile should be and adjust the xmin and xmax for each tile to fill up the gap.
dat <- data.frame(x = rep(1:10, 10),
y = unlist(lapply(1:10, function(i) rep(i, 10))),
z = faithfuld$density[1:100])
library(ggplot2)
library(gridExtra)
g <- ggplot(dat, aes(x = log(x), y = y, fill = z)) +
geom_raster()
# Replace the ymin and ymax
distance <- diff((unique(dat$x)))/2
upper <- (unique(dat$x)) + c(distance, distance[length(distance)])
lower <- (unique(dat$x)) - c(distance[1], distance)
# Create xmin, xmax, ymin, ymax
dat$xmin <- dat$x - 0.5 # default of geom_raster is 0.5
dat$xmax <- dat$x + 0.5
dat$ymin <- unlist(lapply(lower, function(i) rep(i, rle(dat$y)$lengths[1])))
dat$ymax <- unlist(lapply(upper, function(i) rep(i, rle(dat$y)$lengths[1])))
# You can also use geom_tile with the width argument
g2 <- ggplot(dat, aes(x=log(x), y=y, xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax, fill=z)) +
geom_rect()
# show the plots
grid.arrange(g, g2, ncol=2)

Related

How can I selectively colour the entire area of a plot based on specified points on the x-axis?

I would like to colour the background of a line plot based on manually inserted coordinates on the x-axis.
As a simple example:
x <- 1:10
y <- c(2.0,2.0,2.1,2.2,2.4,2.7,2.7,2.8,2.8,2.9)
df <- data.frame(x,y)
library(ggplot2)
plot <- ggplot(df, aes(x=x, y=y)) + geom_line()
print(plot)
Now I would like to colour the entire area of the plot between two specified points on the x-axis, say, between 1.5 and 3. Would anyone happen to know an easy manual solution to this?
Maybe this is what you are looking for. You could use a geom_rect to achieve this:
x <- 1:10
y <- c(2.0,2.0,2.1,2.2,2.4,2.7,2.7,2.8,2.8,2.9)
df <- data.frame(x,y)
library(ggplot2)
ggplot(df, aes(x=x, y=y)) +
geom_rect(xmin = 1.5, xmax = 3, ymin = -Inf, ymax = Inf, fill = "red", inherit.aes = FALSE) +
geom_line()
Created on 2020-12-21 by the reprex package (v0.3.0)

How to add contour line to tile plot

I have a set of data predicted by a model. I'm plotting it with geom_tile().
df1 <- data.frame(x=rep(seq(0,10,by=.1),each=101),
y=rep(seq(10,20,by=.1),times=101))
df1$z <- ((.1*df1$x^2+df1$y)-10)/20
library("ggplot2")
ggplot(mapping=aes(x=x,y=y,size=5,color=z),data=df1)+
geom_point(size = 16, shape = 15)
ggplot(df1, aes(x, y, fill="blue",alpha = z)) + geom_tile()
How can I add some contour lines to it at specific values (e.g. z=0.9, 0.95, 0.99)? Alternatively, geom_tile can be changed to any suitable continuous / contour / raster plot function.
ggplot(df1, aes(x, y, z = z, fill = z))+
geom_tile()+
geom_contour()

When using `scale_x_log10`, how can I map `geom_text` accurately to `geom_bin2d`?

A great answer on how to label the count on geom_bin2d, can be found here:
Getting counts on bins in a heat map using R
However, when modifying this to have a logarithmic X axis:
library(ggplot2)
set.seed(1)
dat <- data.frame(x = rnorm(1000), y = rnorm(1000))
# plot MODIFIED HERE TO BECOME log10
p <- ggplot(dat, aes(x = x, y = y)) + geom_bin2d() + scale_x_log10()
# Get data - this includes counts and x,y coordinates
newdat <- ggplot_build(p)$data[[1]]
# add in text labels
p + geom_text(data=newdat, aes((xmin + xmax)/2, (ymin + ymax)/2,
label=count), col="white")
This produces labels that are very poorly mapped to their respective points.
How can I correct the geom_text based labels to correctly map to thier respective points?
Apply logarithmic transformation directly on x values, not on scale. Change only one line of your code:
p <- ggplot(dat, aes(x = log10(x), y = y)) + geom_bin2d()
That allows to keep negative values and produces the following plot:

geom_tile with unequally spaced y values (e.g. 2^X)

I want to recreate an "image" plot in ggplot (because of some other aspects of the package). However, I'm facing a problem caused by my y-scale, which is defined by unequally but logically spaced values, e.g. I would have z values for y = 2,4,8,16,32. This causes the tiles to not be equally large, so I have these white bands in my figure. I can solve this by transforming the y values in a factor, but I don't want to do this because I'm also trying to plot other geom objects on the figure which require a numeric scale.
This clearifies my problem a bit:
# random data, with y scale numeric
d <- data.frame(Var1=rep(1901:2000,10),Var2=rep(c(2,4,8,16,32),each=100),value=rnorm(500,50,5))
line=data.frame(Var1=1901:2000,Var2=rnorm(50,1.5,0.5))
ggplot(d, aes(x=Var1, y=Var2)) +
geom_tile(aes(fill=value)) +
geom_line(data=line)
# y as factor
d2 = d
d2$Var2=as.factor(d2$Var2) ggplot(d2, aes(x=Var1, y=Var2)) +
geom_tile(aes(fill=value)) +
geom_line(data=line)
I tried attributing the line values to the value of the nearest factor level, but this introduces a big error. Also, I tried the size option in geom_tile, but this didn't work out either.
In the example the y data is log transformed, but this is just for the ease of making a fake dataset.
Thank you.
Something like this??
ggplot(d, aes(x=Var1, y=Var2)) +
geom_tile(aes(fill=value)) +
geom_line(data=line)+
scale_y_continuous(trans="log2")
Note the addition of scale_y_continuous(trans="log2")
EDIT Based on OP's comment below.
There is no built-in "reverse log2 transform", but it is possible to create new transformations using the trans_new(...) function in package scales. And, naturally, someone has already thought of this: ggplot2 reverse log coordinate transform. The code below is based on the link.
library(scales)
reverselog2_trans <- function(base = 2) {
trans <- function(x) -log(x, base)
inv <- function(x) base^(-x)
trans_new(paste0("reverselog-", format(base)), trans, inv, log_breaks(base = base), domain = c(1e-100, Inf))
}
ggplot(d, aes(x=Var1, y=Var2)) +
geom_tile(aes(fill=value)) +
geom_line(data=line)+
scale_y_continuous(trans="reverselog2")
Perhaps another approach using a discrete scale and facets might be a possibility:
d <- data.frame(Var1=rep(1901:2000,10),Var2=rep(c(2,4,8,16,32),each=100),value=rnorm(500,50,5), chart="tile" )
d$Var2 <- factor(d$Var2, levels=rev(unique(d$Var2)))
line <- data.frame(Var1=1901:2000,Var2=rnorm(50,1.5,0.5), chart="line")
ggplot(d, aes(x=Var1, y=Var2)) +
geom_tile(aes(y = Var2, fill=value) ) +
geom_line( data=line ) +
scale_y_discrete() +
facet_grid( chart ~ ., scale = "free_y", space="free_y")
which gives a chart like:

Possible to combine position_jitter with position_dodge?

I've become quite fond of boxplots in which jittered points are overlain over the boxplots to represent the actual data, as below:
set.seed(7)
l1 <- gl(3, 1, length=102, labels=letters[1:3])
l2 <- gl(2, 51, length=102, labels=LETTERS[1:2]) # Will use this later
y <- runif(102)
d <- data.frame(l1, l2, y)
ggplot(d, aes(x=l1, y=y)) +
geom_point(position=position_jitter(width=0.2), alpha=0.5) +
geom_boxplot(fill=NA)
(These are particularly helpful when there are very different numbers of data points in each box.)
I'd like to use this technique when I am also (implicitly) using position_dodge to separate boxplots by a second variable, e.g.
ggplot(d, aes(x=l1, y=y, colour=l2)) +
geom_point(position=position_jitter(width=0.2), alpha=0.5) +
geom_boxplot(fill=NA)
However, I can't figure out how to dodge the points by the colour variable (here, l2) and also jitter them.
Here is an approach that manually performs the jittering and dodging.
# a plot with no dodging or jittering of the points
dp <- ggplot(d, aes(x=l1, y=y, colour=l2)) +
geom_point(alpha=0.5) +
geom_boxplot(fill=NA)
# build the plot for rendering
foo <- ggplot_build(dp)
# now replace the 'x' values in the data for layer 1 (unjittered and un-dodged points)
# with the appropriately dodged and jittered points
foo$data[[1]][['x']] <- jitter(foo$data[[2]][['x']][foo$data[[1]][['group']]],amount = 0.2)
# now draw the plot (need to explicitly load grid package)
library(grid)
grid.draw(ggplot_gtable(foo))
# note the following works without explicitly loading grid
plot(ggplot_gtable(foo))
I don't think you'll like it, but I've never found a way around this except to produce your own x values for the points. In this case:
d$l1.num <- as.numeric(d$l1)
d$l2.num <- (as.numeric(d$l2)/3)-(1/3 + 1/6)
d$x <- d$l1.num + d$l2.num
ggplot(d, aes(l1, y, colour = l2)) + geom_boxplot(fill = NA) +
geom_point(aes(x = x), position = position_jitter(width = 0.15), alpha = 0.5) + theme_bw()
It's certainly a long way from ideal, but becomes routine pretty quickly. If anyone has an alternative solution, I'd be very happy!
The new position_jitterdodge() works for this. However, it requires the fill aesthetic to tell it how to group points, so you have to specify a manual fill to get uncolored boxes:
ggplot(d, aes(x=l1, y=y, colour=l2, fill=l2)) +
geom_point(position=position_jitterdodge(width=0.2), alpha=0.5) +
geom_boxplot() + scale_fill_manual(values=rep('white', length(unique(l2))))
I'm using a newer version of ggplot2 (ggplot2_2.2.1.9000) and I was struggling to find an answer that worked for a similar plot of my own. #John Didon's answer produced an error for me; Error in position_jitterdodge(width = 0.2) : unused argument (width = 0.2). I had previous code that worked with geom_jitter that stopped working after downloading the newer version of ggplot2. This is how I solved it below - minimal-fuss code....
ggplot(d, aes(x=l1, y=y, colour=l2, fill=l2)) +
geom_point(position = position_jitterdodge(dodge.width = 1,
jitter.width = 0.5), alpha=0.5) +
geom_boxplot(position = position_dodge(width = 1), fill = NA)
Another option would be to use facets:
set.seed(7)
l1 <- gl(3, 1, length=102, labels=letters[1:3])
l2 <- gl(2, 51, length=102, labels=LETTERS[1:2]) # Will use this later
y <- runif(102)
d <- data.frame(l1, l2, y)
ggplot(d, aes(x=l1, y=y, colour=l2)) +
geom_point(position=position_jitter(width=0.2), alpha=0.5) +
geom_boxplot(fill=NA) +
facet_grid(.~l2) +
theme_bw()
Sorry, donĀ“t have enough points to post the resulting graph.

Resources