How to scale data when creating a new geom in ggplot2? - r

I am trying to create a new geom in ggplot2 which draws a lot of lines. However, my problem is that lines drawn were not accurate. Here is a simple illustration of my problem. Consider this example
GeomLine1 <- ggproto("GeomLine1", Geom,
required_aes = c('x','y'),
default_aes = aes(colour = "black"),
draw_key = draw_key_abline,
draw_panel = function(data, panel_scales, coord) {
grid::linesGrob(x=data$x,y=data$y,default.units = 'native')}
)
geom_line1 <- function(mapping = NULL, data = NULL, stat = "identity",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
geom = GeomLine1, mapping = mapping, data = data, stat = stat,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
This is a new a geom called geom_line1 which should draw a line.
dat <- data.frame(x = c(0,10), y = c(0,10))
ggplot(dat,aes(x,y))+geom_line1() + geom_point()
Output Of this CODE is
You can see the line is extending beyond the points at that position. If I use default function geom_line, It is perfect. What is wrong in my code ? Also, It will be great if anyone can suggest good tutorial for writing new geoms.
ggplot(dat,aes(x,y))+geom_line() + geom_point()

I just managed to figure out the problem. Actually we have to use transform the data like this and use transformed data for plotting.
coords <- coord$transform(data, panel_scales)
grid::linesGrob(x=coords$x,y=coords$y,default.units = 'native')
Instead of data$x and data$y we should coords$x and coords$y

Related

Develop a modified version of stat_contour

I'm ultimately trying to plot contour plots, or "raster plots", of irregular datasets - a rather common question of course. Many solutions propose to interpolate the data first, and then plot it, for instance here : Plotting contours on an irregular grid amongst other - or in fact, the man page at https://ggplot2.tidyverse.org/reference/geom_contour.html
However, for convenience I'm trying to wrap it into a new stat.
I managed to get something that works for geom_raster, simply lifting the interpolation code from the example in the manual:
require(akima)
StatInterpRaster <- ggproto("StatInterpRaster", Stat,
compute_group = function(data, scales) {
ii<-akima::interp(x = data$x,
y = data$y,
z = data$fill)
data.out <- tibble(x = rep(ii$x, nrow(ii$z)),
y = rep(ii$y, each = ncol(ii$z)),
fill = as.numeric(ii$z) )
return(data.out)
},
required_aes = c("x", "y", "fill")
)
stat_interp_raster<- function(mapping = NULL, data = NULL, geom = "contour",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
stat = StatInterpRaster, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
which works as expected:
ee <- tibble (x=rnorm(50),y=rnorm(50),z=x*y)
ee %>% ggplot() + geom_raster(aes(x=x,y=y,fill=z),stat=StatInterpRaster)
I would now trying to achieve the same thing with contours. Naively I tried
StatInterpContour <- ggproto("StatInterpContour", Stat,
compute_group = function(data, scales) {
ii<-akima::interp(x = data$x,
y = data$y,
z = data$z)
data.out <- tibble(x = rep(ii$x, nrow(ii$z)),
y = rep(ii$y, each = ncol(ii$z)),
z = as.numeric(ii$z) )
#StatContour(data.out)
return(data.out)
},
required_aes = c("x", "y", "z")
)
stat_interp_contour<- function(mapping = NULL, data = NULL, geom = "contour",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
stat = StatInterpContour, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
which is essentially the same as above. However it does not produce the expected result :
ee %>% ggplot() + geom_contour(aes(x=x,y=y,z=z),stat=StatInterpContour)
In retrospect, this is not surprising. My stat is generating a regular data array, with neatly ordered values in x and y, but nowhere am I generating the actual lines. The contour lines are more complicated, seem to be generated by xyz_to_isolines in stat_contour (cf. https://github.com/tidyverse/ggplot2/blob/main/R/stat-contour.r , line 97 as of today).
I could copy the relevant code in stat-contour.r, but it seems to me that it is a waste of effort and it would be better to simply pass my result to stat_contour, that already does the job: it generates contour lines from an object of that shape. So apparently I "just" have to call StatContour (or friends) somewhere in my StatInterpContour function compute_group -- but how ?
Thanks !
You are right that you shouldn't need to copy code over from StatContour. Instead, make your ggproto class inherit from StatContour. Prepare the data then pass it, along with all necessary parameters, to the compute_group function from StatContour
StatInterpContour <- ggproto("StatInterpRaster", StatContour,
compute_group = function(data, scales, z.range, bins = NULL, binwidth = NULL,
breaks = NULL, na.rm = FALSE) {
ii<-akima::interp(x = data$x,
y = data$y,
z = data$z)
data <- tibble(x = rep(ii$x, nrow(ii$z)),
y = rep(ii$y, each = ncol(ii$z)),
z = as.numeric(ii$z), group = 1)
StatContour$compute_group(data, scales, z.range,
bins, binwidth, breaks, na.rm)
},
required_aes = c("x", "y", "z")
)
This requires a little modification of your user-facing function:
stat_interp_contour<- function(mapping = NULL, data = NULL, geom = "contour",
position = "identity", na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE, bins = NULL, binwidth = NULL,
breaks = NULL, ...) {
layer(
stat = StatInterpContour, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, bins = bins, binwidth = binwidth,
breaks = breaks, ...)
)
}
But should now work without as expected. Here, I've plotted it along with the original points coloured according to their z value to show that the contours try to approximate the level of the points:
ee %>%
ggplot(aes(x, y)) +
geom_point(aes(color = z), size = 3) +
stat_interp_contour(aes(z = z, color = after_stat(level))) +
scale_color_viridis_c()

Add text to different horizontal mean lines in a facet_grid of grouped data ggplo2

I have used MrFlicks solution to add different horizontal mean lines to plots shown with facet_grid.
It works great, but I was wondering if it would be possible to add some individual text next to the different lines?
My question is: Is it possible to incorporate something like this in the code? And how would you do it?
geom_text(aes(.7,mean(variable),label = round(mean(variable),digits = 2), vjust = -1))
With some adjustments to the solution of #MrFlick this can be achieved like so:
Instead of only computing yintercept I adjusted MrFlick's function to replace y with the mean(y) which ensures that the labels are put on the y-position of the mean lines.
Instead of returning the whole dataset the adjusted function returns only one row, whereby I set x to mean(x). This ensures that we only get one label.
With these adjustments you can can add labels to the mean lines via
geom_text(aes(x = 10, label = round(..yintercept.., digits = 2)), stat = "mean_line", vjust = -1, hjust = 0)
Try this:
library(ggplot2)
StatMeanLine <- ggproto("StatMeanLine", Stat,
compute_group = function(data, scales) {
transform(data, x = mean(x), y = mean(y), yintercept=mean(y))[1,]
},
required_aes = c("x", "y")
)
stat_mean_line <- function(mapping = NULL, data = NULL, geom = "hline",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
stat = StatMeanLine, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
ggplot(mtcars, aes(mpg, cyl)) +
stat_mean_line(color="red") +
geom_text(aes(x = 10, label = round(..yintercept.., digits = 2)), stat = "mean_line", vjust = -1, hjust = 0) +
geom_point() +
facet_wrap(~ gear)

In ggproto, coord$transform did not transform some columns to [0, 1]

I want to create a new Geom type: geom_ohlc(), which is something like Candlestick Charts, to plot the stock open-high-low-close data.
After learning this Hadley's article: I tried this:
GeomOHLC <- ggproto(`_class` = "GeomOHLC", `_inherit` = Geom,
required_aes = c("x", "op", "hi", "lo", "cl"),
draw_panel = function(data, panel_scales, coord){
coords <- coord$transform(data, panel_scales)
browser() # <<-- here is where I found the problem
grid::gList(
grid::rectGrob(
x = coords$x,
y = pmin(coords$op, coords$cl),
vjust = 0,
width = 0.01,
height = abs(coords$op - coords$cl),
gp = grid::gpar(col = coords$color, fill = "yellow")
),
grid::segmentsGrob(
x0 = coords$x,
y0 = coords$lo,
x1 = coords$x,
y1 = coords$hi
)
)
})
geom_ohlc <- function(data = NULL, mapping = NULL, stat = "identity", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)
{
layer(
geom = GeomOHLC, mapping = mapping, data = data,
stat = stat, position = position, show.legend = show.legend,
inherit.aes = inherit.aes, params = list(na.rm = na.rm, ...)
)
}
dt <- data.table(x = 1:10, open = 1:10, high = 3:12, low = 0:9, close = 2:11)
p <- ggplot(dt, aes(x = x, op = open, hi = high, lo = low, cl = close)) +
geom_ohlc()
p
for simplicity, i just do not consider the color of bar.
The result plot is like this:
I add a browser() inside the ggproto function, and I found that the coord$transform did not transform the op, hi, lo, cl aesthetics into interverl [0,1]. How to fix this problem ?
Moreover, is there any other documents about how to create your own Geom type except that Hadley's article ?
As mentioned in the comments under the OP's question the problem is aes_to_scale() function inside transform_position(), which in turn is called by coord$transform. Transformations are limited to variables named x, xmin, xmax, xend, xintercept and the equivalents for y axis. This is mentioned in the help for transform_position:
Description
Convenience function to transform all position variables.
Usage
transform_position(df, trans_x = NULL, trans_y = NULL, ...) Arguments
trans_x, trans_y Transformation functions for x and y aesthetics.
(will transform x, xmin, xmax, xend etc) ... Additional arguments
passed to trans_x and trans_y.
A workaround would be to use those variable names instead of the variable names used by the OP. The following code works in transforming the variables but it fails at somewhere else (see at the end). I do not know the details of the intended plot, so didn't try to fix this error.
GeomOHLC <- ggproto(
`_class` = "GeomOHLC",
`_inherit` = Geom,
required_aes = c("x", "yintercept", "ymin", "ymax", "yend"),
draw_panel = function(data, panel_scales, coord) {
coords <- coord$transform(data, panel_scales)
#browser() # <<-- here is where I found the problem
grid::gList(
grid::rectGrob(
x = coords$x,
y = pmin(coords$yintercept, coords$yend),
vjust = 0,
width = 0.01,
height = abs(coords$op - coords$cl),
gp = grid::gpar(col = coords$color, fill = "yellow")
),
grid::segmentsGrob(
x0 = coords$x,
y0 = coords$ymin,
x1 = coords$x,
y1 = coords$ymax
)
)
}
)
geom_ohlc <-
function(data = NULL,
mapping = NULL,
stat = "identity",
position = "identity",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE,
...)
{
layer(
geom = GeomOHLC,
mapping = mapping,
data = data,
stat = stat,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
dt <-
data.table(
x = 1:10,
open = 1:10,
high = 3:12,
low = 0:9,
close = 2:11
)
p <-
ggplot(dt, aes(
x = x,
yintercept = open,
ymin = high,
ymax = low,
yend = close
)) +
geom_ohlc()
p
This transforms the variables but produces the following error:
Error in unit(height, default.units) :
'x' and 'units' must have length > 0
But hopefully from here it can be made to work.
NOTE: I chose the mapping between the original variable names (op, hi, lo, cl) rather arbitrarily. Specially yintercept does not seem to fit well. Maybe there is need to support arbitrary scale variable names in ggplot2?

geom_density - customize KDE

I would like to use a different KDE method than stats::density which is used by stat_density/geom_density to plot a KDE for a distrubtion. How should I go about this?
I realized that this can be done by extending ggplot2 with ggproto. The ggproto vignette has an example that can be adapted pretty easily:
StatDensityCommon <- ggproto("StatDensityCommon", Stat,
required_aes = "x",
setup_params = function(data, params) {
if (!is.null(params$bandwidth))
return(params)
xs <- split(data$x, data$group)
bws <- vapply(xs, bw.nrd0, numeric(1))
bw <- mean(bws)
message("Picking bandwidth of ", signif(bw, 3))
params$bandwidth <- bw
params
},
compute_group = function(data, scales, bandwidth = 1) {
### CUSTOM FUNCTION HERE ###
d <- locfit::density.lf(data$x) #FOR EXAMPLE
data.frame(x = d$x, y = d$y)
}
)
stat_density_common <- function(mapping = NULL, data = NULL, geom = "line",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, bandwidth = NULL,
...) {
layer(
stat = StatDensityCommon, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(bandwidth = bandwidth, na.rm = na.rm, ...)
)
}
ggplot(mpg, aes(displ, colour = drv)) + stat_density_common()

Manipulate ggproto to get multiple layers

I'm trying to get multiple area layers out of a ggproto object. I don't know if this is even possible but in case it is, I'm unable to figure out how.
For instance, how can I get the code below to produce two area layers where one has y coordinates as half of the other -
StatDensityHalf <- ggproto("StatDensity2", Stat,
required_aes = "x",
default_aes = aes(y = ..density..),
compute_group = function(data, scales, bandwidth = 1) {
d <- density(data$x, bw = bandwidth)
rbind(
data.frame(x = d$x, density = d$y, fill = 1),
data.frame(x = d$x, density = d$y/2, fill =2)
)
}
)
stat_density_half <- function(mapping = NULL, data = NULL, geom = "line",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, bandwidth = NULL,
...) {
layer(
stat = StatDensityHalf, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(bandwidth = bandwidth, na.rm = na.rm, ...)
)
}
ggplot(mpg, aes(displ)) +
stat_density_half(bandwidth = 1, geom = "area", position = "stack")
Please note, I'm NOT looking for a workaround to produce the same plot as the example suggests. I'm looking for a generic solution to this problem.
Okay, finally got around to finishing this up. This creates two layers:
library(ggplot2)
StatDensityHalf <-
ggproto("StatDensity2", Stat,
required_aes = "x",
default_aes = aes(y = ..density..),
compute_group = function(data, scales, bandwidth = 1,fak=1,fillgrp="1"){
d <- density(data$x, bw = bandwidth)
data.frame(x = d$x, density = d$y / fak, fill = fillgrp)
}
)
stat_density_half <- function(mapping = NULL, data = NULL, geom = "line",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, bandwidth = NULL, ...) {
list(
layer(
stat = StatDensityHalf, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(bandwidth = bandwidth, na.rm = na.rm, fak = 1, fillgrp = "1", ...)),
layer(
stat = StatDensityHalf, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(bandwidth = bandwidth, na.rm = na.rm, fak = 2, fillgrp = "2", ...))
)
}
ggplot(mpg, aes(cty)) +
stat_density_half(bandwidth = 2, geom = "area", position = "stack") +
scale_fill_manual(values = c("2" = "red", "1" = "blue"))
Yields:
Update:
In the first iteration I had two ggproto's because I didn't really see how to add parameters to a ggproto (here fak and fillgrp). The solution was to add them explicitly to the compute_group function in addition adding them to the params list, otherwise the ggproto wrapper complains and fails.

Resources