Related
I'm ultimately trying to plot contour plots, or "raster plots", of irregular datasets - a rather common question of course. Many solutions propose to interpolate the data first, and then plot it, for instance here : Plotting contours on an irregular grid amongst other - or in fact, the man page at https://ggplot2.tidyverse.org/reference/geom_contour.html
However, for convenience I'm trying to wrap it into a new stat.
I managed to get something that works for geom_raster, simply lifting the interpolation code from the example in the manual:
require(akima)
StatInterpRaster <- ggproto("StatInterpRaster", Stat,
compute_group = function(data, scales) {
ii<-akima::interp(x = data$x,
y = data$y,
z = data$fill)
data.out <- tibble(x = rep(ii$x, nrow(ii$z)),
y = rep(ii$y, each = ncol(ii$z)),
fill = as.numeric(ii$z) )
return(data.out)
},
required_aes = c("x", "y", "fill")
)
stat_interp_raster<- function(mapping = NULL, data = NULL, geom = "contour",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
stat = StatInterpRaster, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
which works as expected:
ee <- tibble (x=rnorm(50),y=rnorm(50),z=x*y)
ee %>% ggplot() + geom_raster(aes(x=x,y=y,fill=z),stat=StatInterpRaster)
I would now trying to achieve the same thing with contours. Naively I tried
StatInterpContour <- ggproto("StatInterpContour", Stat,
compute_group = function(data, scales) {
ii<-akima::interp(x = data$x,
y = data$y,
z = data$z)
data.out <- tibble(x = rep(ii$x, nrow(ii$z)),
y = rep(ii$y, each = ncol(ii$z)),
z = as.numeric(ii$z) )
#StatContour(data.out)
return(data.out)
},
required_aes = c("x", "y", "z")
)
stat_interp_contour<- function(mapping = NULL, data = NULL, geom = "contour",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
stat = StatInterpContour, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
which is essentially the same as above. However it does not produce the expected result :
ee %>% ggplot() + geom_contour(aes(x=x,y=y,z=z),stat=StatInterpContour)
In retrospect, this is not surprising. My stat is generating a regular data array, with neatly ordered values in x and y, but nowhere am I generating the actual lines. The contour lines are more complicated, seem to be generated by xyz_to_isolines in stat_contour (cf. https://github.com/tidyverse/ggplot2/blob/main/R/stat-contour.r , line 97 as of today).
I could copy the relevant code in stat-contour.r, but it seems to me that it is a waste of effort and it would be better to simply pass my result to stat_contour, that already does the job: it generates contour lines from an object of that shape. So apparently I "just" have to call StatContour (or friends) somewhere in my StatInterpContour function compute_group -- but how ?
Thanks !
You are right that you shouldn't need to copy code over from StatContour. Instead, make your ggproto class inherit from StatContour. Prepare the data then pass it, along with all necessary parameters, to the compute_group function from StatContour
StatInterpContour <- ggproto("StatInterpRaster", StatContour,
compute_group = function(data, scales, z.range, bins = NULL, binwidth = NULL,
breaks = NULL, na.rm = FALSE) {
ii<-akima::interp(x = data$x,
y = data$y,
z = data$z)
data <- tibble(x = rep(ii$x, nrow(ii$z)),
y = rep(ii$y, each = ncol(ii$z)),
z = as.numeric(ii$z), group = 1)
StatContour$compute_group(data, scales, z.range,
bins, binwidth, breaks, na.rm)
},
required_aes = c("x", "y", "z")
)
This requires a little modification of your user-facing function:
stat_interp_contour<- function(mapping = NULL, data = NULL, geom = "contour",
position = "identity", na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE, bins = NULL, binwidth = NULL,
breaks = NULL, ...) {
layer(
stat = StatInterpContour, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, bins = bins, binwidth = binwidth,
breaks = breaks, ...)
)
}
But should now work without as expected. Here, I've plotted it along with the original points coloured according to their z value to show that the contours try to approximate the level of the points:
ee %>%
ggplot(aes(x, y)) +
geom_point(aes(color = z), size = 3) +
stat_interp_contour(aes(z = z, color = after_stat(level))) +
scale_color_viridis_c()
My goal is to write a custom geom_ method that calculates and plots, e.g., confidence intervals and these should be plotted either as polygons or as lines. The question now is, where to check which "style" should be plotted?
So far I have tried out three different approaches,
(i) write two different geom_/stat_ for line and polygon style plots,
(ii) write a single geom_/stat_ which uses a custom GeomMethod,
(iii) write a single geom_/stat_ which uses either GeomPolygon or GeomLine.
In my opinion, to sum up
(i) is more or less straightforward but only bypasses the problem,
(ii) works when you use either GeomPath$draw_panel() or GeomPolygon$draw_panel() depending on an extra parameter style. But here I can't work it out to set default_aes depending also on the extra argument style. Compare also the answer here.
(iii) works when calling geom_ but fails for calling stat_ as the name matching within ggplot2 fails. See minimal example below.
Setting up the methods of approach (iii):
geom_my_confint <- function(mapping = NULL, data = NULL, stat = "my_confint",
position = "identity", na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE,
style = c("polygon", "line"), ...) {
style <- match.arg(style)
ggplot2::layer(
geom = if (style == "line") GeomPath else GeomPolygon,
mapping = mapping,
data = data,
stat = stat,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
na.rm = na.rm,
style = style,
...
)
)
}
stat_my_confint <- function(mapping = NULL, data = NULL, geom = "my_confint",
position = "identity", na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE,
style = c("polygon", "line"), ...) {
style <- match.arg(style)
ggplot2::layer(
geom = geom,
stat = StatMyConfint,
data = data,
mapping = mapping,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
na.rm = na.rm,
style = style,
...
)
)
}
StatMyConfint <- ggplot2::ggproto("StatMyConfint", ggplot2::Stat,
compute_group = function(data, scales, style) {
if (style == "polygon") {
nd <- data.frame(
x = c(data$x, rev(data$x)),
y = c(data$y - 1, rev(data$y) + 1)
)
nd
} else {
nd <- data.frame(
x = rep(data$x, 2),
y = c(data$y - 1, data$y + 1),
group = c(rep(1, 5), rep(2, 5))
)
nd
}
},
required_aes = c("x", "y")
)
Trying out the methods of approach (iii):
library("ggplot2")
d <- data.frame(
x = seq(1, 5),
y = seq(1, 5)
)
ggplot(d, aes(x = x, y = y)) + geom_line() + geom_my_confint(style = "polygon", alpha = 0.2)
ggplot(d, aes(x = x, y = y)) + geom_line() + geom_my_confint(style = "line", linetype = 2)
This works well so far. However when calling the stat_ there is an error in ggplot2:::check_subclass because there is no GeomMyConfint method.
ggplot(d, aes(x = x, y = y)) + geom_line() + stat_my_confint()
# Error: Can't find `geom` called 'my_confint'
Any solutions or suggestions for alternative approaches?
The following isn't very elegant but seems to work. Let's define the following constructor, wherein the geom is set to GeomMyConfint, which we'll define further down.
geom_my_confint <- function(mapping = NULL, data = NULL, stat = "my_confint",
position = "identity", na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE,
style = c("polygon", "line"), ...) {
style <- match.arg(style)
ggplot2::layer(
geom = GeomMyConfint,
mapping = mapping,
data = data,
stat = stat,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
na.rm = na.rm,
style = style,
...
)
)
}
Below is the paired ggproto class. I've amended the use_defaults method to replace a defaulted colour by some text. Then later, the draw_panel() method chooses the actual default to replace the text we've inserted earlier, depending on the style argument.
GeomMyConfint <- ggproto(
"GeomMyConfint", GeomPolygon,
# Tag colour if it has been defaulted
use_defaults = function(self, data, params = list(), modifiers = aes()) {
has_colour <- "colour" %in% names(data) || "colour" %in% names(params)
data <- ggproto_parent(GeomPolygon, self)$use_defaults(
data, params, modifiers
)
if (!has_colour) {
data$colour <- "default_colour"
}
data
},
# Resolve colour defaults here
draw_panel = function(
data, panel_params, coord,
# Polygon arguments
rule = "evenodd",
# Line arguments
lineend = "butt", linejoin = "round", linemitre = 10,
na.rm = FALSE, arrow = NULL,
# Switch argument
style = "polygon")
{
if (style == "polygon") {
data$colour[data$colour == "default_colour"] <- NA
GeomPolygon$draw_panel(data, panel_params, coord, rule)
} else {
data$colour[data$colour == "default_colour"] <- "black"
GeomPath$draw_panel(data, panel_params, coord,
arrow, lineend, linejoin, linemitre, na.rm)
}
}
)
Then then works with the rest of the functions from your example.
A more elegant method might be to use the vctrs package to define a custom S3 class for defaulted values that is easy to recognise, but I haven't seen people trying to use aes(colour = I("default_colour")) before, so you're probably safe aside from this one single edge case.
Based on #teunbrand's answer and how geom_sf() is implemented, I came up with the following solution supporting approach (ii):
geom_my_confint <- function(mapping = NULL, data = NULL, stat = "my_confint",
position = "identity", na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE,
type = c("polygon", "line"), ...) {
type <- match.arg(type)
ggplot2::layer(
geom = GeomMyConfint,
mapping = mapping,
data = data,
stat = stat,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
na.rm = na.rm,
type = type,
...
)
)
}
GeomMyConfint <- ggplot2::ggproto("GeomMyConfint", ggplot2::Geom,
## Setting up all defaults needed for `GeomPolygon` and `GeomPath`
default_aes = ggplot2::aes(
colour = NA,
fill = NA,
size = NA,
linetype = NA,
alpha = NA,
subgroup = NULL
),
draw_panel = function(data, panel_params, coord,
rule = "evenodd", # polygon arguments
lineend = "butt", linejoin = "round", # line arguments
linemitre = 10, na.rm = FALSE, arrow = NULL, # line arguments
type = c("polygon", "line")) {
type <- match.arg(type)
## Swap NAs in `default_aes` with own defaults
data <- my_modify_list(data, my_default_aesthetics(type), force = FALSE)
if (type == "polygon") {
GeomPolygon$draw_panel(data, panel_params, coord, rule)
} else {
GeomPath$draw_panel(data, panel_params, coord,
arrow, lineend, linejoin, linemitre, na.rm)
}
},
draw_key = function(data, params, size) {
## Swap NAs in `default_aes` with own defaults
data <- my_modify_list(data, my_default_aesthetics(params$type), force = FALSE)
if (params$type == "polygon") {
draw_key_polygon(data, params, size)
} else {
draw_key_path(data, params, size)
}
}
)
## Helper function inspired by internal from `ggplot2` defined in `performance.R`
my_modify_list <- function(old, new, force = FALSE) {
if (force) {
for (i in names(new)) old[[i]] <- new[[i]]
} else {
for (i in names(new)) old[[i]] <- if (all(is.na(old[[i]]))) new[[i]] else old[[i]]
}
old
}
## Helper function inspired by internal from `ggplot2` defined in `geom-sf.R`
my_default_aesthetics <- function(type) {
if (type == "line") {
my_modify_list(GeomPath$default_aes, list(colour = "red", linetype = 2), force = TRUE)
} else {
my_modify_list(GeomPolygon$default_aes, list(fill = "red", alpha = 0.2), force = TRUE)
}
}
I've kept the stat_my_confint() and StatMyConfint() from above unchanged (only the argument style is now called type according to the naming w/i geom_sf()):
stat_my_confint <- function(mapping = NULL, data = NULL, geom = "my_confint",
position = "identity", na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE,
type = c("polygon", "line"), ...) {
type <- match.arg(type)
ggplot2::layer(
geom = geom,
stat = StatMyConfint,
data = data,
mapping = mapping,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
na.rm = na.rm,
type = type,
...
)
)
}
StatMyConfint <- ggplot2::ggproto("StatMyConfint", ggplot2::Stat,
compute_group = function(data, scales, type) {
if (type == "polygon") {
nd <- data.frame(
x = c(data$x, rev(data$x)),
y = c(data$y - 1, rev(data$y) + 1)
)
nd
} else {
nd <- data.frame(
x = rep(data$x, 2),
y = c(data$y - 1, data$y + 1),
group = c(rep(1, 5), rep(2, 5))
)
nd
}
},
required_aes = c("x", "y")
)
Now the examples from above work fine:
library("ggplot2")
d1 <- data.frame(
x = seq(1, 5),
y = seq(1, 5)
)
ggplot(d1, aes(x = x, y = y)) + geom_line() + geom_my_confint()
ggplot(d1, aes(x = x, y = y)) + geom_line() + geom_my_confint(type = "line")
ggplot(d1, aes(x = x, y = y)) + geom_line() + geom_my_confint(type = "polygon", alpha = 0.8)
ggplot(d1, aes(x = x, y = y)) + geom_line() + geom_my_confint(type = "line", linetype = 4, colour = "red")
ggplot(d1, aes(x = x, y = y)) + geom_line() + stat_my_confint()
ggplot(d1, aes(x = x, y = y)) + geom_line() + stat_my_confint(type = "line")
ggplot(d1, aes(x = x, y = y)) + geom_line() + stat_my_confint(type = "polygon", alpha = 0.8)
ggplot(d1, aes(x = x, y = y)) + geom_line() + stat_my_confint(type = "line", linetype = 4, colour = "red")
However, the solution still fails if you want additionally, e.g., set the fill colour of the polygon by an external grouping variable:
d2 <- data.frame(
x = rep(seq(1, 5), 2),
y = rep(seq(1, 5), 2),
z = factor(c(rep(1, 5), rep(2, 5)))
)
ggplot(d2, aes(x = x, y = y)) + geom_line() + geom_my_confint() + facet_wrap(.~z)
# no error
ggplot(d2, aes(x = x, y = y, fill = z)) + geom_line() + geom_my_confint() + facet_wrap(.~z)
# Error in grid.Call.graphics(C_setviewport, vp, TRUE) :
# non-finite location and/or size for viewport
So still no perfect answer. Help/extensions appreciated!
EDIT:
The error no longer occurs if the size argument is set to 0.5 within GeomMyConfint$default_aes():
Not clear to me why - anyone?!
Here, this works as I don't change the default size for GeomPolygon or GeomPath, but would be problematic otherwise.
I do not find any more errors (for now).
The adapted code:
GeomMyConfint <- ggplot2::ggproto("GeomMyConfint", ggplot2::Geom,
## Setting up all defaults needed for `GeomPolygon` and `GeomPath`
default_aes = ggplot2::aes(
colour = NA,
fill = NA,
size = 0.5,
linetype = NA,
alpha = NA,
subgroup = NULL
),
draw_panel = function(data, panel_params, coord,
rule = "evenodd", # polygon arguments
lineend = "butt", linejoin = "round", # line arguments
linemitre = 10, na.rm = FALSE, arrow = NULL, # line arguments
type = c("polygon", "line")) {
type <- match.arg(type)
## Swap NAs in `default_aes` with own defaults
data <- my_modify_list(data, my_default_aesthetics(type), force = FALSE)
if (type == "polygon") {
GeomPolygon$draw_panel(data, panel_params, coord, rule)
} else {
GeomPath$draw_panel(data, panel_params, coord,
arrow, lineend, linejoin, linemitre, na.rm)
}
},
draw_key = function(data, params, size) {
## Swap NAs in `default_aes` with own defaults
data <- my_modify_list(data, my_default_aesthetics(params$type), force = FALSE)
if (params$type == "polygon") {
draw_key_polygon(data, params, size)
} else {
draw_key_path(data, params, size)
}
}
)
The plot:
ggplot(d2, aes(x = x, y = y, fill = z)) + geom_line() + geom_my_confint() + facet_wrap(.~z)
I have used MrFlicks solution to add different horizontal mean lines to plots shown with facet_grid.
It works great, but I was wondering if it would be possible to add some individual text next to the different lines?
My question is: Is it possible to incorporate something like this in the code? And how would you do it?
geom_text(aes(.7,mean(variable),label = round(mean(variable),digits = 2), vjust = -1))
With some adjustments to the solution of #MrFlick this can be achieved like so:
Instead of only computing yintercept I adjusted MrFlick's function to replace y with the mean(y) which ensures that the labels are put on the y-position of the mean lines.
Instead of returning the whole dataset the adjusted function returns only one row, whereby I set x to mean(x). This ensures that we only get one label.
With these adjustments you can can add labels to the mean lines via
geom_text(aes(x = 10, label = round(..yintercept.., digits = 2)), stat = "mean_line", vjust = -1, hjust = 0)
Try this:
library(ggplot2)
StatMeanLine <- ggproto("StatMeanLine", Stat,
compute_group = function(data, scales) {
transform(data, x = mean(x), y = mean(y), yintercept=mean(y))[1,]
},
required_aes = c("x", "y")
)
stat_mean_line <- function(mapping = NULL, data = NULL, geom = "hline",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
stat = StatMeanLine, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
ggplot(mtcars, aes(mpg, cyl)) +
stat_mean_line(color="red") +
geom_text(aes(x = 10, label = round(..yintercept.., digits = 2)), stat = "mean_line", vjust = -1, hjust = 0) +
geom_point() +
facet_wrap(~ gear)
A common layout in many sites is to draw the grid as shaded bars:
I'm doing this with this function:
grid_bars <- function(data, y, n = 5, fill = "gray90") {
breaks <- pretty(data[[y]], n)
len <- length(breaks)-1
all_bars <- data.frame(
b.id = rep(1:len, 4),
b.x = c(rep(-Inf, len), rep(Inf, len*2), rep(-Inf, len)),
b.y = c(rep(breaks[-length(breaks)], 2), rep(breaks[-1], 2))
)
bars <- all_bars[all_bars$b.id %in% (1:len)[c(FALSE, TRUE)], ]
grid <- list(
geom_polygon(data = bars, aes(b.x, b.y, group = b.id),
fill = fill, colour = fill),
scale_y_continuous(breaks = breaks),
theme(panel.grid = element_blank())
)
return(grid)
}
#-------------------------------------------------
dat <- data.frame(year = 1875:1972,
level = as.vector(LakeHuron))
ggplot(dat, aes(year, level)) +
grid_bars(dat, "level", 10) +
geom_line(colour = "steelblue", size = 1.2) +
theme_classic()
But it needs to specify data and y again. How to take those directly from the ggplot?
After having a look at the options for extending ggplot2 in Hadley Wickham's book on ggplot2 you probably have to set up your own Geom or Stat layer to achieve the desired result. This way you can access the data and aesthetics specified in ggplot() or even pass different data and aesthetics to your fun. Still a newbie in writing extensions for ggplot2 but a first approach may look like so:
library(ggplot2)
# Make bars dataframe
make_bars_df <- function(y, n) {
breaks <- pretty(y, n)
len <- length(breaks) - 1
all_bars <- data.frame(
group = rep(1:len, 4),
x = c(rep(-Inf, len), rep(Inf, len * 2), rep(-Inf, len)),
y = c(rep(breaks[-length(breaks)], 2), rep(breaks[-1], 2))
)
all_bars[all_bars$group %in% (1:len)[c(FALSE, TRUE)], ]
}
# Setup Geom
geom_grid_bars_y <- function(mapping = NULL, data = NULL, stat = "identity",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, n = 5, ...) {
layer(
geom = GeomGridBarsY, mapping = mapping, data = data, stat = stat,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(n = n, ...)
)
}
GeomGridBarsY <- ggproto("GeomGridBarsY", Geom,
required_aes = c("y"),
default_aes = aes(alpha = NA, colour = NA, fill = "gray90", group = NA,
linetype = "solid", size = 0.5, subgroup = NA),
non_missing_aes = aes("n"),
setup_data = function(data, params) {
transform(data)
},
draw_group = function(data, panel_scales, coord, n = n) {
bars <- make_bars_df(data[["y"]], n)
# setup data for GeomPolygon
## If you want this to work with facets you have to take care of the PANEL
bars$PANEL <- factor(1)
# Drop x, y, group from data
d <- data[ , setdiff(names(data), c("x", "y", "group"))]
d <- d[!duplicated(d), ]
# Merge information in data to bars
bars <- merge(bars, d, by = "PANEL")
# Set color = fill
bars[["colour"]] <- bars[["fill"]]
# Draw
grid::gList(
ggplot2::GeomPolygon$draw_panel(bars, panel_scales, coord)
)
},
draw_key = draw_key_rect
)
grid_bars <- function(n = 5, fill = "gray90") {
list(
geom_grid_bars_y(n = n, fill = fill),
scale_y_continuous(breaks = scales::pretty_breaks(n = n)),
theme(panel.grid = element_blank())
)
}
dat <- data.frame(year = 1875:1972,
level = as.vector(LakeHuron))
ggplot(dat, aes(year, level)) +
grid_bars(n = 10, fill = "gray95") +
geom_line(colour = "steelblue", size = 1.2) +
theme_classic()
Just for reference:
A first and simple approach to get grid bars one could simply adjust the size of the grid lines via theme() like so:
# Simple approach via theme
ggplot(dat, aes(year, level)) +
geom_line(colour = "steelblue", size = 1.2) +
scale_y_continuous(breaks = scales::pretty_breaks(n = 10)) +
theme_classic() +
theme(panel.grid.major.y = element_line(size = 8))
Created on 2020-06-14 by the reprex package (v0.3.0)
I want to draw a graph which is familiar to the enterotype plot in the research. But my new multiple-ggproto seems terrible as showed in p1, owing to the missing backgroup color of the label. I've tried multiple variations of this, for example modify GeomLabel$draw_panel in order to reset the default arguments of geom in ggplot2::ggproto. However, I could not find the labelGrob() function which is removed in ggplot2 and grid package. Thus, the solution of modification didn't work. How to modify the backgroup color of label in the multiple-ggproto. Any ideas? Thanks in advance. Here is my code and two pictures.
p1: the background color of label should be white or the text color should be black.
P2:displays the wrong point color, line color and legend.
geom_enterotype <- function(mapping = NULL, data = NULL, stat = "identity", position = "identity",
alpha = 0.3, prop = 0.5, ..., lineend = "butt", linejoin = "round",
linemitre = 1, arrow = NULL, na.rm = FALSE, parse = FALSE,
nudge_x = 0, nudge_y = 0, label.padding = unit(0.15, "lines"),
label.r = unit(0.15, "lines"), label.size = 0.1,
show.legend = TRUE, inherit.aes = TRUE) {
library(ggplot2)
# create new stat and geom for PCA scatterplot with ellipses
StatEllipse <- ggproto("StatEllipse", Stat,
required_aes = c("x", "y"),
compute_group = function(., data, scales, level = 0.75, segments = 51, ...) {
library(MASS)
dfn <- 2
dfd <- length(data$x) - 1
if (dfd < 3) {
ellipse <- rbind(c(NA, NA))
} else {
v <- cov.trob(cbind(data$x, data$y))
shape <- v$cov
center <- v$center
radius <- sqrt(dfn * qf(level, dfn, dfd))
angles <- (0:segments) * 2 * pi/segments
unit.circle <- cbind(cos(angles), sin(angles))
ellipse <- t(center + radius * t(unit.circle %*% chol(shape)))
}
ellipse <- as.data.frame(ellipse)
colnames(ellipse) <- c("x", "y")
return(ellipse)
})
# write new ggproto
GeomEllipse <- ggproto("GeomEllipse", Geom,
draw_group = function(data, panel_scales, coord) {
n <- nrow(data)
if (n == 1)
return(zeroGrob())
munched <- coord_munch(coord, data, panel_scales)
munched <- munched[order(munched$group), ]
first_idx <- !duplicated(munched$group)
first_rows <- munched[first_idx, ]
grid::pathGrob(munched$x, munched$y, default.units = "native",
id = munched$group,
gp = grid::gpar(col = first_rows$colour,
fill = alpha(first_rows$fill, first_rows$alpha), lwd = first_rows$size * .pt, lty = first_rows$linetype))
},
default_aes = aes(colour = "NA", fill = "grey20", size = 0.5, linetype = 1, alpha = NA, prop = 0.5),
handle_na = function(data, params) {
data
},
required_aes = c("x", "y"),
draw_key = draw_key_path
)
# create a new stat for PCA scatterplot with lines which totally directs to the center
StatConline <- ggproto("StatConline", Stat,
compute_group = function(data, scales) {
library(miscTools)
library(MASS)
df <- data.frame(data$x,data$y)
mat <- as.matrix(df)
center <- cov.trob(df)$center
names(center)<- NULL
mat_insert <- insertRow(mat, 2, center )
for(i in 1:nrow(mat)) {
mat_insert <- insertRow( mat_insert, 2*i, center )
next
}
mat_insert <- mat_insert[-c(2:3),]
rownames(mat_insert) <- NULL
mat_insert <- as.data.frame(mat_insert,center)
colnames(mat_insert) =c("x","y")
return(mat_insert)
},
required_aes = c("x", "y")
)
# create a new stat for PCA scatterplot with center labels
StatLabel <- ggproto("StatLabel" ,Stat,
compute_group = function(data, scales) {
library(MASS)
df <- data.frame(data$x,data$y)
center <- cov.trob(df)$center
names(center)<- NULL
center <- t(as.data.frame(center))
center <- as.data.frame(cbind(center))
colnames(center) <- c("x","y")
rownames(center) <- NULL
return(center)
},
required_aes = c("x", "y")
)
layer1 <- layer(data = data, mapping = mapping, stat = stat, geom = GeomPoint,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...))
layer2 <- layer(stat = StatEllipse, data = data, mapping = mapping, geom = GeomEllipse, position = position, show.legend = FALSE,
inherit.aes = inherit.aes, params = list(na.rm = na.rm, prop = prop, alpha = alpha, ...))
layer3 <- layer(data = data, mapping = mapping, stat = StatConline, geom = GeomPath,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(lineend = lineend, linejoin = linejoin,
linemitre = linemitre, arrow = arrow, na.rm = na.rm, ...))
if (!missing(nudge_x) || !missing(nudge_y)) {
if (!missing(position)) {
stop("Specify either `position` or `nudge_x`/`nudge_y`",
call. = FALSE)
}
position <- position_nudge(nudge_x, nudge_y)
}
layer4 <- layer(data = data, mapping = mapping, stat = StatLabel, geom = GeomLabel,
position = position, show.legend = FALSE, inherit.aes = inherit.aes,
params = list(parse = parse, label.padding = label.padding,
label.r = label.r, label.size = label.size, na.rm = na.rm, ...))
return(list(layer1,layer2,layer3,layer4))
}
# data
data(Cars93, package = "MASS")
car_df <- Cars93[, c(3, 5, 13:15, 17, 19:25)]
car_df <- subset(car_df, Type == "Large" | Type == "Midsize" | Type == "Small")
x1 <- mean(car_df$Price) + 2 * sd(car_df$Price)
x2 <- mean(car_df$Price) - 2 * sd(car_df$Price)
car_df <- subset(car_df, Price > x2 | Price < x1)
car_df <- na.omit(car_df)
# Principal Component Analysis
car.pca <- prcomp(car_df[, -1], scale = T)
car.pca_pre <- cbind(as.data.frame(predict(car.pca)[, 1:2]), car_df[, 1])
colnames(car.pca_pre) <- c("PC1", "PC2", "Type")
xlab <- paste("PC1(", round(((car.pca$sdev[1])^2/sum((car.pca$sdev)^2)), 2) * 100, "%)", sep = "")
ylab <- paste("PC2(", round(((car.pca$sdev[2])^2/sum((car.pca$sdev)^2)), 2) * 100, "%)", sep = "")
head(car.pca_pre)
#plot
library(ggplot2)
p1 <- ggplot(car.pca_pre, aes(PC1, PC2, fill = Type , color= Type ,label = Type)) +
geom_enterotype()
p2 <- ggplot(car.pca_pre, aes(PC1, PC2, fill = Type , label = Type)) +
geom_enterotype()
You can manually change the colour scale to give it more emphasis against the background fill colour:
p3 <- ggplot(car.pca_pre, aes(PC1, PC2, fill = Type , color = Type, label = Type)) +
geom_enterotype() +
scale_colour_manual(values = c("red4", "green4", "blue4"))
p3
You can additionally adjust your fill colours by changing the alpha values, or assigning different colour values to give better contrast to your labels.
p4 <- ggplot(car.pca_pre, aes(PC1, PC2, label = Type, shape = Type, fill = Type, colour = Type)) +
geom_enterotype() +
scale_fill_manual(values = alpha(c("pink", "lightgreen", "skyblue"), 1)) +
scale_colour_manual(values = c("red4", "green4", "blue4"))
p4
Finally, if you want a background white colour to your labels, you have to remove the fill option. You can also additionally assign a shape value.
As you can observe, the background text colour is associated with the shape fill colour, while the text label colour is associated with the line colour, the the shape border colour.
p5 <- ggplot(car.pca_pre, aes(PC1, PC2, label = Type, shape = Type, colour = Type)) +
geom_enterotype() + scale_colour_manual(values = c("red4", "green4", "blue4"))
p5