Add unique letters per facet with scales=free

Add unique letters per facet with scales=free - r

Similar to this question, I'm trying to add a, b, c, ... to a grid of facets so they can be referenced in individually elsewhere. With scales = 'fixed', this is relatively easy, as you can even hardcode the x,y coordinates for a geom_text label. However, with scales = 'free', it's a pain to compute all the x,y coordinates for each facet so they labels end up in the same location, visually. Can this be done automatically?

A pure vanilla ggplot2 solution would be to use Inf, -Inf to snap text to the limits of each panel. To automatically get the labels you can use after_stat() to grab the PANEL internal variable. If you plot it as a label, you can have control over the offset from the panel edges by hiding the label itself and setting label.padding.
library(ggplot2)
x = data.frame(a=c('a','b','a','b'),b=c(1,1,2,2),v=runif(8))
ggplot(x,aes(x=v,y=v)) +
geom_point() +
facet_grid('a~b',scales='free') +
geom_label(
aes(x = -Inf, y = Inf, label = after_stat(ifelse(
duplicated(PANEL), "", letters[as.numeric(PANEL)]
))),
vjust = 1, hjust = 0,
fill = NA, label.size = 0, # Don't show box
label.padding = unit(5, "mm") # Control margins to panel bounds
)
If that spacing mechanism seems to finnicky to you, you can use the ggpp::geom_text_npc() function to directly set relative coordinates for the text labels.
x = data.frame(a=c('a','b','a','b'),b=c(1,1,2,2),v=runif(8))
ggplot(x,aes(x=v,y=v)) +
geom_point() +
facet_grid('a~b',scales='free') +
ggpp::geom_text_npc(
aes(npcx = 0.05, npcy = 0.95,
label = after_stat(
ifelse(duplicated(PANEL), "", letters[as.numeric(PANEL)])
))
)
Created on 2022-10-18 by the reprex package (v2.0.1)

Here is a solution with letters as default, but allowing other labels too, and ... is also passed to geom_text.
library(ggplot2)
# main solution
add.letters = function(g,px,py,lab=NULL,...){
data = ggplot_build(g)$layout$layout
if (is.null(lab)){ data$lab = letters[1:nrow(data)] } else { data$lab = lab }
p.range = function(r,p){ r[1] + (r[2]-r[1])*p }
data = do.call(rbind,lapply(1:nrow(data),function(i){
ls = layer_scales(g,data[i,]$ROW,data[i,]$COL)
data.i = cbind(data[i,],
x = p.range(ls$x$range$range,px),
y = p.range(ls$y$range$range,py))
}))
g = g + geom_text(data=data,aes(label=lab,x=x,y=y),...)
}
# mwe
x = data.frame(a=c('a','b','a','b'),b=c(1,1,2,2),v=runif(8))
g = ggplot(x,aes(x=v,y=v)) +
geom_point() +
facet_grid('a~b',scales='free')
g = add.letters(g,.05,.95,lab=c('a','bb','ccc','dddd'),hjust='left')
print(g)

Related

no visible binding for global variable 'density'

Let's consider my ggplot function for histogram:
library(ggplot2)
get_histogram <- function(vec, width) {
df <- data.frame(vec)
temp <- ggplot2::ggplot(df, aes(x = vec)) +
# Delete x axis name and add plot title
labs(
x = NULL,
title = "Empirical histogram vs standard normal density"
) +
# Center plot title
theme(plot.title = element_text(hjust = 0.5)) +
# Add histogram with respect to given bin width
geom_histogram(
binwidth = width,
aes(y = stat(density)),
fill = I("blue"),
col = I("red"),
alpha = I(.2)
) +
# Adding probability density function of standard normal distribution.
stat_function(fun = function(x) {
stats::dnorm(x, mean = 0, sd = 1)
})
temp
}
Let's see how it works:
get_histogram(rnorm(100), width = 0.4)
However I will see error:
no visible binding for global variable 'density'
when running pacakge checks. Do you know where is the problem? I tried to find it, but it seems that most of those errors is connected with dplyr package rather than ggplot

divide the y axis to make part with a score <25 occupies the majority in ggplot

I want to divide the y axis for the attached figure to take part with a score <25 occupies the majority of the figure while the remaining represent a minor upper part.
I browsed that and I am aware that I should use scale_y_discrete(limits .I used this p<- p+scale_y_continuous(breaks = 1:20, labels = c(1:20,"//",40:100)) but it doesn't work yet.
I used the attached data and this is my code
Code
p<-ggscatter(data, x = "Year" , y = "Score" ,
color = "grey", shape = 21, size = 3, # Points color, shape and size
add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
add = "loess", #reg.line
conf.int = T,
cor.coef = F, cor.method = "pearson",
xlab = "Year" , ylab= "Score")
p<-p+ coord_cartesian(xlim = c(1980, 2020));p

Here is as close as I could get getting a fake axis break and resizing the upper area of the plot. I still think it's a bad idea and if this were my plot I'd much prefer a more straightforward axis transform.
First, we'd need a function that generates a transform that squeezes all values above some threshold:
library(ggplot2)
library(scales)
# Define new transform
my_transform <- function(threshold = 25, squeeze_factor = 10) {
force(threshold)
force(squeeze_factor)
my_transform <- trans_new(
name = "trans_squeeze",
transform = function(x) {
ifelse(x > threshold,
((x - threshold) * (1 / squeeze_factor)) + threshold,
x)
},
inverse = function(x) {
ifelse(x > threshold,
((x - threshold) * squeeze_factor) + threshold,
x)
}
)
return(my_transform)
}
Next we apply that transformation to the y-axis and add a fake axis break. I've used vanilla ggplot2 code as I find the ggscatter() approach confusing.
ggplot(data, aes(Year, Score)) +
geom_point(color = "grey", shape = 21, size = 3) +
geom_smooth(method = "loess", fill = "lightgray") +
# Add fake axis lines
annotate("segment", x = -Inf, xend = -Inf,
y = c(-Inf, Inf), yend = c(24.5, 25.5)) +
# Apply transform to y-axis
scale_y_continuous(trans = my_transform(25, 10),
breaks = seq(0, 80, by = 10)) +
scale_x_continuous(limits = c(1980, 2020), oob = oob_keep) +
theme_classic() +
# Turn real y-axis line off
theme(axis.line.y = element_blank())
You might find it informative to read Hadley Wickham's view on discontinuous axes. People sometimes mock weird y-axes.

position_dodge and nudge_y together

I am trying to add labels (letters) above a barplot using ggplot2 function geom_text. My bars are separated using position=position_dodge() and so I need to apply the same for the new labels. However I would like to use also nudge_y to separate the labels from the bar. If I try to use both together R complains that I can use only one of either options. I'd like to do something like this:
Tukey.labels <- geom_text(data=stats,
aes(x=factor(Treatment2), y=x.mean,
label=Tukey.dif),
size=4, nudge_y=3, # move letters in Y
position=position_dodge(0.5)) # move letters in X
To create something like this image Does anybody knows a possibility to shift all my labels the same distance in Y while doing position_dodge at the same time? I could not find answer for this in other posts

Hard to troubleshoot without a reproducible example. Hopefully this helps:
library(dplyr); library(ggplot2)
ggplot(mtcars %>% rownames_to_column("car") ,
aes(as.factor(cyl), mpg, group = car)) +
geom_col(position = position_dodge(0.9)) +
geom_errorbar(aes(ymin = mpg - wt,
ymax = mpg + wt),
position = position_dodge(0.9)) +
geom_text(aes(label = gear, y = mpg + wt), vjust = -0.5,
position = position_dodge(0.9))

In the spirit of the original question, one can easily combine ggplot's position_nudge and position_dodge like this:
position_nudgedodge <- function(x = 0, y = 0, width = 0.75) {
ggproto(NULL, PositionNudgedodge,
x = x,
y = y,
width = width
)
}
PositionNudgedodge <- ggproto("PositionNudgedodge", PositionDodge,
x = 0,
y = 0,
width = 0.3,
setup_params = function(self, data) {
l <- ggproto_parent(PositionDodge,self)$setup_params(data)
append(l, list(x = self$x, y = self$y))
},
compute_layer = function(self, data, params, layout) {
d <- ggproto_parent(PositionNudge,self)$compute_layer(data,params,layout)
d <- ggproto_parent(PositionDodge,self)$compute_layer(d,params,layout)
d
}
)
Then you can use it like this:
Tukey.labels <- geom_text(data=stats,
aes(x=factor(Treatment2), y=x.mean, label=Tukey.dif),
size=4,
position=position_nudgedodge(y=3,width=0.5)
)

Constrict ggplot ellips to realistic/possible values

When plotting an ellips with ggplot is it possible to constrain the ellips to values that are actually possible?
For example, the following reproducible code and data plots Ele vs. Var for two species. Var is a positive variable and cannot be negative. Nonetheless, negative values are included in the resulting ellips. Is it possible to bound the ellips by 0 on the x-axis (using ggplot)?
More specifically, I am picturing a flat edge with the ellipsoids truncated at 0 on the x-axis.
library(ggplot2)
set.seed(123)
df <- data.frame(Species = rep(c("BHS", "MTG"), each = 100),
Ele = c(sample(1500:3000, 100), sample(2500:3500, 100)),
Var = abs(rnorm(200)))
ggplot(df, aes(Var, Ele, color = Species)) +
geom_point() +
stat_ellipse(aes(fill = Species), geom="polygon",level=0.95,alpha=0.2)

You could edit the default stat to clip points to a particular value. Here we change the basic stat to trim x values less than 0 to 0
StatClipEllipse <- ggproto("StatClipEllipse", Stat,
required_aes = c("x", "y"),
compute_group = function(data, scales, type = "t", level = 0.95,
segments = 51, na.rm = FALSE) {
xx <- ggplot2:::calculate_ellipse(data = data, vars = c("x", "y"), type = type,
level = level, segments = segments)
xx %>% mutate(x=pmax(x, 0))
}
)
Then we have to wrap it in a ggplot stat that is identical to stat_ellipe except that it uses our custom Stat object
stat_clip_ellipse <- function(mapping = NULL, data = NULL,
geom = "path", position = "identity",
...,
type = "t",
level = 0.95,
segments = 51,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE) {
layer(
data = data,
mapping = mapping,
stat = StatClipEllipse,
geom = geom,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
type = type,
level = level,
segments = segments,
na.rm = na.rm,
...
)
)
}
then you can use it to make your plot
ggplot(df, aes(Var, Ele, color = Species)) +
geom_point() +
stat_clip_ellipse(aes(fill = Species), geom="polygon",level=0.95,alpha=0.2)
This was inspired by the source code for stat_ellipse.

Based on my comment above, I created a less-misleading option for visualization. This is ignoring the problem with y being uniformly distributed, since that's a somewhat less egregious problem than the heavily skewed x variable.
Both these options use the ggforce package, which is an extension of ggplot2, but just in case, I've also included the source for the particular function I used.
library(ggforce)
library(scales)
# power_trans <- function (n)
# {
# scales::trans_new(name = paste0("power of ", fractions(n)), transform = function(x) {
# x^n
# }, inverse = function(x) {
# x^(1/n)
# }, breaks = scales::extended_breaks(), format = scales::format_format(),
# domain = c(0, Inf))
# }
Option 1:
ggplot(df, aes(Var, Ele, color = Species)) +
geom_point() +
stat_ellipse(aes(fill = Species), geom="polygon",level=0.95,alpha=0.2) +
scale_x_sqrt(limits = c(-0.1,3.5),
breaks = c(0.0001,1:4),
labels = 0:4,
expand = c(0.00,0))
This option stretches the x-axis along a square-root transform, spreading out the points clustered near zero. Then it computes an ellipse over this new space.
Advantage: looks like an ellipse still.
Disadvantage: in order to get it to play nice and label the Var=0 point on the x axis, you have to use expand = c(0,0), which clips the limits exactly, and so requires a bit more fiddling with manual limits/breaks/labels, including choosing a very small value (0.0001) to be represented as 0.
Disadvantage: the x values aren't linearly distributed along the axis, which requires a bit more cognitive load when reading the figure.
Option 2:
ggplot(df, aes(sqrt(Var), Ele, color = Species)) +
geom_point() +
stat_ellipse() +
coord_trans(x = ggforce::power_trans(2)) +
scale_x_continuous(breaks = sqrt(0:4), labels = 0:4,
name = "Var")
This option plots the pre-transformed sqrt(Var) (notice the aes(...)). It then calculates the ellipses based on this new approximately normal value. Then it stretches out the x-axis so that the values of Var are once again linearly spaced, which distorts the ellipse in the same transformation.
Advantage: looks cool.
Advantage: values of Var are easy to interpret on the x-axis.
Advantage: you can see the density near Var=0 with the points and the wide flat end of the "egg" easily.
Advantage: the pointy end shows you how low the density is at those values.
Disadvantage: looks unfamiliar and requires explanation and additional cognitive load to interpret.

r: ggplot2 and shiny: how to make text more readable without using geom_label?

I use shiny to create some reactive plots. When I use geom_text to put the intercepts of geom_vlines next to the lines, I can hardly read the text because of the colors of the plot. I have tried with various colours, none work well.
When I use geom_label instead of geom_text from the {ggplot2} package, my plots take much longer to load. The time basically triples. I have read the article on geom_label and it says that it takes longer to create the plot.
So my question is, how could I make text more readable on the plot without using geom_label and thus slowing down the time to create the plot? Does anybody have any ideas? I know there are solutions, but which one is the ideal one in terms of the time it takes to create the plot. Thank you!
EDIT
Here is an example. I can not change the colors of the plot or text. I could change the position along the y axis of the text.
set.seed(1)
df <- data.frame(numbers = rnorm(1000, 1000, 500))
p123 <- ggplot(data = df, aes(x = numbers))+
geom_histogram(bins = 15, fill = "#000D62")+
geom_vline(xintercept = mean(df$numbers)*2.5)+
geom_text(label = paste0("value = ", round(mean(df$numbers)*2.5, 0),
"€"), x = mean(df$numbers)*2.5, y = 4,
size = 4, colour = "#FFBA18")+
labs(x = "Numbers", y = "number of observations")
plot(p123)

Option 1
One option is to replicate the geom_text() layer and put a copy of it below in white and bold to serve as a makeshift dropshadow. I don't know if that would actually improve your performance, but it does technically avoid using geom_label(). Also I've found that it can be used with plotly::ggplotly() which is not true of geom_label().
library(tidyverse)
# sim data
set.seed(1)
df <- data.frame(numbers = rnorm(1000, 1000, 500))
# base plot
p <- ggplot(data = df, aes(x = numbers)) +
geom_histogram(bins = 15, fill = "#000D62") +
geom_vline(xintercept = mean(df$numbers) * 2.5) +
labs(x = "Numbers", y = "number of observations")
## with plain ggplot2 using two geom_text layers
p +
geom_text(label = paste0("value = ", round(mean(df$numbers) * 2.5, 0), "€"),
x = mean(df$numbers) * 2.5, y = 4, size = 4,
colour = "white", fontface = "bold") +
geom_text(label = paste0("value = ", round(mean(df$numbers) * 2.5, 0), "€"),
x = mean(df$numbers) * 2.5, y = 4, size = 4,
colour = "#FFBA18")
Option 2
Another option is to use the {shadowtext} package which directly addresses this issue.
## with shadowtext library
library(shadowtext)
p +
geom_shadowtext(
label = paste0("value = ", round(mean(df$numbers) * 2.5, 0), "€"),
x = mean(df$numbers) * 2.5, y = 4, size = 4, colour = "#FFBA18")
Created on 2022-05-18 by the reprex package (v2.0.1)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Add unique letters per facet with scales=free - r

Related

no visible binding for global variable 'density'

divide the y axis to make part with a score <25 occupies the majority in ggplot

position_dodge and nudge_y together

Constrict ggplot ellips to realistic/possible values

r: ggplot2 and shiny: how to make text more readable without using geom_label?

Categories

Resources