automagically using labels (haven semantics) in ggplot2 plots - r

I'm plotting data marked up using haven semantics, i.e. variables and values have labels defined via attributes.
Often, these labels are also what I want in my axis titles and ticks.
library(ggplot2)
mtcars$mpg = haven::labelled(mtcars$mpg, labels = c("low" = 10, "high" = 30))
attributes(mtcars$mpg)$label = "miles per gallon"
ggplot(mtcars, aes(mpg, cyl)) + geom_point() +
scale_x_continuous(attributes(mtcars$mpg)$label,
breaks = attributes(mtcars$mpg)$labels,
labels = names(attributes(mtcars$mpg)$labels))
Could I write a helper that replaces that laborious scale_x_continuous statement with something that can more easily be iterated? E.g. something like
scale_x_continuous(label_from_attr, breaks = breaks_from_attr, labels = value_labels_from_attr). Or maybe even + add_labels_from_attributes() to replace the whole thing?
I'm aware that I can write/use helpers like Hmisc::label to slightly shorten the attribute-code above, but that's not what I want here.

I don't have a good scale, but you can use a function like this:
label_x <- function(p) {
b <- ggplot_build(p)
x <- b$plot$data[[b$plot$labels$x]]
p + scale_x_continuous(
attributes(x)$label,
breaks = attributes(x)$labels,
labels = names(attributes(x)$labels)
)
}
Then use as (+ won't do):
p <- ggplot(mtcars, aes(mpg, cyl)) + geom_point()
label_x(p)
Alternatively, use a pipe:
mtcars %>% { ggplot(., aes(mpg, cyl)) + geom_point() } %>% label_x()
Old solution
use_labelled <- function(l, axis = "x") {
if (axis == "x") {
scale_x_continuous(attributes(l)$label,
breaks = attributes(l)$labels,
labels = names(attributes(l)$labels))
}
if (axis == "y") {
scale_y_continuous(attributes(l)$label,
breaks = attributes(l)$labels,
labels = names(attributes(l)$labels))
}
}
Then you just give:
ggplot(mtcars, aes(mpg, cyl)) + geom_point() + use_labelled(mtcars$cyl)
Or for the y-axis:
ggplot(mtcars, aes(cyl, mpg)) + geom_point() + use_labelled(mtcars$cyl, "y")

Another approach is to write a wrapper for ggplot() that has its own class. Then attributes have full visibility when the corresponding print method is called. See ?ag.print from package 'yamlet' (0.2.1).
library(ggplot2)
library(yamlet)
library(magrittr)
mtcars$disp %<>% structure(label = 'displacement', unit = 'cu. in.')
mtcars$mpg %<>% structure(label = 'mileage', unit = 'miles/gallon')
mtcars$am %<>% factor(levels = c(0,1), labels = c('automatic','manual'))
mtcars$am %<>% structure(label = 'transmission')
agplot(mtcars, aes(disp, mpg, color = am)) + geom_point()

Related

Optional facets as a function parameter

How can you specify a facet parameter that is optional for facet_wrap(), without a weird additional label showing up?
For facet_wrap(), it works as expected when facets are specified. But if it's NULL, there is a weird (all) facet. Is it possible to get rid of that facet label without adding another parameter to the function?
foo_wrap <- function(x) {
ggplot(mtcars) +
aes(x = mpg, y = disp) +
geom_point() +
facet_wrap(vars({{ x }}))
}
foo_wrap (cyl) # cylinder facets
foo_wrap (NULL) # how to get rid of "(all)"?
How can you get rid of "(all)"?
Adding these examples as references in case people find this by searching
Below is an example function with optional facets for facet_grid(), where it works as expected:
foo_grid <- function(x) {
ggplot(mtcars) +
aes(x = mpg, y = disp) +
geom_point() +
facet_grid(rows=NULL, cols=vars({{ x }}))
}
foo_grid (cyl) # cylinder facets
foo_grid (NULL) # no facets, as expected
Here is an example with hard coded rows facetting. Note that you need to call vars():
foo_grid_am <- function(x) {
ggplot(mtcars) +
aes(x = mpg, y = disp) +
geom_point() +
facet_grid(rows=vars(am), cols=vars({{ x }}))
}
foo_grid_am (cyl) # automatic-manual x cylinder facets
One option would be to add a conditional facet layer where I use rlang::quo_is_null(rlang::enquo(x)) to check whether a faceting variable was provided or not:
Note: I made NULL the default.
library(ggplot2)
library(rlang)
foo_wrap <- function(x = NULL) {
facet_layer <- if (!rlang::quo_is_null(rlang::enquo(x))) facet_wrap(vars({{ x }}))
ggplot(mtcars) +
aes(x = mpg, y = disp) +
geom_point() +
facet_layer
}
foo_wrap(cyl)
foo_wrap(NULL)

Add additional labels from a DataFrame to a facet_grid with existing label

I have a set data that I need to add to levels of labels. One on a single chart within the facet grid, and one from a small dataframe with entries for for each chart.
In the example below you'll see that I can add to a single chart no problem but when I try to add from the df I get the error -
Error in FUN(X[[i]], ...) : object 'wt' not found
Preparation:
library(ggplot2)
p <- ggplot(mtcars, aes(mpg, wt)) + geom_line()
p <- p + facet_grid(. ~ cyl)
ann_text <- data.frame(mpg = 30,wt = 5,lab = "Text",
cyl = factor(8,levels = c("4","6","8")))
dfl <- data.frame(name = c('Jim',"Bob", "Sue"), r = c(-0.2, 0.5, -0.4))
Single Label:
p + geom_text(data = ann_text,label = "Text")
Multiple Labels:
p + geom_text(data = ann_text,label = "Text") +
geom_text(data = dfl, mpg = 30,wt = 5, aes(label = r))
The method I'm using is trying to recreate other examples I've found here on SO and elsewhere but I seem to be missing something.
It's not working in your second code because in the second geom_text, your mpg and wt in not in aes(). Also, these two variables are absent in your dfl.
If you wish to have better control of the labelling of your r variable, you can create extra columns in dfl specifying the x and y location of the label, and use these variables in geom_text(aes()).
Note that I have modified the y position in the second geom_text to avoid overlapping "0,2" with "Text".
library(ggplot2)
p <- ggplot(mtcars, aes(mpg, wt)) + geom_line()
p <- p + facet_grid(. ~ cyl)
ann_text <- data.frame(mpg = 30,wt = 5,lab = "Text",
cyl = factor(8,levels = c("4","6","8")))
dfl <- data.frame(name = c('Jim',"Bob", "Sue"), r = c(-0.2, 0.5, -0.4))
p + geom_text(data = ann_text,label = "Text") +
geom_text(data = dfl, aes(30, 4, label = r), check_overlap = T)
Created on 2022-05-06 by the reprex package (v2.0.1)

Shorthand functions for multiple geoms in ggplot2

I would like to create shorthand notations or functions that combines multiple geoms for ggplot.
For example, instead of
mtcars %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
geom_smooth(method = "lm") +
ggpubr::stat_cor()
I would like to be able to create a function to combine the geoms like so
lm_and_cor <- function() {
geom_smooth(method = "lm", se = FALSE) +
stat_cor()
}
mtcars %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
lm_and_cor()
I am aware that I can create functions that does all of the plotting, basically
plot_data <- function(x) {
x %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
geom_smooth(method = "lm") +
ggpubr::stat_cor()
}
which to be fair does what I want, to some degree. However, I would instead like to combine multiple geoms in a single function, as the underlying geom (e.g. point, lines, etc.) will not always be the same. Is this doable, and is it feasible?
With ggplot2 you can use list of elements:
lm_and_cor <- function()
list(geom_smooth(method = "lm", se = FALSE),
ggpubr::stat_cor()
)
mtcars %>%
ggplot(aes(x = cyl, y = mpg)) +
geom_point() +
lm_and_cor()
Output:
Do you mean something like this?
You can store multiple geom in a list object.
Edit: I misunderstand the question. This should meet the expectation.
data(iris)
library(ggplot2)
x <- list(geom_point(), geom_line())
ggplot(iris, aes(Sepal.Length, Sepal.Width)) + x
Or if you want to make a function to plot by column use this {{variable}}.
library(dplyr)
plotting <- function(data, x, y){
data %>%
ggplot(aes({{x}}, {{y}})) +
geom_point() +
geom_smooth(method = "lm")}
plotting(iris, Sepal.Length, Sepal.Width)

Display Greek symbols and charge facet titles at the same time with ggplot

I know how to modify titles in ggplot without altering the original data. Suppose I have the following data frame and I want to change the labels. Then, I would do so in the following way
df <- data.frame(x = 1:4, y = 1:4, label = c(c("params[1]", "params[2]", "params[3]",
"params[4]")))
params_names <- list(
'params[1]'= "beta[11]",
'params[2]'= "beta[22]",
'params[3]'= "beta[33]",
'params[4]'= "beta[44]"
)
param_labeller <- function(variable, value){
params_names[value]
}
ggplot(df, aes(x=x,y=y)) +
geom_point() +
facet_grid(~label, labeller = param_labeller)
If I wanted to display the subscripts, I would just do this
ggplot(df, aes(x=x,y=y)) +
geom_point() +
facet_grid(~label, labeller = label_parsed)
How do I apply both operations at the same time?
I don't know exactly if this conflicts with you not wanting to "alter" the original data, but you add the labelling information to the factor itself:
df$label2 <- factor(df$label,
labels = c("beta[4]", "beta[24]", "beta[42]", "beta[43]"))
ggplot(df, aes(x = x, y = y)) +
geom_point() +
facet_grid( ~ label2, labeller = label_parsed)
This produces the following plot:
Plot with formatted facet labels

creating custom annotations in only one facet of a ggplot

Is there a way to specify that a custom_annotation only applies to one facet of a ggplot?
For example, if I run the following code
library(tidyverse)
library(grid)
text_grob=grobTree(textGrob("text",x=0.5, y=0.6, rot=90,
gp=gpar(col="red")))
ggplot(mtcars, aes(x=mpg, y =drat))+
geom_point() +
facet_wrap(~cyl) +
annotation_custom(overrep_grob)
I get this
How can I only keep the rightmost red "text" annotation and not add the "text" annotation to the first two facets? Note I can't use geom_text or annotate because I need to make use of textGrob's relative text positioning
egg has geom_custom,
library(ggplot2)
library(grid)
library(egg)
d = data.frame(cyl=6, drat = 4, mpg = 15)
d$grob <- list(textGrob("text",rot=90, hjust = 0, gp=gpar(col="red")))
ggplot(mtcars, aes(x=mpg, y=drat))+
geom_point() +
facet_wrap(~cyl) +
geom_custom(data = d, aes(data = grob), grob_fun = identity)
You could also do this using geom_text by calculating the relative position of the text needed. Note that here, the relative position is slightly different than the one you use above because here I define the relative position as some proportion of the dynamic range. You can choose a different value for rel to get the position you need. I find that this way makes the positioning less arbitrary.
library(tidyverse)
rel_pos <- function(.data, var, rel){
var <- enquo(var)
.data %>%
summarise(x = sum(max(!!var), min(!!var))* rel) %>% .[1, "x"]
}
my_text <- data_frame(mpg = rel_pos(mtcars, mpg, 0.5),
drat = rel_pos(mtcars, drat, 0.6) ,
cyl = 8, lab = "text")
ggplot(mtcars, aes(x=mpg, y =drat))+
geom_point() +
facet_wrap(~cyl)+
geom_text(data = my_text, aes(label = lab), color = "red", angle = 90)
Created on 2018-08-15 by the reprex
package (v0.2.0).

Resources