Specify ggplot2 labeller options without passing label values - r

I have a facetted point plot, and the facets are based on multiple factors:
p = p + facet_wrap(~ model + run, ncol = 1, scales = 'fixed')
I'm happy with ggplot2 using the existing unique values of model and run to build the facet labels, but I'd like to have them across one line, not multiple. However,
p = p + facet_wrap(~ model + run, ncol = 1, scales = 'fixed',
labeller = label_value(multi_line = FALSE)
results in a missing argument error, because label_value() expects the argument labels first:
Error in lapply(labels, as.character) :
argument "labels" is missing, with no default
I'm not sure how to supply this given there are multiple facetting variables specified. If I leave the labeller out entirely, ggplot2 seems happy to work it out itself.

You want to pass the function without calling it (it will be called with the labels when constructing the plot), so:
p = p + facet_wrap(~ model + run, ncol = 1, scales = 'fixed',
labeller = function(labs) {label_value(labs, multi_line = FALSE)})

Related

geom_histogram() to hist() in R

I have the following code:
mapping <- aes(
x = values
, color = factor(par_a)
)
plot <- (ggplot(data=data, mapping=mapping)
+ geom_histogram(binwidth = 5, na.rm = TRUE)
+ facet_grid(par_b ~ par_c ~ par_d, scales = "free")
)
Since I am asked to use instead hist() because of the possibility to use plot=FALSE, now I want to adjust the code.
mapping <- aes(
x = values
, color = factor(par_a)
)
plot2 <- hist(values, breaks = seq(min(values), max(values)+5, by = 5))
+ facet_grid(par_b ~ par_c ~ par_d, scales = "free")
However, I have no idea how to implement the 'color = factor(par_a)' or the whole line 'facet_grid(par_b ~ par_c ~ par_d, scales = "free")'. I guess these functions are not explicitly supported for 'hist()', but I would really appreciate it if someone could tell me what the alternatives for them would be?
the base plotting can use the function par(mfrow=c(num_rows,num_cols)) in order to build subplots. The next plot, hist, etc. calls that you make will fill the desired subplots. and to color your bars within the plots you can make a variable as described here to pass to the color parameter of hist.

Plotting two stat_function()'s in a grid using ggplot

I want to output two plots in a grid using the same function but with different input for x. I am using ggplot2 with stat_function as per this post and I have combined the two plots as per this post and this post.
f01 <- function(x) {1 - abs(x)}
ggplot() +
stat_function(data = data.frame(x=c(-1, 1)), aes(x = x, color = "red"), fun = f01) +
stat_function(data = data.frame(x=c(-2, 2)), aes(x = x, color = "black"), fun = f01)
With the following outputs:
Plot:
Message:
`mapping` is not used by stat_function()`data` is not used by stat_function()`mapping` is not used by stat_function()`data` is not used by stat_function()
I don't understand why stat_function() won't use neither of the arguments. I would expect to plot two graphs one with x between -1:1 and the second with x between -2:2. Furthermore it takes the colors as labels, which I also don't understand why. I must be missing something obvious.
The issue is that according to the docs the data argument is
Ignored by stat_function(), do not use.
Hence, at least in the second call to stat_function the data is ignored.
Second, the
The function is called with a grid of evenly spaced values along the x axis, and the results are drawn (by default) with a line.
Therefore both functions are plotted over the same range of x values.
If you simply want to draw functions this can be achievd without data and mappings like so:
library(ggplot2)
f01 <- function(x) {1 - abs(x)}
ggplot() +
stat_function(color = "black", fun = f01, xlim = c(-2, 2)) +
stat_function(color = "red", fun = f01, xlim = c(-1, 1))
To be honest, I'm not really sure what happens here with ggplot and its inner workings. It seems that the functions are always applied to the complete range, here -2 to 2. Also, there is an issue on github regarding a wrong error message for stat_function.
However, you can use the xlim argument for your stat_function to limit the range on which a function is drawn. Also, if you don't specify the colour argument by a variable, but by a manual label, you need to tell which colours should be used for which label with scale_colour_manual (easiest with a named vector). I also adjusted the line width to show the function better:
library(ggplot2)
f01 <- function(x) {1 - abs(x)}
cols <- c("red" = "red", "black" = "black")
ggplot() +
stat_function(data = data.frame(x=c(-1, 1)), aes(x = x, colour = "red"), fun = f01, size = 1.5, xlim = c(-1, 1)) +
stat_function(data = data.frame(x=c(-2, 2)), aes(x = x, colour = "black"), fun = f01) +
scale_colour_manual(values = cols)

Best way to calculate number of facets in geom_hline/_vline

When I combine geom_vline() with facet_grid() like so:
DATA <- data.frame(x = 1:6,y = 1:6, f = rep(letters[1:2],3))
ggplot(DATA,aes(x = x,y = y)) +
geom_point() +
facet_grid(f~.) +
geom_vline(xintercept = 2:3,
colour =c("goldenrod3","dodgerblue3"))
I get an error message stating Error: Aesthetics must be either length 1 or the same as the data (4): colour because there are two lines in each facet and there are two facets. One way to get around this is to use rep(c("goldenrod3","dodgerblue3"),2), but this requires that every time I change the faceting variables, I also have to calculate the number of facets and replace the magic number (2) in the call to rep(), which makes re-using ggplot code so much less nimble.
Is there a way to get the number of facets directly from ggplot for use in this situation?
You could put the xintercept and colour info into a data.frame to pass to geom_vline and then use scale_color_identity.
ggplot(DATA, aes(x = x, y = y)) +
geom_point() +
facet_grid(f~.) +
geom_vline(data = data.frame(xintercept = 2:3,
colour = c("goldenrod3","dodgerblue3") ),
aes(xintercept = xintercept, color = colour) ) +
scale_color_identity()
This side-steps the issue of figuring out the number of facets, although that could be done by pulling out the number of unique values in the faceting variable with something like length(unique(DATA$f)).

multi_line does not work with label_parsed?

I'm trying to make a graph with facet labels containing an expression and a regular value. But I can't make label_parsed to work with 'multi_line = FALSE'. Is there another way to make it in 1 line? (I mean besides combining the two factors in 1)
example:
df<-data.frame(x=1:3,y=1:3,f1=rep("TCRb",3),f2=1:3)
#make label to be parsed
df$f1.<-df$f1
levels(df$f1.)<-"paste('TCR',beta)^'-/-'"
#plot with two factor labels in 1 line
ggplot(df,aes(x,y))+geom_point()+
facet_wrap(~f1+f2,labeller=labeller(.multi_line=F))
#now with two lines and the parsed label
ggplot(df,aes(x,y))+geom_point()+
facet_wrap(~f1.+f2,labeller=labeller(f1.=label_parsed,.multi_line=T))
#it doesn't work with 1 line
ggplot(df,aes(x,y))+geom_point()+
facet_wrap(~f1.+f2,labeller=labeller(f1.=label_parsed,.multi_line=F))
If you use label_parsed for the label for the whole margin (.cols in your example) you can parse and keep everything on the same line at the same time.
ggplot(df, aes(x, y)) +
geom_point() +
facet_wrap(~f1. + f2, labeller = labeller(.cols = label_parsed, .multi_line = FALSE))
I don't see how to pass an argument directly to a labeller function like label_parsed, but another option is to make a new parsing function with multi_line set to FALSE.
label_parsed2 = function(labels) {
label_parsed(labels = labels, multi_line = FALSE)
}
ggplot(df, aes(x, y)) +
geom_point() +
facet_wrap(~f1. + f2, labeller = label_parsed2)

recreate scale_fill_brewer with scale_fill_manual

I am trying to understand the connection between scale_fill_brewer and scale_fill_manual of package ggplot2.
First, generate a ggplot with filled colors:
library(ggplot2)
p <- ggplot(data = mtcars, aes(x = mpg, y = wt,
group = cyl, fill = factor(cyl))) +
geom_area(position = 'stack')
# apply ready-made palette with scale_fill_brewer from ggplot2
p + scale_fill_brewer(palette = "Blues")
Now, replicate with scale_fill_manual
library(RColorBrewer)
p + scale_fill_manual(values = brewer.pal(3, "Blues"))
where 3 is the number of fill-colors in the data. For convenience, I have used the brewer.pal function of package RColorBrewer.
As far as I understand, the convenience of scale_fill_brewer is that it automatically computes the number of unique levels in the data (3 in this example). Here is my attempt at replicating:
p + scale_fill_manual(values = brewer.pal(length(levels(factor(mtcars$cyl))), "Blues"))
My question is: how does scale_fill_brewer compute the number of levels in the data?
I'm interested in understanding what else fill_color_brewer might be doing under the hood. Might I run into any difficulty if I replace the more user friendly fill_color_brewer with a more contorted implementation of scale_fill_manual like the one above.
Perusing the source code:
scale_fill_brewer
function (..., type = "seq", palette = 1) {
discrete_scale("fill", "brewer", brewer_pal(type, palette), ...)
}
I couldn't see through this how scale_fill_brewer computes the number of unique levels in the data. Perhaps hidden in the ... ?
Edit: Where does the function scale_fill_brewer receive instructions to compute the number of levels in the data? Is it in "seq" or in ... or elsewhere?
The discrete_scale function is intricate and I'm lost. Here are its arguments:
discrete_scale <- function(aesthetics, scale_name, palette, name = NULL,
breaks = waiver(), labels = waiver(), legend = NULL, limits = NULL,
expand = waiver(), na.value = NA, drop = TRUE, guide="legend") {
Does any of this compute the number of levels?
The easiest way is to trace it is to think in terms of (1) setting up the plot data structure, and (2) resolving the aesthetics. It uses S3 so the branching is implicit
The setup call sequence
[scale-brewer.R] scale_fill_brewer(type="seq", palette="Blues")
[scale-.R] discrete_scale(...) - return an object representing the scale
structure(list(
call = match.call(),
aesthetics = aesthetics,
scale_name = scale_name,
palette = palette,
range = DiscreteRange$new(), ## this is scales::DiscreteRange
...), , class = c(scale_name, "discrete", "scale"))
The resolve call sequence
[plot-build.R] ggplot_build(plot) - for non-position scales, apply scales_train_df
# Train and map non-position scales
npscales <- scales$non_position_scales() ## scales is plot$scales, S4 type Scales
if (npscales$n() > 0) {
lapply(data, scales_train_df, scales = npscales)
data <- lapply(data, scales_map_df, scales = npscales)
}
[scales-.r] scales_train_df(...) - iterate again over scales$scales (list)
[scale-.r] scale_train_df(...) - iterate again
[scale-.r] scale_train(...) - S3 generic function
[scale-.r] scale_train.discrete(...) - almost there...
scale$range$train(x, drop = scale$drop)
but scale$range is a DiscreteRange instance, so it calls (scales::DiscreteRange$new())$train, which overwrites scale$range!
range <<- train_discrete(x, range, drop)
scales:::train_discrete(...) - again, almost there...
scales:::discrete_range(...) - still not there..
scales:::clevels(...) - there it is!
As of this point, scale$range has been overwritten by the levels of the factor. Unwinding the call stack to #1, we now call scales_map_df
[plot-build.R] ggplot_build(plot) - for non-position scales, apply scales_train_df
# Train and map non-position scales
npscales <- scales$non_position_scales() ## scales is plot$scales, S4 type Scales
if (npscales$n() > 0) {
lapply(data, scales_train_df, scales = npscales)
data <- lapply(data, scales_map_df, scales = npscales)
}
[scales-.r] scale_maps_df(...) - iterate
[scale-.r] scale_map_df(...) - iterate
[scale-.r] scale_map.discrete - fill up the palette (non-position scale!)
scale_map.discrete <- function(scale, x, limits = scale_limits(scale)) {
n <- sum(!is.na(limits))
pal <- scale$palette(n)
...
}

Resources