(x-posted to community.rstudio.com)
I'm wondering if it's possible to change the axis text in ggplot2 programatically or if there is some native way to do this in ggplot2. In this reprex, the idea is that I want to bold the axis text of a variable y that has an absolute value of x over 1.5. I can add it in manually via theme(), and that works fine:
library(ggplot2)
library(dplyr)
library(forcats)
set.seed(2939)
df <- data.frame(x = rnorm(15), y = paste0("y", 1:15), group = rep(1:3, 5))
df <- mutate(df, big_number = abs(x) > 1.5, face = ifelse(big_number, "bold",
"plain"))
p <- ggplot(df, aes(x = x, y = fct_inorder(y), col = big_number)) + geom_point() +
theme(axis.text.y = element_text(face = df$face))
p
Plot 1 with no facets
But if I facet it by group, y gets reordered and ggplot2 has no idea how face is connected to df and thus y, so it just bolds in the same order as the first plot.
p + facet_grid(group ~ .)
Plot 2 with facets
And it's worse if I use a different scale for each.
p + facet_grid(group ~ ., scales = "free")
Plot 3 with facets and different scales
What do you think? Is there a general way to handle this that would work consistently here?
Idea: Don't change theme, change y-axis labels. Create a call for every y with if/else condition and parse it with parse.
Not the most elegant solution (using for loop), but works (need loop as bquote doesn't work with ifelse). I always get confused when trying to work with multiple expressions (more on that here).
Code:
# Create data
library(tidyverse)
set.seed(2939)
df <- data.frame(x = rnorm(15), y = paste0("y", 1:15), group = rep(1:3, 5)) %>%
mutate(yF = fct_inorder(y),
big_number = abs(x) > 1.5)
# Expressions for y-axis
# ifelse doesn't work
# ifelse(df$big_number, bquote(bold(1)), bquote(plain(2)))
yExp <- c() # Ignore terrible way of concatenating
for(i in 1:nrow(df)) {
if (df$big_number[i]) {
yExp <- c(yExp, bquote(bold(.(as.character(df$yF[i])))))
} else {
yExp <- c(yExp, bquote(plain(.(as.character(df$yF[i])))))
}
}
# Plot with facets
ggplot(df, aes(x, yF, col = big_number)) +
geom_point() +
scale_y_discrete(breaks = levels(df$yF),
labels = parse(text = yExp)) +
facet_grid(group ~ ., scales = "free")
Result:
Inspired by #PoGibas, I also used a function in scale_y_discrete(), which works, too.
bold_labels <- function(breaks) {
big_nums <- filter(df, y %in% breaks) %>%
pull(big_number)
labels <- purrr::map2(
breaks, big_nums,
~ if (.y) bquote(bold(.(.x))) else bquote(plain(.(.x)))
)
parse(text = labels)
}
ggplot(df, aes(x, fct_inorder(y), col = big_number)) +
geom_point() +
scale_y_discrete(labels = bold_labels) +
facet_grid(group ~ ., scales = "free")
I'm trying to create a utility function that combines several geom_, like in this example (which doesn't work):
my_geom_y <- function(yy, colour){
geom_line(aes(y=yy), col=colour) + geom_point(aes(y=yy), col=colour)
}
so that then I can do this:
myX <- 0:90
ggplot(mapping = aes(x=myX)) + my_geom_y(dlnorm(myX), "red") + my_geom_y(dexp(myX), "blue")
Is that possible?
I tried using get(), eval(), substitute(), as.name() with no avail.
Looking at related posts: passing parameters to ggplot, Use of ggplot() within another function in R didn't help.
I like MSM's approach, but if you want to be able to add my_geom_y to a ggplot you've already made, this is an alternative that might suit what you're after:
library(ggplot2)
x <- 1:100
my_geom_y <- function(yy, colour = "black"){
list(
geom_line(mapping = aes(y= yy),
col = colour),
data = data.frame(x, yy)),
geom_point(mapping = aes(y = yy),
col = colour,
data = data.frame(x, yy))
)
}
ggplot(mapping = aes(x)) +
my_geom_y(x, "red") +
my_geom_y(dlnorm(x), "blue") +
my_geom_y((x^1.1), "black") +
my_geom_y(x/2, "yellow")
I don't have enough reputations to comment so here is a suggestion:
my_geom_y <- function(xx, yy, colour){
ggplot() +
geom_line(aes(x=xx, y=yy), col=colour) +
geom_point(aes(x=xx, y=yy), col=colour)
}
This will create one plot. To create multiple ones, you need to pass your inputs to the function as a list and loop through it inside the function for each geom (since we can't add two or more ggplot objects) - if that makes sense.
Based on #luke-c idea, this makes the function standalone, cut-n-paste ready. We can also add now labels to each curve.
my_geom_y <- function(.xx, .yy, yLabel = 1, .colour=NA ){
if (is.na(.colour))
.colour <- palette()[yLabel%%length(palette())]
list( geom_line(mapping=aes(.xx,.yy), col=.colour, data=data.frame(.xx, .yy)),
geom_point(mapping=aes(.xx,.yy), col=.colour, data=data.frame(.xx, .yy)),
annotate(geom="text" , col = .colour, label=deparse(substitute(.yy)),
x=mean(.xx),y=max(.yy)-(max(.yy)-min(.yy))/20*yLabel)
)
}
myX <- 1:10
ggplot() + my_geom_y(myX, dlnorm(myX), 1) +
my_geom_y(myX, dexp(myX), 2) + my_geom_y(myX, dexp(myX,0.7), 3)
This function becomes handy when you need to visually compare multiple distributions.
I'm an R/ggplot newbie switching over from MatLab.
I would like to create a function using ggplot with linear regression equation printed on the graph (which is discussed in Adding Regression Line Equation and R2 on graph). But here, I am trying to build a function with it but wasn't successful.
I kept getting an error -
"Error in eval(expr, envir, enclos) : object 'label' not found".
One workaround is to define "label" variable outside of the function but I just don't understand why this doesn't work.
Can anyone explain why?
df <- data.frame(x = c(1:100))
df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
f <- function(DS, x, y, z) {
label <- z
print(label)
ggplot(DS, aes(x=x, y=y)) +
geom_point() +
labs(y=y) +
labs(title=y) +
xlim(0,5)+
ylim(0,5)+
geom_smooth(method="lm", se=FALSE)+
geom_text (aes(x=1, y=4, label=label))
}
f(df, x, y, "aaa") #execution line
See the following code:
library(ggplot2)
df <- data.frame(x = c(1:100))
df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
f <- function(DS, x, y, z) {
label.df = data.frame(x=1, y=4, label=z)
ggplot(DS, aes_string(x=x, y=y)) +
geom_point() +
labs(y=y) +
labs(title=y) +
geom_smooth(method="lm", se=FALSE)+
geom_text (aes(x=x, y=y, label=label), label.df)
}
f(df, "x", "y", "aaa")
There were a few fixes about your code:
The data you are using in geom_text is the same you have defined in ggplot() unless you change it. Here I have created a temporary data.frame for this purpose called label.df.
The xlim() and ylim() functions were filtering most of your data points, since the range of x and y are much larger than the limits you defined in the original code.
When you want to pass the names of the columns of your data.frame to be used for displaying the graph it would be easier to pass their names as strings (i.e. "x"). This way, the aes() function is also changed to aes_string().
Here is the result:
Edit
Thanks to #Gregor, a simpler version would be:
f <- function(DS, x, y, z) {
ggplot(DS, aes_string(x=x, y=y)) +
geom_point() +
labs(y=y) +
labs(title=y) +
geom_smooth(method="lm", se=FALSE)+
annotate(geom="text", x=1, y=4, label=z)
}
I have used the following ggplot command:
ggplot(survey, aes(x = age)) + stat_bin(aes(n = nrow(h3), y = ..count.. / n), binwidth = 10)
+ scale_y_continuous(formatter = "percent", breaks = c(0, 0.1, 0.2))
+ facet_grid(hospital ~ .)
+ theme(panel.background = theme_blank())
to produce
I'd like to change the facet labels, however, to something shorter (like Hosp 1, Hosp 2...) because they are too long now and look cramped (increasing the height of the graph is not an option, it would take too much space in the document). I looked at the facet_grid help page but cannot figure out how.
Here is a solution that avoids editing your data:
Say your plot is facetted by the group part of your dataframe, which has levels control, test1, test2, then create a list named by those values:
hospital_names <- list(
'Hospital#1'="Some Hospital",
'Hospital#2'="Another Hospital",
'Hospital#3'="Hospital Number 3",
'Hospital#4'="The Other Hospital"
)
Then create a 'labeller' function, and push it into your facet_grid call:
hospital_labeller <- function(variable,value){
return(hospital_names[value])
}
ggplot(survey,aes(x=age)) + stat_bin(aes(n=nrow(h3),y=..count../n), binwidth=10)
+ facet_grid(hospital ~ ., labeller=hospital_labeller)
...
This uses the levels of the data frame to index the hospital_names list, returning the list values (the correct names).
Please note that this only works if you only have one faceting variable. If you have two facets, then your labeller function needs to return a different name vector for each facet. You can do this with something like :
plot_labeller <- function(variable,value){
if (variable=='facet1') {
return(facet1_names[value])
} else {
return(facet2_names[value])
}
}
Where facet1_names and facet2_names are pre-defined lists of names indexed by the facet index names ('Hostpital#1', etc.).
Edit: The above method fails if you pass a variable/value combination that the labeller doesn't know. You can add a fail-safe for unknown variables like this:
plot_labeller <- function(variable,value){
if (variable=='facet1') {
return(facet1_names[value])
} else if (variable=='facet2') {
return(facet2_names[value])
} else {
return(as.character(value))
}
}
Answer adapted from how to change strip.text labels in ggplot with facet and margin=TRUE
edit: WARNING: if you're using this method to facet by a character column, you may be getting incorrect labels. See this bug report. fixed in recent versions of ggplot2.
Here's another solution that's in the spirit of the one given by #naught101, but simpler and also does not throw a warning on the latest version of ggplot2.
Basically, you first create a named character vector
hospital_names <- c(
`Hospital#1` = "Some Hospital",
`Hospital#2` = "Another Hospital",
`Hospital#3` = "Hospital Number 3",
`Hospital#4` = "The Other Hospital"
)
And then you use it as a labeller, just by modifying the last line of the code given by #naught101 to
... + facet_grid(hospital ~ ., labeller = as_labeller(hospital_names))
Change the underlying factor level names with something like:
# Using the Iris data
> i <- iris
> levels(i$Species)
[1] "setosa" "versicolor" "virginica"
> levels(i$Species) <- c("S", "Ve", "Vi")
> ggplot(i, aes(Petal.Length)) + stat_bin() + facet_grid(Species ~ .)
The EASIEST way to change WITHOUT modifying the underlying data is:
Create an object using as_labeller(). If the column names begin with a number or contain spaces or special characters, don't forget to use back tick marks:
# Necessary to put RH% into the facet labels
hum_names <- as_labeller(
c(`50` = "RH% 50", `60` = "RH% 60",`70` = "RH% 70",
`80` = "RH% 80",`90` = "RH% 90", `100` = "RH% 100"))
Add to the ggplot:
ggplot(dataframe, aes(x = Temperature.C, y = fit)) +
geom_line() +
facet_wrap(~Humidity.RH., nrow = 2, labeller = hum_names)
Here's how I did it with facet_grid(yfacet~xfacet) using ggplot2, version 2.2.1:
facet_grid(
yfacet~xfacet,
labeller = labeller(
yfacet = c(`0` = "an y label", `1` = "another y label"),
xfacet = c(`10` = "an x label", `20` = "another x label")
)
)
Note that this does not contain a call to as_labeller() -- something that I struggled with for a while.
This approach is inspired by the last example on the help page Coerce to labeller function.
If you have two facets hospital and room but want to rename just one, you can use:
facet_grid( hospital ~ room, labeller = labeller(hospital = as_labeller(hospital_names)))
For renaming two facets using the vector-based approach (as in naught101's answer), you can do:
facet_grid( hospital ~ room, labeller = labeller(hospital = as_labeller(hospital_names),
room = as_labeller(room_names)))
Adding another solution similar to #domi's with parsing mathematical symbols, superscript, subscript, parenthesis/bracket, .etc.
library(tidyverse)
theme_set(theme_bw(base_size = 18))
### create separate name vectors
# run `demo(plotmath)` for more examples of mathematical annotation in R
am_names <- c(
`0` = "delta^{15}*N-NO[3]^-{}",
`1` = "sqrt(x,y)"
)
# use `scriptstyle` to reduce the size of the parentheses &
# `bgroup` to make adding `)` possible
cyl_names <- c(
`4` = 'scriptstyle(bgroup("", a, ")"))~T~-~5*"%"',
`6` = 'scriptstyle(bgroup("", b, ")"))~T~+~10~degree*C',
`8` = 'scriptstyle(bgroup("", c, ")"))~T~+~30*"%"'
)
ggplot(mtcars, aes(wt, mpg)) +
geom_jitter() +
facet_grid(am ~ cyl,
labeller = labeller(am = as_labeller(am_names, label_parsed),
cyl = as_labeller(cyl_names, label_parsed))
) +
geom_text(x = 4, y = 25, size = 4, nudge_y = 1,
parse = TRUE, check_overlap = TRUE,
label = as.character(expression(paste("Log"["10"], bgroup("(", frac("x", "y"), ")")))))
### OR create new variables then assign labels directly
# reverse facet orders just for fun
mtcars <- mtcars %>%
mutate(am2 = factor(am, labels = am_names),
cyl2 = factor(cyl, labels = rev(cyl_names), levels = rev(attr(cyl_names, "names")))
)
ggplot(mtcars, aes(wt, mpg)) +
geom_jitter() +
facet_grid(am2 ~ cyl2,
labeller = label_parsed) +
annotate("text", x = 4, y = 30, size = 5,
parse = TRUE,
label = as.character(expression(paste("speed [", m * s^{-1}, "]"))))
Created on 2019-03-30 by the reprex package (v0.2.1.9000)
Simple solution (from here):
p <- ggplot(mtcars, aes(disp, drat)) + geom_point()
# Example (old labels)
p + facet_wrap(~am)
to_string <- as_labeller(c(`0` = "Zero", `1` = "One"))
# Example (New labels)
p + facet_wrap(~am, labeller = to_string)
This solution is very close to what #domi has, but is designed to shorten the name by fetching first 4 letters and last number.
library(ggplot2)
# simulate some data
xy <- data.frame(hospital = rep(paste("Hospital #", 1:3, sep = ""), each = 30),
value = rnorm(90))
shortener <- function(string) {
abb <- substr(string, start = 1, stop = 4) # fetch only first 4 strings
num <- gsub("^.*(\\d{1})$", "\\1", string) # using regular expression, fetch last number
out <- paste(abb, num) # put everything together
out
}
ggplot(xy, aes(x = value)) +
theme_bw() +
geom_histogram() +
facet_grid(hospital ~ ., labeller = labeller(hospital = shortener))
Note that this solution will not work nicely in case ggplot will show less factors than your variable actually contains (which could happen if you had been for example subsetting):
library(ggplot2)
labeli <- function(variable, value){
names_li <- list("versicolor"="versi", "virginica"="virg")
return(names_li[value])
}
dat <- subset(iris,Species!="setosa")
ggplot(dat, aes(Petal.Length)) + stat_bin() + facet_grid(Species ~ ., labeller=labeli)
A simple solution (besides adding all unused factors in names_li, which can be tedious) is to drop the unused factors with droplevels(), either in the original dataset, or in the labbeler function, see:
labeli2 <- function(variable, value){
value <- droplevels(value)
names_li <- list("versicolor"="versi", "virginica"="virg")
return(names_li[value])
}
dat <- subset(iris,Species!="setosa")
ggplot(dat, aes(Petal.Length)) + stat_bin() + facet_grid(Species ~ ., labeller=labeli2)
Both facet_wrap and facet_grid also accept input from ifelse as an argument. So if the variable used for faceting is logical, the solution is very simple:
facet_wrap(~ifelse(variable, "Label if true", "Label if false"))
If the variable has more categories, the ifelse statement needs to be nested.
As a side effect, this also allows the creation of the groups to be faceted within the ggplot call.
This is working for me.
Define a factor:
hospitals.factor<- factor( c("H0","H1","H2") )
and use, in ggplot():
facet_grid( hospitals.factor[hospital] ~ . )
The labeller function defintion with variable, value as arguments would not work for me. Also if you want to use expression you need to use lapply and can not simply use arr[val], as the argument to the function is a data.frame.
This code did work:
libary(latex2exp)
library(ggplot2)
arr <- list('virginica'=TeX("x_1"), "versicolor"=TeX("x_2"), "setosa"=TeX("x_3"))
mylabel <- function(val) { return(lapply(val, function(x) arr[x])) }
ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width)) + geom_line() + facet_wrap(~Species, labeller=mylabel)
Since I'm not yet allowed to comment on posts, I'm posting this separately as an addendum to Vince's answer and son520804's answer . Credit goes to them.
Son520804:
using Iris data:
I assume:
You have installed the dplyr package, which has the convenient mutate command, and
your dataset is named survey.
survey %>% mutate(Hosp1 = Hospital1, Hosp2 = Hospital2,........)
This command helps you to rename columns, yet all other columns are kept.
Then do the same facet_wrap, you are fine now.
Using the iris example of Vince and the partial code of son520804, I did this with the mutate function and achieved an easy solution without touching the original dataset.
The trick is to create a stand-in name vector and use mutate() inside the pipe to correct the facet names temporarily:
i <- iris
levels(i$Species)
[1] "setosa" "versicolor" "virginica"
new_names <- c(
rep("Bristle-pointed iris", 50),
rep("Poison flag iris",50),
rep("Virginia iris", 50))
i %>% mutate(Species=new_names) %>%
ggplot(aes(Petal.Length))+
stat_bin()+
facet_grid(Species ~ .)
In this example you can see the levels of i$Species is temporarily changed to corresponding common names contained in the new_names vector. The line containing
mutate(Species=new_names) %>%
can easily be removed to reveal the original naming.
Word of caution: This may easily introduce errors in names if the new_name vector is not correctly set up. It would probably be much cleaner to use a separate function to replace the variable strings. Keep in mind that the new_name vector may need to be repeated in different ways to match the order of your original dataset. Please double - and - triple check that this is correctly achieved.
Just extending naught101's answer -- credit goes to him
plot_labeller <- function(variable,value, facetVar1='<name-of-1st-facetting-var>', var1NamesMapping=<pass-list-of-name-mappings-here>, facetVar2='', var2NamesMapping=list() )
{
#print (variable)
#print (value)
if (variable==facetVar1)
{
value <- as.character(value)
return(var1NamesMapping[value])
}
else if (variable==facetVar2)
{
value <- as.character(value)
return(var2NamesMapping[value])
}
else
{
return(as.character(value))
}
}
What you have to do is create a list with name-to-name mapping
clusteringDistance_names <- list(
'100'="100",
'200'="200",
'300'="300",
'400'="400",
'600'="500"
)
and redefine plot_labeller() with new default arguments:
plot_labeller <- function(variable,value, facetVar1='clusteringDistance', var1NamesMapping=clusteringDistance_names, facetVar2='', var1NamesMapping=list() )
And then:
ggplot() +
facet_grid(clusteringDistance ~ . , labeller=plot_labeller)
Alternatively you can create a dedicated function for each of the label changes you want to have.
I have another way to achieve the same goal without changing the underlying data:
ggplot(transform(survey, survey = factor(survey,
labels = c("Hosp 1", "Hosp 2", "Hosp 3", "Hosp 4"))), aes(x = age)) +
stat_bin(aes(n = nrow(h3),y=..count../n), binwidth = 10) +
scale_y_continuous(formatter = "percent", breaks = c(0, 0.1, 0.2)) +
facet_grid(hospital ~ .) +
opts(panel.background = theme_blank())
What I did above is changing the labels of the factor in the original data frame, and that is the only difference compared with your original code.
I think all other solutions are really helpful to do this, but there is yet another way.
I assume:
you have installed the dplyr package, which has the convenient mutate command, and
your dataset is named survey.
survey %>%
mutate(Hosp1 = Hospital1, Hosp2 = Hospital2,........)
This command helps you to rename columns, yet all other columns are kept.
Then do the same facet_wrap, you are fine now.
I feel like I should add my answer to this because it took me quite long to make this work:
This answer is for you if:
you do not want to edit your original data
if you need expressions (bquote) in your labels and
if you want the flexibility of a separate labelling name-vector
I basically put the labels in a named vector so labels would not get confused or switched. The labeller expression could probably be simpler, but this at least works (improvements are very welcome). Note the ` (back quotes) to protect the facet-factor.
n <- 10
x <- seq(0, 300, length.out = n)
# I have my data in a "long" format
my_data <- data.frame(
Type = as.factor(c(rep('dl/l', n), rep('alpha', n))),
T = c(x, x),
Value = c(x*0.1, sqrt(x))
)
# the label names as a named vector
type_names <- c(
`nonsense` = "this is just here because it looks good",
`dl/l` = Linear~Expansion~~Delta*L/L[Ref]~"="~"[%]", # bquote expression
`alpha` = Linear~Expansion~Coefficient~~alpha~"="~"[1/K]"
)
ggplot() +
geom_point(data = my_data, mapping = aes(T, Value)) +
facet_wrap(. ~ Type, scales="free_y",
labeller = label_bquote(.(as.expression(
eval(parse(text = paste0('type_names', '$`', Type, '`')))
)))) +
labs(x="Temperature [K]", y="", colour = "") +
theme(legend.position = 'none')
A one liner from mishabalyasin
:
facet_grid(.~vs, labeller = purrr::partial(label_both, sep = " #"))
See it in action
library(reprex)
library(tidyverse)
mtcars %>%
ggplot(aes(x="", y=gear,fill=factor(gear), group=am)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +
facet_grid(.~vs, labeller = purrr::partial(label_both, sep = " #"))
Created on 2021-07-09 by the reprex package (v2.0.0)
Have you tried changing the specific levels of your Hospital vector?
levels(survey$hospital)[levels(survey$hospital) == "Hospital #1"] <- "Hosp 1"
levels(survey$hospital)[levels(survey$hospital) == "Hospital #2"] <- "Hosp 2"
levels(survey$hospital)[levels(survey$hospital) == "Hospital #3"] <- "Hosp 3"
After struggling for a while, what I found is that we can use fct_relevel() and fct_recode() from forcats in conjunction to change the order of the facets as well fix the facet labels. I am not sure if it's supported by design, but it works! Check out the plots below:
library(tidyverse)
before <- mpg %>%
ggplot(aes(displ, hwy)) +
geom_point() +
facet_wrap(~class)
before
after <- mpg %>%
ggplot(aes(displ, hwy)) +
geom_point() +
facet_wrap(
vars(
# Change factor level name
fct_recode(class, "motorbike" = "2seater") %>%
# Change factor level order
fct_relevel("compact")
)
)
after
Created on 2020-02-16 by the reprex package (v0.3.0)
My approach to this issue these days is to use dplyr::case_when to produce a labeler within the facet_grid or facet_wrap function. This is an extension of the solution proposed by #lillemets
ggplot(survey, aes(x = age)) + stat_bin(aes(n = nrow(h3), y = ..count.. / n), binwidth = 10)
+ scale_y_continuous(formatter = "percent", breaks = c(0, 0.1, 0.2))
+ facet_grid(case_when(hospital == "Hospital #1" ~ "Hosp1",
hospital == "Hospital #2" ~ "Hosp2") ~ .)
+ theme(panel.background = theme_blank())
What's nice is that if you have a second facet label to change you just use the same approach on the other side of the ~ within facet_grid
Found a simple way to assign the values to an existing vector without the need for c=(`1`="first name",`2`="second name") etc.
First create a labeller function
title_labeller_function <- function(nuisance_parameter) {
return(vector_of_labels)
}
,where vector_of_labels are your labels, eg. c("first label", "second label")
Then just assign this in your facet_grid/facet_wrap
facet_grid(.~ hospital,labeller=as_labeller(title_labeller_function))
For more info, see as_labeller-documentation.