linebreak and superscript in bquote axis label ggplot2 - r

After looking at many examples and lots of trying, I'm still failing to combine text strings and an expression into ggplot2 axis labels to exactly what I want.
what I am trying to get here is the x-axis label to be:
the ingredients:
parname <- 'FL.Red.Total'
xmean <- 123.34
xsigma <- 2580.23
to change the numbers to 10^n notations I use this formula:
sci_form10 <- function(x) {
paste(gsub("e\\+", " \xB7 10^", scientific_format()(x)))
}
the name would then be build by:
labs( x = bquote(.(gsub('\\.', '\\ ', parname)) ~ " (a.u.) (" ~ mu ~ "=" ~ .(sci_form10(xmean)) ~ ", " ~ sigma ~ " =" ~ .(sci_form10(xsigma)) ~ ")" ))
I'm hoping to replace 10^04 with 10 followed by a 4 in superscript and to add a linebreak to the labels as the first image shows
The test code:
library(ggplot2)
library(scales)
sci_form10 <- function(x) {
paste(gsub("e\\+", " * 10^", scientific_format()(x)))
}
parname <- 'FL.Red.Total'
xmean <- 123.34
xsigma <- 2580.23
ggplot(mtcars, aes(x=mpg,y=cyl)) +
geom_point() +
labs( x = bquote(.(gsub('\\.', '\\ ', parname)) ~ " (a.u.) (" ~ mu ~ "=" ~ .(sci_form10(xmean)) ~ ", " ~ sigma ~ " =" ~ .(sci_form10(xsigma)) ~ ")" ))
gives:
p.s. I also tried
sci_form10 <- function(x) {
paste(gsub(".*e\\+", "10^", scientific_format()(x)))
}
which only gives the 10^03 part to see if that would change the outcome of my label, but no.

An option would be wrap with atop to create line breaks
sci_form10 <- function(x) {
paste(gsub("e\\+", " \u00B7 10^", scientific_format()(x)))
}
x1 <- sci_form10(xmean)
x2 <- sci_form10(xsigma)
lst1 <- strsplit(c(x1,x2), "\\s(?=10)", perl = TRUE)
pre <- sapply(lst1, `[`, 1)
post <- sapply(lst1, `[`, 2)
xmean1 <- parse(text = paste0("'", pre[1], "'"))[[1]]
xsigma1 <- parse(text = paste0("'", pre[2], "'"))[[1]]
post1 <- parse(text = post[1])[[1]]
post2 <- parse(text = post[2])[[1]]
ggplot(mtcars, aes(x=mpg,y=cyl)) +
geom_point() +
labs( x = bquote(atop(.(gsub("\\.", "\\ ",
parname))~"(a.u.)"~phantom(), "(" ~ mu~ " = "~ .(xmean1) ~ .(post1) ~ ", " ~ sigma ~ " = " ~ .(xsigma1) ~ .(post2)~ ")")))
-output

I have something that does most of what you wanted.
changeSciNot <- function(n) {
output <- format(n, digits=3, scientific = TRUE) # Transforms the number into scientific notation even if small
output <- sub("e", "*10^", output) # Replace e with 10^
output <- sub("\\+0?", "", output) # Remove + symbol and leading zeros on exponent, if > 1
output <- sub("-0?", "-", output) # Leaves - symbol but removes leading zeros on exponent, if < 1
output
}
# example data
parname <- "FL.Red.Total"
xmean <- 123.34
xsigma <- 2580.23
label <- bquote(atop(.(gsub("\\.", "\\ ", parname)) ~ "(a.u.)",
mu*"="*.(changeSciNot(xmean))*"," ~ sigma*"="*.(changeSciNot(xsigma))))
ggplot(mtcars, aes(x=mpg,y=cyl)) +
geom_point() +
labs(x = label)
The changeSciNot function came from this thread. I had some problems using \xB7 for the multiplication, so I left *. I also hard coded the number of digits for the format, but you can also make it into an argument. Hopefully, this will get you closer to the exact desired output.

Related

Adding Greek letters and variables to a legend in plot R

I want to print the variables z in the plot.
I have added
sprintf(%1.1f,z1)
etc in various combinations with paste (and paste0) and expression, but none of them are working.
In the dummy code below I have hardcoded the values I want to see.
x <- c(1,2,3)
y <- c(1,2,3)
plot(x,y)
z <- c(0.1,0.2,0.3)
labels = c( expression( paste( sigma," = ","0.1" )),
expression( paste( sigma," = ","0.2" )),
expression( paste( sigma," = ","0.3" ))
)
legend("topright", inset=.05, title="title",
labels, lwd=2, lty=c(1,1,1), col=colors)
Create the string and parse it.
labels <- parse(text = sprintf("sigma == %f", z))
Words can be separated with ~ symbols or combined into a single literal using quotes. * can be used for juxtaposition.
labels <- parse(text = sprintf("Case ~ (%d) ~ sigma == %f", 1:3, z))
labels <- parse(text = sprintf("Case ~ (%d) * ':' ~ sigma == %f", 1:3, z))
labels <- parse(text = sprintf("'Case (%d)' ~ sigma == %f", 1:3, z))
labels <- parse(text = sprintf("'Case (%d):' ~ sigma == %f", 1:3, z))
Try demo("plotmath") for more info.

Convert a string vector into a formula style, removing **""*

I want to convert the following string vector:
variables <- c("temperature", "rain", "sun_days", "season")
into the following formula:
formula <- pred ~ treatment*(temperature + rain + sun_days + season)
The way I converted the variables vector into a formula style is the following:
predictors <- paste0(variables, collapse = "+")
However, it does not make the trick when I write the formula in the following way:
formula <- pred ~ treatment*(variables)
It doesn't work because of the "" that characterises the string vector.
Any idea?
formula <- as.formula(
paste("pred ~ treatment * (", paste(variables, collapse = "+"), ")")
)
Result:
> formula
pred ~ treatment * (temperature + rain + sun_days + season)

R: Dynamically update formula

How can I dynamically update a formula?
Example:
myvar <- "x"
update(y ~ 1 + x, ~ . -x)
# y ~ 1 (works as intended)
update(y ~ 1 + x, ~ . -myvar)
# y ~ x (doesn't work as intended)
update(y ~ 1 + x, ~ . -eval(myvar))
# y ~ x (doesn't work as intended)
You can use paste() within the update()call.
myvar <- "x"
update(y ~ 1 + x, paste(" ~ . -", myvar))
# y ~ 1
Edit
As #A.Fischer noted in the comments, this won't work if myvar is a vector of length > 1
myvar <- c("k", "l")
update(y ~ 1 + k + l + m, paste(" ~ . -", myvar))
# y ~ l + m
# Warning message:
# Using formula(x) is deprecated when x is a character vector of length > 1.
# Consider formula(paste(x, collapse = " ")) instead.
Just "k" gets removed, but "l" remains in the formula.
In this case we could transform the formula into a strings, add/remove what we want to change and rebuild the formula using reformulate, something like:
FUN <- function(fo, x, negate=FALSE) {
foc <- as.character(fo)
s <- el(strsplit(foc[3], " + ", fixed=T))
if (negate) {
reformulate(s[!s %in% x], foc[2], env=.GlobalEnv)
} else {
reformulate(c(s, x), foc[2], env=.GlobalEnv)
}
}
fo <- y ~ 1 + k + l + m
FUN(fo, c("n", "o")) ## add variables
# y ~ 1 + k + l + m + n + o
FUN(fo, c("k", "l"), negate=TRUE)) ## remove variables
# y ~ 1 + m

Getting a variable to pass into function in R (ggplot2)

I'm trying to plot a graph between two columns of data from the data frame called "final". I want the p value and r^2 value to show up on the graph.
I'm using this function and code, but it gives me the error "cannot find y value"
library(ggplot2)
lm_eqn <- function(final, x, y){
m <- lm(final[,y] ~ final[,x])
output <- paste("r.squared = ", round(summary(m)$adj.r.squared, digits = 4), " | p.value = ", formatC(summary(m)$coefficients[8], format = "e", digits = 4))
return(output)
}
output_plot <- lm_eqn(final, x, y)
p1 <- ggplot(final, aes(x=ENSG00000153563, y= ENSG00000163599)) + geom_point() + geom_smooth(method=lm, se=FALSE) + labs(x = "CD8A", y = "CTLA-4") + ggtitle("CD8 v/s CTLA-4", subtitle = paste("Linear Regression of Expression |", output_plot))
How do I get both columns of data x and y to flow through the function and for the graph to plot with the p value and residual value printed on the graph?
Thanks in advance.
When you call function for output_plot generation you have to use the same ENS... variables as in your plot. After simplifying slightly function, should work now
library(stats)
library(ggplot2)
lm_eqn <- function(x, y){
m <- lm(y ~ x)
output <- paste("r.squared = ", round(summary(m)$adj.r.squared, digits = 4), " | p.value = ", formatC(summary(m)$coefficients[8], format = "e", digits = 4))
return(output)
}
x <-c(1,2,5,2,3,6,7,0)
y <-c(2,3,5,9,8,3,3,1)
final <- data_frame(x,y)
output_plot <- lm_eqn(x, y)
p1 <- ggplot(final, aes(x=x, y= y)) + geom_point() + geom_smooth(method=lm, se=FALSE) + labs(x = "x", y = "y") + ggtitle("CD8 v/s CTLA-4", subtitle = paste("Linear Regression of Expression |", output_plot))

ggplot2: How to parse a character variable (e.g. x <- ".35") as character, not number, in geom_text label

I am working on a figure for publication and wish to annotate it with some beta and p values; the style guidelines of my area dictate that these numbers be formatted without leading zeros (e.g., ".003", not "0.003"). I have run into what seems like a Catch-22; I have extracted beta and p values from my models and done some preprocessing to correctly format them so that they are now characters rather than numeric:
fake.beta.vals <- c(".53", ".29", ".14")
fake.p.vals <- c(".034", ".001", ".050")
But, when I try to use these values in my figure, parse = TRUE turns them back into numeric values, losing the formatting I need.
fake.beta.vals <- c(".53", ".29", ".14")
fake.p.vals <- c(".034", ".001", ".050")
p <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width))
p <- p +
geom_smooth(method = "lm") +
geom_point() +
facet_wrap( ~ Species)
p
len <-length(levels(iris$Species))
vars <- data.frame(expand.grid(levels(iris$Species)))
colnames(vars) <- c("Species")
betalabs <- as.data.frame(fake.beta.vals)
plabs <- as.data.frame(fake.p.vals)
dat <- data.frame(
x = rep(7, len),
y = rep(4, len),
vars,
betalabs,
plabs)
dat$fake.beta.vals <- as.factor(dat$fake.beta.vals)
dat$fake.p.vals <- as.factor(dat$fake.p.vals)
p <- p +
geom_text(
aes(x = x,
y = y,
label = paste("list(beta ==",
fake.beta.vals,
", italic(p) ==",
fake.p.vals,
")"),
group = NULL),
size = 5,
data = dat,
parse = TRUE)
p
I have been banging my head against this problem for a while now but adding as.character():
label = paste("list(beta ==",
as.character(fake.beta.vals),
", italic(p) ==",
as.character(fake.p.vals),
")"),
Is obviously also cancelled out by parse = TRUE
And adding the function I had previously used to format my values:
statformat <- function(val,z){
sub("^(-?)0.", "\\1.", sprintf(paste("%.",z,"f", sep = ""), val))
}
Is even worse:
label = paste("list(beta ==",
statformat(fake.beta.vals, 2),
", italic(p) ==",
statformat(fake.p.vals, 3),
")"),
And just ends up with a mess.
Help?
Use bquote to create the labels, then coerce to a character representation using deparse
For example
# create a list of labels using bquotw
labs <- Map(.beta = fake.beta.vals,
.p = fake.p.vals,
f = function(.beta,.p) bquote(list(beta == .(.beta), italic(p) == .(.p))))
# coerce to a character representation for parse=TRUE to work within
# geom_text
dat <- data.frame(
x = rep(7, len),
y = rep(4, len),
vars,
labels = sapply(labs,deparse))
p <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_smooth(method = "lm") +
geom_point() +
facet_wrap( ~ Species) +
geom_text(data = dat, aes(x=x,y=y,label=labels), parse=TRUE)
p
After getting back to my computer and re-reading your question, I found that I misinterpreted the question. Trying out the I function, I found that it doesn't seem to work with parse.
I found a way to get it to work, and this is by encasing your fake.beta.vals and fake.p.vals with the ` character or the ' character in your call to parse.
p <- p +
geom_text(
aes(x = x,
y = y,
label = paste("list(beta ==",
"`", fake.beta.vals, "`",
", italic(p) ==",
"`", fake.p.vals, "`",
")",
sep=""),
group = NULL),
size = 5,
data = dat,
parse = TRUE)
That should work.

Resources