Alternatives to combination to paste and parse functions using ggplot2 - r

I'm trying to construct a general function to generate some graphics. So I have to create an axis label that combines an expression and a string. I've tried:
i <- 2
measure <- c(expression((kg/m^2)), "(%)")
variable <- c("BMI", "Cintilografia 1h")
data <- data.frame(x = 0, y = 0)
gp <- ggplot(data, aes(x = x, y = y)) +
geom_point() +
labs(y = parse(text = paste(variable[i], measure[i])))
It works for i = 1, but for i = 2.
I discovered parse function has some problems with special characters. I would prefer that the user-function does not to worry about these features. Then, I'm looking for a more general solution.
Does anyone have any idea?
Thanks in advance

Trying to paste and parse isn't a great idea when working with labels where you want to use ?plotmath markup. It's better to work with expressions. The bquote() function makes it somewhat easier to pop in values into expressions. This should work for you
i<-1
g1<-ggplot(data, aes(x = x, y = y)) +
geom_point() +
labs(y = bquote(.(variable[i])~.(measure[[i]])))
i<-2
g2<-ggplot(data, aes(x = x, y = y)) +
geom_point() +
labs(y = bquote(.(variable[i])~.(measure[[i]])))
gridExtra::grid.arrange(g1,g2, ncol=2)
Also note that when using bquote() on the measure object, since it's an expression, you want to extract the sub-expressions with [[ ]] rather than [ ] because the latter always wraps the result in an expression() and plotmath doesn't like nested expressions -- you want proper call objects.

Related

What are the differences in use of labeller as function and as argument in ggplot

I have solved my particular use case issue regarding this, but I hope that by posing this question I may assist others with similar issues and provide a bookmark to come back to in the future if I need to make similar graphics. When using library(ggplot2) the capability to facet plots into multi-panel figures is paramount to many analyses, however, getting the labels right using the labeller argument and/or function can be rather finicky when the underlying data is not formatted correctly. I present to you a simple data set to demonstrate my particular use case and some variations that caused me problems, as well as the solutions that I used. I also pose some questions about how the labeller function is used within the labeller argument to facet_wrap and/or facet_grid. The starting data:
dat <- tibble(label1 = c('one','two','three','four','five','six',
'six','five','four','three','two','one'),
label2 = rep('T[alpha]'),
fit = c(25,50,75,40,60,90,40,50,70,30,90,100),
upr = c(35,60,85,50,70,100,50,60,80,40,100,110),
lwr = c(15,40,65,30,50,80,30,40,60,20,80,90),
var1 = rep(c(seq(1990,1995,1)),2))
Notice that label2 here is a character expression that can be evaluated as an expression. This appears to be the same as using base-R to plot:
plot(dat$var1, dat$fit)
mtext(expression(paste('T'[alpha])))
Essentially, as long as the expression can be evaluated within the expression function, it will work with label_parsed as an argument to facet_grid or facet_wrap. Here is what the code looks like for the actual plot I want:
p1 <- ggplot(dat, aes(x = var1, y = fit)) +
geom_point() +
geom_segment(aes(xend = var1, y = lwr, yend = upr)) +
facet_grid(label1~label2, scales = "fixed",
labeller = label_parsed)
p1
Now if I want to use label1 with a space, I can use this solution as well, but need to include a tilde ~ anywhere the space should exist so that it is still able to be evaluated as an expression.
dat1 <- dat %>%
mutate(label1 = paste(label1, '~x', sep = ''))
dat1
p2 <- ggplot(dat1, aes(x = var1, y = fit)) +
geom_point() +
geom_segment(aes(xend = var1, y = lwr, yend = upr)) +
facet_grid(label1~label2, scales = "fixed",
labeller = label_parsed)
p2
This is as far as I had to go to solve my use case. Howver, to better understand the functionality I would like to ask the community to clarify another way of using labeller as a function. The question posed here:
label_parsed of facet_grid in ggplot2 mixed with spaces and expressions
eventually helped me arrive at my solution. However, if we use the code as written in that question, where the labeller argument is: labeller = labeller(type=label_parsed) the result is not what is desired (we get the character expression as written instead of being evaluated as an expression.
p3 <- ggplot(dat1, aes(x = var1, y = fit)) +
geom_point() +
geom_segment(aes(xend = var1, y = lwr, yend = upr)) +
facet_grid(label1~label2, scales = "fixed",
labeller = labeller(type=label_parsed)) # this is the change
p3
Can anyone explain when it would be appropriate to use labeller as a function within the call to labeller from facet_grid? My hope is that a solution exists that doesn't require re-formatting of the entire dataset to reflect expressions and perhaps that utility lies within the labeller function itself.

ggplot2: display every nth value on discrete axis

How I can automate displaying only 1 in every n values on a discrete axis?
I can get every other value like this:
library(ggplot2)
my_breaks <- function(x, n = 2) {
return(x[c(TRUE, rep(FALSE, n - 1))])
}
ggplot(mpg, aes(x = class, y = cyl)) +
geom_point() +
scale_x_discrete(breaks = my_breaks)
But I don't think it's possible to specify the n parameter to my_breaks, is it?
Is this possible another way? I'm looking for a solution that works for both character and factor columns.
Not quite like that, but scale_x_discrete can take a function as the breaks argument, so you we just need to adapt your code to make it a functional (a function that returns a function) and things will work:
every_nth = function(n) {
return(function(x) {x[c(TRUE, rep(FALSE, n - 1))]})
}
ggplot(mpg, aes(x = class, y = cyl)) +
geom_point() +
scale_x_discrete(breaks = every_nth(n = 3))
Since ggplot 3.3.0 it is also possible to solve the problem of dense labels on discrete axis by using scale_x_discrete(guide = guide_axis(n.dodge = 2)), which gives (figure from documentation):
See the rewrite of axis code section of the release notes for more details.

R preserve symbol names across indirection

I made a plot like so...
ggplot(my_data, aes(x = ttd, y = aval)) +
theme_bw() +
geom_point(alpha = 0.25)
That gave me a nice plot with ttd and aval as my axes labels. I like how it used the names of the arguments as the default labels.
However, I have a bunch of plots like this, and I wanted to abstract it into my own function. But I can't seem to make the plot from inside the function. Here's what I tried:
bw_plot <- function(data, x_, y_) {
ggplot(data, aes(x = substitute(x_), y = substitute(y_))) +
theme_bw() +
geom_point(alpha = 0.25)
}
bw_plot(my_data, ttd, aval)
But I get this error:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : object 'x_' not found
I simply want to pass the symbols down from my bw_plot function into ggplot. How can I get it to see my actual column names?
(I also tried passing the column names in as strings and calling as.name on them, but i get the same result.
substitute will correctly "quote" your arguments x_ and y_. However, aes will apply a second round of "quoting" internally, which is what gives you the error. You need to unquote the result of your substitute calls, as you're passing them to aes. This can be done using !! operator from rlang.
library( ggplot2 )
library( rlang )
bw_plot <- function( .data, x_, y_ )
{
xsym <- ensym(x_)
ysym <- ensym(y_)
ggplot( .data, aes(x = !!xsym, y = !!ysym) ) +
theme_bw() +
geom_point(alpha=0.25)
}
Note that the correct function to use is rlang::ensym, rather than substitute, because you are aiming to capture individual symbols (column names). Also, I suggest not naming your argument data to avoid name collisions with a built-in function.
Here's example usage: bw_plot( mtcars, mpg, wt )
The accepted answer works for ggplot2 version 2.2.1.9000 and later. For versions 2.2.1 and prior, it looks like the non-standard evaluation used by aes makes this impossible. Instead, I have to use aes_ which provides an "escape hatch" allowing me to provide the symbols to the function as expected. Here is the solution:
bw_plot <- function(data, x_, y_) {
ggplot(data, aes_(x = substitute(x_), y = substitute(y_))) +
theme_bw() +
geom_point(alpha = 0.25)
}
bw_plot(my_data, ttd, aval)

Suppress message from geom_line with only one point

I'm iterating through multiple data sets to produce line plots for each set. How can I prevent ggplot from complaining when I use geom_line over one point?
Take, for example, the following data:
mydata = data.frame(
x = c(1, 2),
y = c(2, 2),
group = as.factor(c("foo", "foo"))
)
Creating line graph looks and works just fine because there are two points in the line:
ggplot(mydata, aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
However, plotting only the fist row give the message:
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
ggplot(mydata[1,], aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
Some of my figures will only have one point and the messages cause hangups in the greater script that produces these figures. I know the plots still work, so my concern is avoiding the message. I'd also like to avoid using suppressWarnings() if possible in case another legitimate and unexpected issue arises.
Per an answer to this question: suppressMessages(ggplot()) fails because you have to wrap it around a print() call of the ggplot object--not the ggplot object itself. This is because the warning/message only occurs when the object is drawn.
So, to view your plot without a warning message run:
p <- ggplot(mydata[1,], aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
suppressMessages(print(p))
I think the following if-else solution should resolve the problem:
if (nrow(mydata) > 1) {
ggplot(mydata, aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
} else {
ggplot(mydata, aes(x = x, y = y)) +
geom_point()
}
On the community.RStudio.com, John Mackintosh suggests a solution which worked for me:
Freely quoting:
Rather than suppress warnings, change the plot layers slightly.
Facet wrap to create empty plot
Add geom_point for entire data frame
Subset the dataframe by creating a vector of groups with more than one data point, and filtering the original data for those groups. Only
plot lines for this subset.
Details and example code in the followup of the link above.

Writing ggplot functions in R with optional arguments

I have a series of ggplot graphs that I'm repeating with a few small variations. I would like to wrap these qplots with their options into a function to avoid a lot of repetition in the code.
My problem is that for some of the graphs I am using the + facet_wrap() option, but for others I am not. I.e. I need the facet wrap to be an optional argument. When it is included the code needs to call the +facet_wrap() with the variable supplied in the facets argument.
So ideally my function would look like this, with facets being an optional argument:
$ qhist(variable, df, heading, facets)
I have tried googling how to add optional arguments and they suggest either passing a default value or using an if loop with the missing() function. I haven't been able to get either to work.
Here is the function that I have written, with the desired functionality of the optional facets argument included too.
$ qhist <- function(variable, df, heading, facets) {
qplot(variable, data = df, geom = "histogram", binwidth = 2000,
xlab = "Salary", ylab = "Noms") +
theme_bw() +
scale_x_continuous(limits=c(40000,250000),
breaks=c(50000,100000,150000,200000,250000),
labels=c("50k","100k","150k","200k","250k")) +
opts(title = heading, plot.title = theme_text(face = "bold",
size = 14), strip.text.x = theme_text(size = 10, face = 'bold'))
# If facets argument supplied add the following, else do not add this code
+ facet_wrap(~ facets)
the way to set up a default is like this:
testFunction <- function( requiredParam, optionalParam=TRUE, alsoOptional=123 ) {
print(requiredParam)
if (optionalParam==TRUE) print("you kept the default for optionalParam")
paste("for alsoOptional you entered", alsoOptional)
}
*EDIT*
Oh, ok... so I think I have a better idea of what you are asking. It looks like you're not sure how to bring the optional facet into the ggplot object. How about this:
qhist <- function(variable, df, heading, facets=NULL) {
d <- qplot(variable, data = df, geom = "histogram", binwidth = 2000,
xlab = "Salary", ylab = "Noms") +
theme_bw() +
scale_x_continuous(limits=c(40000,250000),
breaks=c(50000,100000,150000,200000,250000),
labels=c("50k","100k","150k","200k","250k")) +
opts(title = heading, plot.title = theme_text(face = "bold",
size = 14), strip.text.x = theme_text(size = 10, face = 'bold'))
# If facets argument supplied add the following, else do not add this code
if (is.null(facets)==FALSE) d <- d + facet_wrap(as.formula(paste("~", facets)))
d
return(d)
}
I have not tested this code at all. But the general idea is that the facet_wrap expects a formula, so if the facets are passed as a character string you can build a formula with as.formula() and then add it to the plot object.
If I were doing it, I would have the function accept an optional facet formula and then pass that facet formula directly into the facet_wrap. That would negate the need for the as.formula() call to convert the text into a formula.
Probably, the best way is to stop using such unusual variable names including commas or spaces.
As a workaround, here is an extension of #JDLong's answer. The trick is to rename the facet variable.
f <- function(dat, facet = NULL) {
if(!missing(facet)) {
names(dat)[which(names(dat) == facet)] <- ".facet."
ff <- facet_wrap(~.facet.)
} else {
ff <- list()
}
qplot(x, y, data = dat) + ff
}
d <- data.frame(x = 1:10, y = 1:10, "o,o" = gl(2,5), check.names=F)
f(d, "o,o")
f(d)
Note that you can also use missing(facets) to check if the facets argument was specified or not. If you use #JD Long's solution, it would look something like this:
qhist <- function(variable, df, heading, facets) {
... insert #JD Longs' solution ...
if (!missing(facets)) d <- d + facet_wrap(as.formula(paste("~", facets)))
return(d)
}
...Note that I also changed the default argument from facets=NULL to just facets.
Many R functions use missing arguments like this, but in general I tend to prefer #JD Long's variant of using a default argument value (like NULL or NA) when possible. But sometimes there is no good default value...

Resources