Adding bar labels with italics and superscript to a barplot - r

I am using barplot for my data.
I need to insert x-axis bar labels (sample names) which have superscripts and should be italicized. For instance, one of the sample names (bar labels) is lab(delta21). Apart from the whole name to be in italics, I want the delta in (delta21) to be in symbol form and (delta21) to be a superscript of lab. (This is nothing fancy, just how biological gene mutant names are written).
I have tried fiddling around with names.arg=expression() but could not get it to work.
Any suggestions/ideas are most welcome.

Please try this minimal example:
x <- rnorm(2)
barplot(x, names.arg = c(expression(paste(italic("1")^"st")), expression(paste(italic("2")^"nd"))))
italic() does the italic part, ^ does the superscript part.

You may need to use ggplot2 to create your barplot because "bold, italic and bolditalic do not apply to symbols, and hence not to the Greek symbols such as mu" quoted from this help page. I am also assuming that different numbers are assigned to different samples (e.g., Lab_delta21, Lab_delta22, etc).
library(ggplot2)
library(reshape)
## make up data
data_table <- cast(mtcars, gear ~., value="mpg", mean)
data_table <- rename(data_table, c("(all)"="mean_mpg"))
lab_number <- 21:23
fancy_labels <- sapply(lab_number, function(x) paste0("italic(Lab[delta]", "[", x, "])"))
ggplot(data_table, aes(gear, mean_mpg)) + geom_bar(stat = "identity") +
scale_y_continuous(limits=c(0, 30))+
geom_text(aes(label=fancy_labels), parse=TRUE, hjust=0.5, vjust=-0.5, size=7)
The second "[]" is necessary as in [delta][21] because without it geom_text recognize [delta21] as one word, without rendering delta into a Greek letter.

Related

boldface of labels containing an expression with lower or equal symbol

I need to render in boldface the labels of the legend of a graph. One of the labels is an expression containing a "lower or equal" sign.
This is where I started from:
library(ggplot2)
df <- data.frame(x=factor(rep(0:1, 10)), y=rnorm(10), z=factor(rep(0:1, 10)))
ggplot(df, aes(x, y, shape=z)) +
geom_point() +
scale_shape_discrete(labels=c("Age > 65", expression(Age <= 65))) +
theme(legend.text=element_text(face="bold"))
In this way, the first label is bold, but the second is not. Following the suggestion here I tried to use plotmath bold():
library(ggplot2)
df <- data.frame(x=factor(rep(0:1, 10)), y=rnorm(10), z=factor(rep(0:1, 10)))
ggplot(aes(x, y, shape=z)) +
geom_point() +
scale_shape_discrete(labels=c("Age > 65", expression(bold(Age <= 65)))) +
theme(legend.text=element_text(face="bold"))
The label is rendered in bold only up to the "<=" sign. I have also tried to put the second part of the string within bold():
expression(bold(Age bold(<= 65)))
but to no avail. Any help is appreciated.
Tucked away in the plotmath documentation is the following:
Note that bold, italic and bolditalic do not apply to symbols, and hence not to the Greek symbols such as mu which are displayed in the symbol font. They also do not apply to numeric constants.
Instead, a suggested approach is to use unicode (assuming font and device support) which, in this case, means we can dispense with plotmath altogether.
ggplot(df, aes(x, y, shape=z)) +
geom_point() +
scale_shape_discrete(labels=c("Age > 65", "Age \U2264 65")) +
theme(legend.text=element_text(face="bold"))
While it is a little overkill for this particular issue, package ggtext makes complicated labels in ggplot2 a lot easier. It allows the use of Markdown and HTML syntax for text rendering.
Here's one way to write your legend text labels, using Markdown's ** for bolding and HTML's ≤ for the symbol.
library(ggtext)
ggplot(df, aes(x, y, shape=z)) +
geom_point() +
scale_shape_discrete(labels=c("**Age > 65**", "**Age ≤ 65**")) +
theme(legend.text=element_markdown())
(I'm on a Windows machine, and the default windows graphics device can have problems with adding extra spaces to symbols. Using ragg::agg_png() avoids the issue when saving plots, but also the next version of RStudio will allow you to change the graphics backend to bypass these problems.)

Modify legend and labels of stacked-area plot in R/ggplot2

EDIT: Solved by Haboryme in comments; the problem was my use of xlab and ylab instead of x and y as the names of keyword arguments to labs() (explaining the graph labels), and a redundant use of colour= in the second call to aes() (explaining the persistence of the original legend).
I'd like to make a stacked-area chart from some CSV data with R and ggplot2. For example:
In file "test.csv":
Year,Column with long name 1,Column with long name 2
2000,1,1
2001,1,1.5
2002,1.5,2
I run this code (imitating the answer to this GIS.SE question):
library(ggplot2)
library(reshape)
df <- read.csv('test.csv')
df <- melt(df, id="Year")
png(filename="test.png")
gg <- ggplot(df,aes(x=as.numeric(Year),y=value)) +
# Add a new legend
scale_fill_discrete(name="Series", labels=c("Foo bar", "Baz quux")) +
geom_area(aes(colour=variable,fill=variable)) +
# Change the axis labels and add a title
labs(title="Test",xlab="Year",ylab="Values")
print(gg)
dev.off()
The result, in file "test.png":
Problems: my attempt to change the axis labels was ignored, and my new legend (with code borrowed from the R Cookbook's suggestions) was added to, not substituted for, the (strangely recolored) default one. (Other solutions offered by the R Cookbook, such as calling guides(fill=FALSE), do more or less the same thing.) I'd rather not use the workaround of editing my dataframe (e.g. stripping the periods that read.csv() substitutes for spaces in column headers) so that the default labels turn out correct. What should I do?
ggplot(df,aes(x=as.numeric(Year),y=value)) +
scale_fill_discrete(name="Series", labels=c("Foo bar", "Baz quux")) +
geom_area(aes(fill=variable)) +
labs(title="Test",x="Year",y="Values")
The argument colour in the aes() of geom_area() only colours the contour and hence doesn't add anything to the plot here.

ggplot facet_grid label superscript

I am having trouble with putting subscript in facet_grid label. Here is
an example of the work I have been trying to do.
df <- data.frame(species=gl(2,10,labels=c('sp1','sp2')),
age=sample(3:12,40,replace=T),
variable=gl(2,20,labels=c('N1P1 var','N2P1 var')),
value=rnorm(40))
test.plot <- ggplot(data=df,aes(x=age,y=value)) +
geom_point() +
facet_grid(variable~species)
Now I want to make by vertical facet label as 'N[1]P[1] var' and so on,
where the numbers in the squared bracket means subscript.
I have consulted some helps in this platform regarding this, but none helped me. I have used expression, bquote as suggested, but nothing worked!
You need to do 2 things:
first, make your labels as plotmath expressions
variable_labels <-
c(expression(paste(N[1],P[1]~var)), expression(paste(N[2],P[1]~var)))
df <- data.frame(species=gl(2,10,labels=c('sp1','sp2')),
age=sample(3:12,40,replace=T),
variable=gl(2,20,labels=variable_labels),
value=rnorm(40))
And then change the default labeller function in facet_grid to "label_parsed"
test.plot <- ggplot(data=df,aes(x=age,y=value)) +
geom_point() +
facet_grid(variable~species, labeller = "label_parsed")

Label or annotation with subscript and variable source

I have an R routine which creates a number of plots from a large set of data. Each plot is labeled with a titles describing the details of the set of points plotted. Unfortunately, I have not been able to use subscripts in the text if I am using paste to combine a complex label. The result is ugly. This is a simplified version of the code using data from R. The title shows the technique I am currently using, without subscripts. The attempt at an improved version is placed either on the x axis or on the plot.
library(ggplot2)
x1 = 1
x2 = 2
list <- c(1:4)
tle <- paste("VGM = ", as.character(list[1]),
"V, VDM = ", as.character(list[2]),
"V, VGF = ", as.character(list[3]),
"V, VDF = ", as.character(list[4]),
"V", sep="")
p <- ggplot(mtcars, aes(x=wt, y=mpg)) +
labs(title=tle) +
geom_point()
p
p + xlab(expression(V[DM])) #works fine
p + xlab(expression(paste(V[DM], "= 3"))) # works fine
# now we would like to use a variable to provide the number
p + xlab(expression(paste(V[DM], "=", x1))) # Just displays "x1", not value of x1
p + xlab(expression(paste(V[DM], "=",
as.character(x1)))) # NO
p + xlab(expression(paste(V[DM], "=",
as.character(as.number(x1))))) # NO
my.xlab1 <- bquote(V[DM] == .(x1))
p + xlab(my.xlab1) # We can see the success here
# A single variable at the end of the expression works
# What if you wanted to display two variables?
my.xlab2 <- bquote(V[GM] == .(x2))
my.xlab3 <- paste(my.xlab1, my.xlab2)
p + xlab(my.xlab3) # doesn't work
# Apparently the expressions cannot be pasted together. Try another approach.
# Place the two expressions separately on the plot. They no longer need to be
# pasted together. It would look better, anyway. Would it work?
p + annotate("text", x=4, y=30, label="Annotate_text", parse=TRUE)
# This is the idea
# p + annotate("text", x=4, y=30, label=bquote(V[DM] == .(x1)), parse=TRUE)
# This is a disaster
# RStudio stops the process with a question mark placed on the console. Appears that
# more input is being requested?
p + geom_text(x=4, y=30, label="Geom_text") # works
p + geom_text(x=4, y=30, label=my.xlab1) # does not accept variables.
I have included comments which describe the problems raised by each attempt. Ideally, the information should probably be placed as an annotation on the plot rather than as a title, but I cannot find a way to do this. Using a subscript turns a character into an expression, and it seems that there is a long list of functions which handle characters but not expressions.
If you want to "paste" two expressions together, you need to have some "operator" join them. There really isn't a paste method for expressions, but there are ways to put them together. First, obviously you could use one bquote() to put both variables together. Either
my.xlab3 <- bquote(V[DM] == .(x1)~ V[GM] == .(x2))
my.xlab3 <- bquote(list(V[DM] == .(x1), V[GM] == .(x2)))
would work. The first puts a space between them, the second puts a comma between them. But if you want to build them separately, you can combine them with another round of bquote. So the equivalent building method for the two above expressions is
my.xlab3 <- bquote(.(my.xlab1) ~ .(my.xlab2))
my.xlab3 <- bquote(list(.(my.xlab1), .(my.xlab2)))
All of those should work to set your xlab() value.
Now, if you also want to get annotate to work, you can "un-parse" your expression and then have R "re-parse" it for you and you should be all set. Observe
p + annotate("text", x=4, y=30, label=deparse(my.xlab3), parse=TRUE)

How can I make a legend in ggplot2 with one point entry and one line entry?

I am making a graph in ggplot2 consisting of a set of datapoints plotted as points, with the lines predicted by a fitted model overlaid. The general idea of the graph looks something like this:
names <- c(1,1,1,2,2,2,3,3,3)
xvals <- c(1:9)
yvals <- c(1,2,3,10,11,12,15,16,17)
pvals <- c(1.1,2.1,3.1,11,12,13,14,15,16)
ex_data <- data.frame(names,xvals,yvals,pvals)
ex_data$names <- factor(ex_data$names)
graph <- ggplot(data=ex_data, aes(x=xvals, y=yvals, color=names))
print(graph + geom_point() + geom_line(aes(x=xvals, y=pvals)))
As you can see, both the lines and the points are colored by a categorical variable ('names' in this case). I would like the legend to contain 2 entries: a dot labeled 'Data', and a line labeled 'Fitted' (to denote that the dots are real data and the lines are fits). However, I cannot seem to get this to work. The (awesome) guide here is great for formatting, but doesn't deal with the actual entries, while I have tried the technique here to no avail, i.e.
print(graph + scale_colour_manual("", values=c("green", "blue", "red"))
+ scale_shape_manual("", values=c(19,NA,NA))
+ scale_linetype_manual("",values=c(0,1,1)))
The main trouble is that, in my actual data, there are >200 different categories for 'names,' while I only want the 2 entries I mentioned above in the legend. Doing this with my actual data just produces a meaningless legend that runs off the page, because the legend is trying to be a key for the colors (of which I have way too many).
I'd appreciate any help!
I think this is close to what you want:
ggplot(ex_data, aes(x=xvals, group=names)) +
geom_point(aes(y=yvals, shape='data', linetype='data')) +
geom_line(aes(y=pvals, shape='fitted', linetype='fitted')) +
scale_shape_manual('', values=c(19, NA)) +
scale_linetype_manual('', values=c(0, 1))
The idea is that you specify two aesthetics (linetype and shape) for both lines and points, even though it makes no sense, say, for a point to have a linetype aesthetic. Then you manually map these "nonsense" aesthetics to "null" values (NA and 0 in this case), using a manual scale.
This has been answered already, but based on feedback I got to another question (How can I fix this strange behavior of legend in ggplot2?) this tweak may be helpful to others and may save you headaches (sorry couldn't put as a comment to the previous answer):
ggplot(ex_data, aes(x=xvals, group=names)) +
geom_point(aes(y=yvals, shape='data', linetype='data')) +
geom_line(aes(y=pvals, shape='fitted', linetype='fitted')) +
scale_shape_manual('', values=c('data'=19, 'fitted'=NA)) +
scale_linetype_manual('', values=c('data'=0, 'fitted'=1))

Resources