Minus as an exponent in plotmath (in ggplot2 legend) - r

I'm trying to make a legend in a ggplot2 plot that contains a minus sign as an exponent (with no other characters in the exponent). However, I can't figure out the plotmath syntax.
It seems like the following would work:
expr1 <- expression(paste("text", main[sub]^{-}))
ggplot(mpg, aes(x=cty, y=hwy, colour=drv)) + geom_point() +
scale_colour_discrete(labels=c(expr1, "b", "c"))
(And it does work if we say expr1 <- expression(paste("text", main[sub]^{super})). Is there an escape character or something for minus signs in plotmath?

You are almost certainly going to need to put quotes around that minus sign, because it would otherwise be expected to be an infix operator and as such require arguments before and after it. Add a small test case if that does not solve the problem.
Escaping does not work in plotmath. In particular you cannot use "\n" as an end-of-line/newline marker (as is documented in the help(plotmath) page.
This also succeeds:
expr1 <- expression(paste("text", main[sub]^{phantom()-phantom()}))
I had never before tried using phantom preceding and succeeding an infix operator, but it seems acceptable to interpreter. Plotmath expressions do get parsed and need to conform to R-parsing rules. See ?Syntax. As noted in comments, using "-" as a prefix operator to a single phantom() also succeeds because the minus sign can be used as either a unary-minus or a binary-minus:
expr1 <- expression(paste("text", main[sub]^{-phantom()}))
We could also have used an empty character value as the item after the prefix minus: {-""}

Related

Functions to format text for base R plotting

Specifying text in a base R plot() with formatting such as italics / bold font / newline usually involves one or more of the following functions:
paste()
expression()
atop()
substitute()
italic()
Is there an intuitive explanation for the differences between these functions and when best to apply them?
What you're referring to is the plotmath syntax.
To start off, let's make it clear that for a plotmath expression to be interpreted as such, you tell R it's an "expression" and that is why you need expression().
So any time you want to use special symbols or formatting, like italic() and atop(), it's actually a part of plotmath and so you need to wrap it in an expression. eg:
plot(0, main = expression(atop(over,italic(under))))
If you've tried out ?italic or ?atop, you've probably noticed it takes you straight to the plotmath manual page, where a bunch of other functions are listed.
What about substitute() ? Well in my previous example, you'll notice I used strings directly to write 'over' and 'under', without putting them within quotes. This is because of the special expression() environment.
So if you need to put whatever is inside a variable in your text (rather than the variable name) then you put your expression inside a substitute() and give it the arguments. eg:
plot(0, main = substitute(atop(oo,italic(under))), list(oo='over2')))
Note that we don't put substitute around the expression block but replace it entirely.
Finally, where does paste() come in all this ? Well, paste is the glue (pun intended) with any text not dealt with by plotmath.
So if you need text before or after math symbols (or formatted text), you paste() things together within the expression (or substitute) environment. eg :
plot(0, main = substitute(paste("b4", atop(oo,italic(under)), aft),
list(oo='over', aft = 'after3')))
As before, if you want to paste the content of a variable, you need substitute.
And Voilà that's most of the plotmath you'll ever need!
For any other symbols, or functions, have look at ?plotmath

R identifier naming rules can be broken by using quotes? [duplicate]

I'm trying to understand what backticks do in R.
From what I can tell, this is not explained in the ?Quotes documentation page for R.
For example, at the R console:
"[["
# [1] "[["
`[[`
# .Primitive("[[")
It seem to be returning the equivalent to:
get("[[")
A pair of backticks is a way to refer to names or combinations of symbols that are otherwise reserved or illegal. Reserved are words like if are part of the language, while illegal includes non-syntactic combinations like c a t. These two categories, reserved and illegal, are referred to in R documentation as non-syntactic names.
Thus,
`c a t` <- 1 # is valid R
and
> `+` # is equivalent to typing in a syntactic function name
function (e1, e2) .Primitive("+")
As a commenter mentioned, ?Quotes does contain some information on the backtick, under Names and Identifiers:
Identifiers consist of a sequence of letters, digits, the period (.) and the underscore. They must not start with a digit nor underscore, nor with a period followed by a digit. Reserved words are not valid identifiers.
The definition of a letter depends on the current locale, but only ASCII digits are considered to be digits.
Such identifiers are also known as syntactic names and may be used directly in R code. Almost always, other names can be used provided they are quoted. The preferred quote is the backtick (`), and deparse will normally use it, but under many circumstances single or double quotes can be used (as a character constant will often be converted to a name). One place where backticks may be essential is to delimit variable names in formulae: see formula
This prose is a little hard to parse. What it means is that for R to parse a token as a name, it must be 1) a sequence of letters digits, the period and underscores, that 2) is not a reserved word in the language. Otherwise, to be parsed as a name, backticks must be used.
Also check out ?Reserved:
Reserved words outside quotes are always parsed to be references to the objects linked to in the 'Description', and hence they are not allowed as syntactic names (see make.names). They are allowed as non-syntactic names, e.g.inside backtick quotes.
In addition, Advanced R has some examples of how backticks are used in expressions, environments, and functions.
They are equivalent to verbatim. For example... try this:
df <- data.frame(20a=c(1,2),b=c(3,4))
gives error
df <- data.frame(`20a`=c(1,2),b=c(3,4))
doesn't give error
Here is an incomplete answer using improper vocabulary: backticks can indicate to R that you are using a function in a non-standard way. For instance, here is a use of [[, the list subsetting function:
temp <- list("a"=1:10, "b"=rnorm(5))
extract element one, the usual way
temp[[1]]
extract element one using the [[ function
`[[`(temp,1)

What do backticks do in R?

I'm trying to understand what backticks do in R.
From what I can tell, this is not explained in the ?Quotes documentation page for R.
For example, at the R console:
"[["
# [1] "[["
`[[`
# .Primitive("[[")
It seem to be returning the equivalent to:
get("[[")
A pair of backticks is a way to refer to names or combinations of symbols that are otherwise reserved or illegal. Reserved are words like if are part of the language, while illegal includes non-syntactic combinations like c a t. These two categories, reserved and illegal, are referred to in R documentation as non-syntactic names.
Thus,
`c a t` <- 1 # is valid R
and
> `+` # is equivalent to typing in a syntactic function name
function (e1, e2) .Primitive("+")
As a commenter mentioned, ?Quotes does contain some information on the backtick, under Names and Identifiers:
Identifiers consist of a sequence of letters, digits, the period (.) and the underscore. They must not start with a digit nor underscore, nor with a period followed by a digit. Reserved words are not valid identifiers.
The definition of a letter depends on the current locale, but only ASCII digits are considered to be digits.
Such identifiers are also known as syntactic names and may be used directly in R code. Almost always, other names can be used provided they are quoted. The preferred quote is the backtick (`), and deparse will normally use it, but under many circumstances single or double quotes can be used (as a character constant will often be converted to a name). One place where backticks may be essential is to delimit variable names in formulae: see formula
This prose is a little hard to parse. What it means is that for R to parse a token as a name, it must be 1) a sequence of letters digits, the period and underscores, that 2) is not a reserved word in the language. Otherwise, to be parsed as a name, backticks must be used.
Also check out ?Reserved:
Reserved words outside quotes are always parsed to be references to the objects linked to in the 'Description', and hence they are not allowed as syntactic names (see make.names). They are allowed as non-syntactic names, e.g.inside backtick quotes.
In addition, Advanced R has some examples of how backticks are used in expressions, environments, and functions.
They are equivalent to verbatim. For example... try this:
df <- data.frame(20a=c(1,2),b=c(3,4))
gives error
df <- data.frame(`20a`=c(1,2),b=c(3,4))
doesn't give error
Here is an incomplete answer using improper vocabulary: backticks can indicate to R that you are using a function in a non-standard way. For instance, here is a use of [[, the list subsetting function:
temp <- list("a"=1:10, "b"=rnorm(5))
extract element one, the usual way
temp[[1]]
extract element one using the [[ function
`[[`(temp,1)

How to display greater than or equal to sign using unicode \u2265

This is a follow up question to "Displaying a greater than or equal sign"
This is the text I wish to display as the y axis label:
Pr(Number of Invasions, X ≥ x)
This is the code:
expression(paste("Pr(Number of Invasions, ", italic('X'), "\u2265", italic('x'), ")"))
What I get is:
Pr(Number of Invasions, X = x)
This is the same result in the thread mentioned above. "\u2265" is supposed to overcome the issue, as suggested in the answers to the thread but it doesn't in my case.
When I run "\u2265" the result is:
"\u2265"
[1] "≥"
When I assign this to an object I get the same result:
symbol<-"\u2265"
symbol
[1] "≥"
However, in the Global Environment the object "symbol" contains "=".
Can anyone suggest how to display the symbol in the plot?
The answer isn't obvious to me.
I'm using RStudio, and OS system is Windows 7
By placing quotations marks around >= or \u2265 within paste within expression, it is was not able to produce the right symbol.
Even though I was formatting the Xs in italics, I should have just treated the code as if it was X>=x, which is what expression really wants to see, as MrFlick suggested... which makes sense now.
So:
expression(paste("Pr(Number of Invasions", italic('X')>=italic('x'), ")"))
Thanks MrFick!
You don't need paste. It's often clearer to use ~ and * as separators
plot(1,1, xlab=expression(Pr*'('*Number~of~Invasions~~ italic(X)*'\u2265'*italic(x)*")") )
That way it's easier to transition to the "full" plotmath version which gets a different spacing and looks better:
plot(1,1,
xlab=expression( Pr*'('*Number~of~Invasions~~ italic(X) >= italic(x)*")" )
)
If you had really wanted to have a named token hold the "≥" character, you can use the bquote and .( )-functions. The names inside the .( ) get evaluated (when the dot-function is within bquote):
symbol<-"\u2265"
plot(1,1,xlab=bquote(Pr*'('*Number~of~Invasions~~ italic(X) * .(symbol) * italic(x)*")") )

Using expression and paste in R to format an ion name with units in parantheses

I want to create a clean label to a graph that has the species abbreviation of an ion (in this case Chloride) followed by the concentration units (micro equivalents per liter) enclosed in parentheses. As written, the code mostly produces this, but superscripts the parentheses/units section. Probably missing something small. Using this code snippet with the ylab() command in ggplot2 as a label. Thanks.
My code so far:
cl.label = expression(paste(Cl^- ~(mu~eq ~L^-1)), parse=TRUE)
In the expression, - is an operator so it needs something to "negate." You can give it a phantom object like
cl.label = expression(Cl^-phantom() ~(mu~eq ~L^-1))
or you can treat the - as a literal dash value with
cl.label = expression(Cl^"-" ~(mu~eq ~L^-1))

Resources