use of mathematical annotation as factor in R - r

I refer to my previous question, and want to know more about characteristics of factor in R.
Let say I have a dataset like this:
temp <- data.frame(x=letters[1:5],
y=1:5)
plot(temp)
I can change the label of x easily to another character:
levels(temp[,"x"]) <- letters[6:10]
But if I want to change it into some expression
levels(temp[,"x"]) <- c(expression(x>=1),
expression(x>=2),
expression(x>=3),
expression(x>=4),
expression(x>=5))
The >= sign will not change accordingly in the plot. And I found that class(levels(temp[,"x"])) is character, but expression(x>=1) is not.
If I want to add some mathematical annotation as factor, what can I do?

I do not see any levels arguments in ggplot and assigning levels to a character vector should not work. If you are trying to assign expression vectors you should just use one expression call and separate the arguments by commas and you should use the labels argument in a scale function:
p <- qplot(1:10, 10:1)+ scale_y_continuous( breaks= 1:10,
labels=expression( x>= 1, x>=2, x>=3, x>= 4,x>=5,
x>= 6, x>=7, x>= 8,x>=9, x>= 10) )
p

I would just leave them as character strings
levels(temp[,"x"]) <- paste("x>=", 1:5, sep="")
If you then want to include them as axis labels, you could do something like the following to convert them to expressions:
lev.as.expr <- parse(text=levels(temp[,"x"]))
For your plot, you could then do:
plot(temp, xaxt="n")
axis(side=1, at=1:5, labels=lev.as.expr)

Expression is used to generate text for plots and output but can't be the variable names per se. You'd have to use the axis() command to generate your own labels. Because it can evaluate expressions you could try...
plot(temp, xaxt = 'n')
s <- paste('x>', 1:5, sep = '=')
axis(1, 1:5, parse(text = s))

Related

In R, why is there awkward output in the legend when I am using paste() instead of c() in addition to pretty10exp()?

I'm trying to make the legend of this plot pretty, so I need there the be an actual superscript, which is why I am using the pretty10exp() function from the sfsmisc library. It works when I use the c() function.
However, I am also trying to keep the string and the scientific notation number on the same line. The legend() is broken into two lines, which I think is due to c(). I thought I could use paste(), but for some reason the output is now incorrect.
plot(1:12)
pVal <- 4
legend("topright", legend = c("P value:", sfsmisc::pretty10exp(pVal)), cex = 1.5)
legend("topright", legend = paste("P value:", sfsmisc::pretty10exp(pVal)), cex = 1.5)
pVal being an arbitrary number represented in scientific notation. The second line results in output like this: "P value: (significand) %*% 10^-4". The first line also doesn't give me what I want. How can I fix this problem?
pretty10exp returns an expression which allows it to use the ?plotmath features for making nice looking numbers. When working with expressions, you can't just paste values in like strings. You need to manipulate them with a special set of functions. One such function is substitute. You can do
plot(1:12)
pVal <- 4
legend("topright", cex = 1.5,
legend = substitute("P value: "*x, list(x=sfsmisc::pretty10exp(pVal)[[1]])) )
We use substitute() to take the value contained in the expression from pretty10exp and prefix it with the label you want. (We use * to concatenate rather than paste() since plotmath allows it)
This is what I would do:
fun <- function(text, pVal) {
y <- floor(log10(pVal))
x <- pVal / 10^y
bquote(.(text)*":" ~ .(x) %.% 10 ^ .(y))
}
plot.new()
text(0.5,0.7,fun("P value", 0.4))
text(0.5, 0.3, fun("P value", signif(1/pi, 1)))
No package is needed.

superpose a histogram and an xyplot

I'd like to superpose a histogram and an xyplot representing the cumulative distribution function using r's lattice package.
I've tried to accomplish this with custom panel functions, but can't seem to get it right--I'm getting hung up on one plot being univariate and one being bivariate I think.
Here's an example with the two plots I want stacked vertically:
set.seed(1)
x <- rnorm(100, 0, 1)
discrete.cdf <- function(x, decreasing=FALSE){
x <- x[order(x,decreasing=FALSE)]
result <- data.frame(rank=1:length(x),x=x)
result$cdf <- result$rank/nrow(result)
return(result)
}
my.df <- discrete.cdf(x)
chart.hist <- histogram(~x, data=my.df, xlab="")
chart.cdf <- xyplot(100*cdf~x, data=my.df, type="s",
ylab="Cumulative Percent of Total")
graphics.off()
trellis.device(width = 6, height = 8)
print(chart.hist, split = c(1,1,1,2), more = TRUE)
print(chart.cdf, split = c(1,2,1,2))
I'd like these superposed in the same frame, rather than stacked.
The following code doesn't work, nor do any of the simple variations of it that I have tried:
xyplot(cdf~x,data=cdf,
panel=function(...){
panel.xyplot(...)
panel.histogram(~x)
})
You were on the right track with your custom panel function. The trick is passing the correct arguments to the panel.- functions. For panel.histogram, this means not passing a formula and supplying an appropriate value to the breaks argument:
EDIT Proper percent values on y-axis and type of plots
xyplot(100*cdf~x,data=my.df,
panel=function(...){
panel.histogram(..., breaks = do.breaks(range(x), nint = 8),
type = "percent")
panel.xyplot(..., type = "s")
})
This answer is just a placeholder until a better answer comes.
The hist() function from the graphics package has an option called add. The following does what you want in the "classical" way:
plot( my.df$x, my.df$cdf * 100, type= "l" )
hist( my.df$x, add= T )

How to include variable values in histogram titles in R - using by()

I want to produce histograms using by(), how can I access the values of the factors, to include in histogram headings, for example...
a <- runif(500, 0, 10)
b <- LETTERS[1:5]
c <- c("Condition1", "Condition2")
x <- data.frame("Variable1" = b, "Variable2"= c, "Value"=a)
head(x)
by(x$Value, x$Variable2, hist)
or using two variables
by(x$Value, list(x$Variable2, x$Variable1), hist)
Is there a way of passing the variable value (eg Condition1) to the title of the histogram using the options within hist(), eg putting function(x) hist(x, main=...) into by()?
Pass the split up dataframe rather than just the Values. Then you will have more to work with:
by(x, x$Variable2, function(x) hist(x$Value, main=unique(x$Variable2) ) )
Produced two plots labled Condition1, Condition2
This doesn't really answer your question, since you're specifying the use of by(), but I usually use split() and lapply() for these types of problems. My approach is usually along the lines of:
temp <- split(x$Value, list(x$Variable2, x$Variable1))
lapply(names(temp), function(x) hist(temp[[x]], main = x, xlab = "Value"))

Generate a list of expression literals from an integer sequence

I would like to map a sequence of integers to a sequence of expression literals in order to use the latter as tick mark labels in a plot, e.g.
lbls <- lapply(-2:2, function(i) expression(i * pi))
plot(...)
axis(1, at=seq(-2,2)*pi, labels=lbls)
So far I've tried all variations of bquote, substitute, expression etc. that I could think of, but apparently I must have missed something.
Also, the FAQ and related SO questions & answers didn't fully solve this for me.
How would I do it correctly (I want axis to render pi as the greek letter and have -2 ... 2 substituted for i in the above example)?
try this:
lbls <- do.call("expression", lapply(-2:2, function(i) substitute(X * pi, list(X = i))))
plot(-10:10, -10:10, xaxt="n")
axis(1, at=seq(-2,2)*pi, labels=lbls)
Try this:
lbls <- parse(text = paste(seq(-2, 2), "pi", sep = "*"))

Add values and superscript to pie-labels

I'm struggling with adding a superscript to the labels of a plot.
I would like to have the '3' in the labels (..m^3) as superscript. I tried expression(), substitute() etc. but didn't find the correct solution.
values <- c(2, 4, 5)
pie(values, labels = paste(values, "m^3") )
Thanks for any hint!
A bit cumbersome workaround:
foo <- sapply(as.list(values), function(x) bquote(.(x) ~ m^3))
pie(values, labels = as.expression(foo))

Resources