R plot: Bringing formatted values into expression() in base graphics legend - r

Following problem: I try to make a legend where I will have something like:
ɛ = 5 L / (mol cm). The number however is calculated, here a minimal example:
plot(rnorm(10,3),rnorm(10,3))
epsilon.calc <- mean(rnorm(10,3))
legend("topleft",bty="n",legend=paste("epsilon=",format(epsilon.calc,digits=5),"L/(molcm)"))
legend("bottom",bty="n",legend=expression(epsilon,paste(format(epsilon.calc,digits=5)),"L/(molcm)"))
If I use the first legend I can paste the number (epsilon.calc), if I use legend 2 I can express epsilon in the right manner.
Anyone an idea to combine how to combine those expression() and paste() in one legend?

This is what I think you want:
legend("bottom",bty="n",legend=c(bquote(epsilon ==.(format(epsilon.calc, digits=5))),
expression( L/(mol %.% cm) )
) )
It's better to learn to use plotmath with minimal use of paste(). bquote is the simplest mechanism for getting evaluations done but it can also be done with substitute. paste inside an expression call is actually a different function than paste "outside" one.
This is the one line version:
legend("bottom",bty="n",
legend=bquote(epsilon ==.(format(epsilon.calc, digits=5))~(L/(mol %.% cm))
) )

Related

R Legend Variable Substitution

I always desire to have my R code as flexible as possible; at present I have three (potentially more) curves to compare based on a parameter delta, but I don't want to hardcode the values of delta anywhere (or even how many values if I can avoid it).
I am trying to make a legend that involves both Greek and a variable substitution for the delta values, so each legend entry is of the form like 'delta = 0.01', where delta is Greek and 0.01 is determined by variable. Many different combinations of paste, substitute, bquote and expression have been tried, but always end up with some verbatim code leftover in the finished legend, OR fail to put 'delta' into symbolic form.
delta <- c(0.01,0.05,0.1)
plot(type="n", x=1:5, y=1:5) #the curves themselves are irrelevant
legend_text <- vector(length=length(delta)) #I don't think lists work either
for(i in 1:length(delta)){
legend_text[i] <- substitute(paste(delta,"=",D),list(D=delta[i]) )
}
legend(x="topleft", fill=rainbow(length(delta)), legend=legend_text)
Since legend=substitute(paste(delta,"=",D),list(D=delta[1]) works for a single entry, I've also tried doing a 'semi-hardcoded' version, fixing the length of delta:
legend(x="topleft", fill=rainbow(length(delta)),
legend=c(substitute(paste(delta,"=",A), list(A=delta[1])),
substitute(paste(delta,"=",B), list(B=delta[2])),
substitute(paste(delta,"=",C), list(C=delta[3])) )
)
but this has the same issues as before.
Is there a way I can do this, or do I need to change the code by hand with each update of delta?
Try using lapply() with as.expression() to generate your legend labels. Also use bquote to create your individual expressions
legend_text <- as.expression(lapply(delta, function(d) {
bquote(delta==.(d))
} ))
Note that with plotmath you need == to get an equals sign. Also no need for paste() since nothing is really a string here.

How to diplay subscripts and array elements simultaneously on r plot

I have a coefficient array bees created in the following way:
gfit = lm(y_data,x_data);
bees = coef(gfit);, where bees[1]=0.123, bees[2]=4.56
A plot plot(x_data,y_data) is created. I'd liket to add some text on this plot. The text should look like $b_0=0.123, b_1=4.55$ (how to add Latex symbols on StackOverflow?).
I tried the following command: text(3,15,expression(paste("b"[0],"="bees[1])));, which turns out to be $b_0=bees_1$, i.e. the variable bees[1] is not interpreted properly.
How can I display the value of a variable by typing its name?
R doesn't have a LaTeX interpreter. You need to use ?plotmath. Try using bquote to allow getting values of R-objects , and here assuming that (1,1) is in the range of your (undescribed) data. The .()-function will put values pulled from the working environment into expressions:
text(1,1, bquote( list( b[0] == .(bees[1]) , b[1] == .(bees[2]) ) ) )
See the examples in ?bquote.
Writing formulas is a horrible mess in R. Only regexp is more write-only.
bees=c(0.12, 4.56)
plot(rnorm(100))
text(30,0,bquote(bees[1]== .(bees[1])))

r producing multiple violin plots with one graphic device

I have a data frame that looks like that:
bin_with_regard_to_strand CLONE3
31 0.14750872
33 0.52735917
28 0.48559060
. .
. .
I want to use this data frame to generate violin plots in such a way that all of the values in CLONE3 corresponding to a given value of bin_with_regard_to_strand will generate one plot.
Further, I want all of the plots to appear in the same graphic device (I'm using R-studio, and I want all of the plots to appear in one plot window).
Theoretically I could do this with:
vioplot(df$CLONE3[which(df$bin_with_regard_to_strand==1)],
df$CLONE3[which(df$bin_with_regard_to_strand==2)]...)
but since bin_with_regard_to_strand has 60 different values, this seems a bit ridiculous.
I tried using tapply:
tapply(df$CLONE3, df$bin_with_regard_to_strand,vioplot)
But that would open 60 different windows (one for each plot).
Or, if I used the add parameter:
tapply(df$CLONE3, df$bin_with_regard_to_strand,vioplot(add=TRUE))
generated a single plot with the data from all values bin_with_regard_to_strand (seperated by lines).
Is there a way to do this?
You could use par(mfrow=c(rows, columns)) (see ?par for details).
(see also ?layout for complexer arrangements)
d <- lapply(1:6, function(x)runif(100)) # generate some example data
library("vioplot")
par(mfrow=c(3, 2)) # use a 3x2 (rows x columns) layout
lapply(d, vioplot) # call plot for each list element
par(mfrow=c(1, 1)) # reset layout
Another alternative to mfrow, is to use layout. It is very handy to organize your plots. You just create a matrix with plots index. Here what you can do. It seems that 60 boxplots is a huge number. Maybe you should organize them in 2 pages.
The code below in function of N (number of plots)
library(vioplot)
N <- 60
par(mar=rep(2,4))
layout(matrix(c(1:N),
nrow=10,byrow=T))
dat <- data.frame(bin_with_regard_to_strand=gl(N,10),CLONE3=rnorm(10*N))
with(dat ,
tapply(CLONE3,bin_with_regard_to_strand ,vioplot))
This is an old question, but though I would put out a different solution for getting vioplot to make multiple violin plots on the same graph (i.e. same axes), rather than on different graphics objects like the above answers.
Basically use do.call to apply vioplot to a list of data. Ultimately, vioplot is not very well written (can't even set the title, axis names, etc.). I usually prefer base R, but this is a case where ggplot2 options is probably the way to go.
x<-rnorm(1000)
fac<-rep(c(1:10),each=100)
listOfData<-tapply(x,fac,function(x){x},simplify=FALSE)
names(listOfData)[[1]]<-"x" #because vioplot requires a 'x' argument
do.call(vioplot,listOfData)
resultingImage

strip panels lattice

My problem is to strip my panels with lattice framework.
testData<-data.frame(star=rnorm(1200),frame=factor(rep(1:12,each=100))
,n=factor(rep(rep(c(4,10,50),each=100),4))
,var=factor(rep(c("h","i","h","i"),each=300))
,stat=factor(rep(c("c","r"),each=600))
)
levels(testData$frame)<-c(1,7,4,10,2,8,5,11,3,9,6,12)# order of my frames
histogram(~star|factor(frame), data=testData
,as.table=T
,layout=c(4,3),type="density",breaks=20
,panel=function(x,params,...){
panel.grid()
panel.histogram(x,...,col=1)
panel.curve(dnorm(x,0,1), type="l",col=2)
}
)
What I'm looking for, is:
You should not need to add the factor call around items in the conditioning section of the formula when they are already factors. If you want to make a cross between two factors the interaction function is the best approach. It even has a 'sep' argument which will accept a new line character. This is the closest I can produce:
h<-histogram(~star|interaction(stat, var, sep="\n") + n, data=testData ,
as.table=T ,layout=c(4,3), type="density", breaks=20 ,
panel=function(x,params,...){ panel.grid()
panel.histogram(x,...,col=1)
panel.curve(dnorm(x,0,1), type="l",col=2) } )
plot(h)
useOuterStrips(h,strip.left = strip.custom(horizontal = FALSE),
strip.lines=2, strip.left.lines=1)
I get an error when I try to put in three factors separately and then try to use useOuterStrips. It won't accept three separate conditioning factors. I've searched for postings in Rhelp, but the only perfectly on-point question got an untested suggestion and when I tried it failed miserably.

Getting strings recognized as variable names in R

Related: Strings as variable references in R
Possibly related: Concatenate expressions to subset a dataframe
I've simplified the question per the comment request. Here goes with some example data.
dat <- data.frame(num=1:10,sq=(1:10)^2,cu=(1:10)^3)
set1 <- subset(dat,num>5)
set2 <- subset(dat,num<=5)
Now, I'd like to make a bubble plot from these. I have a more complicated data set with 3+ colors and complicated subsets, but I do something like this:
symbols(set1$sq,set1$cu,circles=set1$num,bg="red")
symbols(set2$sq,set2$cu,circles=set2$num,bg="blue",add=T)
I'd like to do a for loop like this:
colors <- c("red","blue")
sets <- c("set1","set2")
vars <- c("sq","cu","num")
for (i in 1:length(sets)) {
symbols(sets[[i]][,sq],sets[[i]][,cu],circles=sets[[i]][,num],
bg=colors[[i]],add=T)
}
I know you can have a variable evaluated to specify the column (like var="cu"; set1[,var]; I want to know how to get a variable to specify the data.frame itself (and another to evaluate the column).
Update: Ran across this post on r-bloggers which has this example:
x <- 42
eval(parse(text = "x"))
[1] 42
I'm able to do something like this now:
eval(parse(text=paste(set[[1]],"$",var1,sep="")))
In fiddling with this, I'm finding it interesting that the following are not equivalent:
vars <- data.frame("var1","var2")
eval(parse(text=paste(set[[1]],"$",var1,sep="")))
eval(parse(text=paste(set[[1]],"[,vars[[1]]]",sep="")))
I actually have to do this:
eval(parse(text=paste(set[[1]],"[,as.character(vars[[1]])]",sep="")))
Update2: The above works to output values... but not in trying to plot. I can't do:
for (i in 1:length(set)) {
symbols(eval(parse(text=paste(set[[i]],"$",var1,sep=""))),
eval(parse(text=paste(set[[i]],"$",var2,sep=""))),
circles=paste(set[[i]],".","circles",sep=""),
fg="white",bg=colors[[i]],add=T)
}
I get invalid symbol coordinates. I checked the class of set[[1]] and it's a factor. If I do is.numeric(as.numeric(set[[1]])) I get TRUE. Even if I add that above prior to the eval statement, I still get the error. Oddly, though, I can do this:
set.xvars <- as.numeric(eval(parse(text=paste(set[[i]],"$",var1,sep=""))))
set.yvars <- as.numeric(eval(parse(text=paste(set[[i]],"$",var2,sep=""))))
symbols(xvars,yvars,circles=data$var3)
Why different behavior when stored as a variable vs. executed within the symbol function?
You found one answer, i.e. eval(parse()) . You can also investigate do.call() which is often simpler to implement. Keep in mind the useful as.name() tool as well, for converting strings to variable names.
The basic answer to the question in the title is eval(as.symbol(variable_name_as_string)) as Josh O'Brien uses. e.g.
var.name = "x"
assign(var.name, 5)
eval(as.symbol(var.name)) # outputs 5
Or more simply:
get(var.name) # 5
Without any example data, it really is difficult to know exactly what you are wanting. For instance, I can't at all divine what your object set (or is it sets) looks like.
That said, does the following help at all?
set1 <- data.frame(x = 4:6, y = 6:4, z = c(1, 3, 5))
plot(1:10, type="n")
XX <- "set1"
with(eval(as.symbol(XX)), symbols(x, y, circles = z, add=TRUE))
EDIT:
Now that I see your real task, here is a one-liner that'll do everything you want without requiring any for() loops:
with(dat, symbols(sq, cu, circles = num,
bg = c("red", "blue")[(num>5) + 1]))
The one bit of code that may feel odd is the bit specifying the background color. Try out these two lines to see how it works:
c(TRUE, FALSE) + 1
# [1] 2 1
c("red", "blue")[c(F, F, T, T) + 1]
# [1] "red" "red" "blue" "blue"
If you want to use a string as a variable name, you can use assign:
var1="string_name"
assign(var1, c(5,4,5,6,7))
string_name
[1] 5 4 5 6 7
Subsetting the data and combining them back is unnecessary. So are loops since those operations are vectorized. From your previous edit, I'm guessing you are doing all of this to make bubble plots. If that is correct, perhaps the example below will help you. If this is way off, I can just delete the answer.
library(ggplot2)
# let's look at the included dataset named trees.
# ?trees for a description
data(trees)
ggplot(trees,aes(Height,Volume)) + geom_point(aes(size=Girth))
# Great, now how do we color the bubbles by groups?
# For this example, I'll divide Volume into three groups: lo, med, high
trees$set[trees$Volume<=22.7]="lo"
trees$set[trees$Volume>22.7 & trees$Volume<=45.4]="med"
trees$set[trees$Volume>45.4]="high"
ggplot(trees,aes(Height,Volume,colour=set)) + geom_point(aes(size=Girth))
# Instead of just circles scaled by Girth, let's also change the symbol
ggplot(trees,aes(Height,Volume,colour=set)) + geom_point(aes(size=Girth,pch=set))
# Now let's choose a specific symbol for each set. Full list of symbols at ?pch
trees$symbol[trees$Volume<=22.7]=1
trees$symbol[trees$Volume>22.7 & trees$Volume<=45.4]=2
trees$symbol[trees$Volume>45.4]=3
ggplot(trees,aes(Height,Volume,colour=set)) + geom_point(aes(size=Girth,pch=symbol))
What works best for me is using quote() and eval() together.
For example, let's print each column using a for loop:
Columns <- names(dat)
for (i in 1:ncol(dat)){
dat[, eval(quote(Columns[i]))] %>% print
}

Resources