Gnu R ships with a very odd way to note formulas and symbols. It is often discussed here and mentioned in the R helppage ?plotmath. For anyone who ever wrote LaTeX the code for a simple formula in R looks unreadable and is errorprone to write.
Is there a better way to annotate with formulas? Is there a function like tex2r("x_2") that will generate the strange code?
edit:
I am looking for a solution without TikZdevice, because TikZdevice is still very fragile and the printoout does not look exactly the same.
With the tikzDevice package (currently available only from the CRAN archive) you can use straight-up LaTeX markup to annotate your plots. (The package comes with a beautiful vignette that'll get you up and running).
The example below was lifted directly from this page, which also displays the figure it produces:
require(tikzDevice)
tikz('normal.tex', standAlone = TRUE, width=5, height=5)
# Normal distribution curve
x <- seq(-4.5,4.5,length.out=100)
y <- dnorm(x)
# Integration points
xi <- seq(-2,2,length.out=30)
yi <- dnorm(xi)
# plot the curve
plot(x,y,type='l',col='blue',ylab='$p(x)$',xlab='$x$')
# plot the panels
lines(xi,yi,type='s')
lines(range(xi),c(0,0))
lines(xi,yi,type='h')
#Add some equations as labels
title(main="$p(x)=\\frac{1}{\\sqrt{2\\pi}}e^{-\\frac{x^2}{2}}$")
int <- integrate(dnorm,min(xi),max(xi),subdivisions=length(xi))
text(2.8, 0.3, paste("\\small$\\displaystyle\\int_{", min(xi),
"}^{", max(xi), "}p(x)dx\\approx", round(int[['value']],3),
'$', sep=''))
#Close the device
dev.off()
# Compile the tex file
tools::texi2dvi('normal.tex',pdf=T)
I just found a package that does exactly what OP was asking for: latex2exp and in there the fuction TeX.
E.g.:
library(latex2exp)
library(berryFunctions)
set.seed(1)
milk <- data.frame(Datum = as.Date(rep(seq(17500, 18460, by = 30), each = 30), origin = "1970-01-01"),
Milch = abs(rnorm(990, mean = 20, sd = 8)))
X11(width = 12, height = 7)
par(mar = c(3,5.5,3,1))
with(milk, plot(Datum, Milch, pch = "-", cex=1.5, col = rgb(red = 0, green = 0.5, blue = 0.5, alpha = 0.7),
xaxt = "n", xlab = "", ylab = "",
main = "Milchleistung am Wolkenhof"))
title(ylab = TeX("Milchmenge $\\,$ $\\left[\\frac{\\mathrm{kg}}{\\mathrm{Kuh} \\cdot \\mathrm{Tag}}\\right]$"), line = 2.5)
monthAxis()
yields:
Edit: Now there is a space between "Milchmenge" and the left bracket "[", but I didn't want to upload a new picture therefore.
Related
Let's say I've assigned a plot in R to a variable name. Here's an example I'm currently working on, although any variable <- plotting code example will do:
myplot <- wireframe(sag.pr.dev ~ Col*Row, data=t22mapee,
xlab = "col",
ylab = "row",
ylim = c(33,1),
main = "T22 PR Sag Deviation",
#zlim=c(-0.6, 0.2),
drape = TRUE,
colorkey = TRUE,
scales = list(arrows=FALSE,cex=.5, tick.number = 10, z = list(arrows=F), distance =c(1.5, 1.5, 1.5)),
col.regions = terrain.colors(100),
screen = list(z = 30, x = -60))
Typing myplot will draw the plot on demand. But my question is: Is there a command/method to retrieve the code stored under myplot later? I'm aware of things like ls(myplot) and the like, but that only gives a list of commands invoked and not the actual code.
I need to do this because I have some plot code that has rolled off my screen in the current R session (due to foolishly listing out a few very long data frames), and I don't exactly remember how I created a few particular plots.
Thanks!
YES! You should be able to get back the code using
myplot$call
You can see this by typing str(myplot) and browsing the output.
I have a big data with more than 20 millions values, due to privacy and making the codes reproducible, I use mydata to replace it.
set.seed(1234)
mydata <- rlnorm(28000000,3.14,1.3)
I want to find which known distributions fit mydata best, so function fitdist in package fitdistrplus is choosen.
library(fitdistrplus)
fit.lnorm <- fitdist(mydata,"lnorm")
fit.weibull <- fitdist(mydata, "weibull")
fit.gamma <- fitdist(mydata, "gamma", lower = c(0, 0))
fit.exp <- fitdist(mydata,"exp")
Then, I use ppcomp function to draw P-P plot to help me choose the best fitted distribution.
library(RColorBrewer)
tiff("./pplot.tiff",res = 300,compression = "lzw",height = 6,width = 10,units = "in",pointsize = 12)
ppcomp(list(fit.lnorm,fit.weibull, fit.gamma,fit.exp), fitcol = brewer.pal(9,"Set1")[1:4],legendtext = c("lnorm","weibull", "gamma","exp"))
dev.off()
Absolutely, lognormal fits mydata best, but take a look at the legend of the plot, the line annotation with different colors is missing, only text annotation shows, what should I do?
I try some datasets with few values, and it worked. So the big data leads to the question, what should I do to make the legend perfect?
A lot of function questions could be done by fix(function), in this way, we could know how the function works.
fix(ppcomp)
And I find some codes about legend,
if (addlegend) {
if (missing(legendtext))
legendtext <- paste("fit", 1:nft)
if (!largedata)
legend(x = xlegend, y = ylegend, bty = "n", legend = legendtext,
pch = fitpch, col = fitcol, ...)
else legend(x = xlegend, y = ylegend, bty = "n", legend = legendtext,
col = fitcol, ...)
}
Then, I add lty=1 to the legend, and it works.
I'm trying to produce a non-ultrametric tree using the ape package in R and the function plot.phylo(). I'm struggling to find any documentation on how to keep the tip label vertically aligned on their left edge and with a series of dots (variable length) linking the species' name to the tip of the node.
Any help would be much appreciated as well as links to other packages within R that may be able to achieve this.
An example of the newick tree
I don't have any tree examples of what i want, however, the description seems self explanatory. the labels would all be shifted to the very right, and aligned on their left side, then a series of dots (.......) would link the label to where there old position was.
MLJTT = newickTree (as a string)
plot.phylo(read.tree(text = MLJTT), show.tip.label = T,use.edge.length = T, no.margin = T, cex = 0.55)
And example of three that I want to copy the layout of from here:
Ok, I ended up slightly modifying the default plot.phylo code to accomidate such a change. Here's how it looks
library(ape)
plot.phylo2 <- plot.phylo
environment(plot.phylo2) <- environment(plot.phylo)
body(plot.phylo2)[[c(34, 3, 6, 3, 4, 3)]] <- quote({
mx <- max(xx[1:Ntip])
segments(xx[1:Ntip], yy[1:Ntip] + loy, mx, yy[1:Ntip] + loy,
lty=2, col="grey")
text(mx + lox, yy[1:Ntip] + loy, x$tip.label, adj = adj,
font = font, srt = srt, cex = cex, col = tip.color)
})
This is somewhat fragile and may change in different version of ape, I've tested this with version ape_3.1-4. You can check if this will work by verifying that
body(plot.phylo)[[c(34, 3, 6, 3, 4, 3)]]
returns
text(xx[1:Ntip] + lox, yy[1:Ntip] + loy, x$tip.label, adj = adj,
font = font, srt = srt, cex = cex, col = tip.color)
just to make sure we are changing the correct line. But the code above basically replaces that line where the labels are drawn by moving the x axis where they are drawn and adding in the segments for the dotted lines. Then you can run this with your test data
MLJTT = read.tree(text="..<sample data>..")
plot.phylo2(MLJTT,
show.tip.label = T,use.edge.length = T, no.margin = T, cex = 0.55)
And this produces
I think what you may be looking for is the argument to plot.phylo:
align.tip.label = TRUE
Have you tried this?
MLJTT <- rtree(100)
plot.phylo(MLJTT, show.tip.label = T, align.tip.label = T, use.edge.length = T, no.margin = T, cex = 0.55)
I am relatively new to R and I am trying to get my head around how to do ordination techniques in R, so that I don't need to use other software.
I am trying to get a PCA with environmental factors in the place of species.
As I have sites which differ qualitatively (in terms of land use) I wanted to be able to show that difference in the final plot (with different colours). Therefore, I used the method a la Gavin Simpson with the package vegan. So far so good. Here is also the code that I used for that:
with(fish, status)
scl <- -1 ## scaling = -1
colvec <- c("red2", "mediumblue")
plot(pond.pca, type = "n", scaling = scl)
with(fish, points(pond.pca, display = "sites", col = colvec[status], scaling = scl, pch = 21, bg = colvec[status]))
head(with(fish, colvec[status]))
text(pond.pca, display = "species", scaling = scl, cex = 0.8, col = "darkcyan")
with(fish, legend("topright", legend = levels(status), bty = "n", col = colvec, pch = 21, pt.bg = colvec))
The problem arises when I try to put arrows for my environmental variables in the ordination plot. If I use biplot and other functions like ordiplot etc. I ll not be able to keep the different colours for my two types of sites, therefore I don't want to use those. If I use the command here:
plot(envfit(pond.pca, PondEnv38, scaling=-1), add=TRUE, col="black")
I get nice arrows, only the are not aligned (and in some cases are completely opposite) with the environmental variables that I ve given with the code before (line 5). I tried to change the scaling but they just cannot align.
Does anyone know how to deal with that problem?
Any tips would be useful.
It is not clear what you are doing wrong as you don't provide a reproducible example of the problem and I am having difficulty following your description of what is wrong. Here is a fully worked out example for you to follow that does what you seem to being trying to do.
data(varespec)
data(varechem)
ord <- rda(varespec)
set.seed(1)
(fit <- envfit(ord, varechem, perm = 999))
## make up a fake `status`
status <- factor(rep(c("Class1","Class2"), times = nrow(varespec) / 2))
> head(status)
[1] Class1 Class2 Class1 Class2 Class1 Class2
Now plot
layout(matrix(1:2, ncol = 2))
## auto version
plot(fit, add = FALSE)
## manual version with extra things
colvec <- c("red","green")
scl <- -1
plot(ord, type = "n", scaling = scl)
points(ord, display = "sites", col = colvec[status], pch = (1:2)[status])
points(ord, display = "species", pch = "+")
plot(fit, add = TRUE, col = "black")
layout(1)
Which gives
And all the arrows seem to be pointing as they would if you plotted the envfit object directly.
I would like to overlay 2 density plots on the same device with R. How can I do that? I searched the web but I didn't find any obvious solution.
My idea would be to read data from a text file (columns) and then use
plot(density(MyData$Column1))
plot(density(MyData$Column2), add=T)
Or something in this spirit.
use lines for the second one:
plot(density(MyData$Column1))
lines(density(MyData$Column2))
make sure the limits of the first plot are suitable, though.
ggplot2 is another graphics package that handles things like the range issue Gavin mentions in a pretty slick way. It also handles auto generating appropriate legends and just generally has a more polished feel in my opinion out of the box with less manual manipulation.
library(ggplot2)
#Sample data
dat <- data.frame(dens = c(rnorm(100), rnorm(100, 10, 5))
, lines = rep(c("a", "b"), each = 100))
#Plot.
ggplot(dat, aes(x = dens, fill = lines)) + geom_density(alpha = 0.5)
Adding base graphics version that takes care of y-axis limits, add colors and works for any number of columns:
If we have a data set:
myData <- data.frame(std.nromal=rnorm(1000, m=0, sd=1),
wide.normal=rnorm(1000, m=0, sd=2),
exponent=rexp(1000, rate=1),
uniform=runif(1000, min=-3, max=3)
)
Then to plot the densities:
dens <- apply(myData, 2, density)
plot(NA, xlim=range(sapply(dens, "[", "x")), ylim=range(sapply(dens, "[", "y")))
mapply(lines, dens, col=1:length(dens))
legend("topright", legend=names(dens), fill=1:length(dens))
Which gives:
Just to provide a complete set, here's a version of Chase's answer using lattice:
dat <- data.frame(dens = c(rnorm(100), rnorm(100, 10, 5))
, lines = rep(c("a", "b"), each = 100))
densityplot(~dens,data=dat,groups = lines,
plot.points = FALSE, ref = TRUE,
auto.key = list(space = "right"))
which produces a plot like this:
That's how I do it in base (it's actually mentionned in the first answer comments but I'll show the full code here, including legend as I can not comment yet...)
First you need to get the info on the max values for the y axis from the density plots. So you need to actually compute the densities separately first
dta_A <- density(VarA, na.rm = TRUE)
dta_B <- density(VarB, na.rm = TRUE)
Then plot them according to the first answer and define min and max values for the y axis that you just got. (I set the min value to 0)
plot(dta_A, col = "blue", main = "2 densities on one plot"),
ylim = c(0, max(dta_A$y,dta_B$y)))
lines(dta_B, col = "red")
Then add a legend to the top right corner
legend("topright", c("VarA","VarB"), lty = c(1,1), col = c("blue","red"))
I took the above lattice example and made a nifty function. There is probably a better way to do this with reshape via melt/cast. (Comment or edit if you see an improvement.)
multi.density.plot=function(data,main=paste(names(data),collapse = ' vs '),...){
##combines multiple density plots together when given a list
df=data.frame();
for(n in names(data)){
idf=data.frame(x=data[[n]],label=rep(n,length(data[[n]])))
df=rbind(df,idf)
}
densityplot(~x,data=df,groups = label,plot.points = F, ref = T, auto.key = list(space = "right"),main=main,...)
}
Example usage:
multi.density.plot(list(BN1=bn1$V1,BN2=bn2$V1),main='BN1 vs BN2')
multi.density.plot(list(BN1=bn1$V1,BN2=bn2$V1))
You can use the ggjoy package. Let's say that we have three different beta distributions such as:
set.seed(5)
b1<-data.frame(Variant= "Variant 1", Values = rbeta(1000, 101, 1001))
b2<-data.frame(Variant= "Variant 2", Values = rbeta(1000, 111, 1011))
b3<-data.frame(Variant= "Variant 3", Values = rbeta(1000, 11, 101))
df<-rbind(b1,b2,b3)
You can get the three different distributions as follows:
library(tidyverse)
library(ggjoy)
ggplot(df, aes(x=Values, y=Variant))+
geom_joy(scale = 2, alpha=0.5) +
scale_y_discrete(expand=c(0.01, 0)) +
scale_x_continuous(expand=c(0.01, 0)) +
theme_joy()
Whenever there are issues of mismatched axis limits, the right tool in base graphics is to use matplot. The key is to leverage the from and to arguments to density.default. It's a bit hackish, but fairly straightforward to roll yourself:
set.seed(102349)
x1 = rnorm(1000, mean = 5, sd = 3)
x2 = rnorm(5000, mean = 2, sd = 8)
xrng = range(x1, x2)
#force the x values at which density is
# evaluated to be the same between 'density'
# calls by specifying 'from' and 'to'
# (and possibly 'n', if you'd like)
kde1 = density(x1, from = xrng[1L], to = xrng[2L])
kde2 = density(x2, from = xrng[1L], to = xrng[2L])
matplot(kde1$x, cbind(kde1$y, kde2$y))
Add bells and whistles as desired (matplot accepts all the standard plot/par arguments, e.g. lty, type, col, lwd, ...).