Scaling in vegan RDA plots, how is it working? - r

I don't understand how the scaling works in Vegan, when plotting ordinations.
I found this question which will help clarify my point. For what I can read on the "Numerical ecology with R" book, there is differences between scaling = 1 and scaling = 2. In particular, with scaling 1 "The angles among descriptor vectors do not reflect their correlations" while with scaling 2 "The angles between descriptors in the biplot reflect their correlations".
So, I run this code (partially copy-pasted from the cited question) and I get two different plots (the axis span is different, so maybe the scaling parameter is doing something) but I don't see much difference between the angles of the descriptor vectors so I am trying to understand what, if anything, is wrong.
What I am missing, here?
library("vegan")
data(varespec)
data(varechem)
ord <- rda(varespec)
set.seed(1)
(fit <- envfit(ord, varechem, perm = 999))
## make up a fake `status`
status <- factor(rep(c("Class1","Class2"), times = nrow(varespec) / 2))
## manual version with extra things
colvec <- c("red","green")
scl <- 1
plot(ord, type = "n", scaling = scl, main="Scaling 1")
points(ord, display = "sites", col = colvec[status], pch = (1:2)[status])
points(ord, display = "species", pch = "+")
plot(fit, add = TRUE, col = "black")
dev.new()
scl <- 2
plot(ord, type = "n", scaling = scl, main="Scaling 2")
points(ord, display = "sites", col = colvec[status], pch = (1:2)[status])
points(ord, display = "species", pch = "+")
plot(fit, add = TRUE, col = "black")

Related

Homogenizing scale for density plot

I am making a series of plots from a point pattern (PPP) with the density (kernel) function. I would like that the maximum plotted number is 200 in all cases and just the heatmap accordingly (two of the images only go up to 100). I have not been able to find a solution to this problem using the R base plot.
Microglia_Density <- density(Microglia_PPP, sigma =0.1, equal.ribbon = TRUE, col = topo.colors, main = "")
plot(Microglia_Density, main = "Microglia density")
Astrocytes_Density <- density(Astrocytes_PPP, sigma =0.1, equal.ribbon = TRUE, col = topo.colors, main = "")
plot(Astrocytes_Density, main = "Astrocytes density")
Neurons_Density <- density(Neurons_PPP, sigma =0.1, equal.ribbon = TRUE, col = topo.colors, main = "")
plot(Neurons_Density, main = "Neuronal density")
I would appreciate recommendations. Regards
Since we don’t have access to your data I simulate fake data in a square.
There are several options to do what you want. First you should know that
density() is a generic function, so when you invoke it on a ppp like
Microglia_PPP actually the function density.ppp() is invoked.
This function returns an im object (effectively a 2-d “image” of values).
You plot this with plot() which in turn calls plot.im(), so you should
read the help file of plot.im(), where it says that the argument col
controls the colours used in the plot. Either you can make a colour map
covering the range of values you are interested in and supply that, or if you
know that one of the images has the colour map you want to use you can save
it and reuse for the others:
library(spatstat)
set.seed(42)
Microglia_PPP <- runifpoint(100)
Neurons_PPP <- runifpoint(200)
Neurons_Density <- density(Neurons_PPP, sigma = 0.1)
Microglia_Density <- density(Microglia_PPP, sigma = 0.1)
my_colourmap <- plot(Neurons_Density, main = "Neuronal density", col = topo.colors)
plot(Microglia_Density, main = "Microglia density", col = my_colourmap)
Notice the colour maps are the same, but it only covers the range from
approximately 80 to 310. Any values of the image outside this range will not
be plottet, so they appear white.
You can make a colour map first and then use it for all the plots
(see help(colourmap)):
my_colourmap <- colourmap(topo.colors(256), range = c(40,315))
plot(Neurons_Density, main = "Neuronal density", col = my_colourmap)
plot(Microglia_Density, main = "Microglia density", col = my_colourmap)
Finally another solution if you want the images side by side is to make them
an imlist (image list) and use plot.imlist() with equal.ribbon = TRUE:
density_list <- as.imlist(list(Neurons_Density, Microglia_Density))
plot(density_list, equal.ribbon = TRUE, main = "")

Plot percentage change figure with 95% CI and stats

I am planning to reproduce the attached figure, but I have no clue how to do so:
Let´s say I would be using the CO2 example dataset, and I would like to plot the relative change of the Uptake according to the Treatment. Instead of having the three variables in the example figure, I would like to show the different Plants grouped for each day/Type.
So far, I managed only to get this bit of code, but this is far away from what it should look like.
aov1 <- aov(CO2$uptake~CO2$Type+CO2$Treatment+CO2$Plant)
plot(TukeyHSD(aov1, conf.level=.95))
Axes should be switched, and I would like to add statistical significant changes indicated with letters or stars.
You can do this by building it in base R - this should get you started. See comments in code for each step, and I suggest running it line by line to see what's being done to customize for your specifications:
Set up data
# Run model
aov1 <- aov(CO2$uptake ~ CO2$Type + CO2$Treatment + CO2$Plant)
# Organize plot data
aov_plotdata <- data.frame(coef(aov1), confint(aov1))[-1,] # remove intercept
aov_plotdata$coef_label <- LETTERS[1:nrow(aov_plotdata)] # Example labels
Build plot
#set up plot elements
xvals <- 1:nrow(aov_plotdata)
yvals <- range(aov_plotdata[,2:3])
# Build plot
plot(x = range(xvals), y = yvals, type = 'n', axes = FALSE, xlab = '', ylab = '') # set up blank plot
points(x = xvals, y = aov_plotdata[,1], pch = 19, col = xvals) # add in point estimate
segments(x0 = xvals, y0 = aov_plotdata[,2], y1 = aov_plotdata[,3], lty = 1, col = xvals) # add in 95% CI lines
axis(1, at = xvals, label = aov_plotdata$coef_label) # add in x axis
axis(2, at = seq(floor(min(yvals)), ceiling(max(yvals)), 10)) # add in y axis
segments(x0=min(xvals), x1 = max(xvals), y0=0, lty = 2) #add in midline
legend(x = max(xvals)-2, y = max(yvals), aov_plotdata$coef_label, bty = "n", # add in legend
pch = 19,col = xvals, ncol = 2)

Using Bxp function in R with varwidth

I am quite new to R programming and have been given the task of representing some data in a boxplot. We were only provided the five figure summary of the data, i.e the lowest value, lower quartile,median,upper quartile,highest value. We are also told the amount of samples (n).
I read bxp was a function similar to boxplot but drew the boxplot based upon this five figure summary.
However, I know varwidth can be used to change the width of boxes proportionate to N, yet it does not seem to work here as all boxes are the same length. This is what I need help with.
MORSEYear1 <- c(18.2,58.5,64.4,73.4,91.1)
MORSEYear2 <- c(22.3,56.4,64.3,75.7,97.4)
MORSEYear3 <- c(29.1,57.9,66.6,73.4,86.0)
MathStatYear1 <- c(46.8,54.8,66.1,71.4,84.1)
MathStatYear2 <- c(35.1,47.8,57.8,65.7,82.8)
MathStatYear3 <- c(32.6,56.3,61.1,75.6,89.4)
MORSE1<-list(stats=matrix(MORSEYear1,MORSEYear1[5],MORSEYear1[1]), n=139)
MORSE2<-list(stats=matrix(MORSEYear2,MORSEYear2[5],MORSEYear2[1]), n=132)
MORSE3<-list(stats=matrix(MORSEYear3,MORSEYear3[5],MORSEYear3[1]), n=131)
MS1 <- list(stats=matrix(MathStatYear1,MathStatYear1[5],MathStatYear1[1]), n= 21)
MS2 <- list(stats=matrix(MathStatYear2,MathStatYear2[5],MathStatYear2[1]), n=20)
MS3 <- list(stats=matrix(MathStatYear3,MathStatYear3[5],MathStatYear3[1]), n= 14)
bxp(MORSE1, xlim = c(0.5,6.5),ylim = c(0,100),varwidth= TRUE, main = "Graph comparing distribution of marks across different years of MORSE and MathStat",ylab = "Marks", xlab = "Course and year of study (Course,Year)", axes = FALSE)
par(new=T)
bxp(MORSE2, xlim = c(-0.5,5.5), ylim = c(0,100),axes= TRUE, varwidth=TRUE)
par(new=T)
bxp(MORSE3, xlim = c(-1.5,4.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS1, xlim = c(-2.5,3.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS2, xlim = c(-3.5,2.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
par(new=T)
bxp(MS3, xlim = c(-4.5,1.5), ylim = c(0,100), varwidth=TRUE, axes = FALSE)
NOTE: My supervisor said to use par(new=T) and change the xlim to plot multiple graphs using bxp(), if someone could verify if this is the best method or not that would be great!
Thanks
Stumbled upon the same problem, without much experience with R.
The varwidth argument of the bxp() function requires multiple boxplots being plotted at once. Adding to an initial plot does not count, as no readjustment is possible after the fact.
The question is how to construct a multidimensional z argument for bxp(). To answer this, a look at the result of something like boxplot(c(c(1,1),c(2,2))~c(c(11,11),c(22,22))) helps.
First, a generic example with made-up data to aid anyone that lands here:
# data
d1 <- c(1,2,3,4,5)
d2 <- c(1,2,3,5,8,13,21,34)
# summaries (generated with quantile and structured accordingly)
z1 <- list(
stats=matrix(quantile(d1, c(0.05,0.25,0.5,0.75,0.85))),
n=length(d1)
)
z2 <- list(
stats=matrix(quantile(d2, c(0.05,0.25,0.5,0.75,0.85))),
n=length(d2)
)
# merging the summaries appropriately
z <- list(
stats=cbind(z1$stats,z2$stats),
n=c(z1$n,z2$n)
)
# check result
print(z)
# call bxp with needed parameters ("at" can/should also be used here)
bxp(z=z,varwidth=TRUE)
In the case of the original question, one should merge MORSE# and MS#. The code is far from optimal - there might be a better way to merge and a function for this can be written, but the aim is ugly clarity and simplicity:
z <- list(
stats=cbind(MORSE1$stats, MORSE2$stats, MORSE3$stats, M1$stats, M2$stats, M3$stats),
n=c(MORSE1$stats, MORSE2$n, MORSE3$n, M1$n, M2$n, M3$n)
)

Customising vegan ordination plot

I have a dataset including 100 species and therefore it's very bad to plot. So I want to pick out a subset of these species and plot them in a RDA plot. I have been following this
guideline
The code looks like this:
## load vegan
require("vegan")
## load the Dune data
data(dune, dune.env)
## PCA of the Dune data
mod <- rda(dune, scale = TRUE)
## plot the PCA
plot(mod, scaling = 3)
## build the plot up via vegan methods
scl <- 3 ## scaling == 3
colvec <- c("red2", "green4", "mediumblue")
plot(mod, type = "n", scaling = scl)
with(dune.env, points(mod, display = "sites", col = colvec[Use],
scaling = scl, pch = 21, bg = colvec[Use]))
text(mod, display = "species", scaling = scl, cex = 0.8, col = "darkcyan")
with(dune.env, legend("topright", legend = levels(Use), bty = "n",
col = colvec, pch = 21, pt.bg = colvec))
This is the plot you end up with. Now i would really like to remove some of the species from the plot, but not the analysis. So the plot only shows like Salrep, Viclat, Aloge and Poatri.
Help is appreciated.
The functions you are doing the actual plotting with have an argument select (at least text.cca() and points.cca(). select takes either a logical vector of length i indicating whether the ith thing should be plotted, or the (numeric) indices of the things to plot. The example would then become:
## Load vegan
library("vegan")
## load the Dune data
data(dune, dune.env)
## PCA of the Dune data
mod <- rda(dune, scale = TRUE)
## plot the PCA
plot(mod, scaling = 3)
## build the plot up via vegan methods
scl <- 3 ## scaling == 3
colvec <- c("red2", "green4", "mediumblue")
## Show only these spp
sppwant <- c("Salirepe", "Vicilath", "Alopgeni", "Poatriv")
sel <- names(dune) %in% sppwant
## continue plotting
plot(mod, type = "n", scaling = scl)
with(dune.env, points(mod, display = "sites", col = colvec[Use],
scaling = scl, pch = 21, bg = colvec[Use]))
text(mod, display = "species", scaling = scl, cex = 0.8, col = "darkcyan",
select = sel)
with(dune.env, legend("topright", legend = levels(Use), bty = "n",
col = colvec, pch = 21, pt.bg = colvec))
Which gives you:
You may also use the ordiselect() function from the goeveg-package:
https://CRAN.R-project.org/package=goeveg
It offers selection of species for ordination plots based on abundances and/or species fit to axes.
## Select ssp. with filter: 50% most abundant and 50% best fitting
library(goeveg)
sel <- ordiselect(dune, mod, ablim = 0.5, fitlim = 0.5)
sel # 12 species selected
The result object of the function (containing the names of selected species) can be put into the select argument (as described above).

Add arrows in RDA in R

I am relatively new to R and I am trying to get my head around how to do ordination techniques in R, so that I don't need to use other software.
I am trying to get a PCA with environmental factors in the place of species.
As I have sites which differ qualitatively (in terms of land use) I wanted to be able to show that difference in the final plot (with different colours). Therefore, I used the method a la Gavin Simpson with the package vegan. So far so good. Here is also the code that I used for that:
with(fish, status)
scl <- -1 ## scaling = -1
colvec <- c("red2", "mediumblue")
plot(pond.pca, type = "n", scaling = scl)
with(fish, points(pond.pca, display = "sites", col = colvec[status], scaling = scl, pch = 21, bg = colvec[status]))
head(with(fish, colvec[status]))
text(pond.pca, display = "species", scaling = scl, cex = 0.8, col = "darkcyan")
with(fish, legend("topright", legend = levels(status), bty = "n", col = colvec, pch = 21, pt.bg = colvec))
The problem arises when I try to put arrows for my environmental variables in the ordination plot. If I use biplot and other functions like ordiplot etc. I ll not be able to keep the different colours for my two types of sites, therefore I don't want to use those. If I use the command here:
plot(envfit(pond.pca, PondEnv38, scaling=-1), add=TRUE, col="black")
I get nice arrows, only the are not aligned (and in some cases are completely opposite) with the environmental variables that I ve given with the code before (line 5). I tried to change the scaling but they just cannot align.
Does anyone know how to deal with that problem?
Any tips would be useful.
It is not clear what you are doing wrong as you don't provide a reproducible example of the problem and I am having difficulty following your description of what is wrong. Here is a fully worked out example for you to follow that does what you seem to being trying to do.
data(varespec)
data(varechem)
ord <- rda(varespec)
set.seed(1)
(fit <- envfit(ord, varechem, perm = 999))
## make up a fake `status`
status <- factor(rep(c("Class1","Class2"), times = nrow(varespec) / 2))
> head(status)
[1] Class1 Class2 Class1 Class2 Class1 Class2
Now plot
layout(matrix(1:2, ncol = 2))
## auto version
plot(fit, add = FALSE)
## manual version with extra things
colvec <- c("red","green")
scl <- -1
plot(ord, type = "n", scaling = scl)
points(ord, display = "sites", col = colvec[status], pch = (1:2)[status])
points(ord, display = "species", pch = "+")
plot(fit, add = TRUE, col = "black")
layout(1)
Which gives
And all the arrows seem to be pointing as they would if you plotted the envfit object directly.

Resources