Ordination Graph in R - r

Before I begin, I am very much a beginner at R, so apologies if this is an easy fix.
I need to create an ordination graph with species labels and a key. I found the 'vegan' package which I believe is relevant. I've read in the table and attached it and also have loaded the package:
Ordination<-read.table("pHWTD.txt", header = T, stringsAsFactors = T)
attach(Ordination)
library(vegan)
ordiplot(Mean_pH ~ Mean_Water_Table_Depth)
After this ordiplot function, the console returns: "Error in NROW(X) : object 'X' not found"
I don't understand this because Mean_pH and Mean_Water_Table_Depth are variables, so unsure what is not being found.
Below is a generalised version of the data I need to plot (where each species will be represented by a graphical point with an abbreviated label e.g. Species 1 could be spec.1):
Any advice, if possible, would be appreciated.
Many thanks in advance.

Related

Is there a way to add species to an ISOMAP plot in R?

I am using the isomap-function from vegan package in R to analyse community data of epiphytic mosses and lichens. I started analysing the data using NMDS but due to the structure of the data ran into problems which is why I switched to ISOMAP which works perfectly well and returns very nice results. So far so good... However, the output of the function does not support plotting of species within the ISOMAP plot as species scores are not available. Anyway, I would really like to add species information to enhance the interpretability of the output.
Does anyone of you has a solution or hint to this problem? Is there a way to add species kind of post hoc to the plot as it can be done with environmental data?
I would greatly appreciate any help on this topic!
Thank you and best regards,
Inga
No, there is no function to add species scores to isomap. It would look like this:
`sppscores<-.isomap` <-
function(object, value)
{
value <- scale(value, center = TRUE, scale = FALSE)
v <- crossprod(value, object$points)
attr(v, "data") <- deparse(substitute(value))
object$species <- v
object
}
Or alternatively:
`sppscores<-.isomap` <-
function(object, value)
{
wa <- vegan::wascores(object$points, value, expand = TRUE)
attr(wa, "data") <- deparse(substitute(value))
object$species <- wa
object
}
If ord is your isomap result and comm are your community data, you can use these as:
sppscores(ord) <- comm # either alternative
I have no idea (yet) which of these alternatives is more correct. The first adds species scores as vectors of their linear increase, the second as their weighted averages in ordination space, but expanded so that we allow some species be more extreme than the site units where they occur.
These will add new element species to the result object ord. However, using these in vegan would need more coding, but you can extract the species scores with vegan::scores, but their scaling is based on the original scale of community data, and may be badly scaled with respect to points of site units, and working on this would require more work. However, you can plot them separately, or then multiply with a constant giving similar scaling as site unit scores.
sp <- scores(ord, display="species", choices=1:2)
plot(sp, type = "n", asp = 1) # does not allow plotting text
text(sp, labels = rownames(sp)) # so we must add text

R: $ cannot be used for atomic vectors (box plot)

I would love some advice on how to fix the error shown in the screenshot. I only started learning R yesterday so I'm not very familiar with it. I tried using % but this produced a different type of error (unexpected input). Are there any problems with how I've defined time_spen and species earlier on in the code?
Do you want to make a boxplot of the time_spen variable for each species? The ggplot2 package can help.
Since I don't have your data, here is an example from the penguins dataset in the palmerpenguins package.
library(palmerpenguins)
library(ggplot2)
ggplot(penguins) +
geom_boxplot(aes(x = species, y = body_mass_g))

define population level for PCA analysis in adegenet

I want to perform a PCA analysis in adegenet starting from a genepop file without defined populations.
I imported the data like this:
datapop <- read.genepop('tous.gen', ncode=3, quiet = FALSE)
it works, and I can perform a PCA after scaling the data.
But I would like to plot the results / individuals on the PCA axis according to their population of origin using s.class. I have a vcf file with a three lettre code for each individual. I imported it in R:
pops_list <- read.csv('liste_pops.csv', header=FALSE)
but now how can I use it to define population levels in the genind object datapop?
I tried something likes this:
setPop(datapop, formula = NULL)
setPop(datapop) <- pops_list
but it doesn't work; even the first line doesn't work: I get this message:
"Erreur : formula must be a valid formula object."
And then how should I use it in s.class?
thanks
Didier
Without a working example it is kind of hard to tell but perhaps you can find the solution to your problem here: How to add strata information to a genind
Either way from your examples and given how the setPop method works, your line setPop(datapop, formula = NULL) would not work because you would not be defining anything. You would actually have to do:
setPop(datapop) <- pops_list
while also guaranteeing that pops_list is a factor with the appropriate format
I know this is a bit late, but the way to do this is to add pops_list as the strata and then use setPop() to select a certain column:
strata(datapop) <- pops_list
setPop(datapop) <- ~myPop # set the population to the column called "myPop" in the data frame

R programming - Graphic edges too large error while using clustering.plot in EMA package

I'm an R programming beginner and I'm trying to implement the clustering.plot method available in R package EMA. My clustering works fine and I can see the results populated as well. However, when I try to generate a heat map using clustering.plot, it gives me an error "Error in plot.new (): graphic edges too large". My code below,
#Loading library
library(EMA)
library(colonCA)
#Some information about the data
data(colonCA)
summary(colonCA)
class(colonCA) #Expression set
#Extract expression matrix from colonCA
expr_mat <- exprs(colonCA)
#Applying average linkage clustering on colonCA data using Pearson correlation
expr_genes <- genes.selection(expr_mat, thres.num=100)
expr_sample <- clustering(expr_mat[expr_genes,],metric = "pearson",method = "average")
expr_gene <- clustering(data = t(expr_mat[expr_genes,]),metric = "pearson",method = "average")
expr_clust <- clustering.plot(tree = expr_sample,tree.sup=expr_gene,data=expr_mat[expr_genes,],title = "Heat map of clustering",trim.heatmap =1)
I do not get any error when it comes to actually executing the clustering process. Could someone help?
In your example, some of the rownames of expr_mat are very long (max(nchar(rownames(expr_mat)) = 271 characters). The clustering_plot function tries to make a margin large enough for all the names but because the names are so long, there isn't room for anything else.
The really long names seem to have long stretches of periods in them. One way to condense the names of these genes is to replace runs of 2 or more periods with just one, so I would add in this line
#Extract expression matrix from colonCA
expr_mat <- exprs(colonCA)
rownames(expr_mat)<-gsub("\\.{2,}","\\.", rownames(expr_mat))
Then you can run all the other commands and plot like normal.

r - Add text to each lattice histogram with panel.text but has error "object x is missing"

In the following R code, I try to create 30 histograms for the variable allowed.clean by the factor zip_cpt(which has 30 levels).
For each of these histograms, I also want to add mean and sample size--they need to be calculated for each level of the factor zip_cpt. So I used panel.text to do this.
After I run this code, I had error message inside each histogram which reads "Error using packet 21..."x" is missing, with..." (I am not able to read the whole error message because they don't show up in whole). I guess there's something wrong with the object x. Is it because mean(x) and length(x) don't actually apply to the data at each level of the factor zip_cpt?
I appreciate any help!
histogram(~allowed.clean|zip_cpt,data=cpt.IC_CAB1,
type='density',
nint=100,
breaks=NULL,
layout=c(10,3),
scales= list(y=list(relation="free"),
x=list(relation="free")),
panel=function(x,...) {
mean.values <-mean(x)
sample.n <- length(x)
panel.text(lab=paste("Sample size = ",sample.n))
panel.text(lab=paste("Mean = ",mean.values))
panel.histogram(x,col="pink", ...)
panel.mathdensity(dmath=dnorm, col="black",args=list(mean=mean(x, na.rm = TRUE),sd=sd(x, na.rm = TRUE)), ...)})
A discussion I found online is helpful for adding customized text (e.g., basic statistics) on each of the histograms:
https://stat.ethz.ch/pipermail/r-help/2007-March/126842.html

Resources