NMDS plot with vegan not coloured by groups - r

I try to use a script to plot NMDS that worked perfectly before...but I changed of R version (R 4.1.2 on Ubuntu 20.04) and I cannot get anymore NMDS coloured graphs.
I get this error
"species scores not available"
I get the good NMDS representation, but I cannot get it colored according "TypeV" -see below code - (I got it before)
mydata TESTNMDS.csv
data <- read.table('TESTNMDS.csv',header = T, sep = ",")
coldit=c( "#FFFF33","#E141B9", "#33FF66", "#3333FF")
abundance.matrix <- data[,3:50]
for (j in ncol(abundance.matrix):1) if (colSums(abundance.matrix[j]) < 1)
bundance.matrix <-abundance.matrix[ , -j]
dist_data<-vegdist(abundance.matrix^0.5, method='bray')
nmds <- metaMDS(dist_data, trace =TRUE, try=500)
plot(nmds)
> plot(nmds)
species scores not available
If I try to plot with various represenstation (that worked before) I cannot get color
orditorp(nmds,display="sites",col=coldit[data$TypeV])
ordipointlabel(nmds, display = "sites", cex=1, pch=16, col=coldit[datat$TypeV])
ordispider(nmds, groups=data$TypeV,label=TRUE)
ordihull(nmds, groups=data$TypeV, lty="dotted")
THANKS A LOT !

The main problem is that R no longer automatically converts character data to factors so you have to do that explicitly. Here is your code with a few simplifications:
library(vegan)
data <- read.table('TESTNMDS.csv',header = T, sep = ",")
data$TypeV <- factor(data$TypeV) # make TypeV a factor
coldit=c( "#FFFF33","#E141B9", "#33FF66", "#3333FF")
abundance.matrix <- data[,3:50]
idx <- colSums(abundance.matrix) > 0
abundance.matrix <- abundance.matrix[ , idx]
dist_data<-vegdist(abundance.matrix^0.5, method='bray')
nmds <- metaMDS(dist_data, trace =TRUE, try=500)
plot(nmds)
orditorp(nmds,display="sites",col=coldit[data$TypeV])
You do not need a loop to eliminate columns since R is vectorized.

Related

Saving output plot in R with grid.grab() doesn't work

I've been trying to save multiple plot generated with the meta package in R, used to conduct meta-analysis, but I have some troubles. I need to save this plot to arrange them in a multiple plot figure.
Example data:
s <- data.frame(Study = paste0("Study", 1:15),
event.e = sample(1:100, 15),
n.e = sample(100:300, 15))
meta1 <- meta::metaprop(event = event.e,
n= n.e,
data=s,
studlab = Study)
Here is the code:
meta::funnel(meta1)
funnelplot <- grid::grid.grab()
I can see the figure in the "plot" tab in R Studio; However, if I search the funnelplot object in the environment it say that is a "NULL" type, and obviously trying to recall that doesn't work.
How can I fix it?

How do I plot multiple lines on the same graph?

I am using the R. I am trying to use the "lines' command in ggplot2 to show the predicted values vs. the actual values for a statistical model (arima, time series). Yet, when I ran the code, I can only see a line of one color.
I simulated some data in R and then tried to make plots that show actual vs predicted:
#set seed
set.seed(123)
#load libraries
library(xts)
library(stats)
#create data
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")
property_damages_in_dollars <- rnorm(731,100,10)
final_data <- data.frame(date_decision_made, property_damages_in_dollars)
#aggregate
y.mon<-aggregate(property_damages_in_dollars~format(as.Date(date_decision_made),
format="%W-%y"),data=final_data, FUN=sum)
y.mon$week = y.mon$`format(as.Date(date_decision_made), format = "%W-%y")`
ts = ts(y.mon$property_damages_in_dollars, start = c(2014,1), frequency = 12)
#statistical model
fit = arima(ts, order = c(4, 1, 1))
Here were my attempts at plotting the graphs:
#first attempt at plotting (no second line?)
plot(fit$residuals, col="red")
lines(fitted(fit),col="blue")
#second attempt at plotting (no second line?)
par(mfrow = c(2,1),
oma = c(0,0,0,0),
mar = c(2,4,1,1))
plot(ts, main="as-is") # plot original sim
lines(fitted(fit), col = "red") # plot fitted values
legend("topleft", legend = c("original","fitted"), col = c("black","red"),lty = 1)
#third attempt (plot actual, predicted and 5 future values - here, the actual and future values show up, but not the predicted)
pred = predict(fit, n.ahead = 5)
ts.plot(ts, pred$pred, lty = c(1,3), col=c(5,2))
However, none of these seem to be working correctly. Could someone please tell me what I am doing wrong? (note: the computer I am using for my work does not have an internet connection or a usb port - it only has R with some preloaded packages. I do not have access to the forecast package.)
Thanks
Sources:
In R plot arima fitted model with the original series
R fitted ARIMA off by one timestep? pkg:Forecast
Plotting predicted values in ARIMA time series in R
You seem to be confusing a couple of things:
fitted usually does not work on an object of class arima. Usually, you can load the forecast package first and then use fitted.
But since you do not have acces to the forecast package you cannot use fitted(fit): it always returns NULL. I had problems with fitted
before.
You want to compare the actual series (x) to the fitted series (y), yet in your first attempt you work with the residuals (e = x - y)
You say you are using ggplot2 but actually you are not
So here is a small example on how to plot the actual series and the fitted series without ggplot.
set.seed(1)
x <- cumsum(rnorm(10))
y <- stats::arima(x, order = c(1, 0, 0))
plot(x, col = "red", type = "l")
lines(x - y$residuals, col = "blue")
I Hope this answer helps you get back on tracks.

R correlation analysis: trying to reproduce a ggcorrplot with a subset of variables vs a different subset, using ggcorrplot2 instead

I am trying to make a correlation plot with a subset of variables versus a different subset.
Using the mtcars data I do the following using ggcorrplot:
data(mtcars)
corrtest <- psych::corr.test(mtcars[,1:7], adjust="none")
all_matrix <- corrtest$r
all_pmat <- corrtest$p
pheno_markers <- names(mtcars)[1:4]
serol_markers <- names(mtcars)[5:7]
sub_matrix <- all_matrix[pheno_markers, serol_markers]
sub_pmat <- all_pmat[pheno_markers, serol_markers]
grDevices::pdf(file="heat_duo.pdf", height=4, width=4)
print(
ggcorrplot::ggcorrplot(sub_matrix, p.mat=sub_pmat, method="circle")
)
grDevices::dev.off()
This produces the following plot, which is good:
Now I want to reproduce the same plot with ggcorrplot2 instead, cause it allows me to overlay significance values of the comparisons as ***. I use this package usually with no problem, but I do not seem to get this case right; it seems it can only deal with symmetrical matrices with colnames == rownames...
I tried the following:
grDevices::pdf(file="heat_duo2.pdf", height=4, width=4)
print(
ggcorrplot2::ggcorrplot(sub_matrix, p.mat=sub_pmat, method="circle",
insig = "label_sig", sig.lvl = c(0.05, 0.01, 0.001))
)
grDevices::dev.off()
But the result is obviously wrong:
Any idea on how to deal with a case like this in ggcorrplot2 (ggcorrplot makes it so easy)?

R: Find outliers with mvBACON

I'm new to R and working on an assignment were I am supposed to replicate the results from a linear regression (time series data with 1360 observations and 52 variables (11 variables in the regression model)). In the original study the researchers identified outliers with the Hadi method. It seems that this is done best in R with the mvBacon function, is this correct? I cannot seem to find a good answer on how to use this though, could anyone please tell me how I can use this function to find the outliers?
(I would very much appreciate an answer that is explained as simply as possible since R is very new to me).
Thank you very much!
Yes, the mvBACON is for outlier identification based on some distance. The default one is the Mahalanobis distance.
The following code will walk you through a simple example on the mtcars subdataset on how to identify outliers with mvBACON:
# load packages
library(dplyr)
library(magrittr)
# Use mtcars (sub)dataset and plot it
data <- mtcars %>% select(mpg, disp)
plot(data, main = "mtcars")
# Add some outliers and plot again
data <- rbind(data,
data.frame(mpg = c(1, 80), disp = c(800, 1000)))
plot(data, main = "mtcars")
# Use mvBacon to calculate the distances and get the ouliers
# install.packages("robustX) # uncomment line to install package
library(robustX)
#compute distance - default is Mahalonobis
distances <- mvBACON(data)
# Plot it again...
plot(data, main = "mtcars")
# ...with highlighting the outliers
points(data[!distances$subset, ], col = "red", pch = 19)
# Some fine tuning, since many of the outliers seem to be still good for regression
distances <- mvBACON(data, alpha = 0.6)
# update plot
plot(data, main = "mtcars")
points(data[!distances$subset, ], col = "red", pch = 19)

How can I plot a biplot for LDA in r?

I did a linear discriminant analysis using the function lda() from the package MASS. Now I would try to plot a biplot like in ade4 package (forLDA). Do you know how can I do this?
If I try to use the biplot() function it doesn't work. For example, if I use the Iris data and make LDA:
dis2 <- lda(as.matrix(iris[, 1:4]), iris$Species)
then I can plot it using the function plot(), but if I use the function biplot() it doesn't work:
biplot(dis2)
Error in nrow(y) : argument "y" is missing, with no default
How can I plot the arrows of variables?
I wrote the following function to do this:
lda.arrows <- function(x, myscale = 1, tex = 0.75, choices = c(1,2), ...){
## adds `biplot` arrows to an lda using the discriminant function values
heads <- coef(x)
arrows(x0 = 0, y0 = 0,
x1 = myscale * heads[,choices[1]],
y1 = myscale * heads[,choices[2]], ...)
text(myscale * heads[,choices], labels = row.names(heads),
cex = tex)
}
For your example:
dis2 <- lda(as.matrix(iris[, 1:4]), iris$Species)
plot(dis2, asp = 1)
lda.arrows(dis2, col = 2, myscale = 2)
The length of the arrows is arbitrary relative to the lda plot (but not to each other, of course!). If you want longer or shorter arrows, change the value of myscale accordingly. By default, this plots arrows for the first and second axes. If you want to plot other axes, change choices to reflect this.
My understanding is that biplots of linear discriminant analyses can be done, it is implemented in fact also in R package ggbiplot, see https://github.com/vqv/ggbiplot/tree/experimental and package ggord, see https://github.com/fawda123/ggord, for your example:
install.packages("devtools")
library(devtools)
install_github("fawda123/ggord")
library(ggord)
ord <- lda(Species ~ ., iris, prior = rep(1, 3)/3)
ggord(ord, iris$Species)
Also the book "Biplots in practice" by M. Greenacre has one chapter (chapter 11) on it and in Figure 11.5 it shows a biplot of a linear discriminant analysis of the iris dataset:
You can achieve this using the ggord package from github. The dataset used is IRIS dataset
# --- data partition -- #
set.seed(555)
IRSam <- sample.int(n = nrow(IR), size = floor(.60*nrow(IR)), replace = FALSE, prob = NULL)
IRTrain <- IR[IRSam,]
IRTest <- IR[-IRSam,]
# --- Prediction --- #
p<- predict(IR.lda, IRTrain)
# --- plotting a biplot --- #
library(devtools)
# install_github('fawda123/ggord') --- Used to install ggord from github we need to run devtools to achieve this.
library(ggord)
ggord(IR.lda, IRTrain$Species, ylim=c(-5,5), xlim=c(-10,10))

Resources