Select argument doesn't work on cca objects - r

I created an object of class cca in vegan and now I am trying to tidy up the triplot. However, I seemingly can't use the select argument to only show specified items.
My code looks like this:
data("varechem")
data("varespec")
ord <- cca(varespec ~ Al+S, varechem)
plot(ord, type = "n")
text(ord, display = "sites", select = c("18", "21"))
I want only the two specified sites (18 and 21) to appear in the plot, but when I run the code nothing happens. I do not even get an error meassage.
I'm really stuck, but I am fairly certain that this bit of code is correct. Can someone help me?

I can't recall now, but I don't think the intention was to allow "names" to select which rows of the scores should be selected. The documentation speaks of select being a logical vector, or indices of the scores to be selected. By indices it was meant numeric indices, not rownames.
The example fails because select is also used to subset the labels character vector of values to be plotted in text(), and this labels character vector is not named. Using a character vector to subset another vector requires that the other vector be named.
Your example works if you do:
data("varechem")
data("varespec")
ord <- cca(varespec ~ Al + S, varechem)
plot(ord, type = "n")
take <- which(rownames(varechem) %in% c("18", "21"))
# or
# take <- rownames(varechem) %in% c("18", "21")
text(ord, display = "sites", select = take)
I'll have a think about whether it will be simple to support the use case of your example.

The following code probably gives the result you want to achieve:
First, create an object to store the blank CCA1-CCA2 plot
p1 = plot(ord, type = "n")
Find and then save the coordinates of the sites 18 and 21
p1$p1$sites[c("18", "21"),]
# CCA1 CCA2
#18 0.3496725 -1.334061
#21 -0.8617759 -1.588855
site18 = p1$sites["18",]
site21 = p1$sites["21",]
Overlay the blank CCA1-CCA2 plot with the points of site 18 and 21. Setting different colors to different points might be a good idea.
points(p1$sites[c("18", "21"),], pch = 19, col = c("blue", "red"))
Showing labels might be informative.
text(x = site18[1], y = site18[2] + 0.3, labels = "site 18")
text(x = site21[1], y = site21[2] + 0.3, labels = "site 21")
Here is the resulted plot.

Related

Use a for-loop of characters to plot several lines with specific colors

I would like to plot 13 lines on a single graph. Each line represents a subset of my data, grouped by the characters in column 'basin'. What I have works, but I'd like to make it more efficient using a for-loop.
Here's what the output looks like.
A simplified dataframe to work with:
env <- data.frame(basin = c('BLK','DUC','WHP','BLK','DUC','WHP','BLK','DUC','WHP'),
sal = c(5,6,3,2,4,5,6,8,4),
date = c(2013,2013,2013,2015,2015,2015,2017,2017,2017))
And a simplified version of what didn't work (it runs, but makes all the lines blue and solid):
basinlist <- c('BLK','DUC','WHP')
plot(sal~date, data = env, type = 'n', ylim = c(0,10), ylab = 'Salinity')
for(i in basinlist){
lines(sal[basin==i] ~ date[basin==i], data = env,
col = c(4,4,2),
lty = c(1,1,2))
}
The issue is that I don't know how to change the colors with each iteration when i is a character. Searching for this issue yields solutions for when i is a number, or for creating lines that are all different colors, neither of which are my goal.
This is the first time I've asked my own question rather than finding the answer posted elsewhere on SO, so let me know if you need anything else.
In this case for each iteration in your loop, you need both the index of the vector and the variable itself. An easy way to get the plot you want is to iterate over the indices (ii in the example below) and also get the vector element with each iteration (i as you had before).
env <- data.frame(basin = c('BLK','DUC','WHP','BLK','DUC','WHP','BLK','DUC','WHP'),
sal = c(5,6,3,2,4,5,6,8,4),
date = c(2013,2013,2013,2015,2015,2015,2017,2017,2017))
basinlist <- c('BLK','DUC','WHP')
plot(sal~date, data = env, type = 'n', ylim = c(0,10), ylab = 'Salinity')
for (ii in seq_along(basinlist)) {
i <- basinlist[ii]
lines(sal[basin==i] ~ date[basin==i], data = env,
col = c(4,4,2)[ii],
lty = c(1,1,2)[ii])
}

R: PCA plot with different colors for Sites

I´m recently trying to analyse my data and want to make the graphs a little nicer but I´m failing at this.
So I have a data set with 144 sites and 5 environmental variables. It´s basically about the substrate composition around an island and the fish abundance. On this island there is supposed to be a difference in the substrate composition between the north and the southside. Right now I am doing a pca and with the biplot function it works quite fine, but I would like to change the plot a bit.
I need one where the sites are just points and not numbered, arrows point to the different variable and the sites are colored according to their location (north or southside). So I tried everything i could find.
Most examples where with the dune data and suggested something like this:
library(vegan)
library(biplot)
data(dune)
mod <- rda(dune, scale = TRUE)
biplot(mod, scaling = 3, type = c("text", "points"))
So according to this I would just need to say text and points and R would label the variables and just make points for the sites. When i do this, however I get the Error:
Error in plot.default(x, type = "n", xlim = xlim, ylim = ylim, col = col[1L], :
formal argument "type" matched by multiple actual arguments
No idea how to get around this.
So next strategy I found, is to make a plot manually like this:
require("vegan")
data(dune, dune.env)
mod <- rda(dune, scale = TRUE)
scl <- 3 ## scaling == 3
colvec <- c("red2", "green4", "mediumblue")
plot(mod, type = "n", scaling = scl)
with(dune.env, points(mod, display = "sites", col = colvec[Use],
scaling = scl, pch = 21, bg = colvec[Use]))
text(mod,display="species", scaling = scl, cex = 0.8, col = "darkcyan")
with(dune.env, legend("bottomright", legend = levels(Use), bty = "n",
col = colvec, pch = 21, pt.bg = colvec))
This works fine so far as well, I get different colors and points, but now the arrows are missing. So I found that this should be corrected easy, if i just put "display="bp"" in the text line. But this doesn´t work either. Everytime I put "bp" R says:
Error in match.arg(display) :
argument "display" is missing, with no default
So I´m kind of desperate now. I looked through all the answers here and I don´t understand why display="bp" and type=c("text","points") is not working for me.
If anyone has an idea i would be super grateful.
https://www.dropbox.com/sh/y8xzq0bs6mus727/AADmasrXxUp6JTTHN5Gr9eufa?dl=0
This is the link to my dropbox folder. It contains my R-script and the csv files. The one named environmentalvariables_Kon1 also contains the data about north and southside.
So yeah...if anyone could help me. That would be awesome. I really don´t know what to do anymore.
Best regards,
Nancy
You can add arrows with arrows(). See the code for vegan:::biplot.rda to see how it works in the original function.
With your plot, add
g <- scores(mod, display = "species")
len <- 1
arrows(0, 0, len * g[, 1], len * g[, 2], length = 0.05, col = "darkcyan")
You might want to adjust the value of len to make the arrows longer

How to create a plot which shows objects on y-axis and number on x-axis

I do have a lots of data which I want to plot in a special way. But I don't know how to do this on R.
The input is a csv file containing several columns. The columns I want to plot are A and D.
A contains text and D numbers. The usesd text in column A can be there several times. But the does not matter
In the end I want to get a plot which shall demonstrate the following:
I have actually no idea how to plot this:
I've tried: plot(data1$COLUMND,data1$COLUMNA,xlab = "COLUMND", ylab = "COLUMNA"); But the result is that the text in column A is replaced by a number. So the axis get the label from 0-3 in this case.
I also tried to change the lable with the labels command. But this lead to the problem that the lables were in an aceding row. But the data in the column are not (in my example above they are, but not in my real data). Therefore R should replace 0 with the corresponding text from column A.
For this I used the methods shown in Quick-R guide
but they work not as desired and replaced the entries with null.
you have to do two steps.
1) Make a list of vectors. Every vector is names after an unique element of column A and contains the corresponding values form column D.
2) Use the stripchart() function with this list.
My code approach:
## your data
data <- data.frame(A = c("AAA", "AAB", "AAC", "AAA", "AAE", "AAC"),
B = rep(12.3),
C = rep(20160729),
D = c(100,80,10,0,5,20))
## empty list to fill in the following loop
list <- list()
## get the values in column D for every unique value in column A
## an add it to the list
for (i in unique(data$A)) list[[i]] <- data$D[data$A == i]
## plot the list
stripchart(list,
xlab = "Column D", ylab = "Column A",
pch = 16, col = "red")
The result:
Stripchart
Have you tried using the axis function?
First, note that "AAD" was not in the sample data that you provided. We have to tell R about the values in Column A and how we want them to be ordered:
data1 <- data.frame(A=c('AAA', 'AAB', 'AAC', 'AAA', 'AAE', 'AAC'),
D=c(100, 80, 10, 0, 5, 20))
data1$A <- factor(data1$A, levels=paste0('AA',LETTERS[1:5]))
Now we can plot. We tell R to leave out the Y-axis for now (using the yaxt argument); we'll add them in manually later.
par(mar=c(6,6,4,2)) # Set margins for plot
plot(data1$D, data1$A, xlab = "Column D", ylab = "", yaxt="n", las=1)
Finally we add in the Y-axis labels, using the actual values instead of factor levels (i.e. the numbers).
axis(2, at=1:length(levels(data1$A)), labels=levels(data1$A), las=2)
mtext("Column A", side=2, line=1, las=2, at=3.2)

How to extract outliers from box plot in R

Could you explain me if there is a way to extract outliers from box plot. I have plotted a box plot and I want to extract only the outliers.
Here is the code for the box plot.
# melting down
require(reshape)
melt_nx <- melt(nx, id.vars = c("x", "y"))
boxplot(data = melt_nx, main = "NX", value ~ variable, las = 2,
par(mar = c(15, 5, 4, 2) + 0.1),
names = c("We1", "We2", "we3"))
Is it possible from the box plot to extract the outliers only?
The boxplot function returns a list with one of it node-names as "out". These are the values that are beyond the "whiskers". I don't know about executing par within the argument list but if you want these particular values, then use this:
vals <- boxplot(data = melt_nx, main = "NX", value ~ variable, las = 2,
names = c("We1", "We2", "we3"))
vals$out
And do read all these help pages:
?boxplot
?boxplot.stats
?bxp
?fivenum
I know this has been answered, but for me there is an alternative method using the Boxplot method from the car package. Note the capital B in the Boxplot function call.
This is the code that does it for me, it returns the row numbers of the outliers which you can then use in your dataframe to filter out or extract, etc...
outliers<-Boxplot(x~y, data=df, id.method="y")
Note that the extracted values are of type Character. Then to exclude them you could do something like:
df2 <- df[-as.numeric(outliers),]
Hope this helps a little

Heatmap like plot with Lattice

I can not figure out how the lattice levelplot works. I have played with this now for some time, but could not find reasonable solution.
Sample data:
Data <- data.frame(x=seq(0,20,1),y=runif(21,0,1))
Data.mat <- data.matrix(Data)
Plot with levelplot:
rgb.palette <- colorRampPalette(c("darkgreen","yellow", "red"), space = "rgb")
levelplot(Data.mat, main="", xlab="Time", ylab="", col.regions=rgb.palette(100),
cuts=100, at=seq(0,1,0.1), ylim=c(0,2), scales=list(y=list(at=NULL)))
This is the outcome:
Since, I do not understand how this levelplot really works, I can not make it work. What I would like to have is the colour strips to fill the whole window of the corresponding x (Time).
Alternative solution with other method.
Basically, I'm trying here to plot the increasing risk over time, where the red is the highest risk = 1. I would like to visualize the sequence of possible increase or clustering risk over time.
From ?levelplot we're told that if the first argument is a matrix then "'x' provides the
'z' vector described above, while its rows and columns are
interpreted as the 'x' and 'y' vectors respectively.", so
> m = Data.mat[, 2, drop=FALSE]
> dim(m)
[1] 21 1
> levelplot(m)
plots a levelplot with 21 columns and 1 row, where the levels are determined by the values in m. The formula interface might look like
> df <- data.frame(x=1, y=1:21, z=runif(21))
> levelplot(z ~ y + x, df)
(these approaches do not quite result in the same image).
Unfortunately I don't know much about lattice, but I noted your "Alternative solution with other method", so may I suggest another possibility:
library(plotrix)
color2D.matplot(t(Data[ , 2]), show.legend = TRUE, extremes = c("yellow", "red"))
Heaps of things to do to make it prettier. Still, a start. Of course it is important to consider the breaks in your time variable. In this very simple attempt, regular intervals are implicitly assumed, which happens to be the case in your example.
Update
Following the advice in the 'Details' section in ?color2D.matplot: "The user will have to adjust the plot device dimensions to get regular squares or hexagons, especially when the matrix is not square". Well, well, quite ugly solution.
par(mar = c(5.1, 4.1, 0, 2.1))
windows(width = 10, height = 2.5)
color2D.matplot(t(Data[ , 2]),
show.legend = TRUE,
axes = TRUE,
xlab = "",
ylab = "",
extremes = c("yellow", "red"))

Resources