R Correlation Plots number instead of Text - r

When i plot the correlation, the column names are not displayed instead number gets displayed.
Why does this happen and how to rectify the same?
Below is the code
espAlltmNum <- espAlltm[, sapply(espAlltm, is.numeric)]
#above dataset is created as correlation needs only numeric columns
M <- cor(espAlltmNum,use = "pairwise", method = "pearson")
corrplot(M, method = "circle",tl.pos = "d", tl.cex = 0.5, tl.col = 'black',
order = "hclust", diag = TRUE,title = "Correlation Plot"
, mar=c(1,1,1,1))
the output is:

i see some issue with corrplot and GGally packages. If the correlation matrix is called before the GGally package/library, the matrix contains the column names(in text).
If the correlation matrix is called after the GGally package/library, matrix contains the index number of the column name. The plot too would give the index number as attached before...

Related

How to specify tm_fill() if I want it to be a variable from a new object?

I am trying to create an R function that would run a GWR on variables that the user specifies from a Spatial Polygons Data Frame. The end result of running the function are two mappings - one of the independent variable's values and one of the coefficient values from the GWR model. I'm having trouble with the second map.
I have managed to create the GWR model and a 'results' object for the coefficients that I would be visualizing.
gwr.model <- gwr(SpatialPolygonsDataFrame#data[, y] ~ SpatialPolygonsDataFrame#data[, x],
data = SpatialPolygonsDataFrame,
adapt = GWRbandwidth,
hatmatrix = TRUE,
se.fit = TRUE)
results <- as.data.frame(gwr.model$SDF)
gwr.map <- SpatialPolygonsDataFrame
gwr.map#data <- cbind(SpatialPolygonsDataFrame#data, as.matrix(results))
To create the visualization of the GWR coefficients, I have to specify my tm_fill() to be a column from the 'results' object, but I do not know how to do it so that the function may be used will any Spatial Polygons Data Frame. So far, I have tried using the paste0() function, as so:
map2 <- tm_shape(gwr.map) + tm_fill(paste0("SpatialPolygonsDataFrame.", x), n = 5, style = "quantile", title = "Coefficient") +
tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.6)
But I got an error saying that the fill argument is neither colors nor a valid variable name.
I'll be grateful for any tips that could help me resolve the issue.
Switching to the package sf - leaving sp behind - probably will solve your problem here.
In the absence of a reproducible example, let me try to suggest the following here:
convert your results with gwr.map.sf <- sf::st_as_sf(gwr.map). Then you add the results of your GWR simply as a new column: gwr.map$results <- results (my understanding is that the dimensions should fit).
Finally you should be able to plot like this:
map2 <- tm_shape(gwr.map.sf) + tm_fill("results", n = 5, style = "quantile", title = "Coefficient") +
tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.6)

How to color the background of a corrplot by group?

Consider this data, where we have several groups with 10 observations each, and we conduct a pairwise.t.test():
set.seed(123)
data <- data.frame(group = rep(letters[1:18], each = 10),
var = rnorm(180, mean = 2, sd = 5))
ttres <- pairwise.t.test(x=data$var, g=data$group, p.adjust.method = "none")#just to make sure i get some sigs for the example
Now lets get the matrix of p values, convert them to a binary matrix showing significant and non-significant values, and plot them with corrplot(), so that we can visualize which groups are different:
library(corrplot)
pmat <- as.matrix(ttres$p.value)
pmat<-round(pmat,2)
pmat <- +(pmat <= 0.1)
pmat
corrplot(pmat, insig = "blank", type = "lower")
Does anyone know a way to color the background of each square according to a grouping label? For instance, say we want the squares for groups a:g to be yellow, the squares for groups h:n to be blue, and the squares for groups o:r to be red. Or is there an alternative way to do this with ggplot?
You can pass a vector of background colors via the bg= parameter. The trick is just making sure they are in the right order. Here's on way to do that
bgcolors <- matrix("white", nrow(pmat), ncol(pmat),dimnames = dimnames(pmat))
bgcolors[1:6, ] <- "yellow"
bgcolors[7:15, ] <- "blue"
bgcolors[14:17, ] <- "red"
bgcolors <- bgcolors[lower.tri(bgcolors, diag=TRUE)]
corrplot(pmat, insig = "blank", type = "lower", bg=bgcolors)
Basically we just make a matrix the same shape as our input, then we set the colors we want for the different rows, and then we just pass the lower triangle of that matrix to the function.

How to fix ‘Error in FUN(X[[i]], ...) : only defined on a data frame with all numeric variables”

I intend to draw a qq plot on the data, but it reminds me that qqnorm function only works on numerical data.
As the factor include A,B,C,D and their two, three and four way interaction, I have no idea how to convert it into numerical form.
The data is as follows:
Effects,Value
A,76.95
B,-67.52
C,-7.84
D,-18.73
AB,-51.32
AC,11.69
AD,9.78
BC,20.78
BD,14.74
CD,1.27
ABC,-2.82
ABD,-6.5
ACD,10.2
BCD,-7.98
ABCD,-6.25
My code is as follows:
library(readr)
data621 <- read_csv("Desktop/data621.csv")
data621_qq<-qqnorm(data621,xlab = "effects",datax = T)
qqline(data621,probs=c(0.3,0.7),datax = T)
text(data621_qq$x,data621_qq$y,names(data621),pos=4)
Your code would work if using the proper columns instead of the entire data frame. For example,
data621_qq <- qqnorm(data621$Value, xlab = "Effects", datax = TRUE)
qqline(data621$Value, probs = c(0.3, 0.7), datax = TRUE)
text(data621_qq$x, data621_qq$y, data621$Effects, pos=4)
By the way, names(data621) would give you the column names, instead of the effect names (which are stored as values in a column).

How to plot an nmds with coloured/symbol points based on SIMPROF

Hi so i am trying to plot my nmds of a assemblage data which is in a bray-curtis dissimilarity matrix in R. I have been able to apply ordielipse(),ordihull() and even change the colours based on group factors created by cutree() of a hclst()
e.g using the dune data from the vegan package
data(dune)
Dune.dis <- vegdist(Dune, method = "bray)
Dune.mds <- metaMDS(Dune, distance = "bray", k=2)
#hierarchical cluster
clua <- hclust(Dune.dis, "average")
plot(clua, hang = -1)
# set groupings
rect.hclust(clua, 4)
grp <- cutree(clua, 4)
#plot mds
plot(Dune.mds, display = "sites", type = "text", cex = 1.5)
#show groupings
ordielipse(Dune.mds, group = grp, border =1, col ="red", lwd = 3)
or even colour the points just by the cutree
colvec <- c("red2", "cyan", "deeppink3", "green3")
colvec[grp]
plot(Dune.mds, display = "sites", type = "text", cex = 1.5) #or use type = "points"
points(P4.mds, col = colvec[c2], bg =colvec[c2], pch=21)
However what i really want to do is use the SIMPROF function using the package "clustsig" to then colour the points based on significant groupings - this is more of a technical coding language thing - i am sure there is a way to create a string of factors but i am sure there is a more efficient way to do it
heres my code so far for that:
simp <- simprof(Dune.dis, num.expected = 1000, num.simulated = 999, method.cluster = "average", method.distance = "braycurtis", alpha = 0.05, sample.orientation = "row")
#plot dendrogram
simprof.plot(simp, plot = TRUE)
Now i am just not sure how do the next step to plot the nmds using the groupings defined by the SIMPROF - how do i make the SIMPROF results a factor string without literally typing it my self it myself?
Thanks in advance.
You wrote you know how to get colours from an hclust object with cutree. Then read the documentation of clustsig::simprof. This says that simprof returns an hclust object within its result object. It also returns numgroups which is the suggested number of clusters. Now you have all information you need to use the cutree of hclust you already know. If your simprof result is called simp, use cutree(simp$hclust, simp$numgroups) to extract the integer vector corresponding to the clustsig::simprof result, and use this to colours.
I have never used simprof or clustsig, but I gathered all this information from its documentation.

How to draw line around significant values in R's corrplot package

I have been asked to obtain a correlation plot for a colaborator.
My choice is to use R for the task, specifically the corrplot package.
I have been researching on the internet and I found multiple ways to obtain such graphics, but not the specific graphic I was asked for (as you can see in the picture the significant values are highlighted by drawing a square around the significant tile), which is puzzling me.
Example of the correlation plot required
The closest result I achieve is using the code under this lines, but I do not seem to be able to find the option to draw line around the significant tiles (if exists).
#Insignificant correlations are leaved blank
corrplot(res3$r, type="upper", order="hclust",
p.mat = res3$P, sig.level = 0.01, insig = "blank")
I tried adding the "addrect" parameter but it didn't work.
#Insignificant correlation are crossed
corrplot(res3$r, type="upper", order="hclust", p.mat = res3$P,
addrect=2, sig.level = 0.01, insig = "blank")
Any help will be appreciated.
corrplot allows you to add new plots to an already existing one. Therefore, once you've created the plot of the initial correlation matrix, you can simply add those cells that you want to highlight in an iterative manner using corrplot(..., add = TRUE).
The only thing required to achieve your goal is an indices vecor (which I called 'ids') to tell R which cells to highlight. Note that for reasons of simplicity, I took a random sample of the initial correlation matrix, but things like ids <- which(p.value < 0.01) (assuming that you've stored your significance levels in a separate vector) would work similarly.
library(corrplot)
## create and visualize correlation matrix
data(mtcars)
M <- cor(mtcars)
corrplot(M, cl.pos = "n", na.label = " ")
## select cells to highlight (e.g., statistically significant values)
set.seed(10)
ids <- sample(1:length(M), 15L)
## duplicate correlation matrix and reject all irrelevant values
N <- M
N[-ids] <- NA
## add significant cells to the initial corrplot iteratively
for (i in ids) {
O <- N
O[-i] <- NA
corrplot(O, cl.pos = "n", na.label = " ", addgrid.col = "black", add = TRUE,
bg = "transparent", tl.col = "transparent")
}
Note that you could also add all values to highlight in one go (i.e., without requiring a for loop) using corrplot(N, ...), but in that case, an undesirable black margin is drawn all around the plotting area.

Resources