Plot class probability by neuron in self organizing maps

Plot class probability by neuron in self organizing maps - r

I found a nice tutorial of self organizing map clustering in R in which, it is explained how to display your input data in the unit space (see below). In order to set up some rules for the labeling, I would like to compute the probability of each class in each neuron and plot it. Computing the probability is rather easy: take for each unit the number of observations of class i and divide it by the total number of observations in this unit. I end up with data.frame pc. Now I struggle to map this result, any clue on how to do it?
library(kohonen)
data(yeast)
set.seed(7)
yeast.supersom <- supersom(yeast, somgrid(8, 8, "hexagonal"),whatmap = 3:6)
classes <- levels(yeast$class)
colors <- c("yellow", "green", "blue", "red", "orange")
par(mfrow = c(3, 2))
plot(yeast.supersom, type = "mapping",pch = 1, main = "All", keepMargins = TRUE,bgcol = gray(0.85))
library(plyr)
pc <- data.frame(Var1=c(1:64))
for (i in seq(along = classes)) {
X.class <- lapply(yeast, function(x) subset(x, yeast$class == classes[i]))
X.map <- map(yeast.supersom, X.class)
plot(yeast.supersom, type = "mapping", classif = X.map,
col = colors[i], pch = 1, main = classes[i], add=TRUE)
# compute percentage per unit
v1F <- levels(as.factor(X.map$unit.classif))
v2F <- levels(as.factor(yeast.supersom$unit.classif))
fList<- base::union(v2F,v1F)
pc <- join(pc,as.data.frame(table(factor(X.map$unit.classif,levels=fList))/table(factor(yeast.supersom$unit.classif,levels=fList))*100),by = 'Var1')
colnames(pc)[NCOL(pc)]<-classes[i]
}

OKay guys here is a solution:
Once I have computed the probability, it derives a color code from a defined gradient (rbPal). The gradient is defined by a upper and a lower bound and the shade of the colors are proportional to their interval. THis is done with the function findInterval.
# compute percentage per unit
v1F <- levels(as.factor(X.map$unit.classif))
v2F <- levels(as.factor(yeast.supersom$unit.classif))
fList<- base::union(v2F,v1F)
pc <- join(pc,as.data.frame(table(factor(X.map$unit.classif,levels=fList))/table(factor(yeast.supersom$unit.classif,levels=fList))*100),by = 'Var1')
colnames(pc)[NCOL(pc)]<-classes[i]
rbPal <- colorRampPalette(c('blue','yellow','red'))
plot(yeast.supersom, type="mapping", bgcol = rbPal((100))[(findInterval(pc[,which(colnames(pc)==as.character(classes[i]))], seq(0:100))+1)], main = paste("Probabily Clusters:", classes[i]))

Related

Error: x,y coords given but no window specified (spatstat)

I am generating a landscape pattern that evolves over time. The problem with the code is that I have clearly defined a window for the object bringing up the error but the window is not being recognised. I also do not see how any points are falling outside of the window, or how that would make a difference.
library(spatstat)
library(dplyr)
# Define the window
win <- owin(c(0, 100), c(0, 100))
# Define the point cluster
cluster1 <- rMatClust(kappa = 0.0005, scale = 0.1, mu = 20,
win = win, center = c(5,5))
# define the spread of the points
spread_rate <- 1
new_nests_per_year<-5
years<-10
# Plot the initial cluster
plot(win, main = "Initial cluster")
points(cluster1, pch = 20, col = "red")
newpoints<-list()
# Loop for n years
for (i in 1:years) {
# Generate new points that spread from the cluster
newpoints[[1]] <-rnorm(new_nests_per_year, mean = centroid.owin(cluster1)$y, sd = spread_rate)
newpoints[[2]] <-rnorm(new_nests_per_year, mean = centroid.owin(cluster1)$x, sd = spread_rate)
# Convert the list to a data frame
newpoints_df <- data.frame(newpoints)
# Rename the columns of the data frame
colnames(newpoints_df) <- c("x", "y")
# Combine the new points with the existing points
cluster1_df <- data.frame(cluster1)
newtotaldf<-bind_rows(cluster1_df,newpoints_df)
cluster1<-as.ppp(newtotaldf, x = newtotaldf$x, y = newtotaldf$y,
window = win)
# Plot the updated cluster
plot(win, main = paste("Cluster after year", i))
points(cluster1, pch = 20, col = "red")
}
However, when I run line:
cluster1<-as.ppp(newtotaldf, x = newtotaldf$x, y = newtotaldf$y,
window = win)
I recieve the error:
Error: x,y coords given but no window specified
Why would this be the case?

In your code, if you use the command W = win it should solve the issue. I also believe you can simplify the command without specifying x and y:
## ...[previous code]...
cluster1 <- as.ppp(newtotaldf, W = win)
plot(win)
points(cluster1, pch = 20, col = "red")

Plot causes "Error: Incorrect Number of Dimensions"

I am learning about the "kohonen" package in R for the purpose of making Self Organizing Maps (SOM, also called Kohonen Networks - a type of Machine Learning algorithm). I am following this R language tutorial over here: https://www.rpubs.com/loveb/som
I tried to create my own data (this time with both "factor" and "numeric" variables) and run the SOM algorithm (this time using the "supersom()" function instead):
#load libraries and adjust colors
library(kohonen) #fitting SOMs
library(ggplot2) #plots
library(RColorBrewer) #colors, using predefined palettes
contrast <- c("#FA4925", "#22693E", "#D4D40F", "#2C4382", "#F0F0F0", "#3D3D3D") #my own, contrasting pairs
cols <- brewer.pal(10, "Paired")
#create and format data
a =rnorm(1000,10,10)
b = rnorm(1000,10,5)
c = rnorm(1000,5,5)
d = rnorm(1000,5,10)
e <- sample( LETTERS[1:4], 100 , replace=TRUE, prob=c(0.25, 0.25, 0.25, 0.25) )
f <- sample( LETTERS[1:5], 100 , replace=TRUE, prob=c(0.2, 0.2, 0.2, 0.2, 0.2) )
g <- sample( LETTERS[1:2], 100 , replace=TRUE, prob=c(0.5, 0.5) )
data = data.frame(a,b,c,d,e,f,g)
data$e = as.factor(data$e)
data$f = as.factor(data$f)
data$g = as.factor(data$g)
cols <- 1:4
data[cols] <- scale(data[cols])
#som model
som <- supersom(data= as.list(data), grid = somgrid(10,10, "hexagonal"),
dist.fct = "euclidean", keep.data = TRUE)
From here, I was able to successfully make some of the basic plots:
#plots
#pretty gradient colors
colour1 <- tricolor(som$grid)
colour4 <- tricolor(som$grid, phi = c(pi/8, 6, -pi/6), offset = 0.1)
plot(som, type="changes")
plot(som, type="count")
plot(som, type="quality", shape = "straight")
plot(som, type="dist.neighbours", palette.name=grey.colors, shape = "straight")
However, the problem arises when I try to make individual plots for each variable:
#error
var <- 1 #define the variable to plot
plot(som, type = "property", property = getCodes(som)[,var], main=colnames(getCodes(som))[var], palette.name=terrain.colors)
var <- 6 #define the variable to plot
plot(som, type = "property", property = getCodes(som)[,var], main=colnames(getCodes(som))[var], palette.name=terrain.colors)
This produces an error: "Error: Incorrect Number of Dimensions"
A similar error (NAs by coercion) is produced when attempting to cluster the SOM Network:
#cluster (error)
set.seed(33) #for reproducability
fit_kmeans <- kmeans(data, 3) #3 clusters are used, as indicated by the wss development.
cl_assignmentk <- fit_kmeans$cluster[data$unit.classif]
par(mfrow=c(1,1))
plot(som, type="mapping", bg = rgb(colour4), shape = "straight", border = "grey",col=contrast)
add.cluster.boundaries(som, fit_kmeans$cluster, lwd = 3, lty = 2, col=contrast[4])
Can someone please tell me what I am doing wrong?
Thanks
Sources: https://www.rdocumentation.org/packages/kohonen/versions/2.0.5/topics/supersom

getCodes() produces a list and as such you have to treat it like one.
Calling getCodes(som) produces a list containing 7 items named a-g as such you should be selecting items from the list either using $ or [[]]
e.g
plot(som, type = "property", property = getCodes(som)[[1]], main=names(getCodes(som))[1], palette.name=terrain.colors)
or
plot(som, type = "property", property = getCodes(som)$a, main="a", palette.name=terrain.colors)
or
plot(som, type = "property", property = getCodes(som)[["a"]], main="a", palette.name=terrain.colors)
if you must set the variable prior to calling the plot you can do so like:
var <- 1
plot(som, type = "property", property = getCodes(som)[[var]], main=names(getCodes(som))[var], palette.name=terrain.colors)
Regarding kmeans()
kmeans() needs a matrix or an object that can be coerced into a matrix, you have factors (categorical data) which cannot be coerced into numeric, either drop the factors, or find another method.
drop the factors:
#cluster (error)
set.seed(33)
#for reproducability
fit_kmeans <- kmeans(as.matrix(data[1:4]), 3)
#3 clusters are used, as indicated by the wss development.
cl_assignmentk <- fit_kmeans$cluster[data$unit.classif]
par(mfrow=c(1,1))
plot(som, type="mapping", bg = rgb(colour4), shape = "straight", border = "grey",col=contrast)
add.cluster.boundaries(som, fit_kmeans$cluster, lwd = 3, lty = 2, col=contrast[4])
edit:
Alternatively you can specify the code directly from getCodes() by using idx like so:
plot(som, type = "property", property = getCodes(som, idx = 1), main="a"), palette.name=terrain.colors)

Partial Row Labels Heatmap - R

I was wondering if anyone knows of a package that allows partial row labeling of heatmaps. I am currently using pheatmap() to construct my heatmaps, but I can use any package that has this functionality.
I have plots with many rows of differentially expressed genes and I would like to label a subset of them. There are two main things to consider (that I can think of):
The placement of the text annotation depends on the height of the row. If the rows are too narrow, then the text label will be ambiguous without some sort of pointer.
If multiple adjacent rows are significant (i.e. will be labelled), then these will need to be offset, and again, a pointer will be needed.
Below is an example of a partial solution that really only gets maybe halfway there, but I hope illustrates what I'd like to be able to do.
set.seed(1)
require(pheatmap)
require(RColorBrewer)
require(grid)
### Data to plot
data_mat <- matrix(sample(1:10000, 300), nrow = 50, ncol = 6)
rownames(data_mat) <- paste0("Gene", 1:50)
colnames(data_mat) <- c(paste0("A", 1:3), paste0("B", 1:3))
### Set how many genes to annotate
### TRUE - make enough labels that some overlap
### FALSE - no overlap
tooMany <- T
### Select a few genes to annotate
if (tooMany) {
sigGenes_v <- paste0("Gene", c(5,20,26,42,47,16,28))
newMain_v <- "Too Many Labels"
} else {
sigGenes_v <- paste0("Gene", c(5,20,26,42))
newMain_v <- "OK Labels"
}
### Make color list
colors_v <- brewer.pal(8, "Dark2")
colors_v <- colors_v[c(1:length(sigGenes_v), 8)]
names(colors_v) <- c(sigGenes_v, "No")
annColors_lsv <- list("Sig" = colors_v)
### Column Metadata
colMeta_df <- data.frame(Treatment = c(rep("A", 3), rep("B", 3)),
Replicate = c(rep(1:3, 2)),
stringsAsFactors = F,
row.names = colnames(data_mat))
### Row metadata
rowMeta_df <- data.frame(Sig = rep("No", 50),
stringsAsFactors = F,
row.names = rownames(data_mat))
for (gene_v in sigGenes_v) rowMeta_df[rownames(rowMeta_df) == gene_v, "Sig"] <- gene_v
### Heatmap
heat <- pheatmap(data_mat,
annotation_row = rowMeta_df,
annotation_col = colMeta_df,
annotation_colors = annColors_lsv,
cellwidth = 10,
main = "Original Heat")
### Get order of genes after clustering
genesInHeatOrder_v <- heat$tree_row$labels[heat$tree_row$order]
whichSigInHeatOrder_v <- which(genesInHeatOrder_v %in% sigGenes_v)
whichSigInHeatOrderLabels_v <- genesInHeatOrder_v[whichSigInHeatOrder_v]
sigY <- 1 - (0.02 * whichSigInHeatOrder_v)
### Change title
whichMainGrob_v <- which(heat$gtable$layout$name == "main")
heat$gtable$grobs[[whichMainGrob_v]] <- textGrob(label = newMain_v,
gp = gpar(fontsize = 16))
### Remove rows
whichRowGrob_v <- which(heat$gtable$layout$name == "row_names")
heat$gtable$grobs[[whichRowGrob_v]] <- textGrob(label = whichSigInHeatOrderLabels_v,
y = sigY,
vjust = 1)
grid.newpage()
grid.draw(heat)
Here are a few outputs:
original heatmap:
ok labels:
ok labels, with flags:
too many labels
too many labels, with flags
The "with flags" outputs are the desired final results.
I just saved these as images from the Rstudio plot viewer. I recognize that I could save them as pdfs and provide a larger file size to get rid of the label overlap, but then the individual cells would be larger than I want.

Based on your code, you seem fairly comfortable with gtables & grobs. A (relatively) straightforward way to achieve the look you want is to zoom in on the row label grob, & make some changes there:
replace unwanted labels with "";
evenly spread out labels within the available space;
add line segments joining the old and new label positions.
I wrote a wrapper function for this, which works as follows:
# heat refers to the original heatmap produced from the pheatmap() function
# kept.labels should be a vector of labels you wish to show
# repel.degree is a number in the range [0, 1], controlling how much the
# labels are spread out from one another
add.flag(heat,
kept.labels = sigGenes_v,
repel.degree = 0)
add.flag(heat,
kept.labels = sigGenes_v,
repel.degree = 0.5)
add.flag(heat,
kept.labels = sigGenes_v,
repel.degree = 1)
Function (explanations in annotations):
add.flag <- function(pheatmap,
kept.labels,
repel.degree) {
# repel.degree = number within [0, 1], which controls how much
# space to allocate for repelling labels.
## repel.degree = 0: spread out labels over existing range of kept labels
## repel.degree = 1: spread out labels over the full y-axis
heatmap <- pheatmap$gtable
new.label <- heatmap$grobs[[which(heatmap$layout$name == "row_names")]]
# keep only labels in kept.labels, replace the rest with ""
new.label$label <- ifelse(new.label$label %in% kept.labels,
new.label$label, "")
# calculate evenly spaced out y-axis positions
repelled.y <- function(d, d.select, k = repel.degree){
# d = vector of distances for labels
# d.select = vector of T/F for which labels are significant
# recursive function to get current label positions
# (note the unit is "npc" for all components of each distance)
strip.npc <- function(dd){
if(!"unit.arithmetic" %in% class(dd)) {
return(as.numeric(dd))
}
d1 <- strip.npc(dd$arg1)
d2 <- strip.npc(dd$arg2)
fn <- dd$fname
return(lazyeval::lazy_eval(paste(d1, fn, d2)))
}
full.range <- sapply(seq_along(d), function(i) strip.npc(d[i]))
selected.range <- sapply(seq_along(d[d.select]), function(i) strip.npc(d[d.select][i]))
return(unit(seq(from = max(selected.range) + k*(max(full.range) - max(selected.range)),
to = min(selected.range) - k*(min(selected.range) - min(full.range)),
length.out = sum(d.select)),
"npc"))
}
new.y.positions <- repelled.y(new.label$y,
d.select = new.label$label != "")
new.flag <- segmentsGrob(x0 = new.label$x,
x1 = new.label$x + unit(0.15, "npc"),
y0 = new.label$y[new.label$label != ""],
y1 = new.y.positions)
# shift position for selected labels
new.label$x <- new.label$x + unit(0.2, "npc")
new.label$y[new.label$label != ""] <- new.y.positions
# add flag to heatmap
heatmap <- gtable::gtable_add_grob(x = heatmap,
grobs = new.flag,
t = 4,
l = 4
)
# replace label positions in heatmap
heatmap$grobs[[which(heatmap$layout$name == "row_names")]] <- new.label
# plot result
grid.newpage()
grid.draw(heatmap)
# return a copy of the heatmap invisibly
invisible(heatmap)
}

How to programmatically determine the column indices of principal components using FactoMineR package?

Given a data frame containing mixed variables (i.e. both categorical and continuous) like,
digits = 0:9
# set seed for reproducibility
set.seed(17)
# function to create random string
createRandString <- function(n = 5000) {
a <- do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))
paste0(a, sprintf("%04d", sample(9999, n, TRUE)), sample(LETTERS, n, TRUE))
}
df <- data.frame(ID=c(1:10), name=sample(letters[1:10]),
studLoc=sample(createRandString(10)),
finalmark=sample(c(0:100),10),
subj1mark=sample(c(0:100),10),subj2mark=sample(c(0:100),10)
)
I perform unsupervised feature selection using the package FactoMineR
df.princomp <- FactoMineR::FAMD(df, graph = FALSE)
The variable df.princomp is a list.
Thereafter, to visualize the principal components I use
fviz_screeplot() and fviz_contrib() like,
#library(factoextra)
factoextra::fviz_screeplot(df.princomp, addlabels = TRUE,
barfill = "gray", barcolor = "black",
ylim = c(0, 50), xlab = "Principal Component",
ylab = "Percentage of explained variance",
main = "Principal Component (PC) for mixed variables")
factoextra::fviz_contrib(df.princomp, choice = "var",
axes = 1, top = 10, sort.val = c("desc"))
which gives the following Fig1
and Fig2
Explanation of Fig1: The Fig1 is a scree plot. A Scree Plot is a simple line segment plot that shows the fraction of total variance in the data as explained or represented by each Principal Component (PC). So we can see the first three PCs collectively are responsible for 43.8% of total variance. The question now naturally arises, "What are these variables?". This I have shown in Fig2.
Explanation of Fig2: This figure visualizes the contribution of rows/columns from the results of Principal Component Analysis (PCA). From here I can see the variables, name, studLoc and finalMark are the most important variables that can be used for further analysis.
Further Analysis- where I'm stuck at: To derive the contribution of the aforementioned variables name, studLoc, finalMark. I use the principal component variable df.princomp (see above) like df.princomp$quanti.var$contrib[,4]and df.princomp$quali.var$contrib[,2:3].
I've to manually specify the column indices [,2:3] and [,4].
What I want: I want to know how to do dynamic column index assignment, such that I do not have to manually code the column index [,2:3] in the list df.princomp?
I've already looked at the following similar questions 1, 2, 3 and 4 but cannot find my solution? Any help or suggestions to solve this problem will be helpful.

Not sure if my interpretation of your question is correct, apologies if not. From what I gather you are using PCA as an initial tool to show you what variables are the most important in explaining the dataset. You then want to go back to your original data, select these variables quickly without manual coding each time, and use them for some other analysis.
If this is correct then I have saved the data from the contribution plot, filtered out the variables that have the greatest contribution, and used that result to create a new data frame with these variables alone.
digits = 0:9
# set seed for reproducibility
set.seed(17)
# function to create random string
createRandString <- function(n = 5000) {
a <- do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE))
paste0(a, sprintf("%04d", sample(9999, n, TRUE)), sample(LETTERS, n, TRUE))
}
df <- data.frame(ID=c(1:10), name=sample(letters[1:10]),
studLoc=sample(createRandString(10)),
finalmark=sample(c(0:100),10),
subj1mark=sample(c(0:100),10),subj2mark=sample(c(0:100),10)
)
df.princomp <- FactoMineR::FAMD(df, graph = FALSE)
factoextra::fviz_screeplot(df.princomp, addlabels = TRUE,
barfill = "gray", barcolor = "black",
ylim = c(0, 50), xlab = "Principal Component",
ylab = "Percentage of explained variance",
main = "Principal Component (PC) for mixed variables")
#find the top contributing variables to the overall variation in the dataset
#here I am choosing the top 10 variables (although we only have 6 in our df).
#note you can specify which axes you want to look at with axes=, you can even do axes=c(1,2)
f<-factoextra::fviz_contrib(df.princomp, choice = "var",
axes = c(1), top = 10, sort.val = c("desc"))
#save data from contribution plot
dat<-f$data
#filter out ID's that are higher than, say, 20
r<-rownames(dat[dat$contrib>20,])
#extract these from your original data frame into a new data frame for further analysis
new<-df[r]
new
#finalmark name studLoc
#1 53 b POTYQ0002N
#2 73 i LWMTW1195I
#3 95 d VTUGO1685F
#4 39 f YCGGS5755N
#5 97 c GOSWE3283C
#6 58 g APBQD6181U
#7 67 a VUJOG1460V
#8 64 h YXOGP1897F
#9 15 j NFUOB6042V
#10 81 e QYTHG0783G
Based on your comment, where you said you wanted to 'Find variables with value greater than 5 in Dim.1 AND Dim.2 and save these variables to a new data frame', I would do this:
#top contributors to both Dim 1 and 2
f<-factoextra::fviz_contrib(df.princomp, choice = "var",
axes = c(1,2), top = 10, sort.val = c("desc"))
#save data from contribution plot
dat<-f$data
#filter out ID's that are higher than 5
r<-rownames(dat[dat$contrib>5,])
#extract these from your original data frame into a new data frame for further analysis
new<-df[r]
new
(This keeps all the original variables in our new data frame since they all contributed more than 5% to the total variance)

There are a lot of ways to extract contributions of individual variables to PCs. For numeric input, one can run a PCA with prcomp and look at $rotation (I spoke to soon and forgot you've got factors here so prcomp won't work directly). Since you are using factoextra::fviz_contrib, it makes sense to check how that function extracts this information under the hood. Key factoextra::fviz_contrib and read the function:
> factoextra::fviz_contrib
function (X, choice = c("row", "col", "var", "ind", "quanti.var",
"quali.var", "group", "partial.axes"), axes = 1, fill = "steelblue",
color = "steelblue", sort.val = c("desc", "asc", "none"),
top = Inf, xtickslab.rt = 45, ggtheme = theme_minimal(),
...)
{
sort.val <- match.arg(sort.val)
choice = match.arg(choice)
title <- .build_title(choice[1], "Contribution", axes)
dd <- facto_summarize(X, element = choice, result = "contrib",
axes = axes)
contrib <- dd$contrib
names(contrib) <- rownames(dd)
theo_contrib <- 100/length(contrib)
if (length(axes) > 1) {
eig <- get_eigenvalue(X)[axes, 1]
theo_contrib <- sum(theo_contrib * eig)/sum(eig)
}
df <- data.frame(name = factor(names(contrib), levels = names(contrib)),
contrib = contrib)
if (choice == "quanti.var") {
df$Groups <- .get_quanti_var_groups(X)
if (missing(fill))
fill <- "Groups"
if (missing(color))
color <- "Groups"
}
p <- ggpubr::ggbarplot(df, x = "name", y = "contrib", fill = fill,
color = color, sort.val = sort.val, top = top, main = title,
xlab = FALSE, ylab = "Contributions (%)", xtickslab.rt = xtickslab.rt,
ggtheme = ggtheme, sort.by.groups = FALSE, ...) + geom_hline(yintercept = theo_contrib,
linetype = 2, color = "red")
p
}
<environment: namespace:factoextra>
So it's really just calling facto_summarize from the same package. By analogy you can do the same thing, simply call:
> dd <- factoextra::facto_summarize(df.princomp, element = "var", result = "contrib", axes = 1)
> dd
name contrib
ID ID 0.9924561
finalmark finalmark 21.4149175
subj1mark subj1mark 7.1874438
subj2mark subj2mark 16.6831560
name name 26.8610132
studLoc studLoc 26.8610132
And that's the table corresponding to your figure 2. For PC2 use axes = 2 and so on.
Regarding "how to programmatically determine the column indices of the PCs", I'm not 100% sure I understand what you want, but if you just want to say for column "finalmark", grab its contribution to PC3 you can do the following:
library(tidyverse)
# make a tidy table of all column names in the original df with their contributions to all PCs
contribution_df <- map_df(set_names(1:5), ~factoextra::facto_summarize(df.princomp, element = "var", result = "contrib", axes = .x), .id = "PC")
# get the contribution of column 'finalmark' by name
contribution_df %>%
filter(name == "finalmark")
# get the contribution of column 'finalmark' to PC3
contribution_df %>%
filter(name == "finalmark" & PC == 3)
# or, just the numeric value of contribution
filter(contribution_df, name == "finalmark" & PC == 3)$contrib
BTW I think ID in your example is treated as numeric instead of factor, but since it's just an example I'm not bothering with it.

Print multiple corrplots (R) in the figure

I am working with corrplot and following example here Plotting multiple corrplots (R) in the same graph I can display multiple corrplots(R) in the same graph. However I would like to save to a tiff file and because I working with loop I don't know how to achieve this. See code below.
I loop through several block of my experiments (Block1, block2) and plot the corrplot one next to another. This works. I don't understand how to direct to tiff file. In particular where to put
tiff(file = 'Figure4Plots.tiff', width = 12, height = 12, units = "in", res = 300) and dev.off() I tried after dflist and several others but does not work Thanks!
dflist<-c('Block1', 'Block2')
par(mfrow=c(2,2))
for (i in seq_along(dflist)) {
#Subset different Blocks
dataCorr<- subset(total , (block == dflist[i] ))
p.mat <- cor.mtest(dataCorr)
M<-cor(dataCorr)
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(M, method="color", col=col(200),
type="upper", title = title,
addCoef.col = "black", # Add coefficient of correlation
tl.col="red", tl.srt=45, #Text label color and rotation
# Combine with significance
p.mat = p.mat, sig.level = 0.05, insig = "blank",
diag=TRUE,
mar=c(0,0,1,0) )}

I don't have your original data, and I'm not familiar with the corrplot package, so I've made some dummy data and used just a simple plot() function. Unless there's something particular about the corrplot() function, you should be able to enclose most of your code in a tiff() block, like this:
dflist <- c('Block1', 'Block2', 'Block3', 'Block4')
total <- data.frame(block=sample(dflist, size=100, replace=TRUE), x=runif(100), y=runif(100)*2)
tiff(file = 'Figure4Plots.tiff', width = 12, height = 12, units = "cm", res = 72)
par(mfrow=c(2,2))
for (thisBlock in dflist) {
#Subset different Blocks
dataCorr <- subset(total , (block == thisBlock ))
dataCorr <- dataCorr[, c('x', 'y')]
plot(dataCorr)
}
dev.off()
This code produces Figure4Plots.tiff:

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Plot class probability by neuron in self organizing maps - r

Related

Error: x,y coords given but no window specified (spatstat)

Plot causes "Error: Incorrect Number of Dimensions"

Partial Row Labels Heatmap - R

How to programmatically determine the column indices of principal components using FactoMineR package?

Print multiple corrplots (R) in the figure

Categories

Resources