FactorMiner plot.HCPC function for cluster labeling - r

This is the function that is part of FactorMiner package
https://github.com/cran/FactoMineR/blob/master/R/plot.HCPC.R
As an example this is the code I ran
res.pca <- PCA(iris[, -5], scale = TRUE)
hc <- HCPC(res.pca, nb.clust=-1,)
plot.HCPC(hc, choice="3D.map", angle=60)
hc$call$X$clust <- factor(hc$call$X$clust, levels = unique(hc$call$X$clust))
plot(hc, choice="map")
The difference is when i run this hc$call$X$clust <- factor(hc$call$X$clust, levels = unique(hc$call$X$clust))
before plot.HCPC this doesn't change the annotation in the figure but when I do the same thing before I ran this plot(hc, choice="map") it is reflected in the final output.
When i see the plot.HCPC function this is the line of the code that does embed the cluster info into the figure
for(i in 1:nb.clust) leg=c(leg, paste("cluster",levs[i]," ", sep=" "))
legend("topleft", leg, text.col=as.numeric(levels(X$clust)),cex=0.8)
My question I have worked with small function where I understand when i edit or modify which one goes where and does what here in this case its a complicated function at least to me so Im not sure how do I modify that part and get what I would like to see.
I would like to see in case of my 3D dendrogram each of the cluster are labelled with group the way we can do in complexheatmap where we can annotate that are in row or column with a color code so it wont matter what the order in the data-frame we can still identify(it's just visual thing I know but I would like to learn how to modify these)

Related

Change of colors in compare.matrix command in r

I'm trying to change the colors for the compare.matrix command in r, but the error is always the same:
Error in image.default(x = mids, y = mids, z = mdata, col = c(heat.colors(10)[10:1]), :
formal argument "col" matched by multiple actual arguments
My code is very simple:
compare.matrix(current,ech_b1,nbins=40)
and some of my attempts are:
compare.matrix(current,ech_b1,nbins=40,col=c(grey.colors(5)))
compare.matrix(current,ech_b1,nbins=40,col=c(grey.colors(10)[10:1]))
Assuming you're using compare.matrix() from the SDMTools package, the color arguments appear to be hard-coded into the function, so you'll need to redefine the function in order to make them flexible:
# this shows you the code in the console
SDMTools::compare.matrix
function(x,y,nbins,...){
#---- preceding code snipped ----#
suppressWarnings(image(x=mids, y=mids, z=mdata, col=c(heat.colors(10)[10:1]),...))
#overlay contours
contour(x=mids, y=mids, z=mdata, col="black", lty="solid", add=TRUE,...)
}
So you can make a new one like so, but bummer, there are two functions using the ellipsis that have a col argument predefined. If you'll only be using extra args to image() and not to contour(), this is cheap and easy.
my.compare.matrix <- function(x,y,nbins,...){
#---- preceding code snipped ----#
suppressWarnings(image(x=mids, y=mids, z=mdata,...))
#overlay contours
contour(x=mids, y=mids, z=mdata, col="black", lty="solid", add=TRUE)
}
If, however, you want to use ... for both internal calls, then the only way I know of to avoid confusion about redundant argument names is to do something like:
my.compare.matrix <- function(x,y,nbins,
image.args = list(col=c(heat.colors(10)[10:1])),
contour.args = list(col="black", lty="solid")){
#---- preceding code snipped ----#
contour.args[[x]] <- contour.args[[y]] <- image.args[[x]] <- image.args[[y]] <- mids
contour.args[[z]] <- image.args[[z]] <- mdata
suppressWarnings(do.call(image, image.args))
#overlay contours
do.call(contour, contour.args)
}
Decomposing this change: instead of ... make a named list of arguments, where the previous hard codes are now defaults. You can then change these items by renaming them in the list or adding to the list. This could be more elegant on the user side, but it gets the job done. Both of the above modifications are untested, but should get you there, and this is all prefaced by my above comment. There may be some other problem that cannot be detected by SO Samaritans because you didn't specify the package or the data.

New gplots update in R cannot find function distfun in heatmap.2

I have a bit of R-code to make a heatmap from a correlation matrix, which worked the last time I used it (prior to the 2013 Oct 17 update of gplots; after updating to R Version 3.0.2). This makes me think that something changed in the most recent gplots update, but I can not figure out what.
What used to present a nice plot now gives me this error:
" Error in hclustfun(distfun(x)) : could not find function "distfun" "
and won't plot anything. Below is the code to reproduce the plot (heavily commented as I was using it to teach an undergrad how to use heatmaps for a project). I tried adding the last line to explicitly set the functions, but it didn't help resolve the problem.
EDIT: I changed the last line of code to read:
,distfun =function(c) {as.dist(1-c,upper=FALSE)}, hclustfun=hclust)
and it worked. When I used just "dist=as.dist" I got a plot, but it wasn't sorted right, and several of the dendrogram branches didn't connect to the tree. Not sure what happened, or why this is working, but it appears to be.
Any help would be greatly appreciated.
Thanks in advance,
library(gplots)
set.seed(12345)
randData <- as.data.frame(matrix(rnorm(600),ncol=6))
randDataCorrs <- randData+(rnorm(600))
names(randDataCorrs) <- paste(names(randDataCorrs),"_c",sep="")
randDataExtra <- cbind(randData,randDataCorrs)
randDataExtraMatrix <- cor(randDataExtra)
heatmap.2(randDataExtraMatrix, # sets the correlation matrix to use
symm=TRUE, #tells whether it is symmetrical or not
main= "Correlation matrix\nof Random Data Cor", # Names plot
xlab= "x groups",ylab="", # Sets the x and y labels
scale="none", # Tells it not to scale the data
col=redblue(256),# Sets the colors (can be manual, see below)
trace="none", # tells it not to add a trace
symkey=TRUE,symbreaks=TRUE, # Tells it to keep things symmetric around 0
density.info = "none"#) # can be "histogram" if you want a hist of your corr values here
#,distfun=dist, hclustfun=hclust)
,distfun =function(c) {as.dist(1-c,upper=FALSE)}, hclustfun=hclust) # new last line
I had the same error, then I noticed that I had made a variable called dist, which is the default call for distfun= dist. I renamed the variable and then everything ran fine. You likely made the same error, as your new code is working since you have altered the default call of distfun.

Trying to determine why my heatmap made using heatmap.2 and using breaks in R is not symmetrical

I am trying to cluster a protein dna interaction dataset, and draw a heatmap using heatmap.2 from the R package gplots. My matrix is symmetrical.
Here is a copy of the data-set I am using after it is run through pearson:DataSet
Here is the complete process that I am following to generate these graphs: Generate a distance matrix using some correlation in my case pearson, then take that matrix and pass it to R and run the following code on it:
library(RColorBrewer);
library(gplots);
library(MASS);
args <- commandArgs(TRUE);
matrix_a <- read.table(args[1], sep='\t', header=T, row.names=1);
mtscaled <- as.matrix(scale(matrix_a))
# location <- args[2];
# setwd(args[2]);
pdf("result.pdf", pointsize = 15, width = 18, height = 18)
mycol <- c("blue","white","red")
my.breaks <- c(seq(-5, -.6, length.out=6),seq(-.5999999, .1, length.out=4),seq(.100009,5, length.out=7))
#colors <- colorpanel(75,"midnightblue","mediumseagreen","yellow")
result <- heatmap.2(mtscaled, Rowv=T, scale='none', dendrogram="row", symm = T, col=bluered(16), breaks=my.breaks)
dev.off()
The issue I am having is once I use breaks to help me control the color separation the heatmap no longer looks symmetrical.
Here is the heatmap before I use breaks, as you can see the heatmap looks symmetrical:
Here is the heatmap when breaks are used:
I have played with the cutoff's for the sequences to make sure for instance one sequence does not end exactly where the other begins, but I am not able to solve this problem. I would like to use the breaks to help bring out the clusters more.
Here is an example of what it should look like, this image was made using cluster maker:
I don't expect it to look identical to that, but I would like it if my heatmap is more symmetrical and I had better definition in terms of the clusters. The image was created using the same data.
After some investigating I noticed was that after running my matrix through heatmap, or heatmap.2 the values were changing, for example the interaction taken from the provided data set of
Pacdh-2
and
pegg-2
gave a value of 0.0250313 before the matrix was sent to heatmap.
After that I looked at the matrix values using result$carpet and the values were then
-0.224333135
-1.09805379
for the two interactions
So then I decided to reorder the original matrix based on the dendrogram from the clustered matrix so that I was sure that the values would be the same. I used the following stack overflow question for help:
Order of rows in heatmap?
Here is the code used for that:
rowInd <- rev(order.dendrogram(result$rowDendrogram))
colInd <- rowInd
data_ordered <- matrix_a[rowInd, colInd]
I then used another program "matrix2png" to draw the heatmap:
I still have to play around with the colors but at least now the heatmap is symmetrical and clustered.
Looking into it even more the issue seems to be that I was running scale(matrix_a) when I change my code to just be mtscaled <- as.matrix(matrix_a) the result now looks symmetrical.
I'm certainly not the person to attempt reproducing and testing this from that strange data object without code that would read it properly, but here's an idea:
..., col=bluered(20)[4:20], ...
Here's another though which should return the full rand of red which tha above strategy would not:
shift.BR<- colorRamp(c("blue","white", "red"), bias=0.5 )((1:16)/16)
heatmap.2( ...., col=rgb(shift.BR, maxColorValue=255), .... )
Or you can use this vector:
> rgb(shift.BR, maxColorValue=255)
[1] "#1616FF" "#2D2DFF" "#4343FF" "#5A5AFF" "#7070FF" "#8787FF" "#9D9DFF" "#B4B4FF" "#CACAFF" "#E1E1FF" "#F7F7FF"
[12] "#FFD9D9" "#FFA3A3" "#FF6C6C" "#FF3636" "#FF0000"
There was a somewhat similar question (also today) that was asking for a blue to red solution for a set of values from -1 to 3 with white at the center. This it the code and output for that question:
test <- seq(-1,3, len=20)
shift.BR <- colorRamp(c("blue","white", "red"), bias=2)((1:20)/20)
tpal <- rgb(shift.BR, maxColorValue=255)
barplot(test,col = tpal)
(But that would seem to be the wrong direction for the bias in your situation.)

Clearing plotted points in R

I am trying to use the animation package to generate an "evolving" plot of points on a map. The map is generated from shapefiles (from the readShapeSpatial/readShapeLines functions).
The problem is when it's plotted in a for loop, the result is additive, whereas the ideal result is to have it evolve.
Are there ways of using par() that I am missing?
My question is: is there a way to clear just the points ploted from the points function
and not clearing the entire figure thus not having to regraph the shapefiles?
in case someone wants to see code:
# plotting underlying map
newyork <- readShapeSpatial('nycpolygon.shp')
routes <- readShapeLines('nyc.shp')
par(bg="grey25")
plot(newyork, lwd=2, col ="lightgray")
plot(routes,add=TRUE,lwd=0.1,col="lightslategrey")
# plotting points and save to GIF
ani.options(interval=.05)
saveGIF({
par(bg="grey25")
# Begin loop
for (i in 13:44){
infile <-paste("Week",i,".csv",sep='')
mydata <-read.csv(file = infile, header = TRUE, sep=",")
plotvar <- Var$Para
nclr <- 4
plotclr <-brewer.pal(nclr,"RdPu")
class<- classIntervals(plotvar,nclr,style = "pretty")
colcode <- findColours(class,plotclr)
points(Var$Lon,Var$Lat,col=colcode)
}
})
If you can accept a residual shadow or halo of ink, you can over-plot with color ="white" or == to your background choices. We cannot access your shape file but you can try it out by adding this line:
points(Var$Lon, Var$Lat, col="grey25")
It may leave gaps in other previously plotted figures or boundaries, because it's definitely not object-oriented. The lattice and ggplot2 graphics models are more object oriented, so if you want to post a reproducible example, that might be an alternate path to "moving" forward. I seem to remember that the rgl package has animation options in its repetoire.

Multiple plots with high-level plotting functions, especially plot.rqs()

I am trying to plot two regression summaries side-by-side with one centered title. Each regression summary is generated by plot.rqs() and amounts to a set of 9 plots.
I've tried using par(mfrow=c(1,2)) already, but as I learnt from Paul Murrel's (2006) book, high-level functions like plot.rqs() or pairs() save the graphics state before drawing and then restore the graphics state once completed, so that pre-emptive calls to par() or layout() can't help me. plot.rqs() doesn't have a 'panel' function either.
It seems that the only way to achieve the result is to modify the plot.rqs() function to get a new function, say modified.plot.rqs(), and then run
par(mfrow=c(1,2))
modified.plot.rqs(summary(fit1))
modified.plot.rqs(summary(fit2))
par(mfrow=c(1,1))
From there I might be able to work out how to add an overall title to the image using layout(). Does anyone know how to create a modified.plot.rqs() function that could be used in this way?
Thanks
You can patch a function as follows:
use dput and capture.output to retrieve
the code of the function, as a string;
change it as you want (here, I just replace each occurrence of par
with a function that does nothing);
finally evaluate the result to produce a new function.
library(quantreg)
a <- capture.output(dput(plot.summary.rqs))
b <- gsub("^\\s*par\\(", "nop(", a)
nop <- function(...) {}
my.plot.summary.rqs <- eval(parse(text=b))
First we generate an example object, fm . Then we copy plot.rqs and use trace on the copy to insert par <- list at top effectively nullifying any use of par within the function. Then we do the same with plot.summary.rqs. Finally we test it out with our own par:
library(quantreg)
example(plot.rqs) # fm to use in example
# plot.rqs
plot.rqs <- quantreg::plot.rqs
trace("plot.rqs", quote(par <- list), print = FALSE)
# plot.summary.rqs
plot.summary.rqs <- quantreg::plot.summary.rqs
trace("plot.summary.rqs", quote(par <- list), print = FALSE)
# test it out
op <- par(mfrow = c(2, 2))
plot(summary(fm))
plot(fm)
title("My Plots", outer = TRUE, line = -1)
par(op)
EDIT: added plot.summary.rqs.

Resources