I have mixed data type that contain numeric and categorical attributes to which I am planning to apply cluster algorithms.
As a first step, I produced a distance matrix using the daisy() function and Gower distance measure. I have displayed the distance matrix using a heatmap and a levelplot function in R.
It seems as if there is strong similarity between some of the objects in my data and I want to check some of the similar/dissimilar objects to satisfy myself that the measure is working well on my data.
How do I select the similar/dissimilar objects from the heatmap and link them to the original data set to be able to evaluate them?
This is how I plot my heatmap using R. IDX is my distance Matrix.
new.palette=colorRampPalette(c("black","yellow","#007FFF","white"),space="rgb")
levelplot(IDX_as[1:ncol(IDX_as),ncol(IDX_as):1],col.regions=new.palette(20))
quartz(width=7,height=6) #make a new quartz window of a given size
par(mar=c(2,3,2,1)) #set the margins of the figures to be smaller than default
layout(matrix(c(1,2),1,2,byrow=TRUE),widths=c(7,1)) #set the layout of the quartz window. This will create two plotting regions, with width ratio of 7 to 1
image(IDX_as[1:ncol(IDX_as),ncol(IDX_as):1],col=new.palette(20),xaxt="n",yaxt="n") #plot a heat map matrix with no tick marks or axis labels
axis(1,at=seq(0,1,length=20),labels=rep("",20)) #draw in tick marks
axis(2,at=seq(0,1,length=20),labels=rep("",20))
#adding a color legend
s=seq(min(IDX_as),max(IDX_as),length=20) #20 values between minimum and maximum values of m
l=matrix(s,ncol=length(s),byrow=TRUE) #coerce it into a horizontal matrix
image(y=s,z=l,col=new.palette(20),ylim=c(min(IDX),max(IDX)),xaxt="n",las=1) #plot a one-column heat map
heatmap(IDX_as,symm=TRUE,col=new.palette(20))
Related
I have made ordination plots of microbiome data using the R phyloseq functions ordinate and plot_ordination with a phyloseq object and a previously calculated distance matrix (unweighted UniFrac distances) as inputs. I would like to add some arrows that indicate which species relative abundances mainly drive the distance between the samples along the axes.
Adding scores using the function scores or specscores.dbrda has not worked. Is there another way to add these arrows/vectors?
Here's the code I used:
MDS <- ordinate(physeqobject, "MDS", distance=unifracmatrix)
plot <- plot_ordination(physeqobject, MDS, color = "variable1")
speciesscores <- scores(MDS, display = "species")
The first part works fine, the scores function returns this error:
Error in scores.default(MDS, display = "species") :
cannot find scores
I also tried it without a separate distance matrix with the (default) Bray-Curtis distances calculated within ordinate, but I still get the same error:
MDS <- ordinate(physeqobject, "MDS")
I am using the contour function from Julia's Plots to plot level curves. I want to extract a list of x coordinates and a list of y coordinates corresponding to the level curves from the plot, e.g., something like this. Is there a way to do it in Julia?
Not for contour, unfortunately. For most plot types you can extract the input data of, e.g. the first series in the first subplot, with p[1][1][:x]. But for contour in particular Plots does not generate the level curves, it simply passes the matrix to the backend that then does the computation and display.
I want to visualize a data transformation. My data is in 2 dimensions and the transformation is in 3 dimensions. I am interested in seeing how each data point is transformed in the higher dimensions. So overlaid two scatter plots using scatterplot3d package. The data which is in 2 dimensions has third coordinate assigned as 0 for plotting purposes. To keep track of each datapoint, I want to assign a unique color and shape in 2D and the transformed data value will have same color and shape in 3D. I was able to assign unique color but not shape since shapes are limited(my n=50). Any idea to make this visualization better ? Here is my reproducible example.
install.packages("scatterplot3d")
library(scatterplot3d)
set.seed(20)
# example
M<-cbind(a=runif(50),b=runif(50),c=rep(0,n))
N<-cbind(d=rnorm(50),e=rnorm(50),f=rnorm(50))
s3d<-scatterplot3d(N[,1],N[,2],N[,3],color=rainbow(n),type="p",pch=0
,xlab="x",ylab="y ",zlab="z")
s3d$points3d(M[,1],M[,2],M[,3],col=rainbow(n),type="p",pch=15)
Is it possible to generate a heatmap taking into consideration both the color and the transparency, with these two parameters given from two different matrices (matrix 1 defines color, matrix 2 defines alpha)?
A little more information on what I'm after:
I have successfully used R and the heatmap.2 function in the gplots package to generate heatmaps - in this case to visualize miRNA interactions. Here, what I want to show is the probability of a particular nucleotide along the typical 20-24 nucleotides of the miRNA in being engaged in target pairing. My heatmap matrix consists of miRNAs (rows) and positions 1-24 (columns) with numeric paring probability in each cell. An example would be changing the alpha parameter of the color determined by the matrix values, such that white=no pairing and dark red=high pairing.
The heatmap.2 function works great for a single such plot, but I would now like to take in overlap information from two different species. Thus, I would need my heatmap to basically consider two matrices:
1) A matrix with the degree of species overlap, e.g. ranging from red-purple-blue for species1-only to species1+2 to species2-only.
2) A matrix with the average degree of pairing, e.g. visualized by the alpha parameter going from a weak-to-strong average pairing (whatever the color) at a given position in matrix 1.
I have tried to use the principles from this post:
Place 1 heatmap on another with transparency in R
But haven't been able to apply its suggestions to my own question.
Thanks in advance!
Is there a way to calculate an area of a filled countour like plot in r?
This image is just an example and not representative of my data. But I would, for example, want to calculate all areas above 1600. My data is in matrix for with speeds in each cell. The columns represent evenly spaced time intervals however my y axis or rows each represent a particular length and are not evenly spaced. The filled contour plot would be interpolating. I would like to bypass creating a color plot all together and just find the are of say speeds less than 35mph.