I am trying to run function pvclust, but using simpson dissimilarities instead using one of the default distances. Can I include a distance function inside the pvclust (method.dist). I already have my simpson dissimilarity intex as a dist. object from the package betapart, and that is the one that I want to use in pvclust.
Thanks!
Related
I want to do an agglomerative clustering with R, but I want to use my own distance as a linkage method instead of the ones already predefined. How can I can I embed my own distance as a linkage method using the function hclust in R?
I used the R package corrplot to visualize the correlation matrix from my data. I involved the clustering of variables using the embedded option hclust.
The invocation of the command was like this (plus various arrangements of titles, axes etc):
corrplot(Rbas,type="upper",order="hclust",method="ellipse")
But now I perform some analysis and visualizations using other packages, and the question arose about the compatibility of results. In particular, I have to repeat manually the clustering of the correlation matrix. But from the documentation to corrplot there is one obscure point: what dissimilarity measure was used in corrplot behind its reasonable defaults? Whether this is 1-|corr|, sqrt(1-corr^2), or anything else? In literature there are multiple choices, for example, as described in this article
Update to answer own question. I performed a guess trial, using the dissimilarity measure in the form 1-corr. That is I coded (Rbas is the correlation matrix):
dissim1<-1-Rbas
dist1<-as.dist(dissim1)
plot(hclust(dist1))
and recovered the ordering of variables, coinciding with the one suggested by default corrplot with hclust invocation. But it is not clear whether this is indeed their used mechanism and whether this will hold for any other matrix?
The function used by corrplot to reorder variables is corrMatOrder (try ?corrMatOrder). It returns a single permutation vector.
When order= "hclust" is selected in corrplot, corrMatOrder invokes the corrplot:::reorder_using_hclust function:
function (corr, hclust.method)
{
hc <- hclust(as.dist(1 - corr), method = hclust.method)
order.dendrogram(as.dendrogram(hc))
}
This function uses 1-corr as dissimilarity measure.
I would like to apply k-nn on a learning set to predict the class of the data using the Euclidean distance.
I am finding some difficulties with implementing this method
You can try the knn() function in class package. It is using the Euclidean Distance.
You can check its documentation for more detail:
https://stat.ethz.ch/R-manual/R-devel/library/class/html/knn.html
I am using the function, kmeans, to perfrom K-means clustering.
I have a special data which need a custom distance measure function and custom mean function.
Can I put (1) a custom distance measure function and (2) custom mean function to the kmeans function?
It seems it uses Euclidean measure only.
The standard kmeans does not allow this, for good reasons. It uses some clever algorithms (Hartigan and Wong; which is why it is much faster than the standard Lloyd textbook algorithm you find in about 100 other R packages). But these only work for the classic k-means scenario with squared deviations (which means assigning each cluster to the Euclidean nearest center, but it actually optimizes least-squares, not Euclidean distances).
I doubt you can simply plug in other distances and centroid functions into the Hartigan and Wong method (apart from it being written in Fortran, so you cannot just plug in a R function there anyway).
Beware that there are very few known combinations where other distances and means are known to always converge well. Bregman divergences should be fine, and cosine is equivalent to squared Euclidean on a sphere, so it will also work.
I am dealing with 2-dimensional parametric curve f.
Is there there is any function in R (in any package) which gives the arc-length parametrization for any such given parametric f?
I know how to derive the arc-length parametrized function from a given function. It involves derivative and integration as here. But looking for whether there is any R function for computation of arc-length parametrization.
The pracma package seems to have the functions you are looking for. See arclength() in particular on page 23 of the pracma documentation.