nearest shrunken centroids with sample weights - r

I am trying to train my data with nearest shrunken centroid classifier using pamr.train() function in pamr package of R. However, I also have a vector including sample weights except the training data. Is there any way to use this function with considering these sample weights?
Or, is there a way to obtain the source code of this function. If so, I can write the codes for weighted mean and weighted variances instead of the unweighted ones.
Thank you,

Related

How to transform data after fitting a distribution with gamlss?

I have a data set where observations come from highly distinct groups. Each group may have a wildly different distribution, so I am trying to find the best distribution using fitdist from fitdistrplus, then use gamlssML from the gamlss package to find the best parameters.
My issue is with transforming the data after this step. For some of the distributions, like the Box-Cox t, I can find the equation for normalizing the data using the BCT coefficients, but for many of these distributions I cannot.
Does gamlss have a function that normalizes the data after fitting? Their documentation only provides the transformations for a small number of distributions https://www.gamlss.com/wp-content/uploads/2018/01/DistributionsForModellingLocationScaleandShape.pdf
Thanks a lot
The normalised data values (for any distribution) are exactly equal to the residuals from a gamlss fit,
m1 <- gamlss()
which can be accessed by
residuals(m1) or
m1$residuals

Extract sample variance from svykm (survey package by Lumley) for complex survey analysis

In order to compare two survival curves at a fixed point in time and perform basically a two sample test, I need to extract the sample variance of the estimate at a given point in time.
For an object created with the svykm function from Thomas Lumley's survey package in R, this should be accessible in the varlog list. Do the entries in this list constitute the transformed variances on the log scale or the untransformed variances?
I have read the documentation provided for the survey package, but did not fully come to a conclusion. I note that confidence intervals are computed on the log(survival) scale, following the default in survival package and their bounds are given as exp(log(x$surv)+1.96*sqrt(x$varlog)) and exp(log(x$surv)-1.96*sqrt(x$varlog)) in the R package documentation.
They are variances on the log scale.

R: functions to determine distance of multivariate data to normal distribution

I have a multivariate data and I am interested to compute the distance of complete data to multivariate normal distribution. I want to use R. I have seen some functions like shapiro-wilk test etc. But from them I can only understand if p-value is less <0.05 it does not follow normal distribution. But I want to know how much it is far from the normal distribution. Can anyone please refer me to some functions that I can refer to for use.
Use the mqqnorm function from the RVAideMemoire package. It shows, among others, Mahalanobis distances. From the function example:
x <- 1:30+rnorm(30)
y <- 1:30+rnorm(30,1,3)
mqqnorm(cbind(x,y))

Bootstrap for phylogenetic tree generated using Mahalanobis distance (R)

I created a phylogenetic NJ tree in R using the ape package. My data contains metric measures from multiple individuals belonging to known groups. Thus, I decided to calculate the Mahalanobis distance between these groups in order to incorporate the covariance structure in my analyses. Creating the nj tree thus was not a problem.
require(ape)
lda <- lda(y, as.factor(ynames))
dist <- dist(as.matrix(predict(lda,lda$mean)$x),upper=T,diag=T)
plot(nj(dist))
However, now I'd like to calculate some bootstrap values for branch splits. I'd use the boot.phylo function, but here I have no idea how I can deal with the FUN (function) command, and thus with the correct calculation of Mahalanobis distances for the bootstrapped data set.

R: local Moran's I produces negative variance (localmoran)

I have to compute the local moran test statistics for a set of spatial data. I am using the R function localmoran from the spdep package. For some of my geographical units the variance of the local moran test statistics is NEGATIVE. This impedes me to plot e.g. LISA maps, for it makes it impossible to compute the significance of the local moran test for each unit.
Note that for the majority of the units the computation works fine! So I wouldn't ascribe the error to the function, the weights matrix is also correct, can it just be that some of the data is not compatible to the localmoran function??
Any idea how can this be possible?
Thanks!

Resources