I am trying to run a meta analysis using a package "gemtc", and the code performs very well in my test data..............................................
The code is listed:
data <- read.csv("input.txt", sep=",", header=T)
network <- mtc.network(data, description="Example")
result.anohe <- mtc.anohe(network, n.adapt=10000, n.iter=50000)
#The file (problem.txt) is also attached.
However, when I use my real data, it has an unfixed bug:
Error in decompose.study(study.samples[, colIndexes, drop = FALSE], studies[i]) :
Decomposed variance ill-defined for 1. Most likely the USE did not converge:
[,1] [,2] [,3] [,4]
[1,] 0.000 2478.307 2491.482 2485.044
[2,] 2478.307 0.000 1106288.727 -440067.825
[3,] 2491.482 1106288.727 0.000 -1459996.199
[4,] 2485.044 -440067.825 -1459996.199 0.000
Thanks very much in advance!
The input file causing problem is attached:
file
..............................................................................................................................................................................................
Related
When I run the R command:
outer(37:42, 37:42, complex, 1)
I get an error
"Error in dim(robj) <- c(dX, dY) : dims [product 36] do not match the length of object [37]"
in my R session. But when I run
outer(36:42, 36:42, complex, 1)
I have a valid matrix as a result. The problem persists for all values greater than 36. And there is no problem for all values less then 37.
Is this a bug?
My system: Microsoft R Open 3.4.4 / RStudio 1.1.447 / Ubuntu 16.04
More specifically, when running the function with arguments m:n, m:n it returns the error whenever n < (n - m + 1)^2 [citation needed]. Try for example outer(20:23, 20:23, complex, 1) and outer(20:24, 20:24, complex, 1), where the first will fail but the latter won't, because 24 < (24-20+1)^2. I suspect this has to do with the first argument of complex being length.out, which defines the length of the vector to return - not really an explanation, I know. So your first argument 37:42 is passed to the length.out parameter. This does not make a lot of sense so please correct me if I am wrong, but I think what you want to do is the following:
outer(37:42, 37:42, function(x,y) {complex(1, real = x, imaginary = y)})
Which outputs:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 37+37i 37+38i 37+39i 37+40i 37+41i 37+42i
[2,] 38+37i 38+38i 38+39i 38+40i 38+41i 38+42i
[3,] 39+37i 39+38i 39+39i 39+40i 39+41i 39+42i
[4,] 40+37i 40+38i 40+39i 40+40i 40+41i 40+42i
[5,] 41+37i 41+38i 41+39i 41+40i 41+41i 41+42i
[6,] 42+37i 42+38i 42+39i 42+40i 42+41i 42+42i
Hope this helps.
The problem is in the 4th argument: it should be named:
outer(37:42, 37:42, complex, length.out = 1)
works fine!
I would like to know which is the easiest way to put a regression output (splm object) in TeX. Stargazer, texreg, latex does not recognize this type of object so the table would have to be done kind of manually. I already put the coefficients and standard errors in a matrix (standard error bellow) in the following way (each column is a different regression):
[,1] [,2] [,3] [,4] [,5] [,6]
lambda -0.550153770 -0.606755198 -1.0894505645 0.703821961 -0.560769652 -0.698232106
0.056878033 0.056878033 0.0568780329 0.056878033 0.056878033 0.056878033
rho 0.571742772 0.618236404 0.7365074175 -1.017060680 0.745559212 0.733598140
0.034064728 0.034064728 0.0340647282 0.034064728 0.034064728 0.034064728
However I don't know how to put to this matrix the stars (if they are in a vector), parenthesis to the standard errors, and finally put that matrix to TeX including the rownames.
It's not a perfect answer but you can piece together something
smry <- summary(splm_lag)
pander(data.frame(R.Square = smry$rsqr))
pander(smry$CoefTable)
----------
R.Square
----------
0.9161
----------
-----------------------------------------------------------------------
Estimate Std. Error t-value Pr(>|t|)
-------------------------- ---------- ------------ --------- ----------
**lambda** 0.574 0.05808 9.883 4.943e-23
**PC1** -0.06165 0.03741 -1.648 0.09931
**PC2** 0.05824 0.02296 2.537 0.01118
**PC3** 0.02966 0.01937 1.531 0.1258
**PC4** -0.04165 0.02289 -1.82 0.06879
**I(as.numeric(years))** 0.03059 0.00939 3.258 0.001122
-----------------------------------------------------------------------
I am creating a PCA plot from data:
label <- read.table('label_clusters.tsv')
mydata <- read.table('raw_clusters.tsv')
GP.svd = svd(mydata)
dat = data.frame("pc1"= GP.svd$u[,1],
"pc2"= GP.svd$u[,2],
"Data"= c(rep("my", nsamples(our.obj2)), rep("zeller", nsamples(z.obj))))
GP.svd is a large list in the form of:
[,97] [,98] [,99] [,100] [,101] [,102]
[1,] -9.616173e-02 -0.0779788701 -0.1087899396 -0.0653396699 -0.140911786 -5.064931e-02
[2,] 1.101038e-01 0.0465664554 0.0237686772 0.1344639223 0.035536326 2.715842e-02
[3,] -3.247248e-02 0.0295960109 0.0148926826 0.0021550661 -0.003509716 -1.887659e-02
When I run the code thus far, I get this error:
Error in data.frame(pc1 = GP.svd$u[, 1], pc2 = GP.svd$u[, 2], Data = c(rep("my", :
could not find function "nsamples"
I am not sure why this is happening, any help is appreciated
Your code cannot find the nsamples function. This means that you:
have to import an package that contains nsamples, or
write an nsamples function yourself that works correctly on our.obj2, or
use a different function, for example nrow if our.obj2 is a data.frame.
With this program below, I will get the error:
solve.default(Sigma0[cs.idx, cs.idx]) : 'a' is 0-diml
But, when I check the em() function step by step, I mean, sentence by sentence without function, there is no error within solve(). So I am confused and desperate for help, Thank you!
###----------------------------------------------------------------
### Maximal Likelihood estimation of mean and covariance
### for multivariate normal distribution by EM algorithm,
### for demonstration purposes only
###----------------------------------------------------------------
em<-function(xdata,mu0,Sigma0){
n<-nrow(xdata)
p<-ncol(xdata)
err<-function(mu0,Sigma0,mu1,Sigma1){
th0<-c(mu0,as.vector(Sigma0))
th1<-c(mu1,as.vector(Sigma1))
sqrt(sum((th0-th1)*(th0-th1)))
}
mu1<-mu0+1
Sigma1<-Sigma0+1
while(err(mu0,Sigma0,mu1,Sigma1)>1e-6){
mu1<-mu0
Sigma1<-Sigma0
zdata<-xdata
Ai<-matrix(0,p,p)
for(i in 1:n){
if(any(is.na(xdata[i,]))){
zi<-xdata[i,]
na.idx<-(1:p)[is.na(zi)]
cs.idx<-(1:p)[-na.idx]
Sigma012<-Sigma0[na.idx,cs.idx,drop=FALSE]
Sigma022.iv<-solve(Sigma0[cs.idx,cs.idx])
zdata[i,na.idx]<-mu0[na.idx]+(Sigma012%*%Sigma022.iv)%*%(zi[cs.idx]-mu0[cs.idx])
Ai[na.idx,na.idx]<-Ai[na.idx,na.idx]+Sigma0[na.idx,na.idx]-Sigma012%*%Sigma022.iv%*%t(Sigma012)
}
}
mu0<-colMeans(zdata)
Sigma0<-(n-1)*cov(zdata)/n+Ai/n
}
return(list(mu=mu0,Sigma=Sigma0))
}
##A simulation example
library(MASS)
set.seed(1200)
p=3
mu<-c(1,0,-1)
n<-1000
Sig <- matrix(c(1, .7, .6, .7, 1, .4, .6, .4, 1), nrow = 3)
triv<-mvrnorm(n,mu,Sig)
misp<-0.2 #MCAR probability
misidx<-matrix(rbinom(3*n,1,misp)==1,nrow=n)
triv[misidx]<-NA
#exclude the cases whose entire elements were missed
er<-which(apply(apply(triv,1,is.na),2,sum)==p)
if(length(er)>=1) triv<-triv[-er,]
#initial values
mu0<-rep(0,p)
Sigma0<-diag(p)
system.time(rlt<-em(triv,mu0,Sigma0))
#a better initial values
mu0<-apply(triv,2,mean,na.rm=TRUE)
nas<-is.na(triv)
na.num<-apply(nas,2,sum)
zdata<-triv
zdata[nas]<-rep(mu0,na.num)
Sigma0<-cov(zdata)
system.time(rlt<-em(triv,mu0,Sigma0))
Your er<-which(apply(apply(triv,1,is.na),2,sum)==) piece of code is not valid. As a comment above it states, you wish to remove complete NA cases. If so, er<-which(apply(apply(triv,1,is.na),2,sum)==ncol(triv)) is the right piece of code.
The error itself happens when there is a complete NA case still present in triv when being passed to em. At some point, cs.idx is empty, so Sigma0[cs.idx,cs.idx] is also empty, which is reflected by the error message.
However, if the correction above is applied, everything runs fine:
> system.time(rlt<-em(triv,mu0,Sigma0))
user system elapsed
0.46 0.00 0.47
> rlt
$mu
[1] 0.963058487 -0.006246175 -1.024260183
$Sigma
[,1] [,2] [,3]
[1,] 0.9721301 0.6603700 0.5549126
[2,] 0.6603700 1.0292379 0.3745184
[3,] 0.5549126 0.3745184 0.9373208
I'm working on a text mining/clustering project and am trying to create a table which contains number of clusters as rows and 6 columns representing the following 6 metrics:
max.diameter, min.separation, average.within,average.between,avg.silwidth,dunn.
I need to create the tables for 3 methods - kmeans, pam and hclust.
I was able to create something for kmeans
dtm0.90Dist = dist(dtm0.90)
foreachcluster = function(k) {
kmeans.result = kmeans(dtm0.90, k);
kmeans.stats = cluster.stats(dtm0.90Dist,kmeans.result$cluster);
c(kmeans.stats$min.separation, kmeans.stats$max.diameter,
kmeans.stats$average.within, kmeans.stats$avearge.between,
kmeans.stats$avg.silwidth, kmeans.stats$dunn)
}
rbind(foreachcluster(2), foreachcluster(3), foreachcluster(4), foreachcluster(5),
foreachcluster(6), foreachcluster(7),foreachcluster(8))
and I get the following output
[,1] [,2] [,3] [,4] [,5]
[1,] 3.162278 30.19934 5.831550 0.5403872 0.10471348
[2,] 2.236068 28.37252 5.006058 0.3923446 0.07881104
[3,] 1.000000 28.37252 4.995478 0.2496066 0.03524537
[4,] 1.000000 26.40076 4.387212 0.2633338 0.03787770
[5,] 1.000000 26.40076 4.353248 0.2681947 0.03787770
[6,] 1.000000 26.40076 4.163757 0.1633954 0.03787770
[7,] 1.000000 26.40076 4.128927 0.2676423 0.03787770
I need similar output for hclust and pam methods but for the life of me can't get the same function to work for either of the two methods
OK, so I was able to make the function for HCLUST
forhclust=function(k){dfDist = dist(dtm0.90);
hclust.result = hclust(dfDist);
hclust.cluster = (cutree(hclust.result, k));
cluster.stats(dfDist,hclust.cluster);c(cluster.stats$min.separation)}
But I get an error when i run this
Error in cluster.stats$min.separation :
object of type 'closure' is not subsettable
What I need is for it to print "min.separation" output.
I would really appreciate all the help and perhaps some guidance in understanding why my approach is failing in hclust.
Also, is there a good source that can explain the functioning and application of these methods, step by step, in detail?
Thank You
foreachcluster2 = function(k) {
hc = hclust(mDist, method = "ave")
hresult = cutree(hc, k)
h.stats = cluster.stats(mDist,hresult);
c( max.dia=h.stats$max.diameter,
min.sep=h.stats$min.separation,
avg.wi=h.stats$average.within,
avg.bw=h.stats$average.between,
silwidth=h.stats$avg.silwidth,
dunn=h.stats$dunn)
}
t2 = rbind(foreachcluster2(2), foreachcluster2(3), foreachcluster2(4), foreachcluster2(5),foreachcluster2(6),
foreachcluster2(7), foreachcluster2(8), foreachcluster2(9), foreachcluster2(10),
foreachcluster2(11), foreachcluster2(12),foreachcluster2(13),foreachcluster2(14))
rownames(t2) = 2:14
t2
This should work. For pam():
pamC <- pam(x=m, k=2)
pamC
pamC$clustering
use $clustering instead of $cluster, the rest are the same.