Interpreting the PCA axis Dim1 and Dim2 from CLARA plot results directly - r

I had a large dataset that contains more than 300,000 rows/observations and 22 variables. I used the CLARA method for the clustering and plotted the results using fviz_cluster. Using the silhouette method, I got 10 as my number of clusters and from there I applied it to my CLARA algorithm.
clara.res <- clara(df, 10, samples = 50,trace = 1,sampsize = 1000, pamLike = TRUE)
str(clara.res)
List of 10
$ sample : chr [1:1000] "100046" "100303" "10052" "100727" ...
$ medoids : num [1:10, 1:22] 0.925 0.125 0.701 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:10] "193751" "137853" "229261" "257462" ...
.. ..$ : chr [1:22] "COD" "DMW" "HER" "SPR" ...
$ i.med : int [1:10] 104171 42062 143627 174961 300065 13836 192832 207079 185241 228575
$ clustering: Named int [1:302251] 1 1 1 2 3 4 5 3 3 3 ...
..- attr(*, "names")= chr [1:302251] "1" "10" "100" "1000" ...
$ objective : num 0.37
$ clusinfo : num [1:10, 1:4] 71811 40181 46271 10155 31309 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "size" "max_diss" "av_diss" "isolation"
$ diss : 'dissimilarity' num [1:499500] 1.392 2.192 0.937 2.157 1.643 ...
..- attr(*, "Size")= int 1000
..- attr(*, "Metric")= chr "euclidean"
..- attr(*, "Labels")= chr [1:1000] "100046" "100303" "10052" "100727" ...
$ call : language clara(x = df, k = 10, samples = 50, sampsize = 1000, trace = 1, pamLike = TRUE)
$ silinfo :List of 3
..$ widths : num [1:1000, 1:3] 1 1 1 1 1 1 1 1 1 1 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:1000] "83395" "181310" "34452" "42991" ...
.. .. ..$ : chr [1:3] "cluster" "neighbor" "sil_width"
..$ clus.avg.widths: num [1:10] 0.645 0.408 0.487 0.513 0.839 ...
..$ avg.width : num 0.612
$ data : num [1:302251, 1:22] 1 1 1 0.366 0.35 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:302251] "1" "10" "100" "1000" ...
.. ..$ : chr [1:22] "COD" "DMW" "HER" "SPR" ...
- attr(*, "class")= chr [1:2] "clara" "partition"
For the plot:
fviz_cluster(clara.res,
palette = c(
"#004c6d",
"#00a1c1",
"#ffc334",
"#78ab63",
"#00ffff",
"#00cfe3",
"#6efa75",
"#cc0089",
"#ff9509",
"#ffb6de"
), # color palette
ellipse.type = "t",geom = "point",show.clust.cent = TRUE,repel = TRUE,pointsize = 0.5,
ggtheme = theme_classic()
)+ xlim(-7, 3) + ylim (-5, 4) + labs(title = "Plot of clusters")
The result:
I reckoned that this cluster plot is based on PCA and have been trying to figure out which variables in my original data were chosen as Dim1 and Dim2 or what these x and y-axis represent. Can somebody help me how to find out these Dim1 and Dim2 and eigenvalues/variance of the whole Dim that exist without running PCA separately?
I saw there are some other functions/packages for PCA such as get_eigenvalue in factoextra and FactomineR, but it seemed that will require me to use the PCA algorithm from the beginning? How can I integrate it directly with my CLARA results?
Also, my Dim1 only consists of 12.3% and Dim2 8.8%, does it mean that these variables are not representative enough or? considering that I would have 22 dimensions in total (from my 22 variables), I think it's alright, no? I am not sure how these percentages of Dim1 and Dim2 affect my cluster results. I was thinking to do the screeplot from my CLARA results but I also can't figure it out.
I'd appreciate any insights.

Related

R: How to "extract" the p-values of a Dunnetts-test (Post-Hoc after ANOVA)

I am a biochemist working with R as a non-professional and get into a problem now. I have a dataframe and I want to compare my different treatment groups and the positive control with a medium control. The statistical test I want to use is an anova followed by a Dunnetts test. I used the multcomp- and the DescTools-package for this and I get there with this code
Particle <- factor(c("Medium", "PosCon", "Trt1", "Trt2", "Trt3", "Medium", "PosCon", "Trt1", "Trt2", "Trt3", "Medium", "PosCon", "Trt1", "Trt2", "Trt3"))
Values <- c(1.0, 263.0, 3.1, 1.2, 0.9, 1.0, 244.0, 2.4, 1.6, 1.1, 1.0, 255.0, 3.8, 2.0, 0.8)
myDataframe <- data.frame(Particle, Values)
str(myDataframe)
a1 <- aov(Values ~ Particle, data= myDataframe)
summary(a1)
#Output
# Df Sum Sq Mean Sq F value Pr(>F)
#Particle 4 152832 38208 2084 1.48e-14 ***
#Residuals 10 183 18
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
myDataframe.dunnett <- glht(a1, linfct = mcp(Particle= "Dunnett"))
myDataframe.dunnett
summary(myDataframe.dunnett)
# Output:
# Simultaneous Tests for General Linear Hypotheses
#
#Multiple Comparisons of Means: Dunnett Contrasts
#
#
#Fit: aov(formula = Values ~ Particle, data = myDataframe)
#
#Linear Hypotheses:
# Estimate Std. Error t value Pr(>|t|)
#PosCon - Medium == 0 253.00000 3.49616 72.365 <0.001 ***
#Trt1 - Medium == 0 2.10000 3.49616 0.601 0.930
#Trt2 - Medium == 0 0.60000 3.49616 0.172 0.999
#Trt3 - Medium == 0 -0.06667 3.49616 -0.019 1.000
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#(Adjusted p values reported -- single-step method)
Now I want to get the p-values extracted (or Pr(>|t|)) and I want to get them as a four-digit number (three would also work). I used str(summary(myDataframe.dunnett)) and names(summary(myDataframe.dunnett)) to get to know what to extract, but when I extract it, it is a no digit-number as this:
str(summary(myDataframe.dunnett))
names(summary(myDataframe.dunnett)) #just to get to know the names
x <- summary(myDataframe.dunnett)$test$pvalues
x
[1] 0 1 1 1
attr(,"error")
[1] 0.0001612462
Does anyone know what that is or knows a better way to kind of "extract" the significance levels after an Anova and Dunnetts test into a vector? I need those to convert them into the significance-stars above a plot.
I have the feeling that this might help, but I could not figure out how to modify it for my data:
Thanks for your help!
If you look at the structure of the summary object you can see that the values you are hoping to extract are in a list element named test
str( summary(myDataframe.dunnett) )
List of 10
$ model :List of 13
..$ coefficients : Named num [1:5] 1 253 2.1 0.6 -0.0667
.. ..- attr(*, "names")= chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
..$ residuals : Named num [1:15] -1.48e-15 9.00 1.92e-15 -4.00e-01 -3.33e-02 ...
.. ..- attr(*, "names")= chr [1:15] "1" "2" "3" "4" ...
..$ effects : Named num [1:15] -201.8857 -390.926 -2.8833 -0.8957 0.0816 ...
.. ..- attr(*, "names")= chr [1:15] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
..$ rank : int 5
..$ fitted.values: Named num [1:15] 1 254 3.1 1.6 0.933 ...
.. ..- attr(*, "names")= chr [1:15] "1" "2" "3" "4" ...
..$ assign : int [1:5] 0 1 1 1 1
..$ qr :List of 5
.. ..$ qr : num [1:15, 1:5] -3.873 0.258 0.258 0.258 0.258 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:15] "1" "2" "3" "4" ...
.. .. .. ..$ : chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
.. .. ..- attr(*, "assign")= int [1:5] 0 1 1 1 1
.. .. ..- attr(*, "contrasts")=List of 1
.. .. .. ..$ Particle: chr "contr.treatment"
.. ..$ qraux: num [1:5] 1.26 1.54 1.54 1.53 1.52
.. ..$ pivot: int [1:5] 1 2 3 4 5
.. ..$ tol : num 1e-07
.. ..$ rank : int 5
.. ..- attr(*, "class")= chr "qr"
..$ df.residual : int 10
..$ contrasts :List of 1
.. ..$ Particle: chr "contr.treatment"
..$ xlevels :List of 1
.. ..$ Particle: chr [1:5] "Medium" "PosCon" "Trt1" "Trt2" ...
..$ call : language aov(formula = Values ~ Particle, data = myDataframe)
..$ terms :Classes 'terms', 'formula' language Values ~ Particle
.. .. ..- attr(*, "variables")= language list(Values, Particle)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "Values" "Particle"
.. .. .. .. ..$ : chr "Particle"
.. .. ..- attr(*, "term.labels")= chr "Particle"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(Values, Particle)
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "factor"
.. .. .. ..- attr(*, "names")= chr [1:2] "Values" "Particle"
..$ model :'data.frame': 15 obs. of 2 variables:
.. ..$ Values : num [1:15] 1 263 3.1 1.2 0.9 1 244 2.4 1.6 1.1 ...
.. ..$ Particle: Factor w/ 5 levels "Medium","PosCon",..: 1 2 3 4 5 1 2 3 4 5 ...
.. ..- attr(*, "terms")=Classes 'terms', 'formula' language Values ~ Particle
.. .. .. ..- attr(*, "variables")= language list(Values, Particle)
.. .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr [1:2] "Values" "Particle"
.. .. .. .. .. ..$ : chr "Particle"
.. .. .. ..- attr(*, "term.labels")= chr "Particle"
.. .. .. ..- attr(*, "order")= int 1
.. .. .. ..- attr(*, "intercept")= int 1
.. .. .. ..- attr(*, "response")= int 1
.. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. .. ..- attr(*, "predvars")= language list(Values, Particle)
.. .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "factor"
.. .. .. .. ..- attr(*, "names")= chr [1:2] "Values" "Particle"
..- attr(*, "class")= chr [1:2] "aov" "lm"
$ linfct : num [1:4, 1:5] 0 0 0 0 1 0 0 0 0 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : Named chr [1:4] "PosCon - Medium" "Trt1 - Medium" "Trt2 - Medium" "Trt3 - Medium"
.. .. ..- attr(*, "names")= chr [1:4] "Particle1" "Particle2" "Particle3" "Particle4"
.. ..$ : chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
..- attr(*, "type")= chr "Dunnett"
$ rhs : num [1:4] 0 0 0 0
$ coef : Named num [1:5] 1 253 2.1 0.6 -0.0667
..- attr(*, "names")= chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
$ vcov : num [1:5, 1:5] 6.11 -6.11 -6.11 -6.11 -6.11 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
.. ..$ : chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
$ df : int 10
$ alternative: chr "two.sided"
$ type : chr "Dunnett"
$ focus : chr "Particle"
$ test :List of 7
..$ pfunction :function (type = c("univariate", "adjusted", p.adjust.methods), ...)
..$ qfunction :function (conf.level, adjusted = TRUE, ...)
..$ coefficients: Named num [1:4] 253 2.1 0.6 -0.0667
.. ..- attr(*, "names")= chr [1:4] "PosCon - Medium" "Trt1 - Medium" "Trt2 - Medium" "Trt3 - Medium"
..$ sigma : Named num [1:4] 3.5 3.5 3.5 3.5
.. ..- attr(*, "names")= chr [1:4] "PosCon - Medium" "Trt1 - Medium" "Trt2 - Medium" "Trt3 - Medium"
..$ tstat : Named num [1:4] 72.3652 0.6007 0.1716 -0.0191
.. ..- attr(*, "names")= chr [1:4] "PosCon - Medium" "Trt1 - Medium" "Trt2 - Medium" "Trt3 - Medium"
..$ pvalues : num [1:4] 0 0.93 0.999 1
.. ..- attr(*, "error")= num 0.000305
..$ type : chr "single-step"
..- attr(*, "class")= chr "mtest"
- attr(*, "class")= chr [1:2] "summary.glht" "glht"
... and that the test list is a rather complicated list in its own right, ...
> ( summary(myDataframe.dunnett)$test )
$pfunction
function (type = c("univariate", "adjusted", p.adjust.methods),
...)
{
type <- match.arg(type)
pfct <- function(q) {
switch(object$alternative, two.sided = {
low <- rep(-abs(q), dim)
upp <- rep(abs(q), dim)
}, less = {
low <- rep(q, dim)
upp <- rep(Inf, dim)
}, greater = {
low <- rep(-Inf, dim)
upp <- rep(q, dim)
})
pmvt(lower = low, upper = upp, df = df, corr = cr, ...)
}
switch(object$alternative, two.sided = {
if (df > 0) pvals <- 2 * (1 - pt(abs(tstat), df)) else pvals <- 2 *
(1 - pnorm(abs(tstat)))
}, less = {
if (df > 0) pvals <- pt(tstat, df) else pvals <- pnorm(tstat)
}, greater = {
if (df > 0) pvals <- 1 - pt(tstat, df) else pvals <- 1 -
pnorm(tstat)
})
if (type == "univariate")
return(pvals)
if (type == "adjusted") {
ret <- numeric(length(tstat))
error <- 0
for (i in 1:length(tstat)) {
tmp <- pfct(tstat[i])
if (attr(tmp, "msg") != "Normal Completion" && length(grep("^univariate",
attr(tmp, "msg"))) == 0)
warning(attr(tmp, "msg"))
if (error < attr(tmp, "error"))
error <- attr(tmp, "error")
ret[i] <- tmp
}
ret <- 1 - ret
attr(ret, "error") <- error
return(ret)
}
return(p.adjust(pvals, method = type))
}
<bytecode: 0x55c5419b3f18>
<environment: 0x55c540a7c100>
$qfunction
function (conf.level, adjusted = TRUE, ...)
{
tail <- switch(object$alternative, two.sided = "both.tails",
less = "lower.tail", greater = "upper.tail")
if (adjusted) {
calpha <- qmvt(conf.level, df = df, corr = cr, tail = tail,
...)
}
else {
calpha <- qmvt(conf.level, df = df, corr = matrix(1),
tail = tail, ...)
}
ret <- calpha$quantile
attr(ret, "error") <- calpha$estim.prec
return(ret)
}
<bytecode: 0x55c5419b9d20>
<environment: 0x55c540a7c100>
$coefficients
PosCon - Medium Trt1 - Medium Trt2 - Medium Trt3 - Medium
253.00000000 2.10000000 0.60000000 -0.06666667
$sigma
PosCon - Medium Trt1 - Medium Trt2 - Medium Trt3 - Medium
3.496157 3.496157 3.496157 3.496157
$tstat
PosCon - Medium Trt1 - Medium Trt2 - Medium Trt3 - Medium
72.36517911 0.60065959 0.17161703 -0.01906856
$pvalues
[1] 0.0000000 0.9298994 0.9992779 0.9999999
attr(,"error")
[1] 0.0001864546
$type
[1] "single-step"
attr(,"class")
[1] "mtest"
... and that the p-values are in a sublist of that list in an element named pvalues:
> ( summary(myDataframe.dunnett)$test$pvalues )
[1] 0.0000000 0.9299389 0.9992772 0.9999999
attr(,"error")
[1] 0.0001414288
... or with the DescTools function DunnetTest
(z <- DunnettTest(formula(a1$call)))
## Dunnett's test for comparing several treatments with a control :
## 95% family-wise confidence level
##
## $Medium
## diff lwr.ci upr.ci pval
## PosCon-Medium 253.00000000 242.883587 263.11641 <2e-16 ***
## Trt1-Medium 2.10000000 -8.016413 12.21641 0.9299
## Trt2-Medium 0.60000000 -9.516413 10.71641 0.9993
## Trt3-Medium -0.06666667 -10.183079 10.04975 1.0000
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
z$Medium[,"pval"]
## PosCon-Medium Trt1-Medium Trt2-Medium Trt3-Medium
## 0.0000000 0.9298875 0.9992763 0.9999999

R kohonen - Is the input data scaled and centred automatically?

I have been following an online example for R Kohonen self-organising maps (SOM) which suggested that the data should be centred and scaled before computing the SOM.
However, I've noticed the object created seems to have attributes for centre and scale, in which case am I really applying a redundant step by centring and scaling first? Example script below
# Load package
require(kohonen)
# Set data
data(iris)
# Scale and centre
dt <- scale(iris[, 1:4],center=TRUE)
# Prepare SOM
set.seed(590507)
som1 <- som(dt,
somgrid(6,6, "hexagonal"),
rlen=500,
keep.data=TRUE)
str(som1)
The output from the last line of the script is:
List of 13
$ data :List of 1
..$ : num [1:150, 1:4] -0.898 -1.139 -1.381 -1.501 -1.018 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length"
"Petal.Width"
.. ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
.. .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width"
"Petal.Length" "Petal.Width"
.. ..- attr(*, "scaled:scale")= Named num [1:4] 0.828 0.436 1.765 0.762
.. .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width"
"Petal.Length" "Petal.Width"
$ unit.classif : num [1:150] 3 5 5 5 4 2 4 4 6 5 ...
$ distances : num [1:150] 0.0426 0.0663 0.0768 0.0744 0.1346 ...
$ grid :List of 6
..$ pts : num [1:36, 1:2] 1.5 2.5 3.5 4.5 5.5 6.5 1 2 3 4 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "x" "y"
..$ xdim : num 6
..$ ydim : num 6
..$ topo : chr "hexagonal"
..$ neighbourhood.fct: Factor w/ 2 levels "bubble","gaussian": 1
..$ toroidal : logi FALSE
..- attr(*, "class")= chr "somgrid"
$ codes :List of 1
..$ : num [1:36, 1:4] -0.376 -0.683 -0.734 -1.158 -1.231 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:36] "V1" "V2" "V3" "V4" ...
.. .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length"
"Petal.Width"
$ changes : num [1:500, 1] 0.0445 0.0413 0.0347 0.0373 0.0337 ...
$ alpha : num [1:2] 0.05 0.01
$ radius : Named num [1:2] 3.61 0
..- attr(*, "names")= chr [1:2] "66.66667%" ""
$ user.weights : num 1
$ distance.weights: num 1
$ whatmap : int 1
$ maxNA.fraction : int 0
$ dist.fcts : chr "sumofsquares"
- attr(*, "class")= chr "kohonen"
Note notice that in lines 7 and 10 of the output there are references to centre and scale. I would appreciate an explanation as to the process here.
Your step with scaling is not redundant because in source code there are no scaling, and attributes, that you see in 7 and 10 are attributes from train dataset.
To check this, just run and compare results of this chunk of code:
# Load package
require(kohonen)
# Set data
data(iris)
# Scale and centre
dt <- scale(iris[, 1:4],center=TRUE)
#compare train datasets
str(dt)
str(as.matrix(iris[, 1:4]))
# Prepare SOM
set.seed(590507)
som1 <- kohonen::som(dt,
kohonen::somgrid(6,6, "hexagonal"),
rlen=500,
keep.data=TRUE)
#without scaling
som2 <- kohonen::som(as.matrix(iris[, 1:4]),
kohonen::somgrid(6,6, "hexagonal"),
rlen=500,
keep.data=TRUE)
#compare results of som function
str(som1)
str(som2)

Extract statistics from Anderson-Darling test (list)

I would like to extract the p-values from the Anderson-Darling test (ad.test from package kSamples). The test result is a list of 12 containing a 2x3 matrix. The p value is part of the 2x3 matrix and is present in element 7.
When using the following code:
lapply(AD_result, "[[", 7)
I get the following subset of AD test results (first 2 of a total of 50 shown)
[[1]]
AD T.AD asympt. P-value
version 1: 1.72 0.94536 0.13169
version 2: 1.51 0.66740 0.17461
[[2]]
AD T.AD asympt. P-value
version 1: 12.299 14.624 6.9248e-07
version 2: 11.900 14.144 1.1146e-06
My question is how to extract only the p-value (e.g. from version 1) and put these 50 results into a vector
The output from str(AD_result) is:
List of 55
$ :List of 12
..$ test.name : chr "Anderson-Darling"
..$ k : int 2
..$ ns : int [1:2] 103 2905
..$ N : int 3008
..$ n.ties : int 2873
..$ sig : num 0.762
..$ ad : num [1:2, 1:3] 1.72 1.51 0.945 0.667 0.132 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "version 1:" "version 2:"
.. .. ..$ : chr [1:3] "AD" "T.AD" " asympt. P-value"
..$ warning : logi FALSE
..$ null.dist1: NULL
..$ null.dist2: NULL
..$ method : chr "asymptotic"
..$ Nsim : num 1
..- attr(*, "class")= chr "kSamples"
You could try:
unlist(lapply(AD_result, function(x) x$ad[,3]))

Vegan NA values are breaking envfit, even with na.rm = T. Example images in post

I am trying to overlay envfit arrows on to an NMDS chart like this one (which is when I replace missing values with fake numbers):
However, with our actual data, it doesnt give the arrows but labels each point individually, like this:
Any suggestions would be appreciated.
Code:
# Make MDS
x.mds <- metaMDS(x_matrix, trace = FALSE)
# Extract point co-ordinates for putting into ggplot
NMDS <- data.frame(MDS1 = x.mds$points[,1], MDS2 = x.mds$points[,2])
p <- ggplot(NMDS, aes(MDS1, MDS2))
p + geom_point()
#environmental variables
ef <- envfit(x.mds ~ pH + Ammonia + DO, x.env)
ef <- envfit(x.mds ~ pH + Ammonia + DO, x.env, na.rm = TRUE) ##ALTERNATIVE
plot(ef)
Data:
Sample Region pH Ammonia Nitrate BOD DO
15 N 7.618 0.042 0.845 1 NA
34 N 7.911 0.04 7.41 8 5.62
42 SE 7.75 NA 3.82 1 21.629
........
> ef
***VECTORS
NMDS1 NMDS2 r2 Pr(>r)
pH 0.50849 -0.86107 0.0565 0.719
Ammonia 0.99050 -0.13751 0.0998 0.504
DO -0.88859 -0.45871 0.1640 0.319
P values based on 999 permutations.
1 observation deleted due to missingness
> str(ef)
List of 3
$ vectors :List of 4
..$ arrows : num [1:3, 1:2] 0.508 0.991 -0.889 -0.861 -0.138 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:3] "pH" "Ammonia" "DO"
.. .. ..$ : chr [1:2] "NMDS1" "NMDS2"
.. ..- attr(*, "decostand")= chr "normalize"
..$ r : Named num [1:3] 0.0565 0.0998 0.164
.. ..- attr(*, "names")= chr [1:3] "pH" "Ammonia" "DO"
..$ permutations: num 999
..$ pvals : num [1:3] 0.719 0.504 0.319
..- attr(*, "class")= chr "vectorfit"
$ factors : NULL
$ na.action:Class 'omit' int 17
- attr(*, "class")= chr "envfit"

Accessing control chart results in R?

I have a short R script that loads a bunch of data and plots it in an XBar chart. Using the following code, I can plot the data and view the various statistical information.
library(qcc)
tir<-read.table("data.dat", header=T,,sep="\t")
names(tir)
attach(tir)
rand <- sample(tir)
xbarchart <- qcc(rand[1:100,],type="R")
summary(xbarchart)
I want to be able to do some process capability analysis (described here(PDF) on page 5) immediately after the XBar chart is created. In order to create the analysis chart, I need to store the LCL and UCL results from the XBar chart results created before as variables. Is there any way I can do this?
I shall answer your question using the example in the ?qcc help file.
x <- c(33.75, 33.05, 34, 33.81, 33.46, 34.02, 33.68, 33.27, 33.49, 33.20,
33.62, 33.00, 33.54, 33.12, 33.84)
xbarchart <- qcc(x, type="xbar.one", std.dev = "SD")
A useful function to inspect the structure of variables and function results is str(), short for structure.
str(xbarchart)
List of 11
$ call : language qcc(data = x, type = "xbar.one", std.dev = "SD")
$ type : chr "xbar.one"
$ data.name : chr "x"
$ data : num [1:15, 1] 33.8 33 34 33.8 33.5 ...
..- attr(*, "dimnames")=List of 2
.. ..$ Group : chr [1:15] "1" "2" "3" "4" ...
.. ..$ Samples: NULL
$ statistics: Named num [1:15] 33.8 33 34 33.8 33.5 ...
..- attr(*, "names")= chr [1:15] "1" "2" "3" "4" ...
$ sizes : int [1:15] 1 1 1 1 1 1 1 1 1 1 ...
$ center : num 33.5
$ std.dev : num 0.342
$ nsigmas : num 3
$ limits : num [1, 1:2] 32.5 34.5
..- attr(*, "dimnames")=List of 2
.. ..$ : chr ""
.. ..$ : chr [1:2] "LCL" "UCL"
$ violations:List of 2
..$ beyond.limits : int(0)
..$ violating.runs: num(0)
- attr(*, "class")= chr "qcc"
You will notice the second to last element in this list is called $limits and contains the two values for LCL and UCL.
It is simple to extract this element:
limits <- xbarchart$limits
limits
LCL UCL
32.49855 34.54811
Thus LCL <- limits[1] and UCL <- limits[2]

Resources