Composite scores for consensus clustering in R

Composite scores for consensus clustering in R - r

I am using ConsensusClusterPlus package in R for clustering my omic data. I want to use my clusters for regression.Is there a way to create composite scores if say i reduce 1000 genes to 7 clusters and use those 7 clusters for regression.
I tried to look at structure of cluster in R.
results = ConsensusClusterPlus(d1,maxK=maxK,reps=1000,pItem=0.8,pFeature=1, title=title,clusterAlg="hc",distance="pearson",seed=1262118388.71279,plot="png")
icl = calcICL(results,title=title,plot="png")
str(results[[7]])
List of 5
$ consensusMatrix: num [1:40, 1:40] 1 0.689 0.976 1 1 ...
$ consensusTree :List of 7
..$ merge : int [1:39, 1:2] -1 -5 -7 -8 -9 -10 -11 -12 -13 -14 ...
..$ height : num [1:39] 0 0 0 0 0 0 0 0 0 0 ...
..$ order : int [1:40] 40 34 35 28 6 32 22 18 21 19 ...
..$ labels : NULL
..$ method : chr "average"
..$ call : language hclust(d = as.dist(1 - fm), method = finalLinkage)
..$ dist.method: NULL
..- attr(*, "class")= chr "hclust"
$ consensusClass : Named int [1:40] 1 1 1 1 1 2 1 1 1 1 ...
..- attr(*, "names")= chr [1:40] "CAR 12:0" "CAR 12:1" "CAR 13:0" "CAR 14:0" ...
$ ml : num [1:40, 1:40] 1 0.689 0.976 1 1 ...
$ clrs :List of 3
..$ : chr [1:40] "#A6CEE3" "#A6CEE3" "#A6CEE3" "#A6CEE3" ...
..$ : num 8
..$ : chr [1:7] "#A6CEE3" "#FB9A99" "#FF7F00" "#FDBF6F" ...
How to find composite scores ?

Related

MCMCglmm interaction plots in R with ggeffects-package or sjPlot-package

I have done many bayesian models using the MCMCglmm package in R, like this one:
model=MCMCglmm(scale(lifespan)~scale(weight)*scale(littersize),
random=~idv(DNA1)+idv(DNA2),
data=df,
family="gaussian",
prior=prior1,
thin=50,
burnin=5000,
nitt=50000,
verbose=F)
summary(model)
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) 11.23327 8.368 13.73756 6228 <2e-04 ***
weight -1.63770 -2.059 -1.23457 6600 <2e-04 ***
littersize 0.40960 0.024 0.80305 6600 0.0415 *
weight:littersize -0.33411 -0.635 -0.04406 5912 0.0248 *
I would like to plot the resulting interaction (weight:littersize) with ggeffects or sjPlots packages, like this:
plot_model(model,
type = "int",
terms = c("scale(lifespan)", "scale(weight)", "scale(littersize)"),
mdrt.values = "meansd",
ppd = TRUE)
But I obtain the next output:
`scale(weight)` was not found in model terms. Maybe misspelled?
`scale(littersize)` was not found in model terms. Maybe misspelled?
Error in terms.default(model) : no terms component nor attribute
Además: Warning messages:
1: Some model terms could not be found in model data. You probably need to load the data into the environment.
2: Some model terms could not be found in model data. You probably need to load the data into the environment.
Data is already loaded. I tried to write terms differently without the "scale(x)" term, and changed the model too to deal with equal terms, but I am still getting this error message. I am also open to plot this interaction with different packages.
My model str(model) is:
>str(model)
List of 20
$ Sol : 'mcmc' num [1:6600, 1:4] -0.814 1.215 -2.119 -0.125 -1.648 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "(Intercept)" "scale(weight)" "scale(littersize)" "scale(weight):scale(littersize)"
..- attr(*, "mcpar")= num [1:3] 7e+04 4e+05 5e+01
$ Lambda : NULL
$ VCV : 'mcmc' num [1:6600, 1:3] 1.094 0.693 1.58 0.645 1.161 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:3] "phylo." "haplo." "units"
..- attr(*, "mcpar")= num [1:3] 7e+04 4e+05 5e+01
$ CP : NULL
$ Liab : NULL
$ Fixed :List of 3
..$ formula:Class 'formula' language scale(lifespan) ~ scale(weight) * scale(littersize)
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
..$ nfl : int 4
..$ nll : num 0
$ Random :List of 5
..$ formula:Class 'formula' language ~idv(phylo) + idv(haplo)
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
..$ nfl : num [1:2] 1 1
..$ nrl : int [1:2] 92 92
..$ nat : num [1:2] 0 0
..$ nrt : int [1:2] 1 1
$ Residual :List of 6
..$ formula :Class 'formula' language ~units
.. .. ..- attr(*, ".Environment")=<environment: 0x0000025ba05f8938>
..$ nfl : num 1
..$ nrl : int 92
..$ nrt : int 1
..$ family : chr "gaussian"
..$ original.family: chr "gaussian"
$ Deviance : 'mcmc' num [1:6600] -262.6 -137.3 -203.6 -83.6 -29.1 ...
..- attr(*, "mcpar")= num [1:3] 7e+04 4e+05 5e+01
$ DIC : num -158
$ X :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..# i : int [1:368] 0 1 2 3 4 5 6 7 8 9 ...
.. ..# p : int [1:5] 0 92 184 276 368
.. ..# Dim : int [1:2] 92 4
.. ..# Dimnames:List of 2
.. .. ..$ : chr [1:92] "1.1" "2.1" "3.1" "4.1" ...
.. .. ..$ : chr [1:4] "(Intercept)" "scale(weight)" "scale(littersize)" "scale(weight):scale(littersize)"
.. ..# x : num [1:368] 1 1 1 1 1 1 1 1 1 1 ...
.. ..# factors : list()
$ Z :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..# i : int [1:16928] 0 1 2 3 4 5 6 7 8 9 ...
.. ..# p : int [1:185] 0 92 184 276 368 460 552 644 736 828 ...
.. ..# Dim : int [1:2] 92 184
.. ..# Dimnames:List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:184] "phylo1.NA.1" "phylo2.NA.1" "phylo3.NA.1" "phylo4.NA.1" ...
.. ..# x : num [1:16928] 0.4726 0.0869 0.1053 0.087 0.1349 ...
.. ..# factors : list()
$ ZR :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..# i : int [1:92] 0 1 2 3 4 5 6 7 8 9 ...
.. ..# p : int [1:93] 0 1 2 3 4 5 6 7 8 9 ...
.. ..# Dim : int [1:2] 92 92
.. ..# Dimnames:List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:92] "units.1" "units.2" "units.3" "units.4" ...
.. ..# x : num [1:92] 1 1 1 1 1 1 1 1 1 1 ...
.. ..# factors : list()
$ XL : NULL
$ ginverse : NULL
$ error.term : int [1:92] 1 1 1 1 1 1 1 1 1 1 ...
$ family : chr [1:92] "gaussian" "gaussian" "gaussian" "gaussian" ...
$ Tune : num [1, 1] 1
..- attr(*, "dimnames")=List of 2
.. ..$ : chr "1"
.. ..$ : chr "1"
$ meta : logi FALSE
$ y.additional: num [1:92, 1:2] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "class")= chr "MCMCglmm"
Thank you.

Try to scale your predictors before fitting the model, i.e.
df$lifespan <- as.vecor(scale(df$lifespan))
Or better, use effectsize::standardize(), which does not create a matrix for a one-dimensial vector when scaling your variables:
df <- effectsize::standardize(df, select = c("lifespan", "weight", "littersize"))
Then you can call your model like this:
model <- MCMCglmm(lifespan ~ weight * littersize,
random=~idv(DNA1)+idv(DNA2),
data=df,
family="gaussian",
prior=prior1,
thin=50,
burnin=5000,
nitt=50000,
verbose=F)
Does this work?

R Unable to plot loaded randomForest object

I'm unable to call the function randomForest.plot() when loading a randomForest object through an RData file.
library("randomForest")
load("rf.RData")
plot(rf)
I get the error:
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
'data' must be of a vector type, was 'NULL'
Get the same error when I call randomForest:::plot.randomForest(rf)
Other function calls on rf work just fine.
EDIT:
See output of str(rf)
str(rf)
List of 15
$ call : language randomForest(x = data[, match("feat1", names(data)):match("feat_n", names(data))], y = data[, match("my_y", n| __truncated__ ...
$ type : chr "regression"
$ predicted : Named num [1:723012] -1141 -1767 -1577 NA -1399 ...
..- attr(*, "names")= chr [1:723012] "1" "2" "3" "4" ...
$ oob.times : int [1:723012] 3 4 6 3 2 3 2 6 7 5 ...
$ importance : num [1:150, 1:2] 6172 928 6367 5754 1013 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:150] "feat1" "feat2" "feat3" "feat4" ...
.. ..$ : chr [1:2] "%IncMSE" "IncNodePurity"
$ importanceSD : Named num [1:150] 400.9 96.7 500.1 428.9 194.8 ...
..- attr(*, "names")= chr [1:150] "feat1" "feat2" "feat3" "feat4" ...
$ localImportance: NULL
$ proximity : NULL
$ ntree : num 60
$ mtry : num 10
$ forest :List of 11
..$ ndbigtree : int [1:60] 392021 392219 392563 392845 393321 392853 392157 392709 393223 392679 ...
..$ nodestatus : num [1:393623, 1:60] -3 -3 -3 -3 -3 -3 -3 -3 -3 -3 ...
..$ leftDaughter : num [1:393623, 1:60] 2 4 6 8 10 12 14 16 18 20 ...
..$ rightDaughter: num [1:393623, 1:60] 3 5 7 9 11 13 15 17 19 21 ...
..$ nodepred : num [1:393623, 1:60] -8.15 -31.38 5.62 -59.87 -16.06 ...
..$ bestvar : num [1:393623, 1:60] 118 57 82 77 65 148 39 39 12 77 ...
..$ xbestsplit : num [1:393623, 1:60] 1.08e+02 -8.26e+08 -2.50 8.55e+03 1.20e+04 ...
..$ ncat : Named int [1:150] 1 1 1 1 1 1 1 1 1 1 ...
.. ..- attr(*, "names")= chr [1:150] "feat1" "feat2" "feat3" "feat4" ...
..$ nrnodes : int 393623
..$ ntree : num 60
..$ xlevels :List of 150
.. ..$ feat1 : num 0
.. ..$ feat2 : num 0
.. ..$ feat3 : num 0
.. ..$ feat4 : num 0
.. ..$ featn : num 0
.. .. [list output truncated]
$ coefs : NULL
$ y : num [1:723012] -1885 -1918 -1585 -1838 -2035 ...
$ test : NULL
$ inbag : NULL
- attr(*, "class")= chr "randomForest"

Error 'duplicate subscripts for columns' on using CreateTableOne

I was trying to do CreateTableOne from tableone package for my dataset called m.dataaaaaa using the following code:
CreateTableOne(vars =Vars,strata = "ejecfraclesstha40_gps", factorVars =Catvars, data = m.dataaaaaa, test = T)
But I got the following error :
Error in [<-.data.frame(x, i, value = value) : duplicate
subscripts for columns In addition: Warning message: In
ModuleReturnVarsExist(vars, data) : The data frame does not have:
ejecfraclesstha40 Dropped
structure of the data is shown below as it is a big database
str(m.dataaaaaa)
Classes ‘data.table’ and 'data.frame': 194 obs. of 203 variables:
$ ejecfraclesstha40_gps : num 1 0 1 0 0 0 1 1 1 0 ...
$ Serial.ID : num 2 3 4 7 10 14 17 20 23 24 ...
..- attr(*, "format.spss")= chr "F4.0"
$ Serial.ID_matched.EF.cohort.Ivan1.to.2 : num 2 NA 4 NA NA NA 17 20 23 NA ...
..- attr(*, "format.spss")= chr "F8.0"
$ ps..matched.EF.cohort.Ivan1.to.2 : num 0.138 NA 0.19 NA NA NA 0.176 0.286 0.152 NA ...
..- attr(*, "format.spss")= chr "F8.3"
$ psweight1.to.2 : num 1 NA 1 NA NA NA 1 1 1 NA ...
..- attr(*, "format.spss")= chr "F8.2"
$ matched_ID1.to.2 : num 483 NA 763 NA NA NA 180 176 239 NA ...
..- attr(*, "format.spss")= chr "F8.2"
$ matched_cases_in_control1.to.2 : num 2 NA 2 NA NA NA 2 2 2 NA ...
..- attr(*, "format.spss")= chr "F8.2"
$ ejecfrac_4gps : num 1 3 1 3 3 3 1 1 1 3 ...
..- attr(*, "format.spss")= chr "F8.2"
..- attr(*, "labels")= Named num 1 2 3 4
.. ..- attr(*, "names")= chr "EF<35%" "EF=35 - <40%" "EF=40 - <=50" "EF>50%"
$ ejecfrac_4gps30 : num 1 4 1 3 3 4 1 1 1 4 ...
..- attr(*, "format.spss")= chr "F8.2"
..- attr(*, "labels")= Named num 1 2 3 4
.. ..- attr(*, "names")= chr "EF<=30%" "EF>30 - 39%" "EF=40 - 49%" "EF>=50%"
$ renisch : num 29 31 23 18 48 19 10 29 17 13 ...
..- attr(*, "label")= chr "renal + visceral ischemic time"
..- attr(*, "format.spss")= chr "F3.0"
..- attr(*, "display_width")= int 12
$ totxct : num 46 31 55 46 48 19 54 29 17 37 ...
..- attr(*, "label")= chr "total cross-clamp time"
..- attr(*, "format.spss")= chr "F4.0"
..- attr(*, "display_width")= int 12
The original database was read from spss into r.
My main problem is with this error :
Error in [<-.data.frame(x, i, value = value) : duplicate subscripts for columns
Any advice will be greatly appreciated.

How Can I Quickly Inspect Built-in Data Sets (PSA)?

One of the best ways to make a question reproducible is to use one of the built in data sets. Using data(), however, is frustrating because no information about the structure of the data set is provided.
How can I quickly view the structure of available data sets?

The following function may help:
dataStr <- function(fun=function(x) TRUE)
str(
Filter(
fun,
Filter(
Negate(is.null),
mget(data()$results[, "Item"], inh=T, ifn=list(NULL))
) ) )
It accepts a filtering function, applies it to all the data sets, and prints out the structure of the matching data sets. For example, if we're looking for matrices:
> dataStr(is.matrix)
List of 8
$ WorldPhones : num [1:7, 1:7] 45939 60423 64721 68484 71799 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:7] "1951" "1956" "1957" "1958" ...
.. ..$ : chr [1:7] "N.Amer" "Europe" "Asia" "S.Amer" ...
$ occupationalStatus : 'table' int [1:8, 1:8] 50 16 12 11 2 12 0 0 19 40 ...
..- attr(*, "dimnames")=List of 2
.. ..$ origin : chr [1:8] "1" "2" "3" "4" ...
.. ..$ destination: chr [1:8] "1" "2" "3" "4" ...
$ volcano : num [1:87, 1:61] 100 101 102 103 104 105 105 106 107 108 ...
--- 5 entries omitted ---
Or for data frames (also omitting entries):
> dataStr(is.data.frame)
List of 42
$ BOD :'data.frame': 6 obs. of 2 variables:
..$ Time : num [1:6] 1 2 3 4 5 7
..$ demand: num [1:6] 8.3 10.3 19 16 15.6 19.8
..- attr(*, "reference")= chr "A1.4, p. 270"
$ CO2 :Classes ‘nfnGroupedData’, ‘nfGroupedData’, ‘groupedData’ and 'data.frame': 84 obs. of 5 variables:
..$ Plant : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ...
..$ Type : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
..$ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
..$ conc : num [1:84] 95 175 250 350 500 675 1000 95 175 250 ...
..$ uptake : num [1:84] 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
--- 40 entries omitted ---
Or even for simple vectors:
> dataStr(function(x) is.atomic(x) && is.vector(x) && !is.ts(x))
List of 4
$ euro : Named num [1:11] 13.76 40.34 1.96 166.39 5.95 ...
..- attr(*, "names")= chr [1:11] "ATS" "BEF" "DEM" "ESP" ...
$ islands: Named num [1:48] 11506 5500 16988 2968 16 ...
..- attr(*, "names")= chr [1:48] "Africa" "Antarctica" "Asia" "Australia" ...
$ precip : Named num [1:70] 67 54.7 7 48.5 14 17.2 20.7 13 43.4 40.2 ...
..- attr(*, "names")= chr [1:70] "Mobile" "Juneau" "Phoenix" "Little Rock" ...
$ rivers : num [1:141] 735 320 325 392 524 ...

Extract all standard errors of coefficients from list of logistic regressions

I want to extract the standard errors from a list of logistic regression models.
This is the logistic regression function, designed this way so i can run more than one analysis at once:
glmfunk <- function(x) glm( ldata$DFREE ~ x , family=binomial)
I run it on a subset of the variables in the dataframe ldata:
glmkort <- lapply(ldata[,c(2,3,5,6,7,8)],glmfunk)
I can extract the coefficients like this:
sapply(glmkørt, "[[", "coefficients")
But how do i extract the standard error of the coefficients? I can't seem to find it in the str(glmkort)?
Here is the str(glmkort) for AGE where i am looking for the standard error:
str(glmkort)
List of 6
$ AGE :List of 30
..$ coefficients : Named num [1:2] -1.17201 -0.00199
.. ..- attr(*, "names")= chr [1:2] "(Intercept)" "x"
..$ residuals : Named num [1:40] -1.29 -1.29 -1.29 -1.29 4.39 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ fitted.values : Named num [1:40] 0.223 0.225 0.225 0.225 0.228 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ effects : Named num [1:40] 3.2662 -0.0282 -0.4595 -0.4464 2.042 ...
.. ..- attr(*, "names")= chr [1:40] "(Intercept)" "x" "" "" ...
..$ R : num [1:2, 1:2] -2.64 0 -86.01 14.18
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "(Intercept)" "x"
.. .. ..$ : chr [1:2] "(Intercept)" "x"
..$ rank : int 2
..$ qr :List of 5
.. ..$ qr : num [1:40, 1:2] -2.641 0.158 0.158 0.158 0.159 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:40] "1" "2" "3" "4" ...
.. .. .. ..$ : chr [1:2] "(Intercept)" "x"
.. ..$ rank : int 2
.. ..$ qraux: num [1:2] 1.16 1.01
.. ..$ pivot: int [1:2] 1 2
.. ..$ tol : num 1e-11
.. ..- attr(*, "class")= chr "qr"
..$ family :List of 12
.. ..$ family : chr "binomial"
.. ..$ link : chr "logit"
.. ..$ linkfun :function (mu)
.. ..$ linkinv :function (eta)
.. ..$ variance :function (mu)
.. ..$ dev.resids:function (y, mu, wt)
.. ..$ aic :function (y, n, mu, wt, dev)
.. ..$ mu.eta :function (eta)
.. ..$ initialize: expression({ if (NCOL(y) == 1) { if (is.factor(y)) y <- y != levels(y)[1L] n <- rep.int(1, nobs) y[weights == 0] <- 0 if (any(y < 0 | y > 1)) stop("y values must be 0 <= y <= 1") mustart <- (weights * y + 0.5)/(weights + 1) m <- weights * y if (any(abs(m - round(m)) > 0.001)) warning("non-integer #successes in a binomial glm!") } else if (NCOL(y) == 2) { if (any(abs(y - round(y)) > 0.001)) warning("non-integer counts in a binomial glm!") n <- y[, 1] + y[, 2] y <- ifelse(n == 0, 0, y[, 1]/n) weights <- weights * n mustart <- (n * y + 0.5)/(n + 1) } else stop("for the binomial family, y must be a vector of 0 and 1's\n", "or a 2 column matrix where col 1 is no. successes and col 2 is no. failures") })
.. ..$ validmu :function (mu)
.. ..$ valideta :function (eta)
.. ..$ simulate :function (object, nsim)
.. ..- attr(*, "class")= chr "family"
..$ linear.predictors: Named num [1:40] -1.25 -1.24 -1.24 -1.24 -1.22 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ deviance : num 42.7
..$ aic : num 46.7
..$ null.deviance : num 42.7
..$ iter : int 4
..$ weights : Named num [1:40] 0.173 0.174 0.174 0.174 0.176 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ prior.weights : Named num [1:40] 1 1 1 1 1 1 1 1 1 1 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ df.residual : int 38
..$ df.null : int 39
..$ y : Named num [1:40] 0 0 0 0 1 0 1 0 0 0 ...
.. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
..$ converged : logi TRUE
..$ boundary : logi FALSE
..$ model :'data.frame': 40 obs. of 2 variables:
.. ..$ ldata$DFREE: int [1:40] 0 0 0 0 1 0 1 0 0 0 ...
.. ..$ x : int [1:40] 39 33 33 32 24 30 39 27 40 36 ...
.. ..- attr(*, "terms")=Classes 'terms', 'formula' length 3 ldata$DFREE ~ x
.. .. .. ..- attr(*, "variables")= language list(ldata$DFREE, x)
.. .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr [1:2] "ldata$DFREE" "x"
.. .. .. .. .. ..$ : chr "x"
.. .. .. ..- attr(*, "term.labels")= chr "x"
.. .. .. ..- attr(*, "order")= int 1
.. .. .. ..- attr(*, "intercept")= int 1
.. .. .. ..- attr(*, "response")= int 1
.. .. .. ..- attr(*, ".Environment")=<environment: 0x017a5674>
.. .. .. ..- attr(*, "predvars")= language list(ldata$DFREE, x)
.. .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. .. .. ..- attr(*, "names")= chr [1:2] "ldata$DFREE" "x"
..$ call : language glm(formula = ldata$DFREE ~ x, family = binomial)
..$ formula :Class 'formula' length 3 ldata$DFREE ~ x
.. .. ..- attr(*, ".Environment")=<environment: 0x017a5674>
..$ terms :Classes 'terms', 'formula' length 3 ldata$DFREE ~ x
.. .. ..- attr(*, "variables")= language list(ldata$DFREE, x)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "ldata$DFREE" "x"
.. .. .. .. ..$ : chr "x"
.. .. ..- attr(*, "term.labels")= chr "x"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: 0x017a5674>
.. .. ..- attr(*, "predvars")= language list(ldata$DFREE, x)
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:2] "ldata$DFREE" "x"
..$ data :<environment: 0x017a5674>
..$ offset : NULL
..$ control :List of 3
.. ..$ epsilon: num 1e-08
.. ..$ maxit : num 25
.. ..$ trace : logi FALSE
..$ method : chr "glm.fit"
..$ contrasts : NULL
..$ xlevels : Named list()
..- attr(*, "class")= chr [1:2] "glm" "lm"
$ BECK :List of 30
Here is the data i have used for the example. I shortened it for the purporse of this question:
> ldata
ID AGE BECK IVHX NDRUGTX RACE TREAT SITE DFREE
1 1 39 9.000 3 1 0 1 0 0
2 2 33 34.000 2 8 0 1 0 0
3 3 33 10.000 3 3 0 1 0 0
4 4 32 20.000 3 1 0 0 0 0
5 5 24 5.000 1 5 1 1 0 1
6 6 30 32.550 3 1 0 1 0 0
7 7 39 19.000 3 34 0 1 0 1
8 8 27 10.000 3 2 0 1 0 0
9 9 40 29.000 3 3 0 1 0 0
10 10 36 25.000 3 7 0 1 0 0
11 11 38 18.900 3 8 0 1 0 0
12 12 29 16.000 1 1 0 1 0 0
13 13 32 36.000 3 2 1 1 0 1
14 14 41 19.000 3 8 0 1 0 0
15 15 31 18.000 3 1 0 1 0 0
16 16 27 12.000 3 3 0 1 0 0
17 17 28 34.000 3 6 0 1 0 0
18 18 28 23.000 2 1 0 1 0 0
19 19 36 26.000 1 15 1 1 0 1
20 20 32 18.900 3 5 0 1 0 1
21 21 33 15.000 1 1 0 0 0 1
22 22 28 25.200 3 8 0 0 0 0
23 23 29 6.632 2 0 0 0 0 0
24 24 35 2.100 3 9 0 0 0 0
25 25 45 26.000 3 6 0 0 0 0
26 26 35 39.789 3 5 0 0 0 0
27 27 24 20.000 1 3 0 0 0 0
28 28 36 16.000 3 7 0 0 0 0
29 29 39 22.000 3 9 0 0 0 1
30 30 36 9.947 2 10 0 0 0 0
31 31 37 9.450 3 1 0 0 0 0
32 32 30 39.000 3 1 0 0 0 0
33 33 44 41.000 3 5 0 0 0 0
34 34 28 31.000 1 6 1 0 0 1
35 35 25 20.000 1 3 1 0 0 0
36 36 30 8.000 3 7 0 1 0 0
37 37 24 9.000 1 1 0 0 0 0
38 38 27 20.000 1 1 0 0 0 0
39 39 30 8.000 1 2 1 0 0 1
40 40 34 8.000 3 0 0 1 0 0

Using an example from ?glm
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
## copy twice to a list to illustrate
lmod <- list(mod1 = glm.D93, mod2 = glm.D93)
Then we could compute them as summary() would, or extract them after calling summary(). The former is far more efficient as you only compute what you want. The latter doesn't rely on knowing how the standard errors are derived.
Compute the standard errors directly
The standard errors can be computed from the variance-covariance matrix of the model. The diagonal of this matrix contains the variances of the coefficients, and the standard errors are simply the square root of these variances. The vcov() extractor function gets the variance-covariance matrix for us and we square root the diagonals with sqrt(diag()):
> lapply(lmod, function(x) sqrt(diag(vcov(x))))
$mod1
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000
$mod2
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000
Extract them from a call to summary()
Or we can let summary() compute the standard errors (and a lot more), then use lapply() or sapply() to apply an anonymous function that extracts coef(summary(x)) and takes the second column (in which the standard errors are stored).
lapply(lmod, function(x) coef(summary(x))[,2])
Which gives
> lapply(lmod, function(x) coef(summary(x))[,2])
$mod1
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000
$mod2
(Intercept) outcome2 outcome3 treatment2 treatment3
0.1708987 0.2021708 0.1927423 0.2000000 0.2000000
whereas sapply() would give:
> sapply(lmod, function(x) coef(summary(x))[,2])
mod1 mod2
(Intercept) 0.1708987 0.1708987
outcome2 0.2021708 0.2021708
outcome3 0.1927423 0.1927423
treatment2 0.2000000 0.2000000
treatment3 0.2000000 0.2000000
Depending on what you wanted to do , you could extract both the coefficients and the standard errors with a single call:
> lapply(lmod, function(x) coef(summary(x))[,1:2])
$mod1
Estimate Std. Error
(Intercept) 3.044522e+00 0.1708987
outcome2 -4.542553e-01 0.2021708
outcome3 -2.929871e-01 0.1927423
treatment2 1.337909e-15 0.2000000
treatment3 1.421085e-15 0.2000000
$mod2
Estimate Std. Error
(Intercept) 3.044522e+00 0.1708987
outcome2 -4.542553e-01 0.2021708
outcome3 -2.929871e-01 0.1927423
treatment2 1.337909e-15 0.2000000
treatment3 1.421085e-15 0.2000000
But you might prefer them separately?

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Composite scores for consensus clustering in R - r

Related

MCMCglmm interaction plots in R with ggeffects-package or sjPlot-package

R Unable to plot loaded randomForest object

Error 'duplicate subscripts for columns' on using CreateTableOne

How Can I Quickly Inspect Built-in Data Sets (PSA)?

Extract all standard errors of coefficients from list of logistic regressions

Categories

Resources