How to extract p-values from lmekin objects in coxme package - r

I want to be able to view the p-values for the lmekin object produced by the coxme package.
eg.
model= lmekin(formula = height ~ score + sex + age + (1 | IID), data = phenotype_df, varlist = kinship_matrix)
I tried:
summary(model)
coef(summary(model))
summary(model$coefficient$fixed)
fixef(model)/ sqrt(diag(vcov(model)) #(Calculates Z-scores but not p-values)
But these did not work. So how do I view the p-values for this linear mixed model?

It took me ages of searching to figure this out, but I noticed a lot of other similar questions without proper answers, so I wanted to answer this.
You use:
library(coxme)
print(model)
Note it is important to load the coxme package beforehand or it will not work.
I've also noticed a lot of posts about how to extract the p-value from lmekin objects, or how to extract the p-value from coxme objects in general. I wrote this function, which is based on the coxme:::print.coxme function code (to view code type coxme:::print.coxme directly into R). print calculates p-values on the fly - this function allows the extraction of p-values and saves them to an object.
Note that mod is your model, eg. mod <- lmekin(y~x+a+b)
Use print(mod) to double check that the tables match.
extract_coxme_table <- function (mod){
beta <- mod$coefficients$fixed
nvar <- length(beta)
nfrail <- nrow(mod$var) - nvar
se <- sqrt(diag(mod$var)[nfrail + 1:nvar])
z<- round(beta/se, 2)
p<- signif(1 - pchisq((beta/se)^2, 1), 2)
table=data.frame(cbind(beta,se,z,p))
return(table)
}

I arrived at this topic because I was looking for the same thing for just the coxme object. The function of IcedCoffee works with a micro adjustment:
extract_coxme_table <- function (mod){
beta <- mod$coefficients #$fixed is not needed
nvar <- length(beta)
nfrail <- nrow(mod$var) - nvar
se <- sqrt(diag(mod$var)[nfrail + 1:nvar])
z<- round(beta/se, 2)
p<- signif(1 - pchisq((beta/se)^2, 1), 2)
table=data.frame(cbind(beta,se,z,p))
return(table)
}

Related

Time-dependent covariates- is there something wrong with this code? (R program)

I am checking a few of my Cox multivariate regression analyses' proportional hazard assumptions using time-dependent co-variates, using the survival package. The question is looking at survival in groups with different ADAMTS13 levels (a type of enzyme).
Could I check if something is wrong with my code itself? It keeps saying Error in tt(TMAdata$ADAMTS13level.f) : could not find function "tt" . Why?
Notably, ADAMTS13level.f is a factor variable.
cox_multivariate_survival_ADAMTS13 <- coxph(Surv(TMAdata$Daysalive, TMAdata$'Dead=1')
~TMAdata$ADAMTS13level.f
+TMAdata$`Age at diagnosis`
+TMAdata$CCIwithoutage
+TMAdata$Gender.f
+TMAdata$`Peak Creatinine`
+TMAdata$DICorcrit.f,
tt(TMAdata$ADAMTS13level.f),
tt = function(x, t, ...)
{mtrx <- model.matrix(~x)[,-1]
mtrx * log(t)})
Thanks- starting with the fundamentals of my actual code or typos- I have tried different permutations to no avail yet.
#Limey was on the right track!
The time-transformed version of ADAMTS13level.f needs to be added to the model, instead of being separated into a separate argument of coxph(...).
The form of coxph call when testing the time-dependent categorical variables is described in How to use the timeSplitter by Max Gordon.
Other helpful documentation:
coxph - fit proportional hazards regression model
cox_multivariate_survival_ADAMTS13 <-
coxph(
Surv(
Daysalive,
'Dead=1'
) ~
ADAMTS13level.f
+ `Age at diagnosis`
+ CCIwithoutage
+ Gender.f
+ `Peak Creatinine`
+ DICorcrit.f
+ tt(ADAMTS13level.f),
tt = function(x, t, ...) {
mtrx <- model.matrix(~x)[,-1]
mtrx * log(t)
},
data = TMAdata
)
p.s. with the original data, there was also a problem because Daysalive included a zero (0) value, which eventually resulted in an 'infinite predictor' error from coxph, probably because tt transformed the data using a log(t). (https://rdrr.io/github/therneau/survival/src/R/coxph.R)

R Output of fGarch

I am modelling a time series as a GARCH(1,1)-process:
And the z_t are t-distributed.
In R, I do this in the fGarch-package via
model <- garchFit(formula = ~garch(1,1), cond.dist = "std", data=r)
Is this correct?
Now, I would like to understand the output of this to check my formula.
Obviously, model#fit$coefs gives me the coefficients and model#fitted gives me the fitted r_t.
But how do I get the fitted sigma_t and z_t?
I believe that the best way is to define extractor functions when generics are not available and methods when generics already exist.
The first two functions extract the values of interest from the fitted objects.
get_sigma_t <- function(x, ...){
x#sigma.t
}
get_z_t <- function(x, ...){
x#fit$series$z
}
Here a logLik method for objects of class "fGARCH" is defined.
logLik.fGARCH <- function(x, ...){
x#fit$value
}
Now use the functions, including the method. The data comes from the first example in help("garchFit").
N <- 200
r <- as.vector(garchSim(garchSpec(rseed = 1985), n = N)[,1])
model <- garchFit(~ garch(1, 1), data = r, trace = FALSE)
get_sigma_t(model) # output not shown
get_z_t(model) # output not shown
logLik(model)
#LogLikelihood
# -861.9494
Note also that methods coef and fitted exist, there is no need for model#fitted or model#fit$coefs, like is written in the question.
fitted(model) # much simpler
coef(model)
# mu omega alpha1 beta1
#3.541769e-05 1.081941e-06 8.885493e-02 8.120038e-01
It is a list structure. Can find the structure with
str(model)
From the structure, it is easier to extract with $ or #
model#fit$series$z
model#sigma.t

Is it possible to adapt standard prediction interval code for dlm in R with other distribution?

Using the dlm package in R I fit a dynamic linear model to a time series data set, consisting of 20 observations. I then use the dlmForecast function to predict future values (which I can validate against the genuine data for said period).
I use the following code to create a prediction interval;
ciTheory <- (outer(sapply(fut1$Q, FUN=function(x) sqrt(diag(x))), qnorm(c(0.05,0.95))) +
as.vector(t(fut1$f)))
However my data does not follow a normal distribution and I wondered whether it would be possible to
adapt the qnorm function for other distributions. I have tried qt, but am unable to apply qgamma.......
Just wondered if anyone knew how you would go about sorting this.....
Below is a reproduced version of my code...
library(dlm)
data <- c(20.68502, 17.28549, 12.18363, 13.53479, 15.38779, 16.14770, 20.17536, 43.39321, 42.91027, 49.41402, 59.22262, 55.42043)
mod.build <- function(par) {
dlmModPoly(1, dV = exp(par[1]), dW = exp(par[2]))
}
# Returns most likely estimate of relevant values for parameters
mle <- dlmMLE(a2, rep(0,2), mod.build); #nileMLE$conv
if(mle$convergence==0) print("converged") else print("did not converge")
mod1 <- dlmModPoly(dV = v, dW = c(0, w))
mod1Filt <- dlmFilter(a1, mod1)
fut1 <- dlmForecast(mod1Filt, n = 7)
Cheers

Pasting object names inside functions

This is a follow-up question to this (see data and previous commands).
Starting with a list of models in mods, i am now able to find the model with the least AIC (corresponds to the best model):
mods <- lapply(methods, function(m)
update(amod.null, correlation = getFunction(m)(1, form = ~ x + y), method="ML"))
names(mods) <- methods
list.AIC <- lapply(mods, function(x) AIC(x))
best.mod <- names(which.min(list.AIC))
Now, i need to do some testing on the model, e.g. Tukey between dates. The syntax is very simple, e.g. for amod.null
library(multcomp)
res <- glht(amod.null, mcp(Date = "Tukey"))
The tricky part is, how can i tell glht to use the model which was put into best.mod (note: this is all happening within a loop). I tried
res <- glht(paste("mods$", as.factor(best.mod),sep = "") , mcp(Date = "Tukey"))
but to no avail, as glht needs to find a model-object in the first argument.
/edit:
Possibly useful:
names(mods)
[1] "corExp" "corGaus" "corLin" "corRatio" "corSpher"
Since the models are stored in the list mods, you can access the "best model" by using the index of which.min(list.AIC):
list.AIC <- sapply(mods, AIC)
best.mod <- mods[which.min(list.AIC)]
best.mod[[1]]

How to extract info from package in R and use in function?

I apologize for the vague question title. What I want to do is run a regression in R using geeglm from the geepack R package, then use information from that to calculate a quasilikelihood information criteria (QIC; Pan 2001). I can do this fairly easily for single models but I would like to write a general function that can do this for a variety of different types of models. I guess my real question is whether there is a better alternative than having a long series of nested ifelse statements?
Here's my current code:
library(geepack)
data(dietox) #data from the geepack package
# Run gee regression
dietox$Cu <- as.factor(dietox$Cu)
mf <- formula(Weight ~ Cu * (Time + I(Time^2) + I(Time^3)))
gee1 <- geeglm(mf, data = dietox, id = Pig, family = gaussian, corstr = "ar1")
Then I can run a function to calculate the quasilikelihood:
QlogLik.normal <- function(model.R) {
library(MASS)
mu.R <- model.R$fitted.values
y <- model.R$y
# Quasi Likelihood for Normal
quasi.R <- sum(((y - mu.R)^2)/-2)
quasi.R
}
However, I would like to write a function that is more general because the quasilikelihood function is different for every distribution. The above function would work for gee1 because it had a gaussian (normal) distribution. If I wanted to generalize it for a variety of distributions I could use a series of nested ifelse statements (below), but I don't know if this is the best way to do this. Does anyone have other options or a better solution? This just doesn't seem very elegant to say the least (clearly I don't have much programming or R experience).
QlogLik <- function(model.R) {
library(MASS)
mu.R <- model.R$fitted.values
y <- model.R$y
ifelse(model.R$modelInfo$variance == "poisson",
# Quasi Likelihood for Poisson
quasi.R <- sum((y*log(mu.R)) - mu.R),
ifelse(model.R$modelInfo$variance == "gaussian",
# Quasi Likelihood for Normal
quasi.R <- sum(((y - mu.R)^2)/-2),
ifelse(model.R$modelInfo$variance == "binomial",
# Quasilikelihood for Binomial
quasi.R <- sum(y*log(mu.R/(1 - mu.R)) + log(1 - mu.R)),
quasi.R <- "Error: distribution not recognized")))
quasi.R
}
In this example, I used the model output from geeglm to extract the type of distribution used to model the variance
model.R$modelInfo$variance
but there may be other ways to determine what distribution was used in the geeglm model. Any help would be appreciated.
You should be able to rewrite your function like this:
QlogLik <- function(model.R) {
library(MASS)
mu.R <- model.R$fitted.values
y <- model.R$y
type <- family(model.R)$family
switch(type,
poisson = sum((y*log(mu.R)) - mu.R),
gaussian = sum(((y - mu.R)^2)/-2),
binomial = sum(y*log(mu.R/(1 - mu.R)) + log(1 - mu.R)),
stop("Error: distribution not recognized"))
}
As #baptise points out, switch useful in these cases. You use family(model.R)$family to automatically detect what family type should be used with switch.
Also, if your commands for what to do in different cases run beyond one line, you can wrap the lines with curly brackets ({ do something here }) instead.
switch(type,
type1 = { something <- do(this)
thisis(something) },
type2 = do(that))
I hope this helps!
You may also use model.R$family$family which gives the type of distribution used to model the variance, but so far I didn't know if you could eliminate those ifelse statements. The quasi.R in your code differs among different distributions, so you have to define each of them separately.
BTW, it is a good question and thanks for posting it: I had similar situations in the past, and hope to get some advice on how to write the codes more efficiently.

Resources