Extracting pvalues and se(coef) from coxph in R - r

I am trying to streamline my code to avoid for loops but am having a hard time once I run my cox proportional hazards code to extract p-values and standard errors for the coefficients. My code is as follows:
library(survival)
#Generate Data
x = matrix(rbinom(10000,1,.5),ncol=100)
y = rexp(ncol(x),.02)
censor = rbinom(ncol(x),1,.5)
event = ifelse(censor==1,0,1)
#Fit the coxph model to the data
ans = apply(x,1,function(x,y,event)coxph(Surv(y,event)~x),y=y,event=event)
#Extract the coefficients from ans
coef = unname(sapply(ans,function(x)x$coef))
So as you can see, I am able to extract the coefficients from the object ans, but I cannot extract the p-values and standard errors. Is there an easy way to do this from my ans object? Or a simple way to modify this code to do it?

You can just add these two lines of code to get the p-values and the std errors.
pValues <- sapply(1:length(ans), function(x) {summary(ans[[x]])$coefficients[5]})
sd <- sapply(1:length(ans), function(x) {summary(ans[[x]])$coefficients[3]})

Related

Storing model coefficients in a loop in R

This seems like a very basic problem, but I have not been able to find a solution. I essentially wish to run a linear regression in a for loop and store the model coefficients (and standard errors if possible) for each iteration in an csv file.
For reference, I am running Fama-MacBeth regressions on macroeconomic "shocks," (the residuals of macroeconomic factors regressed on their lagged values).
My code for the loop is as follows
for (i in 7:69){
model <- lm(data = data, data[[i]]~TM2R+IPR+InfR+UnR+OilR)
#Model coefficients
print(model$coefficients)
#Standard Errors in regression results
model$vcov <- vcovHC(model, type = "HC1")
print(model$vcov)
}
You can use the broom package to transform the output from lm() to a data frame, then append the results from vconvHC() to it.
Finally, you export as csv.
library(broom)
library(sandwich)
for (i in 7:69){
model_name <- paste0("model_", i, ".csv")
model_i <- lm(data = data, data[[i]]~TM2R+IPR+InfR+UnR+OilR)
tidy_model <- tidy(model_i)
tidy_model$vcov <- vcovHC(model_i, type = "HC1")
write.csv(tidy_model, file = model_name)
}

Asymptotic regression function not correlating with raw data

I'm trying to model raw data by an asymptotic function with the equation $$f(x) = a + (b-a)(1-\exp(-c x))$$ using R. To do so I used the following code:
rawData <- import("path/StackTestData.tsv")
# executing regression
X <- rawData$x
Y <- rawData$y
model <- drm(Y ~ X, fct = DRC.asymReg())
# creating the regression function
f_0_ <- model$coefficients[1] #value for y if x=0
steepness <- model$coefficients[2]
plateau <- model$coefficients[3]
eq <- function(x){f_0_+(plateau-f_0_)*(1-exp(-steepness*x))}
# plotting the regression function together with the raw data
ggplot(rawData,aes(x=x,y=y)) +
geom_line(col="red") +
stat_function(fun=eq,col="blue") +
ylim(10,12.5)
In some cases, I got a proper regression function. However, with the attached data I don't get one. The regression function is not showing any correlation with the raw data whatsoever, as shown in the figure below. Can you perhaps offer a better solution for performing the asymptotic regression or do you know where the error lies?
Best Max
R4.1.2 was used using R Studio 1.4.1106. For ggplot the package ggpubr, for DRC.asymReg() the packages aomisc and drc were load.

How to obtain standard errors of local regression coefficients in spgwr::ggwr()?

I am using spgwr::ggwr() to fit generalized geographically weighted regression with Poisson model and log-link function. The results provide local coefficient estimates, but i am missing how to get their standard errors (or t statistics) to compute pseudo p-values.
Below is a toy example using SpatialEpi::NYleukemia dataset:
library(SpatialEpi)
library(spgwr)
## Load data
data(NYleukemia)
population <- NYleukemia$data$population
cases <- ceiling(NYleukemia$data$cases * 100)
centroids <- latlong2grid(NYleukemia$geo[, 2:3])
# data frame
nyleuk <- data.frame(centroids, cases, population)
# set coordinates as vector
coordny <- cbind(centroids[,1],centroids[,2])
# set a kernel bandwidth
bw <- 0.5
# fit ggwr()
m_pois <- ggwr(cases ~ offset(log(population)),
data = nyleuk, gweight = gwr.Gauss,
adapt = bw, family = poisson(link="log"),
type="working", coords = coordny)
# returns spatial point with coefficients
# but no standard errors :(
head(m_pois$SDF#data)
Is there any way i can get standard errors of the coefficients?
Thanks!
You may obtain standard errors from local coefficients running the function GWmodel::ggwr.basic. Function spgwr::ggwr() returns coefficients but no standard errors.

lm (linear regression) function generated un-removable outliers

I have performed linear regression (lm) on two modified p-value types: q-value and Benjamini-Hochberg. Results gives two astronomical outliers, however, after removal of those, new outliers are always present. Could someone please replicate the code and see if issue prevails? What could be the possible source of an issue?
Here is the full code for easy copy/paste:
library(qvalue)
p = 50
m = 10
pval = c(rbeta(m,1,100), runif(p-m,0,1))
BHpval <- p.adjust(pval,method="BH")
qval_ <- qvalue(pval)
print(qval_$pi0)
fit2 <- lm(qval_$qvalues ~ BHpval)
plot(fit2)

R: obtain coefficients&CI from bootstrapping mixed-effect model results

The working data looks like:
set.seed(1234)
df <- data.frame(y = rnorm(1:30),
fac1 = as.factor(sample(c("A","B","C","D","E"),30, replace = T)),
fac2 = as.factor(sample(c("NY","NC","CA"),30,replace = T)),
x = rnorm(1:30))
The lme model is fitted as:
library(lme4)
mixed <- lmer(y ~ x + (1|fac1) + (1|fac2), data = df)
I used bootMer to run the parametric bootstrapping and I can successfully obtain the coefficients (intercept) and SEs for fixed&random effects:
mixed_boot_sum <- function(data){s <- sigma(data)
c(beta = getME(data, "fixef"), theta = getME(data, "theta"), sigma = s)}
mixed_boot <- bootMer(mixed, FUN = mixed_boot_sum, nsim = 100, type = "parametric", use.u = FALSE)
My first question is how to obtain the coefficients(slope) of each individual levels of the two random effects from the bootstrapping results mixed_boot ?
I have no problem extracting the coefficients(slope) from mixed model by using augment function from broom package, see below:
library(broom)
mixed.coef <- augment(mixed, df)
However, it seems like broom can't deal with boot class object. I can't use above functions directly on mixed_boot.
I also tried to modify the mixed_boot_sum by adding mmList( I thought this would be what I am looking for), but R complains as:
Error in bootMer(mixed, FUN = mixed_boot_sum, nsim = 100, type = "parametric", :
bootMer currently only handles functions that return numeric vectors
Furthermore, is it possible to obtain CI of both fixed&random effects by specifying FUN as well?
Now, I am very confused about the correct specifications for the FUN in order to achieve my needs. Any help regarding to my question would be greatly appreciated!
My first question is how to obtain the coefficients(slope) of each individual levels of the two random effects from the bootstrapping results mixed_boot ?
I'm not sure what you mean by "coefficients(slope) of each individual level". broom::augment(mixed, df) gives the predictions (residuals, etc.) for every observation. If you want the predicted coefficients at each level I would try
mixed_boot_coefs <- function(fit){
unlist(coef(fit))
}
which for the original model gives
mixed_boot_coefs(mixed)
## fac1.(Intercept)1 fac1.(Intercept)2 fac1.(Intercept)3 fac1.(Intercept)4
## -0.4973925 -0.1210432 -0.3260958 0.2645979
## fac1.(Intercept)5 fac1.x1 fac1.x2 fac1.x3
## -0.6288728 0.2187408 0.2187408 0.2187408
## fac1.x4 fac1.x5 fac2.(Intercept)1 fac2.(Intercept)2
## 0.2187408 0.2187408 -0.2617613 -0.2617613
## ...
If you want the resulting object to be more clearly named you can use:
flatten <- function(cc) setNames(unlist(cc),
outer(rownames(cc),colnames(cc),
function(x,y) paste0(y,x)))
mixed_boot_coefs <- function(fit){
unlist(lapply(coef(fit),flatten))
}
When run through bootMer/confint/boot::boot.ci these functions will give confidence intervals for each of these values (note that all of the slopes facW.xZ are identical across groups because the model assumes random variation in the intercept only). In other words, whatever information you know how to extract from a fitted model (conditional modes/BLUPs [ranef], predicted intercepts and slopes for each level of the grouping variable [coef], parameter estimates [fixef, getME], random-effects variances [VarCorr], predictions under specific conditions [predict] ...) can be used in bootMer's FUN argument, as long as you can flatten its structure into a simple numeric vector.

Resources