MLE of covariate parameter - r

Hi I'm currently doing coding to simulate data using inverse method. Im using parallel exponential model where I let the lambda=b0+b1x. My simulation is based on survival analysis.
#generate data
gen <- function(n,lambda,b0,b1){
set.seed(1)
u <- runif(n,0,1)
c1 <- rexp(n,lambda)
x <- rnorm(n,0,1)
t1 = -log(1 - sqrt(u) ) / (b0 + b1*x) #inverse method
c <- 1*(t1 < c1)
t = pmin(t1, c1)
data1 <- data.frame(x, t, t1, c1, c)
return(data1)
}
data2 <- gen(20,0.01,2,4)
data2
x = data2$x
t = data2$t
xsum = sum(x)
tsum = sum(t)
The problem is that when run the second coding below, it won't show my mle for b0 and b1
#Likelihood
library(maxLik)
LLF <- function(para){
set.seed(1)
b0 = para[1]
b1 = para[2]
n = 1
z1 = (n*log(2)) + (n*log(b0+b1*xsum)) - ((b0+b1*xsum)*tsum) + (n*log(1-exp((-(b0 + b1*xsum)*tsum))))
return(z1)
}
mle <- maxLik(LLF, start = c(2,4))

The problem is you assigned n=1 in the LLF. Since we usually maximize the parameters given the entire data, n should be equal to number of observations. If you update this info, your mle will converge. For example,
n<-nrow(data2)
#Likelihood
library(maxLik)
LLF <- function(para){
set.seed(1)
b0 = para[1]
b1 = para[2]
#n = 1
z1 = (n*log(2)) + (n*log(b0+b1*xsum)) - ((b0+b1*xsum)*tsum) + (n*log(1-exp((-(b0 + b1*xsum)*tsum))))
return(z1)
}
mle <- maxLik(LLF, start = c(2,4))
summary(mle)
Maximum Likelihood estimation
Newton-Raphson maximisation, 3 iterations
Return code 1: gradient close to zero
Log-Likelihood: -22.7055
2 free parameters
Estimates:
Estimate Std. error t value Pr(> t)
[1,] 1.986 NA NA NA
[2,] 3.986 NA NA NA

Related

Estimating multiple interrelated parameters with mle2 (bbmle) and custom log likelihood function

To cut a long story short: I would like to estimate the parameter r of a utility function, which depends on demographic variables. I wrote a code for the utility function - that should be maximised -, transformed it into a log lik function and use it in mle2. I get a parameter estimate for r, but I do not get the parameters for the demographic variables.
Now the long story:
In specific, I want to estimate the parameter r in the following function:
CRRA <- function(x, r){
u <- (x ^ (1-r)) / (1-r)
return(u)
}
r describes risk attitude and I want to maximise utility u. r in turn depends on personal characteristics such that
r <- beta0 + beta1 * covar[1] + beta2 * covar[2] + beta3 * covar[3]
where the covars are known and I want to estimate beta0, beta1, ... , betaN and r.
I have choice data from different lottery games (multiple price lists) with 10 decision situations each. In each decision situation, the player needs to decide whether he/she wants to play lottery A or lottery B (choice data). A1, A2, B1 and B2 are the possible payoffs x and prob is the respective probability to receive the associated payoff. In lottery A the expected utility of a specific decision situation is
EU <- function(p, x, r){ #expected utility of one of the two lotteries in a pair
eu <- p * CRRA(x[[1]], r) + (1-p) * CRRA(x[[2]], r)
return(eu)
}
EU(p = d[,("prob"), x = d[,("A1", "A2", "B1", "B2")], r)
The expected utility of the entire decision situation is
DELTA <- function(x, p, r){ #difference of lottery B (right, more risky) and lottery A (left, safe)
delta <- EU(p, x[c(1,2)], r) - EU(p, x[c(3,4)], r)
return(delta)
}
and depends on the choice variable. The choice variable is = 0 when lottery A is chosen and = 1 when lottery B is chosen.
Accordingly, the log likelihood function looks as follows:
Lc2 <- function(r){
dl <- DELTA(x, prob, r)
lc <- -sum(choice * pnorm(dl[1], log.p = TRUE)
(1 - choice) * pnorm(1 - dl[1], log.p = TRUE))
return(lc)
}
Using Lc2 in mle2 provides the following results:
> fit2 <- mle2(Lc2,
+ data = df,
+ start = c(r=0.76),
+ parameters = list(r ~ age+ha+microbes))
> tidy(fit2)
# A tibble: 1 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 r 0.512 0.0445 11.5 1.21e-30
> summary(fit2)
Maximum likelihood estimation
Call:
mle2(minuslogl = Lc2, start = c(r = 0.76), data = df, parameters = list(r ~
covar[1] + covar[2] + covar[3]))
Coefficients:
Estimate Std. Error z value Pr(z)
r 0.512122 0.044503 11.508 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
-2 log L: 743.4722
How do I get mle2 to calculate also parameter estimates for beta? Why does parameter = list(r ~ covar[1] + covar[2] + covar[3]) inside mle2 not provide me the beta estimates for each covar? Where do I need to place the r <- ... function?
Thanks a lot in advance for your help on this.
I tried the following:
U <- function(x, prob, params, covar){
A1 <- x[1]
A2 <- x[2]
B1 <- x[3]
B2 <- x[4]
prob <- prob
r <- params[1]
beta0 <- params[2]
beta1 <- params[3]
beta2 <- params[4]
beta3 <- params[5]
r <- beta0 + beta1 * covar[1] + beta2 * covar[2] + beta3 * covar[3]
u <- DELTA(x = c(A1, A2, B1, B2), prob = prob, r)
return(u)
}
loli <- function(params){
d <- datan
x <- d[,c("A1","A2","B1","B2")]
prob <- d[,"prob"]
covar <- d[,c("cov1", "cov2", "cov3")] %>% as.matrix(.)
dl <- U(x = x, prob = prob, params = params, covar = covar)
choice <- d[,"choice"]
lc <- -sum(choice * pnorm(dl[[1]], log.p = TRUE) +
(1 - choice) * pnorm(1 - dl[[1]], log.p = TRUE))
return(lc)
}
guess <- list(params = c(0.76, 1, 1, 1, 1))
... without success:
> fit3 <- mle2(loli,
+ start = guess,
+ data = choices_DL)
Error in validObject(.Object) :
invalid class “mle2” object: invalid object for slot "fullcoef" in class "mle2": got class "NULL", should be or extend class "numeric"

Predict using felm output with standard errors

Is there way to get predict behavior with standard errors from lfe::felm if the fixed effects are swept out using the projection method in felm? This question is very similar to the question here, but none of the answers to that question can be used to estimate standard errors or confidence/prediction intervals. I know that there's currently no predict.felm, but I am wondering if there are workarounds similar to those linked above that might also work for estimating the prediction interval
library(DAAG)
library(lfe)
model1 <- lm(data = cps1, re74 ~ age + nodeg + marr)
predict(model1, newdata = data.frame(age=40, nodeg = 0, marr=1), se.fit = T, interval="prediction")$fit
# Result: fit lwr upr
# 1 18436.18 2339.335 34533.03
model2 <- felm(data = cps1, re74 ~ age | nodeg + marr)
predict(model2, newdata = data.frame(age=40, nodeg = 0, marr=1), se.fit = T, interval="prediction")$fit
# Does not work
The goal is to estimate a prediction interval for yhat, for which I think I'd need to compute the full variance-covariance matrix (including the fixed effects). I haven't been able to figure out how to do this, and I'm wondering if it's even computationally feasible.
After conversations with several people, I don't believe it is possible to obtain an estimate the distribution of yhat=Xb (where X includes both the covariates and the fixed effects) directly from felm, which is what this question boils down to. It is possible bootstrap them, however. The following code does so in parallel. There is scope for performance improvements, but this gives the general idea.
Note: here I do not compute full prediction interval, just the SEs on Xb, but obtaining the prediction interval is straightforward - just add the root of sigma^2 to the SE.
library(DAAG)
library(lfe)
library(parallel)
model1 <- lm(data = cps1, re74 ~ age + nodeg + marr)
yhat_lm <- predict(model1, newdata = data.frame(age=40, nodeg = 0, marr=1), se.fit = T)
set.seed(42)
boot_yhat <- function(b) {
print(b)
n <- nrow(cps1)
boot <- cps1[sample(1:n, n, replace=T),]
lm.model <- lm(data=demeanlist(boot[, c("re74", "age")], list(factor(boot$nodeg), factor(boot$marr))),
formula = re74 ~ age)
fe <- getfe(felm(data = boot, re74 ~ age | nodeg + marr))
bootResult <- predict(lm.model, newdata = data.frame(age = 40)) +
fe$effect[fe$fe == "nodeg" & fe$idx==0] +
fe$effect[fe$fe == "marr" & fe$idx==1]
return(bootResult)
}
B = 1000
yhats_boot <- mclapply(1:B, boot_yhat)
plot(density(rnorm(10000, mean=yhat_lm$fit, sd=yhat_lm$se.fit)))
lines(density(yhats), col="red")
From your first model predict(.) yields this:
# fit lwr upr
# 1 18436.18 2339.335 34533.03
Following 李哲源 we can achieve these results manually, too.
beta.hat.1 <- coef(model1) # save coefficients
# model matrix: age=40, nodeg = 0, marr=1:
X.1 <- cbind(1, matrix(c(40, 0, 1), ncol=3))
pred.1 <- as.numeric(X.1 %*% beta.hat.1) # prediction
V.1 <- vcov(model1) # save var-cov matrix
se2.1 <- unname(rowSums((X.1 %*% V.1) * X.1)) # prediction var
alpha.1 <- qt((1-0.95)/2, df = model1$df.residual) # 5 % level
pred.1 + c(alpha.1, -alpha.1) * sqrt(se2.1) # 95%-CI
# [1] 18258.18 18614.18
sigma2.1 <- sum(model1$residuals ^ 2) / model1$df.residual # sigma.sq
PI.1 <- pred.1 + c(alpha.1, -alpha.1) * sqrt(se2.1 + sigma2.1) # prediction interval
matrix(c(pred.1, PI.1), nrow = 1, dimnames = list(1, c("fit", "lwr", "upr")))
# fit lwr upr
# 1 18436.18 2339.335 34533.03
Now, your linked example applied to multiple FE, we get this results:
lm.model <- lm(data=demeanlist(cps1[, c(8, 2)],
list(as.factor(cps1$nodeg),
as.factor(cps1$marr))), re74 ~ age)
fe <- getfe(model2)
predict(lm.model, newdata = data.frame(age = 40)) + fe$effect[fe$idx=="1"]
# [1] 15091.75 10115.21
The first value is with and the second without added FE (try fe$effect[fe$idx=="1"]).
Now we're following the manual approach above.
beta.hat <- coef(model2) # coefficient
x <- 40 # age = 40
pred <- as.numeric(x %*% beta.hat) # prediction
V <- model2$vcv # var/cov
se2 <- unname(rowSums((x %*% V) * x)) # prediction var
alpha <- qt((1-0.95)/2, df = model2$df.residual) # 5% level
pred + c(alpha, -alpha) * sqrt(se2) # CI
# [1] 9599.733 10630.697
sigma2 <- sum(model2$residuals ^ 2) / model2$df.residual # sigma^2
PI <- pred + c(alpha, -alpha) * sqrt(se2 + sigma2) # PI
matrix(c(pred, PI), nrow = 1, dimnames = list(1, c("fit", "lwr", "upr"))) # output
# fit lwr upr
# 1 10115.21 -5988.898 26219.33
As we see, the fit is the same as the linked example approach, but now with prediction interval. (Disclaimer: The logic of the approach should be straightforward, the values of the PI should still be evaluated, e.g. in Stata with reghdfe.)
Edit: In case you want to achieve exactly the same output from felm() which predict.lm() yields with the linear model1, you simply need to "include" again the fixed effects in your model (see model3 below). Just follow the same approach then. For more convenience you easily could wrap it into a function.
library(DAAG)
library(lfe)
model3 <- felm(data = cps1, re74 ~ age + nodeg + marr)
pv <- c(40, 0, 1) # prediction x-values
predict0.felm <- function(mod, pv.=pv) {
beta.hat <- coef(mod) # coefficient
x <- cbind(1, matrix(pv., ncol=3)) # prediction vector
pred <- as.numeric(x %*% beta.hat) # prediction
V <- mod[['vcv'] ] # var/cov
se2 <- unname(rowSums((x %*% V) * x)) # prediction var
alpha <- qt((1-0.95)/2, df = mod[['df.residual']]) # 5% level
CI <- structure(pred + c(alpha, -alpha) * sqrt(se2),
names=c("CI lwr", "CI upr")) # CI
sigma2 <- sum(mod[['residuals']] ^ 2) / mod[['df.residual']] # sigma^2
PI <- pred + c(alpha, -alpha) * sqrt(se2 + sigma2) # PI
mx <- matrix(c(pred, PI), nrow = 1,
dimnames = list(1, c("PI fit", "PI lwr", "PI upr"))) # output
list(CI, mx)
}
predict0.felm(model3)[[2]]
# PI fit PI lwr PI upr
# 1 18436.18 2339.335 34533.03
By this with felm() you can achieve the same prediction interval as with predict.lm().

Maximizing a likelihood function that contains pbivnorm

I have a likelihood function that contains a bivariate normal CDF. I keep getting values close to one for the correlation, even when the true value is zero.
The R package sampleSelection maximizes a likelihood function that contains a bivaraite normal CDF (as in Van de Ven and Van Praag (1981)). I tried looking at the source code for the package, but couldn't find how they write the likelihood. For reference, Van de Ven and Van Praag's paper:
The Demand for Deductibles in Private Health Insurance: A Probit Model with Sample Selection.
The likelihood function is Equation (19), where H denotes the standard normal CDF and H_2 denotes the bivariate normal CDF.
My question:
Can someone tell me how the likelihood function is written in the sampleSelection package? or
Can someone tell me why I'm getting values of close to one for the correlation in the code below?
Here's the code that's keeping me up at night:
########################################################
#
# Trying to code Van de Ven and Van Praag (1981)
#
#
########################################################
library(MASS)
library(pbivnorm)
library(mnormt)
library(maxLik)
library(sampleSelection)
set.seed(1)
# Sample size
full_sample <- 1000
# Parameters
rho <- .1
beta0 <- 0
beta1 <- 1
gamma0 <- .2
gamma1 <- .5
gamma2 <- .5
varcovar <- matrix(c(1,rho,rho,1), nrow = 2, ncol = 2)
# Vectors for storing values
y <- rep(0,full_sample)
s <- rep(0,full_sample)
outcome <- rep(0,full_sample)
select <- rep(0,full_sample)
#######################
# Simulate data
#######################
x <- rnorm(full_sample)
z <- rnorm(full_sample)
errors <- mvrnorm(full_sample, rep(0,2), varcovar)
# note: 1st element for selection eq; 2nd outcome
s <- gamma0 + gamma1*x + gamma2*z + errors[,1]
y <- beta0 + beta1*x + errors[,2]
for(i in 1:full_sample){
if(s[i]>=0){
select[i] <- 1
if(y[i]>=0){
outcome[i] <- 1
}else{
outcome[i] <- 0
}
}else{
outcome[i] <- NA
select[i] <- 0
}
}
#######################################
# Writing the log likelihood
##
# Note: vega1= beta0,
# vega2= beta1,
# vega3= gamma0,
# vega4= gamma1,
# vega5= gamma2,
# vega6= rho
#######################################
first.lf <- function(vega) {
# Transforming this parameter becuase
# correlation is bounded between -1 aad 1
corr <- tanh(vega[6])
# Set up vectors for writing log likelihood
y0 <- 1-outcome
for(i in 1:full_sample) {
if(is.na(y0[i])){ y0[i]<- 0}
if(is.na(outcome[i])){ outcome[i]<- 0}
}
yt0 <- t(y0)
yt1 <- t(outcome)
missing <- 1 - select
ytmiss <- t(missing)
# Terms in the selection and outcome equations
A <- vega[3]+vega[4]*x+vega[5]*z
B <- vega[1]+vega[2]*x
term1 <- pbivnorm(A,B,corr)
term0 <- pbivnorm(A,-B,corr)
term_miss <- pnorm(-A)
log_term1 <- log(term1)
log_term0 <- log(term0)
log_term_miss <- log(term_miss)
# The log likelihood
logl <- sum( yt1%*%log_term1 + yt0%*%log_term0 + ytmiss%*%log_term_miss)
return(logl)
}
startv <- c(beta0,beta1,gamma0,gamma1,gamma2,rho)
# Maxmimizing my likelihood gives
maxLik(logLik = first.lf, start = startv, method="BFGS")
# tanh(7.28604) = 0.9999991, far from the true value of .1
# Using sampleSelection package for comparison
outcomeF<-factor(outcome)
selectEq <- select ~ x + z
outcomeEq <- outcomeF ~ x
selection( selectEq, outcomeEq)
# Notice the value of -0.2162 for rho compared to my 0.9999991
It happens that there is a typo in the paper in equation (19). The terms from i = N_1 + 1 to N should have -rho rather than rho. Hence, using
term0 <- pbivnorm(A,-B,-corr)
gives
maxLik(logLik = first.lf, start = startv, method="BFGS")
# Maximum Likelihood estimation
# BFGS maximization, 40 iterations
# Return code 0: successful convergence
# Log-Likelihood: -832.5119 (6 free parameter(s))
# Estimate(s): 0.3723783 0.9307454 0.1349979 0.4693686 0.4572421 -0.219618
as needed.

R - Fitting a constrained AutoRegression time series

I have a time-series which I need to fit onto an AR (auto-regression) model.
The AR model has the form:
x(t) = a0 + a1*x(t-1) + a2*x(t-2) + ... + aq*x(t-q) + noise.
I have two contraints:
Find the best AR fit when lag.max = 50.
Sum of all coefficients a0 + a1 + ... + aq = 1
I wrote the below code:
require(FitAR)
data(lynx) # my real data comes from the stock market.
z <- -log(lynx)
#find best model
step <- SelectModel(z, ARModel = "AR" ,lag.max = 50, Criterion = "AIC",Best=10)
summary(step) # display results
# fit the model and get coefficients
arfit <- ar(z,p=1, order.max=ceil(mean(step[,1])), aic=FALSE)
#check if sum of coefficients are 1
sum(arfit$ar)
[1] 0.5784978
My question is, how to add the constraint: sum of all coefficients = 1?
I looked at this question, but I do not realize how to use it.
**UPDATE**
I think I manage to solve my question as follow.
library(quadprog)
coeff <- arfit$ar
y <- 0
for (i in 1:length(coeff)) {
y <- y + coeff[i]*c(z[(i+1):length(z)],rep(0,i))
ifelse (i==1, X <- c(z[2:length(z)],0), X <- cbind(X,c(z[(i+1):length(z)],rep(0,i))))
}
Dmat <- t(X) %*% X
s <- solve.QP(Dmat , t(y) %*% X, matrix(1, nr=15, nc=1), 1, meq=1 )
s$solution
# The coefficients should sum up to 1
sum(s$solution)

Random draws from an ANOVA-like design with given population effect sizes

Let's say that you have a normally distributed variable y with a 3-group categorical predictor x that has the orthogonal contrasts c1 and c2. I am trying to create a program in R that, given x, c1, and c2, creates y such that c1 and c2 have effect sizes r1 and r2 specified by the user.
For example, let's say that x, c1, c2, r1, and r2 were created like the following:
x <- factor(rep(c(1, 2, 3), 100))
contrasts(x) <- matrix(c(0, -.5, .5, -2/3, 1/3, 1/3),
nrow = 3, ncol = 2, dimnames = list(c("1", "2", "3"), c("c1", "c2")))
contrasts(x)
c1 c2
1 0.0 -0.6666667
2 -0.5 0.3333333
3 0.5 0.3333333
r1 <- .09
r2 <- 0
I would like the program to create y such that the variance in y accounted for by c1 equals r1 (.09) and the variance in y accounted for by c2 equals r2 (0).
Does anybody know how I might go about this? I know that I should be using the rnorm function, but I'm stuck on which population means / sds rnorm should use when it does its sampling.
Courtesy of some generous advice from my colleagues, I now have one function that creates simulated data given a specified number of groups, a set of contrasts, a set of regression coefficients, a specified N per cell, and a specified within-group variance
sim.factor <- function(levels, contr, beta, perCell, errorVar){
# Build design matrix X
X <- cbind(rep(1,levels*perCell), kronecker(contr, rep(1,perCell)))
# Generate y
y <- X %*% beta + rnorm(levels*perCell, sd=sqrt(errorVar))
# Build and return data frame
dat <- cbind.data.frame(y, X[,-1])
names(dat)[-1] <- colnames(contr)
return(dat)
}
I also wrote a function that, given a set of regression coefficients, N per cell, number of groups, set of orthogonal contrasts, desired delta-R^2 for a contrast of interest, returns the required within-group variance:
ws.var <- function(levels, contr, beta, perCell, dc){
# Build design matrix X
X <- cbind(rep(1,levels), contr)
# Generate the expected means
means <- X %*% beta
# Find the sum of squares due to each contrast
var <- (t(means) %*% contr)^2 / apply(contr^2 / perCell, 2, sum)
# Calculate the within-conditions sum of squares
wvar <- var[1] / dc - sum(var)
# Convert the sum of squares to variance
errorVar <- wvar / (3 * (perCell - 1))
return(errorVar)
}
After doing some testing as follows, the functions seem to generate the desired delta R^2 for contrast c1.
contr <- contr.helmert(3)
colnames(contr) <- c("c1","c2")
beta <- c(0, 1, 0)
perCell <- 50
levels = 3
dc <- .08
N <- 1000
# Calculate the error variance
errorVar <- ws.var(levels, contr, beta, perCell, dc)
# To store delta R^2 values
d1 <- vector("numeric", length = N)
# Use the functions
for(i in 1:N)
{
d <- sim.factor(levels=3,
contr=contr,
beta=beta,
perCell=perCell,
errorVar=errorVar)
d1[i] <- lm.sumSquares(lm(y ~ c1 + c2, data = d))[1, 2] # From the lmSupport package
}
m <- round(mean(d1), digits = 3)
bmp("Testing simulation functions.bmp")
hist(d1, xlab = "Percentage of variance due to c1", main = "")
text(.18, 180, labels = paste("Mean =", m))
dev.off()
Patrick

Resources