I am working through "Forecasting with Exponential Smoothing". I am stuck on exercise 16.4 on the part that states:
The data set partx contains a history of monthly sales of an automobile part. Apply a local Poisson model. Parameters should be estimated by either maximizing the likelihood or minimizing the sum of squared errors.
The local Poisson model is defined as:
where and
I have the following code, but it seems to be stuck. The optimization always returns something close to the starting values.
Am I fitting the local Poisson model correctly?
library(expsmooth)
data("partx")
S <- function(x) {
a <- x[1]
if(a < 0 | a > 1)
return(Inf)
n <- length(partx)
lambda <- numeric(n+1)
error <- numeric(n)
lambda[1] <- x[2]
for(i in 1:n) {
error[i] <- partx[i]-rpois(1,lambda[i])
lambda[i+1] <- (1-a)*lambda[i] + a*partx[i]
}
return(sum(error^2))
}
# returns a = 0.5153971 and lambda = 5.9282414
op1 <- optim(c(0.5,5L),S, control = list(trace = 1))
# returns a = 0.5999655 and lambda = 2.1000131
op2 <- optim(c(0.5,2L),S, control = list(trace = 1))
I know the book says you could use sum of squared errors or MLE but the first option seems wired due too the fact that you have to sample a poison distribution so event if you fix the parameters you would get the different sum of squared errors every time. As you don't say that you have tried the MLE approach I program it. The math is as follows:
And the code that implements it is
MLELocalPoisson = function(par,y){
alpha = par[1]
lambda_ini = par[2]
n = length(y)
vec_lambda = rep(NA, n)
for(i in 1:n){
if(i==1){
vec_lambda[i] = (1-alpha)*lambda_ini+alpha*y[i]
}else{
vec_lambda[i] = (1-alpha)*vec_lambda[i-1]+alpha*y[i]
}
}
vec_lambda = c(lambda_ini,vec_lambda[-n])
sum_factorial = sum(sapply(y,function(x)log(factorial(x))))
sum_lambda = sum(vec_lambda)
sum_prod = sum(log(vec_lambda)*y)
loglike = -sum_prod+sum_lambda+sum_factorial
return(loglike)
}
optim(par = c(0.1,1),fn = MLELocalPoisson,y = partx, method = "L-BFGS-B",
lower=c(1e-10,1e-10),upper = c(1,Inf),control = list(maxit = 10000))
the lower values set a 1e-10 is done so the optimization do not try c(0,0) and thus generating a loglikelihood of NaN.
EDIT
Taking a look at the poisson regression literature the usually define $\lambda = exp(x*\beta)$ and calculate the residuals as $y-exp(x*\beta)$ (have a look at). So it might be possible to do the same in this problem using the formula given by the author for $\lambda$ like this:
LocalPoisson = function(par,y,optim){
alpha = par[1]
lambda_ini = par[2]
n = length(y)
vec_lambda = rep(NA, n)
y_hat = rep(NA, n)
for(i in 1:n){
if(i==1){
vec_lambda[i] = (1-alpha)*lambda_ini+alpha*y[i]
}else{
vec_lambda[i] = (1-alpha)*vec_lambda[i-1]+alpha*y[i]
}
}
if(optim){
y_hat = c(lambda_ini,vec_lambda[-n])
return(sum((y_hat-y)^2))
} else {
return(data.frame(y_hat = y_hat,y=y, lambda = vec_lambda))
}
}
optim(par = c(0.1,1),fn = LocalPoisson,y = partx, optim =T,method = "L-BFGS-B",
lower=c(1e-10,1e-10),upper = c(1,Inf),control = list(maxit = 10000))
It does not yields the same results as the MLE (and I feel more comfortable with that option but it might be a possible way to estimate the parameters).
Related
I am trying to establish a method of estimating infectious disease parameters by comparing real epidemic curves with simulations of a stochastic SIR model. To construct the stochastic SIR model, I am using the deSolve package and instead of using fixed parameter values I would like to draw the parameter value used in the equations at each time point from a Poisson distribution centered on the original parameter values.
Using the parameter beta as an example, beta represents the average number of transmission events per capita and is the product of the average number of contacts and the probability that transmission occurs upon contact. Realistically, there is variation in the number of contacts a person will have and since transmission is also a probabilistic event there is variation surrounding this too.
So even if the average transmission rate were to be 2.4 (for example), an individual can go on to infect 0, 1, 2 or 3 ... etc. people with varying probabilities.
I have tried to incorporate this into my code below using the rpois function and reassigning the parameters used in the equations to the outputs of the rpois.
I have run my code with the same initial values and parameters multiple times and all the curves are different indicating that SOMETHING "stochastic" is going on, but I am unsure whether the code is sampling using the rpois at each time point or just once at the beginning. I have only started coding very recently so do not have much experience.
I would be grateful if anyone more experienced than myself could verify what my code is ACTUALLY doing and whether it is sampling using rpois at each time point or not. If not I would be grateful for any suggestions for achieving this. Perhaps a loop is needed?
library('deSolve')
library('reshape2')
library('ggplot2')
#MODEL INPUTS
initial_state_values <- c(S = 10000,
I = 1,
R = 0)
#PARAMETERS
parameters <- c(beta = 2.4,
gamma = 0.1)
#POISSON MODELLING OF PARAMETERS
#BETA
beta_p <- rpois(1, parameters[1])
#GAMMA
infectious_period_p <- rpois(1, 1/(parameters[2]))
gamma_p <- 1/infectious_period_p
#TIMESTEPS
times <- seq(from = 0, to = 50,by = 1)
#SIR MODEL FUNCTION
sir_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
N <- S + I + R
lambda <- beta_p * I/N
dS <- -lambda * S
dI <- lambda*S - gamma_p*I
dR <- gamma_p*I
return(list(c(dS, dI, dR)))
})
}
output<- as.data.frame(ode(y= initial_state_values,
times = times,
func = sir_model,
parms = parameters))
The code given in the question runs the model with constant parameters over time. Here an example with parameters varying over time. However, this setting assumes that for a given time step, the parameters are equal for all indidividuals of the population. If you want to have individual variability, one can either use a matrix formulation for different sub-populations or use an individual model instead.
Model with fluctuating population parameters:
library('deSolve')
initial_state_values <- c(S = 10000,
I = 1,
R = 0)
parameters <- c(beta = 2.4, gamma = 0.1)
times <- seq(from = 0, to = 50, by = 1) # note time step = 1!
# +1 to add one for time = zero
beta_p <- rpois(max(times) + 1, parameters[1])
infectious_period_p <- rpois(max(times) + 1, 1/(parameters[2]))
gamma_p <- 1/infectious_period_p
sir_model <- function(time, state, parameters) {
# cat(time, "\n") # show time steps for debugging
with(as.list(c(state, parameters)), {
# this overwrites the parms passed via parameters
beta <- beta_p[floor(time) + 1]
gamma <- gamma_p[floor(time) + 1]
N <- S + I + R
lambda <- beta * I/N
dS <- -lambda * S
dI <- lambda * S - gamma * I
dR <- gamma * I
list(c(dS, dI, dR))
})
}
output <- ode(y = initial_state_values,
times = times,
func = sir_model,
parms = parameters)
plot(output)
Here another, slightly more generalized version. It is added as a second answer, to keep the original version compact and simple. The new version differs with respect to the following:
generalized, so that it can work with fixed parameters and stochastic forcing
pass parameters as list
run a basic Monte-Carlo simulation
library('deSolve')
sir_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
# this overwrites the parms passed via parameters
if (time_dependent) {
beta <- beta_p[floor(time) + 1]
gamma <- gamma_p[floor(time) + 1]
}
N <- S + I + R
lambda <- beta * I/N
dS <- -lambda * S
dI <- lambda * S - gamma * I
dR <- gamma * I
list(c(dS, dI, dR))
})
}
initial_state_values <- c(S = 10000, I = 1, R = 0)
times <- seq(from = 0, to = 50, by = 1) # note time step = 1!
## (1) standard simulation with constant parameters
parameters <- c(beta = 2.4, gamma = 0.1)
out0 <- ode(y= initial_state_values,
times = times,
func = sir_model,
parms = c(parameters, time_dependent = FALSE))
plot(out0)
## (2) single simulation with time varying parameters
beta_p <- rpois(max(times) + 1, parameters[1])
infectious_period_p <- rpois(times + 1, 1/(parameters[2]))
gamma_p <- 1/infectious_period_p
## here we need pass the vectorized parameters globally
## for simplicity, it can also be done as list
out1 <- ode(y = initial_state_values, times = times,
func = sir_model, parms = c(time_dependent = TRUE))
plot(out0, out1)
## (3) a sample of simulations
monte_carlo <- function(i) {
#parameters <- c(beta = 2.4, gamma = 0.1)
beta_p <- rpois(max(times) + 1, parameters[1])
infectious_period_p <- rpois(max(times) + 1, 1 / (parameters[2]))
gamma_p <- 1/infectious_period_p
ode(y = initial_state_values, times = times,
func = sir_model, parms = list(beta_p = beta_p,
gamma_p = gamma_p,
time_dependent = TRUE))
}
## run 10 simulations
out_mc <- lapply(1:10, monte_carlo)
plot(out0, out_mc, mfrow=c(1, 3))
I hope you are well and having a nice day. I am attempting to code an MH algorithm for a multiple linear regression model. I am following a tutorial online for simple linear regression using an MH algorithm and I plan on applying the principles myself to my multiple linear regression model, but I am running into a problem figuring our how to specify prior distributions for the parameters. Below, is the code.
trueA <- 5
trueB <- 0
trueSd <- 10
sampleSize <- 31
# create independent x-values
x <- (-(sampleSize-1)/2):((sampleSize-1)/2)
# create dependent values according to ax + b + N(0,sd)
y <- trueA * x + trueB + rnorm(n=sampleSize,mean=0,sd=trueSd)
likelihood <- function(param){
a = param[1]
b = param[2]
sd = param[3]
pred = a*x + b
singlelikelihoods = dnorm(y, mean = pred, sd = sd, log = T)
sumll = sum(singlelikelihoods)
return(sumll)
}
This is the portion where I am running into problems. In the tutorial, the author now specifies that they define prior distributions for the parameters. In this case, I would like to specify uninformative prior distributions for the slopes, coefficients, and standard deviation. The author says that they used uniform/normal distributions.
I found an example using Poisson regression, which looks like:
LogPriorFunction <- function(param){
beta0 <- param[1]
beta1 <- param[2]
beta0prior <- dnorm(beta0, 0, sqrt(100), log=TRUE)
beta1prior <- dnorm(beta1, 0, sqrt(100), log=TRUE)
return(beta0prior + beta1prior) # Logarithm of prior distributions
}
But I am having trouble adapting it to the simple linear regression. Here is my attempt:
prior <- function(param){
a = param[1]
b = param[2]
sd = param[3]
aprior = dnorm(a, 0, 1000, log=TRUE)
bprior = dnorm(b, 0, 1000, log=TRUE)
sdprior = 1/rgamma(sd, shape = .001, scale = .001)
return(aprior + bprior + sdprior) # Logarithm of prior distributions
}
The rest of the code is:
posterior <- function(param){
return (likelihood(param) + prior(param))
}
proposalfunction <- function(param){
return(rnorm(3,mean = param, sd= c(0.1,0.5,0.3)))
}
run_metropolis_MCMC <- function(startvalue, iterations){
chain = array(dim = c(iterations+1,3))
chain[1,] = startvalue
for (i in 1:iterations){
proposal = proposalfunction(chain[i,])
probab = exp(posterior(proposal) - posterior(chain[i,]))
if (runif(1) < probab){
chain[i+1,] = proposal
}else{
chain[i+1,] = chain[i,]
}
}
return(chain)
}
startvalue = c(4,0,10)
chain = run_metropolis_MCMC(startvalue, 10000)
burnIn = 5000
acceptance = 1-mean(duplicated(chain[-(1:burnIn),]))
Finally, along with specifying the priors, I keep getting the warning message:
Error in if (runif(1) < probab) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In if (runif(1) < probab) { :
the condition has length > 1 and only the first element will be used
I am not sure if this is related to my incorrectly specifying the prior distributions, as the author of the article gets similar results from the MH algorithm as running a simple linear regression model using the lm() function. Any help would with these issues would be greatly appreciated. I look forward to hearing from you. Thank you.
As a reference, here is the link to the article:
https://khayatrayen.github.io/MCMC.html#defining_the_prior
Finally, here is a link to the article from where I tried to emulate the priors, but does Poisson regression and not linear regression:
https://rpubs.com/SaraGarcesCespedes/586440
I want to want to estimate a model in R.
One of its part is a finite mixture model which is consisted of two OLS.
As a freshman in R, I don't know how to write this probability density function in R.
I wonder if you can give some help.
The probability density function is as following:
f(y|x)=(p/σ1)*φ(y-x*b1/σ1)+((1-p)/σ2)*φ(y-x*b2/σ2)
I have used stata to write a example:
gen double f1'=normalden($ML_y1,xb1',exp(lns1'))
gen doublef2'=normalden($ML_y1,xb2',exp(lns2'))
tempvar p
gen double p'=exp(lp')/(1+exp(lp'))
replacelnf'=ln(p'*f1'+(1-p')*f2')
I wonder if you can show me how to write this function in R.
Thanks a lot and I am looking forward to your help
See the function FLXMRglm. The density is estimated with dnorm
library(flexmix)
FLXMRglm
# your case
if (family == "gaussian") {
z#defineComponent <- function(para) {
predict <- function(x, ...) {
dotarg = list(...)
if ("offset" %in% names(dotarg))
offset <- dotarg$offset
p <- x %*% para$coef
if (!is.null(offset))
p <- p + offset
p
}
logLik <- function(x, y, ...) dnorm(y, mean = predict(x,
...), sd = para$sigma, log = TRUE)
new("FLXcomponent", parameters = list(coef = para$coef,
sigma = para$sigma), logLik = logLik, predict = predict,
df = para$df)
}
z#fit <- function(x, y, w, component) {
fit <- lm.wfit(x, y, w = w, offset = offset)
z#defineComponent(para = list(coef = coef(fit), df = ncol(x) +
1, sigma = sqrt(sum(fit$weights * fit$residuals^2/mean(fit$weights))/(nrow(x) -
fit$rank))))
}
}
I need to implement a logistic regression manually, using the Score/GMM approach, without the use of GLM. This is because at later stages the model will be much more complicated. Currently I am running into a problem where for the logistic regression, the optimization procedures are very initial point dependent.To illustrate, here is my code using an online dataset. More details about the procedure are in the comments:
library(data,table)
library(nleqslv)
library(Matrix)
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
data_analysis<-data.table(mydata)
data_analysis[,constant:=1]
#Likelihood function for logit
#The logistic regression will regress the binary variable
#admit on a constant and the variable gpa
LL <- function(beta){
beta=as.numeric(beta)
data_temp=data_analysis
mat_temp2 = cbind(data_temp$constant,
data_temp$gpa)
one = rep(1,dim(mat_temp2)[1])
h = exp(beta %*% t(mat_temp2))
choice_prob = h/(1+h)
llf <- sum(data_temp$admit * log(choice_prob)) + (sum((one-data_temp$admit) * log(one-choice_prob)))
return(-1*llf)
}
#Score to be used when optimizing using LL
#Identical to the Score function below but returns negative output
Score_LL <- function(beta){
data_temp=data_analysis
mat_temp2 = cbind(data_temp$constant,
data_temp$gpa)
one = rep(1,dim(mat_temp2)[1])
h = exp(beta %*% t(mat_temp2))
choice_prob = h/(1+h)
resid = as.numeric(data_temp$admit - choice_prob)
score_final2 = t(mat_temp2) %*% Diagonal(length(resid), x=resid) %*% one
return(-1*as.numeric(score_final2))
}
#The Score/Deriv/Jacobian of the Likelihood function
Score <- function(beta){
data_temp=data_analysis
mat_temp2 = cbind(data_temp$constant,
data_temp$gpa)
one = rep(1,dim(mat_temp2)[1])
h = exp(beta %*% t(mat_temp2))
choice_prob = as.numeric(h/(1+h))
resid = as.numeric(data_temp$admit - choice_prob)
score_final2 = t(mat_temp2) %*% Diagonal(length(resid), x=resid) %*% one
return(as.numeric(score_final2))
}
#Derivative of the Score function
Score_Deriv <- function(beta){
data_temp=data_analysis
mat_temp2 = cbind(data_temp$constant,
data_temp$gpa)
one = rep(1,dim(mat_temp2)[1])
h = exp(beta %*% t(mat_temp2))
weight = (h/(1+h)) * (1- (h/(1+h)))
weight_mat = Diagonal(length(weight), x=weight)
deriv = t(mat_temp2)%*%weight_mat%*%mat_temp2
return(-1*as.array(deriv))
}
#Quadratic Gain function
#Minimized at Score=0 and so minimizing is equivalent to solving the
#FOC of the Likelihood. This is the GMM approach.
Quad_Gain<- function(beta){
h=Score(as.numeric(beta))
return(sum(h*h))
}
#Derivative of the Quadratic Gain function
Quad_Gain_deriv <- function(beta){
return(2*t(Score_Deriv(beta))%*%Score(beta))
}
sol1=glm(admit ~ gpa, data = data_analysis, family = "binomial")
sol2=optim(c(2,2),Quad_Gain,gr=Quad_Gain_deriv,method="BFGS")
sol3=optim(c(0,0),Quad_Gain,gr=Quad_Gain_deriv,method="BFGS")
When I run this code, I get that sol3 matches what glm produces (sol1) but sol2, with a different initial point, differs from the glm solution by a lot. This is something happening in my main code with the actual data as well. One solution is to create a grid and test multiple starting points. However, my main data set has 10 parameters and this would make the grid very large and the program computationally infeasible. Is there a way around this problem?
Your code seems overly complicated. The following two functions define the negative log-likelihood and negative score vector for a logistic regression with the logit link:
logLik_Bin <- function (betas, y, X) {
eta <- c(X %*% betas)
- sum(dbinom(y, size = 1, prob = plogis(eta), log = TRUE))
}
score_Bin <- function (betas, y, X) {
eta <- c(X %*% betas)
- crossprod(X, y - plogis(eta))
}
Then you can use it as follows:
# load the data
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
# fit with optim()
opt1 <- optim(c(-1, 1, -1), logLik_Bin, score_Bin, method = "BFGS",
y = mydata$admit, X = cbind(1, mydata$gre, mydata$gpa))
opt1$par
# compare with glm()
glm(admit ~ gre + gpa, data = mydata, family = binomial())
Typically, for well-behaved covariates (i.e., expecting to have a coefficients in the interval [-4 to 4]), starting at 0 is a good idea.
I am trying to write a function that can calculate Approximate Bayesian Computation using the Population Monte Carlo method. However, I ran into some troubles with my R code with the following error.
>head(ABC_PMC(2,5000,1,observedData,observedSummary))
Error in while (prior == 0) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In log(importance.sample[2]) : NaNs produced
The code doesn't have a problem when I run it with one iteration, t=1. The issue lies with the second iteration and I think it is the assignment of importance.sammpling step. But I can't figure out what is the error as I tried to run the code manually and it works.
Any and all help will be appreciated.
# initialise variables:
set.seed(12345)
n=5000
weight = rep(1,n)
theta = matrix(NA, nrow = n, ncol = 2)
for (i in 1:n) {
fit = c(runif(1,-2,2), runif(1,0,4))
theta[i,] = fit
}
#Calculate importance density for current t to use in next iteration
old.prop.weight = weight / sum(weight)
old.theta = theta
Kernel.mean = cbind(old.theta[,1],old.theta[,2])
Kernel.cov = cov(old.theta)
Main loop:
for (i in 1:n) {
prior=0
while (prior==0) {
#Loop to test is prior=0
choice = sample(1:n, size = 1, prob = old.prop.weight)
importance.sample = rmvnorm(1,mean = Kernel.mean[choice,], sigma = Kernel.cov)
#logsigma to ensure theta2 are all nonnegative
if (importance.sample[2] != 0)
importance.sample[2] = log(importance.sample[2])
fit = c(importance.sample[1],importance.sample[2])
}
prior = as.numeric(dunif(fit[1],min = -2,max = 2) * dunif(exp(fit[2]),min = 0,max = 4))
}
theta[i,] = fit
##End of step 2 loop
}
Edit: Tried to make the code into a reproducible example.