My task is to simulate a compound Poisson process defined as:
where
is a Poisson process and Y_i are Gamma(shape,scale) distributed. This is my R code:
# parameter for Poisson distribution.
lambda = 1
# parameters for Gamma distribution.
shape = 7.5
scale = 1
comp.pois = function(t.max, lambda) {
stopifnot(t.max >= 0 && t.max %% 1 == 0)
# offset ns by 1 because first y is 0.
# generate N(t), that is number of arrivals until time t.
ns = cumsum(rpois(n = t.max, lambda = lambda)) + 1
# generate gamma distributed random variables Y_i.
ys = c(0, rgamma(n = max(ns), shape = shape, scale = scale))
# generate all X(t) for t <= t.max.
return(c(0, cumsum(x = ys[ns])))
}
Compute a random sample of X(10) and compare means and variances.
# sample size.
size = 1000
t = 10
# ts is a vector of sample values for X(10).
ts = sapply(1:size, function(i) comp.pois(t, lambda)[t])
# sample mean and variance:
(mean.s = mean(ts))
(var.s = var(ts))
# theoretical mean and variance:
(mean.t = lambda * t * shape * scale)
(var.t = (shape + 1) * shape * scale^2)
output:
> # sample:
> (mean.s = mean(ts))
[1] 63.38403
> (var.s = var(ts))
[1] 184.3264
> # theoretical:
> (mean.t = lambda * t * shape * scale)
[1] 75
> (var.t = (shape + 1) * shape * scale^2)
[1] 63.75
This variance is gigantic, but I cannot spot my mistake. Please help. Thank you.
EDIT:
I used the following algorithm to generate the N(t). I don't know why it is supposed to be better. I took it from Rizzo, Maria L. Statistical computing with R. CRC Press, 2007. The mean is good, but the variance is even worse. I tried sampling from the Gamma distribution only once for the entire simulation (although I'm pretty sure this does not reflect the problem very well) and the mean was off by around 10-40 for t = 10. When resampling for every X(t) (which is what the following code does), the mean is very exact. As pointed out, the variance is horrifying. This is probably not a good solution, but I suppose it is as good as it gets.
lambda = 3
shape = 6
scale = 2
size = 10000
eps = 1e-8
t = 10
# with probability 1-eps, n or less gamma distributed random variables are needed.
n = qpois(1-eps, lambda = lambda * t)
# sample from the gamma distribution. Not sure if it's ok to use the same sample every time.
# with this, the mean is of by about 10%.
# ys = c(rgamma(n = n, shape = shape, scale = scale))
# the interarrival times are exponentially distributed with rate lambda.
pp.exp = function (t0) {
# not sure how many Tn are needed :/
Tn = rexp(1000, lambda)
Sn = cumsum(Tn)
return(min(which(Sn > t0)) - 1)
}
# generate N(t) which follow the poisson process.
ns = sapply(1:size, function (i) pp.exp(t))
# generate X(t) as in the problem description.
xs = sapply(ns, function (n) {
ys = c(rgamma(n = n, shape = shape, scale = scale))
sum(ys[1:n])
})
output (t=10) in this case:
> # compare mean and variance of 'size' samples of X(t) for verification.
> # sample:
> (mean.s = mean(xs))
[1] 359.864
> (var.s = var(xs))
[1] 4933.277
> # theoretical:
> (mean.t = lambda * t * shape * scale)
[1] 360
> (var.t = (shape + 1) * shape * scale^2)
[1] 168
Related
Using R, I wish to estimate a vector of parameters a_i (of arbitrary length, i.e. i = 1,...,s) with a multinomial likelihood using a corresponding vector of observations n_i totaling a sample size of N=sum_i (n_i). The probabilities p_i of the multinomial are determined by said a parameters and measurements of variable x such that p_i = (a_i * x_i)/sum_i (a_i * x_i). I wish further to impose the constraint that sum_i a_i = 1.
I've managed to get optim() to do the job as follows --- implementing the two tricks I've seen of estimating the first a_1 as 1 - sum_{i=2} a_i and additionally renormalizing all estimates to 1 --- but the accuracy and dependability of achieving convergence remains rather variable (in addition to being sensitive to the vector of starting estimates I provide), even when N is very large.
I would appreciate guidance on more robust alternatives and/or improvements.
s <- 10 # vector length
N <- 1000 # total sample size
# variable
x_i <- round(rlnorm(n_p, 2.5, 1.5))
# true parameter values
a_i <- rbeta(s, 2, 2)
a_i <- a_i / sum(a_i)
# generate observations
n_i <- rmultinom(1, N, (a_i * x_i) / sum(a_i * x_i))
# negative log-likelihood for parameters `par'
nll = function(par) {
if (any(0 > par | par > 1)) {
return(NA)
}else{
par <- c(1 - sum(par), par) # estimate first as remainder
par <- par / sum(par) # normalize
p_i <- (par * x_i) / sum(par * x_i) # model for probabilities
- sum(dmultinom(
x = n_i,
size = N,
prob = p_i,
log = TRUE
)) }
}
# starting values (dropping first)
start = rep(1/s, s-1)
fit <- optim(par = start,
fn = nll,
control = list(maxit = 10000)
)
ests = c(1 - sum(fit$par), fit$par)
cbind(a_i, ests)
par(pty = 's')
plot(a_i, ests)
abline(0, 1)
I have found no examples of neldermead() on the internet, so I figured I would post the following. That said, I cant figure out how to control the max number of iterations of the algorithm.
https://cran.r-project.org/web/packages/neldermead/neldermead.pdf
Which states the following:
optbase An object of class ’optimbase’, i.e. a list created by optimbase() and containing the following elements:
iterations The number of iterations.
Working example of using Nelder Mead to fit a parabola by minimizing the residuals
library(neldermead); library(nloptr);
## ========= Minimizing the residuals for a 2d quadrature =========== ##
x2d = seq(-4,6,length.out=50); ## x vector definition
ynoise = runif(n=50, min=-2, max = 2) ## noise
y2d = 1.3 + (x2d-2.1)^2 + ynoise ## y data for fitting
## Fitting with nelder-mead
quadmin <- function(x){ sum( (y2d - x[1] - (x2d - x[2])^2)^2 ) }
x000 <- c(1, 2)
sol2d <- neldermead(x0 = x000, fn = quadmin)
sol2d
yfit = sol2d[[1]][1] + (x2d - sol2d[[1]][2])^2 ## Fitted curve.
plot(x2d, y2d); lines(x2d, yfit) ## Plotting
But I'm looking to do something like:
sol2d <- neldermead(x0 = x000, fn = quadmin, iterations = 200)
^^^ which doesn't work. Neither does putting it into a list:
sol2d <- neldermead(x0 = x000, fn = quadmin, optbase = list(iterations = 200))
This is a basic question about how to use these arguments, so I apologize if this isn't the right title. In advance, thank you for your help.
There are at least tow neldermead functions available in R. One is from the package neldermead which correspond to the documentation you link.
I have not been able to make it work. It gives me back neither error or solution.
The code:
library(neldermead)
library(nloptr)
## ========= Minimizing the residuals for a 2d quadrature =========== ##
x2d = seq(-4,6,length.out=50); ## x vector definition
ynoise = runif(n=50, min=-2, max = 2) ## noise
y2d = 1.3 + (x2d-2.1)^2 + ynoise ## y data for fitting
## Fitting with nelder-mead
quadmin <- function(x){
sum(y2d - x[1] - (x2d - x[2])^2)^2 }
x000 <- c(1, 2)
opt <- optimbase(x0 = as.matrix(x000),fx0 = -1000,maxiter = 200,fopt = quadmin,verbose=T)
sol2d <- neldermead::neldermead(opt)
On the other hand, package nloptr also provides a neldermedad function which sintax looks closer to you code and I have been able to run:
library(neldermead)
library(nloptr)
## ========= Minimizing the residuals for a 2d quadrature =========== ##
x2d = seq(-4,6,length.out=50); ## x vector definition
ynoise = runif(n=50, min=-2, max = 2) ## noise
y2d = 1.3 + (x2d-2.1)^2 + ynoise ## y data for fitting
## Fitting with nelder-mead
quadmin <- function(x){
sum(y2d - x[1] - (x2d - x[2])^2)^2 }
x000 <- c(1, 2)
sol2d <- nloptr::neldermead(x0 = x000,fn =quadmin,control =list(maxeval=200))
yfit = sol2d[[1]][1] + (x2d - sol2d[[1]][2])^2 ## Fitted curve.
plot(x2d, y2d); lines(x2d, yfit)
As you can see, the only issue with you code was the control part.
Best!
I am trying to obtain estimated constrained coefficients using RSS. The beta coefficients are constrained between [0,1] and sum to 1. Additionally, my third parameter is constrained between (-1,1). Utilizing the below I can obtain a nice solution using simulated variables but when implementing the methodology on my real data set I keep arriving at a non-unique solution. In turn, I'm wondering if there is a more numerically stable way to obtain my estimated parameters.
set.seed(234)
k = 2
a = diff(c(0, sort(runif(k-1)), 1))
n = 1e4
x = matrix(rnorm(k*n), nc = k)
a2 = -0.5
y = a2 * (x %*% a) + rnorm(n)
f = function(u){sum((y - u[3] * (x %*% u[1:2]))^2)}
g = function(v){
v1 = v[1]
v2 = v[2]
u = vector(mode = "double", length = 3)
# ensure in (0,1)
v1 = 1 / (1 + exp(-v1))
# ensure add up to 1
u[1:2] = c(v1, 1 - sum(v1))
# ensure between [-1,1]
u[3] = (v2^2 - 1) / (v2^2 + 1)
u
}
res = optim(rnorm(2), function(v) f(g(v)), hessian = TRUE, method = "BFGS")
eigen(res$hessian)$values
res$convergence
rbind(Est = res$par, SE = sqrt(diag(solve(res$hessian))))
rbind(g(res$par),c(a,a2))
Hats off to http://zoonek.free.fr/blosxom/R/2012-06-01_Optimization.html
Since there has been no direct answer to your question so far, I'd like to show a way how to implement a parameter-constrained model in Stan/RStan. You should give this a try using your real data.
Doing Bayesian inference has the advantage of giving you posterior probabilities for your (constrained) model parameters. Point estimates including confidence intervals can then be easily calculated.
First off, we load the library and set RStan to store the compiled model and use multiple cores (if available).
library(rstan);
rstan_options(auto_write = TRUE);
options(mc.cores = parallel::detectCores());
We now define our Stan model. In this case, it's very simple, and we can make use of RStan's simplex data type for vectors of non-negative values that sum to one.
model <- "
data {
int<lower=1> n; // number of observations
int<lower=0> k; // number of parameters
matrix[n, k] X; // data
vector[n] y; // response
}
parameters {
real a2; // a2 is a free scaling parameter
simplex[k] a; // a is constrained to sum to 1
real sigma; // residuals
}
model {
// Likelihood
y ~ normal(a2 * (X * a), sigma);
}"
Stan supports various constrained data types; I'd recommend taking a lot at the Stan manual for more complex examples.
Using the sample data from your original question, we can run our model:
# Sample data
set.seed(234);
k = 2;
a = diff(c(0, sort(runif(k-1)), 1));
n = 1e4;
x = matrix(rnorm(k * n), nc = k);
a2 = -0.5;
y = a2 * (x %*% a) + rnorm(n);
# Fit stan model
fit <- stan(
model_code = model,
data = list(
n = n,
k = k,
X = x,
y = as.numeric(y)),
iter = 4000,
chains = 4);
Running the model will only take a few seconds (after the parser has internally translated and compiled the model in C++), and the full results (posterior distributions for all parameters conditional on the data) are stored in fit.
We can inspect the contents of fit using summary:
# Extract parameter estimates
pars <- summary(fit)$summary;
pars;
# mean se_mean sd 2.5% 25%
#a2 -0.4915289 1.970327e-04 0.014363398 -0.5194985 -0.5011471
#a[1] 0.7640606 2.273282e-04 0.016348488 0.7327691 0.7527457
#a[2] 0.2359394 2.273282e-04 0.016348488 0.2040952 0.2248482
#sigma 1.0048695 8.746869e-05 0.007048116 0.9909698 1.0001889
#lp__ -5048.4273105 1.881305e-02 1.204892294 -5051.4871931 -5048.9800451
# 50% 75% 97.5% n_eff Rhat
#a2 -0.4916061 -0.4819086 -0.4625947 5314.196 1.0000947
#a[1] 0.7638723 0.7751518 0.7959048 5171.881 0.9997468
#a[2] 0.2361277 0.2472543 0.2672309 5171.881 0.9997468
#sigma 1.0048994 1.0095420 1.0187554 6492.930 0.9998086
#lp__ -5048.1238783 -5047.5409682 -5047.0355381 4101.832 1.0012841
You can see that a[1]+a[2]=1.
Plotting parameter estimates including confidence intervals is also easy:
plot(fit);
The simplest way to solve optimization problems with equality and inequality constraints will most likely be through the "augmented Lagrangian" approach. In R this is, for example, realized in the alabama package.
# function and gradient
fn = function(u){sum((y - u[3] * (x %*% u[1:2]))^2)}
gr = function(u) numDeriv::grad(fn, u)
# constraint sum(u) == 1
heq = function(u) sum(u) - 1
# constraints 0 <= u[1],u[2] <= 1; -1 <= u[3] <= 1
hin = function(u) c(u[1], u[2], 1-u[1], 1-u[2], u[3]+1, 1-u[3])
sol_a = alabama::auglag(c(0.5, 0.5, 0), fn, gr, hin=hin, heq=heq)
sol_a
## $par
## [1] 1.0000000 0.3642904 -0.3642904
## $value
## [1] 10094.74
## ...
## $hessian
## [,1] [,2] [,3]
## [1,] 15009565054 9999999977 9999992926
## [2,] 9999999977 10000002578 9999997167
## [3,] 9999992926 9999997167 10000022569
For other packages containing an "augmented Lagrangian" procedure see the CRAN Task View on optimization.
I am trying to obtain estimated constrained coefficients using RSS. The beta coefficients are constrained between [0,1] and sum to 1. Additionally, my third parameter is constrained between (-1,1). Utilizing the below I can obtain a nice solution using simulated variables but when implementing the methodology on my real data set I keep arriving at a non-unique solution. In turn, I'm wondering if there is a more numerically stable way to obtain my estimated parameters.
set.seed(234)
k = 2
a = diff(c(0, sort(runif(k-1)), 1))
n = 1e4
x = matrix(rnorm(k*n), nc = k)
a2 = -0.5
y = a2 * (x %*% a) + rnorm(n)
f = function(u){sum((y - u[3] * (x %*% u[1:2]))^2)}
g = function(v){
v1 = v[1]
v2 = v[2]
u = vector(mode = "double", length = 3)
# ensure in (0,1)
v1 = 1 / (1 + exp(-v1))
# ensure add up to 1
u[1:2] = c(v1, 1 - sum(v1))
# ensure between [-1,1]
u[3] = (v2^2 - 1) / (v2^2 + 1)
u
}
res = optim(rnorm(2), function(v) f(g(v)), hessian = TRUE, method = "BFGS")
eigen(res$hessian)$values
res$convergence
rbind(Est = res$par, SE = sqrt(diag(solve(res$hessian))))
rbind(g(res$par),c(a,a2))
Hats off to http://zoonek.free.fr/blosxom/R/2012-06-01_Optimization.html
Since there has been no direct answer to your question so far, I'd like to show a way how to implement a parameter-constrained model in Stan/RStan. You should give this a try using your real data.
Doing Bayesian inference has the advantage of giving you posterior probabilities for your (constrained) model parameters. Point estimates including confidence intervals can then be easily calculated.
First off, we load the library and set RStan to store the compiled model and use multiple cores (if available).
library(rstan);
rstan_options(auto_write = TRUE);
options(mc.cores = parallel::detectCores());
We now define our Stan model. In this case, it's very simple, and we can make use of RStan's simplex data type for vectors of non-negative values that sum to one.
model <- "
data {
int<lower=1> n; // number of observations
int<lower=0> k; // number of parameters
matrix[n, k] X; // data
vector[n] y; // response
}
parameters {
real a2; // a2 is a free scaling parameter
simplex[k] a; // a is constrained to sum to 1
real sigma; // residuals
}
model {
// Likelihood
y ~ normal(a2 * (X * a), sigma);
}"
Stan supports various constrained data types; I'd recommend taking a lot at the Stan manual for more complex examples.
Using the sample data from your original question, we can run our model:
# Sample data
set.seed(234);
k = 2;
a = diff(c(0, sort(runif(k-1)), 1));
n = 1e4;
x = matrix(rnorm(k * n), nc = k);
a2 = -0.5;
y = a2 * (x %*% a) + rnorm(n);
# Fit stan model
fit <- stan(
model_code = model,
data = list(
n = n,
k = k,
X = x,
y = as.numeric(y)),
iter = 4000,
chains = 4);
Running the model will only take a few seconds (after the parser has internally translated and compiled the model in C++), and the full results (posterior distributions for all parameters conditional on the data) are stored in fit.
We can inspect the contents of fit using summary:
# Extract parameter estimates
pars <- summary(fit)$summary;
pars;
# mean se_mean sd 2.5% 25%
#a2 -0.4915289 1.970327e-04 0.014363398 -0.5194985 -0.5011471
#a[1] 0.7640606 2.273282e-04 0.016348488 0.7327691 0.7527457
#a[2] 0.2359394 2.273282e-04 0.016348488 0.2040952 0.2248482
#sigma 1.0048695 8.746869e-05 0.007048116 0.9909698 1.0001889
#lp__ -5048.4273105 1.881305e-02 1.204892294 -5051.4871931 -5048.9800451
# 50% 75% 97.5% n_eff Rhat
#a2 -0.4916061 -0.4819086 -0.4625947 5314.196 1.0000947
#a[1] 0.7638723 0.7751518 0.7959048 5171.881 0.9997468
#a[2] 0.2361277 0.2472543 0.2672309 5171.881 0.9997468
#sigma 1.0048994 1.0095420 1.0187554 6492.930 0.9998086
#lp__ -5048.1238783 -5047.5409682 -5047.0355381 4101.832 1.0012841
You can see that a[1]+a[2]=1.
Plotting parameter estimates including confidence intervals is also easy:
plot(fit);
The simplest way to solve optimization problems with equality and inequality constraints will most likely be through the "augmented Lagrangian" approach. In R this is, for example, realized in the alabama package.
# function and gradient
fn = function(u){sum((y - u[3] * (x %*% u[1:2]))^2)}
gr = function(u) numDeriv::grad(fn, u)
# constraint sum(u) == 1
heq = function(u) sum(u) - 1
# constraints 0 <= u[1],u[2] <= 1; -1 <= u[3] <= 1
hin = function(u) c(u[1], u[2], 1-u[1], 1-u[2], u[3]+1, 1-u[3])
sol_a = alabama::auglag(c(0.5, 0.5, 0), fn, gr, hin=hin, heq=heq)
sol_a
## $par
## [1] 1.0000000 0.3642904 -0.3642904
## $value
## [1] 10094.74
## ...
## $hessian
## [,1] [,2] [,3]
## [1,] 15009565054 9999999977 9999992926
## [2,] 9999999977 10000002578 9999997167
## [3,] 9999992926 9999997167 10000022569
For other packages containing an "augmented Lagrangian" procedure see the CRAN Task View on optimization.
I came across an interesting presentation on page 32, and I started out to replicate and understand a code presented
The code from the presentation is as follows:
#Unicredit banks code
library(evir)
library(fExtremes)
# Quantile function of lognormal-GPD severity distribution
qlnorm.gpd = function(p, theta, theta.gpd, u)
{
Fu = plnorm(u, meanlog=theta[1], sdlog=theta[2])
x = ifelse(p<Fu,
qlnorm( p=p, meanlog=theta[1], sdlog=theta[2] ),
qgpd( p=(p - Fu) / (1 - Fu) , xi=theta.gpd[1], mu=theta.gpd[2], beta=theta.gpd[3]) )
return(x)
}
# Random sampling function of lognormal-GPD severity distribution
rlnorm.gpd = function(n, theta, theta.gpd, u)
{
r = qlnorm.gpd(runif(n), theta, theta.gpd, u)
}
set.seed(1000)
nSim = 1000000 # Number of simulated annual losses
H = 1500 # Threshold body-tail
lambda = 791.7354 # Parameter of Poisson body
theta1 = 2.5 # Parameter mu of lognormal (body)
theta2 = 2 # Parameter sigma of lognormal (body)
theta1.tail = 0.5 # Shape parameter of GPD (tail)
theta2.tail = H # Location parameter of GPD (tail)
theta3.tail = 1000 # Scale parameter of GPD (tail)
sj = rep(0,nSim) # Annual loss distribution inizialization
freq = rpois(nSim, lambda) # Random sampling from Poisson
for(i in 1:nSim) # Convolution with Monte Carlo method
sj[i] = sum(rlnorm.gpd(n=freq[i], theta=c(theta1,theta2), theta.gpd=c(theta1.tail, theta2.tail, theta3.tail), u=H))
However I get this error which I cannot resolve:
Error: min(p, na.rm = TRUE) >= 0 is not TRUE
APPENDED Question
Many thanks to Shadow.
I dont know how to change function reference. Is it as easy as qgpd.fExtremes to qgpd.evir?
Thanks to Shadow again to pointing this out.
For anyone who wishes to change reference to function from different package (In the above example from fExtremes to evir its as simple as adding evir:::(function).
Example:
evir:::qgpd( p=(p - Fu) / (1 - Fu) , xi=theta.gpd[1], mu=theta.gpd[2], beta=theta.gpd[3]) )
The reason you get an error here is that the packages fExtremes and evir both implement different versions of the function qgpd. In the evir version, p can be less than 0, while the fExtremes package only implements qgpd for p>=0.
The easiest solution to this is to change the qgpd function call to evir:::qgpd.