How to save estimated parameters from nigfit() in a variable - r

I want to automatically fit time series returns into a NIG distribution.
With nigfit() from the package fBasics I estimate the mu, alpha, beta and delta of the distribution.
> nigFit(histDailyReturns,doplot=FALSE,trace=FALSE)
Title:
Normal Inverse Gaussian Parameter Estimation
Call:
.nigFit.mle(x = x, alpha = alpha, beta = beta, delta = delta,
mu = mu, scale = scale, doplot = doplot, span = span, trace = trace,
title = title, description = description)
Model:
Normal Inverse Gaussian Distribution
Estimated Parameter(s):
alpha beta delta mu
48.379735861 -1.648483055 0.012361539 0.001125734
This works fine, which means that nigfit plots my parameters.
However I would like to use the estimated parameters and save them in variables. So I could use them later.
> variable = nigfit(histDailyReturns,doplot=FALSE,trace=FALSE)
This doesn't work out. 'variable' is an S4 object of class structure fDISTFIT. Calling the variable replots the output of nigfit above.
I tried the following notations, to get just one parameter:
> variable$alpha
> variable.alpha
> variable[1]
I couldn't find an answer in the documentation of nigfit.
Is it possible to save the estimated parameters in variables? How does it work?

access the output compenents using #. variable has different slots. Get their names using slotNames(). Using the example from the documentation:
set.seed(1953)
s <- rnig(n = 1000, alpha = 1.5, beta = 0.3, delta = 0.5, mu = -1.0)
a <- nigFit(s, alpha = 1, beta = 0, delta = 1, mu = mean(s), doplot = TRUE)
slotNames(a)
[1] "call" "model" "data" "fit" "title"
[6] "description"
# `fit` is a list with all the goodies. You're looking for the vector, `estimate`:
a#fit$estimate
alpha beta delta mu
1.6959724 0.3597794 0.5601027 -1.0446402

Examine the structure of the output object using str(variable):
> variable#fit$par[["alpha"]]
[1] 48.379735861
> variable#fit$par[["beta"]]
[1] -1.648483055
> variable#fit$par[["delta"]]
[1] 0.012361539
> variable#fit$par[["mu"]]
[1] 0.001125734

Related

get MLE of gamma distribution parameters(especially location parameter) in R

Hi, I want to estimate gamma distribution parameters hand by hand! I know a lot of R functions to estimate shape and scale parameters, but it seems hard to find code about estimating location parameter.
x <- c(108,91,62,59,84,60,71,105,70,69,66,65,78,83,82,68,107,68,68,69,80,
75,89,68,64,68,70,57,62,87,51,55,56,57,75,98,60,68,81,47,76,48,63,
58,40,62,61,58,38,40,45,68,56,64,49,53,50,39,54,47,37,50,54,70,49,
57,52,47,43,52,57,46,63,56,50,51,50,42,46,56,52,59,45,50,59,44,52,
54,53,63,45,56,55,53,56,46,45,49,63,50,41,42,53,50,58,50,37,53,58,
49,53,51,64,44,53,53,55,43,50,60,51,55,56,52,51,45,49,51,63,48,51,
60,45,40,50,66,62,69,53,54,49,47,63,55,62,57,58,51,50,57,62,45,47,
52,35,41,53,48,59,45,41,52,36,84,62,31,41,48,47,50,50,57,53,37,46,
41,56,51,39,59,53,51,49,45,42,32,55,34,43,35,48,33,41,38,57,37,40,
34,44,43,62,36,41,51,48,31,28,33,35,48,31)
# estimate shape and scale parameter
gamma_likelihood <- function(para){
sum ( (para[2] -1)*log(x) - para[2]*log(para[1]) - log(gamma(para[2])) - x/para[1] + 1/para[1])
}
MLE = optim(c(10,10),
fn = gamma_likelihood,
method = "L-BFGS-B",
lower = 0.00001,
control = list(fnscale = -1),
hessian = T
)
MLE$par
# estimate location, shape and scale parameter
gamma_likelihood <- function(para){
x = x[x > para[1]]
sum ( (para[3] -1)*log(x - para[1]) - para[3]*log(para[2]) -
log(gamma(para[3])) - x/para[2] + para[1]/para[2] )
}
MLE = optim(c(23,6,7),
fn = gamma_likelihood,
method = 'L-BFGS-B',
lower = 0.00000001,
control = list(fnscale = -1)
)
MLE$par
This is my code, I can estimate shape and scale parameters.
However, when it comes to add location parameters into log likelihood. The result seems incorrect.The TRUE parameters are c(21.4, 5.47, 6.0).
If you have any observed value less or equal than your location parameter, your whole likelihood for that value of lambda must be 0 (remember it's a function of parameters, not observations).
x = x[x > para[1]] is cutting observations that don't make sense for a specific location parameter, making your function return a valid number, when it should return -Inf if any of the x is "invalid", since you'd have 0 likelihood.
Here's a corrected version of your log-likelihood function:
# estimate location, shape and scale parameter
gamma_likelihood <- function(para){
if(min(x) < para[1]) return(-Inf)
sum ( (para[3] -1)*log(x - para[1]) - para[3]*log(para[2]) -
log(gamma(para[3])) - x/para[2] + para[1]/para[2] )
}
MLE = optim(c(23,6,7),
fn = gamma_likelihood,
method = 'L-BFGS-B',
lower = 0.00000001,
control = list(fnscale = -1)
)
MLE$par
retults in: [1] 21.161109 5.394343 6.136862

trying to use exact=TRUE feature in R glmnet

I am trying to use exact=TRUE feature in glmnet. But I am getting an error message.
> fit = glmnet(as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty)
> coef.exact = coef(fit, s = 0.03, exact = TRUE)
Error: used coef.glmnet() or predict.glmnet() with `exact=TRUE` so must in addition supply original argument(s) x and y and penalty.factor in order to safely rerun glmnet
How can I supply penalty.factor to coef.exact?
Options tried:-
> coef.exact = coef(as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty, s = 0.03, exact = TRUE)
Error: $ operator is invalid for atomic vectors
>
> coef.exact = coef((as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty), s = 0.03, exact = TRUE)
Error: unexpected ',' in "coef.exact = coef((as.matrix(((x_values))),"
>
> coef.exact = coef((as.matrix(((x_values))) (as.matrix(y_values)) penalty=variable.list$penalty), s = 0.03, exact = TRUE)
Error: unexpected symbol in "coef.exact = coef((as.matrix(((x_values))) (as.matrix(y_values)) penalty"
>
> coef.exact = coef(fit(as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty), s = 0.03, exact = TRUE)
Error in fit(as.matrix(((x_values))), (as.matrix(y_values)), penalty = variable.list$penalty) :
could not find function "fit"
>
> coef.exact = coef(glmnet(as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty), s = 0.03, exact = TRUE)
Error: used coef.glmnet() or predict.glmnet() with `exact=TRUE` so must in addition supply original argument(s) x and y and penalty.factor in order to safely rerun glmnet
>
Here is an example using mtcars as sample data. Note it's always advisable to provide a minimal & reproducible code example including sample data when posting on SO.
# Fit mpg ~ wt + disp
x <- as.matrix(mtcars[c("wt", "disp")]);
y <- mtcars[, "mpg"];
fit <- glmnet(x, y, penalty = 0.1);
# s is our regularisation parameter, and since we want exact results
# for s=0.035, we need to refit the model using the full data (x,y)
coef.exact <- coef(fit, s = 0.035, exact = TRUE, x = x, y = y, penalty.factor = 0.1);
coef.exact;
#3 x 1 sparse Matrix of class "dgCMatrix"
# 1
#(Intercept) 34.40289989
#wt -3.00225110
#disp -0.02016836
The reason why you explicitly need to provide x and y again is given in ?coef.glmnet (also see #FelipeAlvarenga post).
So in your case, the following should work:
fit = glmnet(x = as.matrix(x_values), y = y_values, penalty=variable.list$penalty)
coef.exact = coef(
fit,
s = 0.03,
exact = TRUE,
x = as.matrix(x_values),
y = y_values,
penalty.factor = variable.list$penalty)
Some comments
Perhaps the confusion arises from the difference between the model's overall regularisaton parameter (s or lambda) and the penalty.factors that you can apply to every coefficient. The latter allows for differential regularisation of individual parameters, whereas s controls the effect of overall L1/L2 regularisation.
In coef the parameter s corresponds to the penalty parameter. In the help files:
s Value(s) of the penalty parameter lambda at which predictions are
required. Default is the entire sequence used to create the model.
[...]
With exact=TRUE, these different values of s are merged (and sorted)
with object$lambda, and the model is refit before predictions are
made. In this case, it is required to supply the original data x= and
y= as additional named arguments to predict() or coef(). The workhorse
predict.glmnet() needs to update the model, and so needs the data used
to create it. The same is true of weights, offset, penalty.factor,
lower.limits, upper.limits if these were used in the original call.
Failure to do so will result in an error.
Therefore, to use exact = T you must assign your original penalties, x, y and any other parameter you inputted in your original model

R rugarch simulation

I'd like to know the range of each parameter in the rugarch specification models.
For example for distribution error "nig" and model "apARCH". I'd like to know what is the range for the parameters "skew", "shape" related to the "nig" distribution and the parameters "gamma" and "delta" for the model "apARCH".
This is my code example:
varianceModel = list(model="apARCH", garchOrder=c(1,1))
meanModel = list(armaOrder=c(1,1))
distributionModel = "nig"
fixedPars = list(mu=0, ar1 = 0.1, ma1= 0.9, omega=0.001, alpha1=0.1, beta1=0.8, gamma1 = 0.01, delta = 2, shape=1.5, skew = 0.2)
spec <- ugarchspec(variance.model = varianceModel,
mean.model= meanModel, distribution.model=distributionModel,
fixed.pars=fixedPars)
path.sgarch <- ugarchpath(spec, n.sim=1000, n.start=1, m.sim=20)
Now for each of this parameters, how I can get the possible range or the "standard" parameters?
There doesn't seem to be a list of ranges of possible values of such parameters in the documentation of rugarch, while this introduction provides only some partial information.
Those ranges of possible values, however, are (at least should be) standard in the sense that they provide well-defined distributions and stationary models. Hence, you should be able to find all such ranges in some other sources.
However, regarding the distributions, there actually is a hidden source in rugarch that you can use---the rugarch:::.DistributionBounds function source code. For instance, it contains
if (distribution == "nig") {
skew = 0.2
skew.LB = -0.99
skew.UB = 0.99
shape = 0.4
shape.LB = 0.01
shape.UB = 25
}
meaning that the lower and upper bounds for skew are -0.99 and 0.99, respectively. To extract those numbers faster, you may use
rugarch:::.DistributionBounds("nig")[c("skew.LB", "skew.UB")]
# $skew.LB
# [1] -0.99
#
# $skew.UB
# [1] 0.99
Regarding the variance models, typically "simple" ranges, such as as -1 < gamma < 1 for APARCH, are not available/what you want, because they only allow the model to exist, but doesn't guarantee stationarity. For instance, for GARCH(1,1) to be stationary we need alpha + beta < 1; hence, we actually have higher dimensional constraints than just intervals. As I said, you may find those online.
However, ugarchpath also checks those conditions by computing persistence(spec). Now, as you can see in
getMethod("persistence", signature(object = "uGARCHspec", pars = "missing",
distribution = "missing", model = "missing",
submodel="missing"))
there is a different way to compute this persistence for each specification. For instance, for APARCH we look at
rugarch:::.persistaparch1
# function (pars, idx, distribution = "norm")
# {
# alpha = pars[idx["alpha", 1]:idx["alpha", 2]]
# beta = pars[idx["beta", 1]:idx["beta", 2]]
# gamma = pars[idx["gamma", 1]:idx["gamma", 2]]
# delta = pars[idx["delta", 1]:idx["delta", 2]]
# skew = pars[idx["skew", 1]:idx["skew", 2]]
# shape = pars[idx["shape", 1]:idx["shape", 2]]
# ghlambda = pars[idx["ghlambda", 1]:idx["ghlambda", 2]]
# ps = sum(beta) + sum(apply(cbind(gamma, alpha), 1, FUN = function(x) x[2] *
# aparchKappa(x[1], delta, ghlambda, shape, skew, distribution)))
# return(ps)
# }
and the condition is that ps < 1. Notice that
rugarch:::.persistsgarch1
# function (pars, idx, distribution = "norm")
# {
# ps = sum(pars[idx["alpha", 1]:idx["alpha", 2]]) + sum(pars[idx["beta",
# 1]:idx["beta", 2]])
# return(ps)
# }
gives exactly alpha + beta in the case of GARCH(1,1) and then ugarchpathchecks the aforementioned stationarity condition. Hence, the most straightforward thing that you can do is to check if persistence(spec) < 1 before simulating. For instance, in your example,
persistence(spec)
# [1] 0.8997927

non-linear optimization in R using optim

I'm a newbie in R!
I would like to find the best gamma distribution parameters to fit my experimental counts data. The optim function's help file says the first argument of the function should be the parameters to be optimized. So I tried :
x = as.matrix(seq(1,20,0.1))
yexp = dgamma(x,2,1)*100 + rnorm(length(x),0,1)
f = function(p,x,yexp) {sum((p[1]*dgamma(x,p[2],scale=p[3]) - yexp)^2)}
mod = optim(c(50,2,1),f(p,x,yexp))
I get the error message :
Error in f(p, x, yexp) : object 'p' not found
Any hint where I'm wrong?
Supplementary question : Is there any other way to fit counts data with standard distribution (gamma, inverse gaussian, etc?)
optim expects its second argument to be a function. Also, the second and third arguments to f are fixed and need to be specified:
optim(c(50, 1, 2), f, x = x, yexp = yexp)
This would also work:
optim(c(50, 1, 2), function(p) f(p, x, yexp))
You could also use nls with default Nelder-Mead algorithm:
nls(yexp ~ a * dgamma(x, sh, scale=sc), start = list(a = 50, sh = 2, sc = 1))
or with plinear in which case no starting value is needed for the first parameter:
nls(c(yexp) ~ dgamma(x, sh, scale=sc), start = list(sh = 2, sc = 1), alg = "plinear")

Fitting a 3 parameter Weibull distribution

I have been doing some data analysis in R and I am trying to figure out how to fit my data to a 3 parameter Weibull distribution. I found how to do it with a 2 parameter Weibull but have come up short in finding how to do it with a 3 parameter.
Here is how I fit the data using the fitdistr function from the MASS package:
y <- fitdistr(x[[6]], 'weibull')
x[[6]] is a subset of my data and y is where I am storing the result of the fitting.
First, you might want to look at FAdist package. However, that is not so hard to go from rweibull3 to rweibull:
> rweibull3
function (n, shape, scale = 1, thres = 0)
thres + rweibull(n, shape, scale)
<environment: namespace:FAdist>
and similarly from dweibull3 to dweibull
> dweibull3
function (x, shape, scale = 1, thres = 0, log = FALSE)
dweibull(x - thres, shape, scale, log)
<environment: namespace:FAdist>
so we have this
> x <- rweibull3(200, shape = 3, scale = 1, thres = 100)
> fitdistr(x, function(x, shape, scale, thres)
dweibull(x-thres, shape, scale), list(shape = 0.1, scale = 1, thres = 0))
shape scale thres
2.42498383 0.85074556 100.12372297
( 0.26380861) ( 0.07235804) ( 0.06020083)
Edit: As mentioned in the comment, there appears various warnings when trying to fit the distribution in this way
Error in optim(x = c(60.7075705026659, 60.6300379017397, 60.7669410153573, :
non-finite finite-difference value [3]
There were 20 warnings (use warnings() to see them)
Error in optim(x = c(60.7075705026659, 60.6300379017397, 60.7669410153573, :
L-BFGS-B needs finite values of 'fn'
In dweibull(x, shape, scale, log) : NaNs produced
For me at first it was only NaNs produced, and that is not the first time when I see it so I thought that it isn't so meaningful since estimates were good. After some searching it seemed to be quite popular problem and I couldn't find neither cause nor solution. One alternative could be using stats4 package and mle() function, but it seemed to have some problems too. But I can offer you to use a modified version of code by danielmedic which I have checked a few times:
thres <- 60
x <- rweibull(200, 3, 1) + thres
EPS = sqrt(.Machine$double.eps) # "epsilon" for very small numbers
llik.weibull <- function(shape, scale, thres, x)
{
sum(dweibull(x - thres, shape, scale, log=T))
}
thetahat.weibull <- function(x)
{
if(any(x <= 0)) stop("x values must be positive")
toptim <- function(theta) -llik.weibull(theta[1], theta[2], theta[3], x)
mu = mean(log(x))
sigma2 = var(log(x))
shape.guess = 1.2 / sqrt(sigma2)
scale.guess = exp(mu + (0.572 / shape.guess))
thres.guess = 1
res = nlminb(c(shape.guess, scale.guess, thres.guess), toptim, lower=EPS)
c(shape=res$par[1], scale=res$par[2], thres=res$par[3])
}
thetahat.weibull(x)
shape scale thres
3.325556 1.021171 59.975470
An alternative: package "lmom". The estimative by L-moments technique
library(lmom)
thres <- 60
x <- rweibull(200, 3, 1) + thres
moments = samlmu(x, sort.data = TRUE)
log.moments <- samlmu( log(x), sort.data = TRUE )
weibull_3parml <- pelwei(moments)
weibull_3parml
zeta beta delta
59.993075 1.015128 3.246453
But I donĀ“t know how to do some Goodness-of-fit statistics in this package or in the solution above. Others packages you can do Goodness-of-fit statistics easily. Anyway, you can use alternatives like: ks.test or chisq.test

Resources