Fitting Weibull using method of moments in R - r

I am trying to fit a Weibull distribution using method of moments to my data in RStudio.
I don't know about the necessary commands and packages one needs to fit distributions such as Weibull or Pareto. Specifically I am trying to estimate the shape parameter k and the
scale λ.
I use this code to generate my data:
a <- rweibull(100, 10, 1)

Here is a function to estimate the Weibull distribution parameters with the method of moments.
weibull_mom <- function(x, interval){
mom <- function(shape, x, xbar){
s2 <- var(x, na.rm = TRUE)
lgamma(1 + 2/shape) - 2*lgamma(1 + 1/shape) - log(xbar^2 + s2) + 2*log(xbar)
}
xbar <- mean(x, na.rm = TRUE)
shape <- uniroot(mom, interval = interval, x = x, xbar = xbar)$root
scale <- xbar/gamma(1 + 1/shape)
list(shape = shape, scale = scale)
}
set.seed(2021) # Make the results reproducible
a <- rweibull(100, 10, 1)
weibull_mom(a, interval = c(1, 1e6))
#$shape
#[1] 9.006623
#
#$scale
#[1] 0.9818155
The maximum likelihood estimates are
MASS::fitdistr(a, "weibull")
# shape scale
# 8.89326148 0.98265852
# (0.69944224) (0.01165359)
#Warning messages:
#1: In densfun(x, parm[1], parm[2], ...) : NaNs produced
#2: In densfun(x, parm[1], parm[2], ...) : NaNs produced

Related

FME package: "Error in cov2cor(x$cov.unscaled) : 'V' is not a square numeric matrix" in fitting using modFit()

I'm trying to fit the differential equation using the least squares method (FME package).
However, I keep getting this error that I don't know how to tackle.
The reproducible example:
times = seq(0, 4, by = 0.5)
dat = data.frame(time = seq(1,4),
Tick = c(128, 52.5, 28, 121))
N = 10
tick.model <- function(time, y, params, ...) { #here we begin a function with three arguments
with(as.list(c(y, params)),{
dTick <- (30 - s.t*Tick)*Tick*0.3*N - delta.t*Tick
return(list(c(dTick)))
})
}
y = c(Tick = 82.375)
cost1 <- function(p) {
out <- ode(y, times, tick.model, p)
modCost(out, dat, weight = "none")
}
params <- c(s.t=0.1, delta.t = 1)
fit = modFit(f = cost1, p = params, lower = rep(0,2),
upper = c(10, 5))
summary(fit)
The result comes out like this:
Parameters:
Estimate Std. Error t value Pr(>|t|)
s.t 0.3641876 NA NA NA
delta.t 0.0001417 NA NA NA
Residual standard error: 60.92 on 2 degrees of freedom
Error in cov2cor(x$cov.unscaled) : 'V' is not a square numeric matrix
In addition: Warning message:
In summary.modFit(fit) : Cannot estimate covariance; system is singular
Also, the fitted model doesn't look nice
.
I have no idea what I could have done wrong.

GARCH estimation using maximum likelihood

I'm trying to estimate a GARCH (1,1) model using maximum likelihood with simulated data. This is what I got:
library(fGarch)
set.seed(1)
garch11<-garchSpec(model = list())
x<-garchSim(garch11, n = 1000)
y <- t(x)
r <- y[1, ]
### Calculate Residuals
CalcResiduals <- function(theta, r)
{
n <- length(r)
omega<-theta[1]
alpha11<-theta[2]
beta11<-theta[3]
sigma.sqs <- vector(length = n)
sigma.sqs[1] <- 0.02
for (i in 1:(n-1)){
sigma.sqs[i+1] <- omega + alpha11*(r[i]^2) + beta11*sigma.sqs[i]
}
return(list(et=r, ht=sigma.sqs))
}
###Calculate the log-likelihood
GarchLogl <- function(theta, r){
res <- CalcResiduals(theta,r)
sigma.sqs <- res$ht
r <- res$et
return(-sum(dnorm(r[-1], mean = 0, sd = sqrt(sigma.sqs[-1]), log = TRUE)))
}
fit2 <- nlm(GarchLogl, # function call
p = rep(1,3), # initial values = 1 for all parameters
hessian = FALSE, # also return the hessian matrix
r = r , # data to be used
iterlim = 500) # maximum iteration
Unfortunately I get the following error message and no results
There were 50 or more warnings (use warnings() to see the first 50)
1: In sqrt(sigma.sqs[-1]) : NaNs produced
2: In nlm(GarchLogl, p = rep(1, 3), hessian = FALSE, data <- r, ... :
NA/Inf durch größte positive Zahl ersetzt
Do you have any idea whats wrong with my code? Thanks a lot!

Errors while trying to fit gamma distribution with R fitdistr{MASS}

I have a problem with fitdistr{MASS} function in R. I have this vector:
a <- c(26,73,84,115,123,132,159,207,240,241,254,268,272,282,300,302,329,346,359,367,375,378, 384,452,475,495,503,531,543,563,594,609,671,687,691,716,757,821,829,885,893,968,1053,1081,1083,1150,1205,1262,1270,1351,1385,1498,1546,1565,1635,1671,1706,1820,1829,1855,1873,1914,2030,2066,2240,2413,2421,2521,2586,2727,2797,2850,2989,3110,3166,3383,3443,3512,3515,3531,4068,4527,5006,5065,5481,6046,7003,7245,7477,8738,9197,16370,17605,25318,58524)
and I want to fit a gamma distribution to the data with a command:
fitted.gamma <- fitdistr(a, "gamma")
but I have such error:
Error in optim(x = c(26, 73, 84, 115, 123, 132, 159, 207, 240, 241, 254, :
non-finite finite-difference value [1]
In addition: Warning messages:
1: In densfun(x, parm[1], parm[2], ...) : NaNs produced
2: In densfun(x, parm[1], parm[2], ...) : NaNs produced
3: In densfun(x, parm[1], parm[2], ...) : NaNs produced
4: In densfun(x, parm[1], parm[2], ...) : NaNs produced
So I tried with initializing the parameters:
(fitted.gamma <- fitdistr(a, "gamma", start=list(1,1)))
The object fitted.gamma is created but when printed, creates an error:
Error in dn[[2L]] : subscript out of bounds
Do you know what is happening or maybe know some other R functions to fit univariate distributions by MLE?
Thanks in advance for any help or response.
Kuba
Always plot your stuff first, you scaling is far offfffffff.
library(MASS)
a <- c(26,73,84,115,123,132,159,207,240,241,254,268,272,282,300,302,329,346,359,367,375,378, 384,452,475,495,503,531,543,563,594,609,671,687,691,716,757,821,829,885,893,968,1053,1081,1083,1150,1205,1262,1270,1351,1385,1498,1546,1565,1635,1671,1706,1820,1829,1855,1873,1914,2030,2066,2240,2413,2421,2521,2586,2727,2797,2850,2989,3110,3166,3383,3443,3512,3515,3531,4068,4527,5006,5065,5481,6046,7003,7245,7477,8738,9197,16370,17605,25318,58524)
## Ooops, rater wide
plot(hist(a))
fitdistr(a/10000,"gamma") # gives warnings
# No warnings
fitted.gamma <- fitdistr(a/10000, dgamma, start=list(shape = 1, rate = 0.1),lower=0.001)
Now you can decide what to do with the scaling
For data that clearly fits the gamma distribution, but is on the wrong scale (i.e., as if it had been multiplied/divided by a large number), here's an alternative approach to fitting the gamma distribution:
fitgamma <- function(x) {
# Equivalent to `MASS::fitdistr(x, densfun = "gamma")`, where x are first rescaled to
# the appropriate scale for a gamma distribution. Useful for fitting the gamma distribution to
# data which, when multiplied by a constant, follows this distribution
if (!requireNamespace("MASS")) stop("Requires MASS package.")
fit <- glm(formula = x ~ 1, family = Gamma)
out <- MASS::fitdistr(x * coef(fit), "gamma")
out$scaling_multiplier <- unname(coef(fit))
out
}
Usage:
set.seed(40)
test <- rgamma(n = 100, shape = 2, rate = 2)*50000
fitdistr(test, "gamma") # fails
dens_fit <- fitgamma(test) # successs
curve(dgamma(x, 2, 2), to = 5) # true distribution
curve(dgamma(x, dens_fit$estimate['shape'], dens_fit$estimate['rate']), add=TRUE, col=2) # best guess
lines(density(test * dens_fit$scaling_multiplier), col = 3)

Generalised Pareto Distribution MLE R code

I've written a function to calculate the MLE estimates of a Generalised Pareto Distribution. When I use it with any data though I'm getting errors like this
1: In log(beta * ksi) : NaNs produced
2: In nlm(loglik, theta, stepmax = 5000, iterlim = 1000) :
NA/Inf replaced by maximum positive value
I was wondering if anyone could spot any mistakes with my code?
MLGPD<-function(data){
xi0 <- 1
beta0 <- 360
theta <- c(xi0, beta0)
excess <- data
assign("tmp", excess)
loglik <- function(theta){
ksi <- theta[1]
beta <- theta[2]
y <- ((tmp - 0.1)/beta)
f <- ((1/ksi)+1)*sum(log(1+y)) + length(tmp) * log(beta*ksi)
f
}
fit <- nlm(loglik, theta, stepmax = 5000, iterlim= 1000)
return(fit)
par.ests <- fit$x
return(par.ests)
}
#Checking our MLE algorithm works:
rgpd<-function(n,ksi, beta){
10000+beta*(((1-runif(n, min=0, max=1))^-ksi)-1)
}
rgpd1 <- rgpd(100, 1, 2.5)
MLGPD(rgpd1)
Thanks!

Fitting a 3 parameter Weibull distribution

I have been doing some data analysis in R and I am trying to figure out how to fit my data to a 3 parameter Weibull distribution. I found how to do it with a 2 parameter Weibull but have come up short in finding how to do it with a 3 parameter.
Here is how I fit the data using the fitdistr function from the MASS package:
y <- fitdistr(x[[6]], 'weibull')
x[[6]] is a subset of my data and y is where I am storing the result of the fitting.
First, you might want to look at FAdist package. However, that is not so hard to go from rweibull3 to rweibull:
> rweibull3
function (n, shape, scale = 1, thres = 0)
thres + rweibull(n, shape, scale)
<environment: namespace:FAdist>
and similarly from dweibull3 to dweibull
> dweibull3
function (x, shape, scale = 1, thres = 0, log = FALSE)
dweibull(x - thres, shape, scale, log)
<environment: namespace:FAdist>
so we have this
> x <- rweibull3(200, shape = 3, scale = 1, thres = 100)
> fitdistr(x, function(x, shape, scale, thres)
dweibull(x-thres, shape, scale), list(shape = 0.1, scale = 1, thres = 0))
shape scale thres
2.42498383 0.85074556 100.12372297
( 0.26380861) ( 0.07235804) ( 0.06020083)
Edit: As mentioned in the comment, there appears various warnings when trying to fit the distribution in this way
Error in optim(x = c(60.7075705026659, 60.6300379017397, 60.7669410153573, :
non-finite finite-difference value [3]
There were 20 warnings (use warnings() to see them)
Error in optim(x = c(60.7075705026659, 60.6300379017397, 60.7669410153573, :
L-BFGS-B needs finite values of 'fn'
In dweibull(x, shape, scale, log) : NaNs produced
For me at first it was only NaNs produced, and that is not the first time when I see it so I thought that it isn't so meaningful since estimates were good. After some searching it seemed to be quite popular problem and I couldn't find neither cause nor solution. One alternative could be using stats4 package and mle() function, but it seemed to have some problems too. But I can offer you to use a modified version of code by danielmedic which I have checked a few times:
thres <- 60
x <- rweibull(200, 3, 1) + thres
EPS = sqrt(.Machine$double.eps) # "epsilon" for very small numbers
llik.weibull <- function(shape, scale, thres, x)
{
sum(dweibull(x - thres, shape, scale, log=T))
}
thetahat.weibull <- function(x)
{
if(any(x <= 0)) stop("x values must be positive")
toptim <- function(theta) -llik.weibull(theta[1], theta[2], theta[3], x)
mu = mean(log(x))
sigma2 = var(log(x))
shape.guess = 1.2 / sqrt(sigma2)
scale.guess = exp(mu + (0.572 / shape.guess))
thres.guess = 1
res = nlminb(c(shape.guess, scale.guess, thres.guess), toptim, lower=EPS)
c(shape=res$par[1], scale=res$par[2], thres=res$par[3])
}
thetahat.weibull(x)
shape scale thres
3.325556 1.021171 59.975470
An alternative: package "lmom". The estimative by L-moments technique
library(lmom)
thres <- 60
x <- rweibull(200, 3, 1) + thres
moments = samlmu(x, sort.data = TRUE)
log.moments <- samlmu( log(x), sort.data = TRUE )
weibull_3parml <- pelwei(moments)
weibull_3parml
zeta beta delta
59.993075 1.015128 3.246453
But I don´t know how to do some Goodness-of-fit statistics in this package or in the solution above. Others packages you can do Goodness-of-fit statistics easily. Anyway, you can use alternatives like: ks.test or chisq.test

Resources