How to fit Gumbel distribution? - r

I want to find a package in R to fit the extreme value distribution
https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution with three unknown parameters mu, sigma, xi.
I found two packages that can do the inference for these three parameters based on maximum likelihood estimation.
library(ismev)
gev.fit(data)
and
library(extRemes)
fevd(data)
the output is estimates of mu, sigma, and xi.
But if I just want to fit distribution with two parameters mu and sigma (like Gumbel distribution, the parameter xi=0). How to apply the above two packages? Or are there any other packages that can do inference for the Gumbel distribution?

The evd package has 2-parameter [dpqr]gumbel functions that you can combine with any general-purpose optimization method (optim() is one such possibility, as suggested in the comments, but there are some shortcuts as suggested below).
Load packages, simulate example:
library(evd)
library(fitdistrplus)
set.seed(101)
x <- rgumbel(1000, loc = 2, scale = 2.5)
Make a more robust wrapper for dgumbel() that won't throw an error if we hand it a non-positive scale value (there are other ways to deal with this problem, but this one works):
dg <- function(x, loc, scale, log) {
r <- try(dgumbel(x, loc, scale, log), silent = TRUE)
if (inherits(r, "try-error")) return(NA)
return(r)
}
fitdistr(x, dg, start = list(loc = 1, scale = 1))
Results seem reasonable:
loc scale
2.09220866 2.48122956
(0.08261121) (0.06102183)
If you want more flexibility I would recommend the bbmle package (for possibly obvious reasons :-) )

Related

How to normalize a Lmer model?

lmer:
mixed.lmer6 <- lmer(Size ~ (Time+I(Time^2))*Country*STemperature +
(1|Country:Locality)+ (1|Locality:Individual)+(1|Batch)+
(1|Egg_masses), REML = FALSE, data = data_NoNA)
residuals:
plot_model(mixed.lmer6, type = "diag")
Tried manual log,power, sqrt transformations in my formula but no improvement and I also can not find a suitable automatic transformation R function such as BoxCox (which does not work for LMER's)
Any help or tips would be appreciated
This might be better suited for CrossValidated ("what should I do?" is appropriate for CV; "how should I do it?" is best for Stack Overflow), but I'll take a crack.
The Q-Q plot is generally the last/least important diagnostic you should look at (the order should be approximately (1) check for significant bias/missed patterns in the mean [fitted vs. residual, residual vs. covariates]; (2) check for outliers/influential points [leverage, Cook's distance]; (3) check for heteroscedasticity [scale-location plot]; (4) check distributional assumptions [Q-Q plot]). The reason is that any of the "upstream" failures (e.g. missed patterns) will show up in the Q-Q plot as well; resolving them will often resolve the apparent non-Normality.
If you can fix the distributional assumptions by fixing something else about the model (adding covariates/adding interactions/adding polynomial or spline terms/removing outliers), then do that.
you could code your own brute-force Box-Cox, something like
fitted_model <- lmer(..., data = mydata)
bcfun <- function(lambda, resp = "y") {
y <- mydata[[resp]]
mydata$newy <- if (lambda==0) log(y) else (y^lambda -1)/lambda
## https://stats.stackexchange.com/questions/261380/how-do-i-get-the-box-cox-log-likelihood-using-the-jacobian
log_jac <- sum((lambda-1)*log(y))
newfit <- update(fitted_model, newy ~ ., data = mydata)
return(-2*(c(logLik(newfit))+ log_jac))
}
lambdavec <- seq(-2, 2, by = 0.2)
boxcox <- vapply(lambdavec, bcfun, FUN.VALUE = numeric(1))
plot(lambdavec, boxcox - min(boxcox))
(lightly tested! but feel free to let me know if it doesn't work)
if you do need to fit a mixed model with a heavy-tailed residual distribution (e.g. Student t), the options are fairly limited. The brms package can fit such models (but takes you down the Bayesian/MCMC rabbit hole), and the heavy package (currently archived on CRAN) will work, but doesn't appear to handle crossed random effects.

Fitting of Pearson Type III in R using fitdist

I have fitted many distributions to my data but I am facing difficulty in fitting Pearson type III distribution to the data. I also used plotdist function to find starting or initial values in order to fit the distribution using iterative method.The plots obtained from plotdist shows that the plot is a good fit to data at the given starting values But fitdist function does not work and give error code of 100. I also studied the problems and answers available on stack overflow regarding fitting of log Pearson Type III distribution and applied the code but then again I am facing problem in running fitdist function and getting error code of 100 again. The data may be downloaded from the following link.
Lheadway <- pvr$headway+0.0000001
m <- mean(Lheadway)
v <- var(Lheadway)
s <- sd(Lheadway)
g <- e1071::skewness(Lheadway, type=1)
n <- length(Lheadway)
g <- g*(sqrt(n*(n-1))/(n-2))*(1+8.5/n)
my.shape <- (2/g)^2
#my.scale <- sqrt(v)/sqrt(my.shape)*sign(g) # modified as recommended by Carl Schwarz
my.scale <- sqrt(v)/sqrt(my.shape)*sign(g)
my.location <- m-sqrt(v * my.shape)
my.param <- list(shape=my.shape, scale=my.scale, location=my.location)
dPIII<-function(x, shape, location, scale) PearsonDS::dpearsonIII(x, shape, location, scale, log=FALSE)
pPIII<-function(q, shape, location, scale) PearsonDS::ppearsonIII(q, shape, location, scale, lower.tail = TRUE, log.p = FALSE)
qPIII<-function(p, shape, location, scale) PearsonDS::qpearsonIII(p, shape, location, scale, lower.tail = TRUE, log.p = FALSE)
fitPIII <- fitdistrplus::fitdist(Lheadway, distr="PIII", method="mle", start=my.param)
plot(fitPIII)
https://ptagovsa-my.sharepoint.com/:x:/g/personal/kkhan_tga_gov_sa/EfzCE5h0jexCkVw0Ak2S2_MBWf3WUywMd1izw41r0EsLeQ?e=EiqWDc
data is available at
The function works just by changing the method from mle to mse or mge and removing param argument from the code.

Parameter estimates using FME ODE model fitting in R

I have a system of ODE equations that I am trying to fit to generated data, synthetic or lab. The final product I am interested in is the parameter and it's estimated error. We use the R package FME with modCost and modFit. As an example, a system of ODEs may be defined as such:
eqs <- function (time, y, parms, ...) {
with(as.list(c(parms, y)), {
dP <- k2*PA - k1*A*P # concentration of nucleic acid
dA <- dP # concentration of free protein
dPA <- -dP
list(c(dA,dP,dPA))
}
}
with parameters k1 and k2 and variables A,P and PA. I import the data (not shown) and define the cost function used in modFit
cost <- function(p, data, ...) {
yy <- p[c("A","P","PA")]
pp <- p[c("k1", "k2")]
out <- ode(yy, time, eqs, pp)
modCost(out, data, ...)
}
I set some initial conditions with a parms vector and then do the fitting with
fit <- modFit(f = cost, p = parms, data = dat, weight = "std",
lower = rep(0, 8), upper = c(600,100,600,0.01,0.01), method = "Marq")
I then do a final ode to get the generated fits with best parameters, Bob's your uncle, and boom, estimated parameters. The input numbers don't matter, I hope my process outline is legible for those who use this package.
My issue and question centers around two things: I'm a scientist, a physicist, and the error of the estimated parameters is important to report. Can I generate the estimated error from MFE somehow or is there a separate package for that kind of return?
I don't get your point. You can just use:
summary(fit)
to see the Std. Error.

Maximum Likelihood Estimation by hand for normal distribution in R

I am a newbie in R and searched in several forums but didn't got an answer so far. We are asked to do a maximum likelihood estimation in R for an AR(1) model without using the arima() command. We should estimate the intercept alpha, the coefficient beta and the variance sigma2. The data should be following a normal distribution, where I derived the log-likelihood function from. I was then trying to program the function with the following code:
Y <- data$V2
nlogL <- function(theta,Y){
alpha <- theta[1]
rho <- theta[2]
sigma2 <- theta[3]
logl <- -(100/2)*log(2*pi) - (100/2)*log(theta[3]) - (0.5*theta[3])*sum(Y-(theta[1]/(1-theta[2]))**2)
return(-logl)
}
par0 <- c(0.1,0.1,0.1)
opt <- optim(par0, nlogL, hessian = TRUE)
When running this code I always get the error message: Error in Y - (theta[1]/(1 - theta[2]))^2 : 'Y' is missing.
It would be great if you could have a look whether the likelihood function is derived correctly.
Thank you very much in advance for your help!
Your nlogL function should only take a single argument, theta. So you can fix your immediate problem simply by removing the 2nd argument to the function, and the Y variable would be resolved by its definition outside of nlogL. Alternatively, you could keep the signature of nlogL as-is and pass Y as an additional argument through optim like this: optim(par0, nlogL, hessian = TRUE, Y=Y). Also I would second chinsoon12's suggestion to review ?optim.

Weighted Portmanteau Test for Fitted GARCH process

I have fitted a GARCH process to a time series and analyzed the ACF for squared and absolute residuals to check the model goodness of fit. But I also want to do a formal test and after searching the internet, The Weighted Portmanteau Test (originally by Li and Mak) seems to be the one.
It's from the WeightedPortTest package and is one of the few (perhaps the only one?) that properly tests the GARCH residuals.
While going through the instructions in various documents I can't wrap my head around what the "h.t" argument wants. It says in the info in R that I need to assign "a numeric vector of the conditional variances". This may be simple to an experienced user, though I'm struggling to understand. What is it that I need to do and preferably how would I code it in R?
Thankful for any kind of help
Taken directly from the documentation:
h.t: a numeric vector of the conditional variances
A little toy example using the fGarch package follows:
library(fGarch)
library(WeightedPortTest)
spec <- garchSpec(model = list(alpha = 0.6, beta = 0))
simGarch11 <- garchSim(spec, n = 300)
fit <- garchFit(formula = ~ garch(1, 0), data = simGarch11)
Weighted.LM.test(fit#residuals, fit#h.t, lag = 10)
And using garch() from the tseries package:
library(tseries)
fit2 <- garch(as.numeric(simGarch11), order = c(0, 1))
summary(fit2)
# comparison of fitted values:
tail(fit2$fitted.values[,1]^2)
tail(fit#h.t)
# comparison of residuals after unstandardizing:
unstd <- fit2$residuals*fit2$fitted.values[,1]
tail(unstd)
tail(fit#residuals)
Weighted.LM.test(unstd, fit2$fitted.values[,1]^2, lag = 10)

Resources