optim() not giving correct minima - r

I'm using optim to try and find the critical region in a binomial test, however after a certain sample size it fails to converge on the correct value.
Seems like the function is well behaved so not sure why it stops working at this point.
N <- 116
optim(1, function(x) abs(1 - pbinom(x, N, 0.1) - 0.05), method = "Brent", lower = 1, upper = N)
The optim function as above works for N < 116.

You should probably use the built-in qbinom function, which computes specified values of the quantile (inverse CDF) function for the binomial distribution: it works fine for any reasonable value of N.
N <- 116
qbinom(0.95, size = N, prob = 0.1)
The function is not well-behaved from an optimization point of view: as explained here, it is piecewise constant.

The gradient at your starting point is almost 0 and the algorithm cannot move to the next best solution.
One way is to use another starting point:
optim(0.1*N, function(x) abs(1 - pbinom(x, N, 0.1) - 0.05), method = "Brent", lower = 1, upper = N)
or to use optimize since its one dimensional:
optimize(function(x) abs(1 - pbinom(x, N, 0.1) - 0.05), c(1,N))

Related

Integral and numeric optimization (nlminb) R

I am having issues with an optimization problem involving numerical estimation of an integral which contains an unknown variable.
Numerical estimating an integral is simple enough, just use the integrate function in R. I am trying to estimate a rather unpleasant integral which requires optimization since it contains an unknown variable and a constraint. I am using the nlminb function but the result is highly incorrect. The idea is to evaluate the integral to constraint smaller or equal to 1-l, where l is between 0 and 1.
the code is the following:
integrand <- function(x, p) {dnorm(x,0,1)*(1-dnorm((qnorm(p)
-sqrt(0.12)*x)/(sqrt(1-0.12)), 0, 1))^800}
and it is the variable p which is unknown.
The objective function to be minimised is the following:
objective <- function(p){
PoD <- integrate(integrand, lower = -Inf, upper = Inf, p = p)$value
PoD - 0.5
}
test <- nlminb(0.015, objective = objective, lower = 0, upper = 1)$par*100
Edited to reflect mistakes in the objective function and the integral.
Same issue still remains.
I think my mistake is not specifying which variable to minimise. The optimisation just gives the starting value in nlminb multiplied by 100.
The authors of the paper used dummy variables and showed that a l = 0,5 should give p=0,15%.
Thank you for your time.
Of course, since your objective function does not depend on p. Do:
integrand <- function(x, p) {dnorm(x,0,1)*(1-dnorm((qnorm(p)
-sqrt(0.12)*x)/(sqrt(1-0.12)), 0, 1))^800}
objective <- function(p){
PoD <- integrate(integrand, lower = -Inf, upper = Inf, p = p)$value
PoD - 0.5
}

Nested integration for incomplete convolution of gauss densities

Let g(x) = 1/(2*pi) exp ( - x^2 / 2) be the density of the normal distribution with mean 0 and standard deviation 1. In some calculation on paper appeared integrals of the form
where c>0 is a positive number.
Since I could not evaluate this by hand, I had the idea to approximate and plot it. I tried this in R, because R provides the dnorm function and a function to do integrals.
You see that I need to integrate numerically n times, where n shall be chosed by the call of a plot function. My code has an for-loop to create those "incomplete" convolutions iterativly.
For example even with n=3 and c=1 this gives me an error. n=2 (thus it's one integration) works.
N = 3
ngauss <- function(x) dnorm(x , mean = 0, sd = 1)
convoluts <- list()
convoluts[[1]] <- ngauss
for (i in 2:N) {
h <- function(y) {
g <- function(z) {ngauss(y-z)*convoluts[[i-1]](z)}
return(integrate(g, lower = -1, upper = 1)$value)
}
h <- Vectorize(h)
convoluts[[i]] <- h
}
convoluts[[3]](0)
What I get is:
Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?
I understand that this is a hard computation, but for "small" n something similar should possible.
Maybe someone can help me to fix my code or provide a recommendation how I can implement this in a better way. Another language that is more appropriate for this would be also okay.
The issue appears to be in how integrate deals with variables in different environments. In particular, it doesn't really deal with i correctly in each iteration. Instead using
h <- evalq(function(y) {
g <- function(z) {ngauss(y - z) * convoluts[[i - 1]](z)}
integrate(g, lower = -1, upper = 1)$value
}, list(i = i))
does the job and, say, setting N <- 6 quickly gives
convoluts[[N]](0)
# [1] 0.03423872
As your integration is simply the pdf of a sum of N independent standard normals (which then follows N(0, N)), we may also verify this approach by setting lower = -Inf and upper = Inf. Then with N <- 4 we have
dnorm(0, sd = sqrt(N))
# [1] 0.1994711
convoluts[[N]](0)
# [1] 0.1994711
So, for practical purposes, when c = Inf, you are way better off using dnorm rather than manual computations.

R: conditional expected value

Hello everybody (this is my first post in here)!
I'm having a problem with finding the conditional expected value for a given distribution.
Suppose that we need to find E( x | x>0.5 ), where x has gev (generalised extreme value) distribution, with density dgev(x, xi, sigma, mu). What I was trying to do was
library(evir)
func1 <- function(x) {x*dgev(x, xi, sigma, mu)}
integral <- integrate(func1, lower = 0.5, upper = 10000, subdivisions = 10000)
cond.exp.val <- as.numeric(integral[1])/(1-q)
where q is the value that gives qgev(q, xi, sigma, mu) = 0.5, used for normalisation.
The result greatly depends on the 'upper' parameter of integrate() function and for higher values of this parameter the integral diverges. As my distribution parameters are
xi <- 0.81
sigma <- 0.0067
mu <- 0.0072
this integration should be feasible and convergent. Do you have any ideas what I am doing wrong or is there any built-in R function that may calculate such conditional expected value?
Generally, you are advised to use Inf rather than a large number when integrating the right tail of a density. See details in ?integrate. I took your description of q as being a value obtained by iteration and I stopped when I got within 4 decimal places of 0.5 using q <- 0.99315:
qgev(.99315, xi, sigma, mu)
[1] 0.4998413
You also incorrectly used the extraction from your integral variable. Should use either "[[" or "$" when working with lists:
func1 <- function(x) {x*dgev(x, xi, sigma, mu)}
integral <- integrate(func1, lower = 0.5, upper = Inf, subdivisions = 10000)
(cond.exp.val <- integral[[1]]/(1-.99315)) # `as.numeric` not needed
#[1] 2.646068
I have concerns that your description of how to get q was misleading, since values above 1 should not be an expectation derived from a statistical PDF.

Optimizing the code for error minimization

I have written the code below for minimization of error by changing the value of alpha (using iteration method).
set.seed(16)
npoints = 10000
Y = round(runif(npoints), 3)
OY = sample(c(0, 1, 0.5), npoints, replace = T)
minimizeAlpha = function(Y, OY, alpha) {
PY = alpha*Y
error = OY - PY
squaredError = sapply(error, function(x) x*x)
sse = sum(squaredError)
return(sse)
}
# # Iterate for 10000 values
alphas = seq(0.0001, 1, 0.0001)
sse = sapply(alphas, function(x) minimizeAlpha(Y, OY, x))
print(alphas[sse == min(sse)])
I have used sapply for basic optimization. But, if the number of points are more than 10000 this code is running forever. So, is there any better way of implementation or any standard techniques to optimize (like Bisection). If so can you please help me in optimizing the code.
Note: I need the value of alpha with at least 4 decimals.
Any help is appreciated.
Replacing sapply instead of for isn’t more efficient, that’s a misconception. It’s merely often simpler code.
However, you can actually take advantage of vectorisation in your code — and that would be faster.
For instance, sapply(error, function(x) x*x) can simply be replaced by x * x. The sum of squared errors of numbers in R is thus simply sum((OY - PY) ** 2).
Your whole function thus boils down to:
minimizeAlpha = function(Y, OY, alpha)
sum((OY - alpha * Y) ** 2)
This should be more efficient — but first and foremost it’s better code and more readable.

Floating point comparison with zero

I'm writing a function to calculate the quantile of the GEV distribution. The relevant aspect for this question is that a different form of the function is required when one of the parameters (the shape parameter or kappa) is zero
Programmatically, this is commonly addressed as follows (this is a snippet from evd:qgev and is similar in lmomco::quagev):
(Edit: Version 2.2.2 of lmomco has addressed the issue identified in this question)
if (shape == 0)
return(loc - scale * log(-log(p)))
else return(loc + scale * ((-log(p))^(-shape) - 1)/shape)
This works fine if shape/kappa is exactly equal to zero but there is odd behaviour near zero.
Lets look at an example:
Qgev_zero <- function(shape){
# p is an exceedance probability
p= 0.01
location=0
scale=1
if(shape == 0) return( location - scale*(log(-log(1-p) )))
location + (scale/shape)*((-log(1-p))^-shape - 1)
}
Qgev_zero(0)
#[1] 4.600149
Qgev_zero(1e-8)
#[1] 4.600149
This looks fine because the same answer is returned near zero and at zero. But look at what happens closer to zero.
k.seq <- seq(from = -4e-16, to = 4e-16, length.out = 1000)
plot(k.seq, sapply(k.seq, Qgev_zero), type = 'l')
The value returned by the function oscillates is often incorrect.
These problems go away if I replace the direct comparison with zero with all.equal e.g.
if(isTRUE(all.equal(shape, 0))) return( location - scale*(log(-log(1-p) )))
Looking at the help for all.equal suggests that for default values, anything smaller than 1.5e-8 will be treated as zero.
Of course this odd behaviour near zero is probably not generally an issue but in my case, I'm using optimisation/root finding to determine parameters from known quantiles so am concerned that my code needs to be robust.
To the question: is using all.equal(target, 0) an appropriate way to deal with this problem? Why is it that this approach isn't used routinely?
Some functions, when implemented the obvious way with floating point representations, are ill-behaved at certain points. That's especially likely to be the case when the function has to be manually defined at a single point: When things go absolutely undefined at a point, it's likely that they're hanging on for dear life when they get close.
In this case, that's from the kappa denominator fighting the kappa negative exponent. Which one wins the battle is determined on a bit-by-bit basis, each one sometimes winning the "rounding to a stronger magnitude" contest.
There's a variety of approaches to fixing these sorts of problems, all of them designed on a case-by-case basis. One often-flawed but easy-to-implement approach is to switch to a better-behaved representation (say, the Taylor expansion with respect to kappa) near the problematic point. That'll introduce discontinuities at the boundaries; if necessary, you can try interpolating between the two.
Following Sneftel's suggestion, I calculate the quantile at k = -1e-7 and k = 1e-7 and interpolate if k argument falls between these limits. This seems to work.
In this code I'm using the parameterisation for the gev quantile function from lmomco::quagev
(Edit: Version 2.2.2 of lmomco has addressed the issues identified in this question)
The function Qgev is the problematic version (black line on plot), while Qgev_interp, interpolates near zero (green line on plot).
Qgev <- function(K, f, XI, A){
# K = shape
# f = probability
# XI = location
# A = scale
Y <- -log(-log(f))
Y <- (1-exp(-K*Y))/K
x <- XI + A*Y
return(x)
}
Qgev_interp <- function(K, f, XI, A){
.F <- function(K, f, XI, A){
Y <- -log(-log(f))
Y <- (1-exp(-K*Y))/K
x <- XI + A*Y
return(x)
}
k1 <- -1e-7
k2 <- 1e-7
y1 <- .F(k1, f, XI, A)
y2 <- .F(k2, f, XI, A)
F_nearZero <- approxfun(c(k1, k2), c(y1, y2))
if(K > k1 & K < k2) {
return(F_nearZero(K))
} else {
return(.F(K, f, XI, A))
}
}
k.seq <- seq(from = -1.1e-7, to = 1.1e-7, length.out = 1000)
plot(k.seq, sapply(k.seq, Qgev, f = 0.01, XI = 0, A = 1), col=1, lwd = 1, type = 'l')
lines(k.seq, sapply(k.seq, Qgev_interp, f = 0.01, XI = 0, A = 1), col=3, lwd = 2)

Resources