Optimizing the code for error minimization - r

I have written the code below for minimization of error by changing the value of alpha (using iteration method).
set.seed(16)
npoints = 10000
Y = round(runif(npoints), 3)
OY = sample(c(0, 1, 0.5), npoints, replace = T)
minimizeAlpha = function(Y, OY, alpha) {
PY = alpha*Y
error = OY - PY
squaredError = sapply(error, function(x) x*x)
sse = sum(squaredError)
return(sse)
}
# # Iterate for 10000 values
alphas = seq(0.0001, 1, 0.0001)
sse = sapply(alphas, function(x) minimizeAlpha(Y, OY, x))
print(alphas[sse == min(sse)])
I have used sapply for basic optimization. But, if the number of points are more than 10000 this code is running forever. So, is there any better way of implementation or any standard techniques to optimize (like Bisection). If so can you please help me in optimizing the code.
Note: I need the value of alpha with at least 4 decimals.
Any help is appreciated.

Replacing sapply instead of for isn’t more efficient, that’s a misconception. It’s merely often simpler code.
However, you can actually take advantage of vectorisation in your code — and that would be faster.
For instance, sapply(error, function(x) x*x) can simply be replaced by x * x. The sum of squared errors of numbers in R is thus simply sum((OY - PY) ** 2).
Your whole function thus boils down to:
minimizeAlpha = function(Y, OY, alpha)
sum((OY - alpha * Y) ** 2)
This should be more efficient — but first and foremost it’s better code and more readable.

Related

Nested integration for incomplete convolution of gauss densities

Let g(x) = 1/(2*pi) exp ( - x^2 / 2) be the density of the normal distribution with mean 0 and standard deviation 1. In some calculation on paper appeared integrals of the form
where c>0 is a positive number.
Since I could not evaluate this by hand, I had the idea to approximate and plot it. I tried this in R, because R provides the dnorm function and a function to do integrals.
You see that I need to integrate numerically n times, where n shall be chosed by the call of a plot function. My code has an for-loop to create those "incomplete" convolutions iterativly.
For example even with n=3 and c=1 this gives me an error. n=2 (thus it's one integration) works.
N = 3
ngauss <- function(x) dnorm(x , mean = 0, sd = 1)
convoluts <- list()
convoluts[[1]] <- ngauss
for (i in 2:N) {
h <- function(y) {
g <- function(z) {ngauss(y-z)*convoluts[[i-1]](z)}
return(integrate(g, lower = -1, upper = 1)$value)
}
h <- Vectorize(h)
convoluts[[i]] <- h
}
convoluts[[3]](0)
What I get is:
Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?
I understand that this is a hard computation, but for "small" n something similar should possible.
Maybe someone can help me to fix my code or provide a recommendation how I can implement this in a better way. Another language that is more appropriate for this would be also okay.
The issue appears to be in how integrate deals with variables in different environments. In particular, it doesn't really deal with i correctly in each iteration. Instead using
h <- evalq(function(y) {
g <- function(z) {ngauss(y - z) * convoluts[[i - 1]](z)}
integrate(g, lower = -1, upper = 1)$value
}, list(i = i))
does the job and, say, setting N <- 6 quickly gives
convoluts[[N]](0)
# [1] 0.03423872
As your integration is simply the pdf of a sum of N independent standard normals (which then follows N(0, N)), we may also verify this approach by setting lower = -Inf and upper = Inf. Then with N <- 4 we have
dnorm(0, sd = sqrt(N))
# [1] 0.1994711
convoluts[[N]](0)
# [1] 0.1994711
So, for practical purposes, when c = Inf, you are way better off using dnorm rather than manual computations.

Two variable function maximization - R code

So I'm trying to maximize the likelihood function for a gamma-poisson and I've programmed it into R as the following:
lik<- function(x,t,a,b){
for(i in 1:n){
like[i] =
log(gamma(a + x[i]))-log(gamma(a))
-log(gamma(1+x[i] + x[i]*log(t[i]/b)-(a+x[i])*log(1+t[i]/b)
}
return(sum(like))
}
where x and t are the data, and I have n data rows.
I need a and b to be solved for simultaneously. Does a built in function exist in R? Or do I need to hard code an algorithm to solve the system of equations? [I'd rather not] I know optimize() solves for 1 variable and so does fminbnd(). I'm trying to copy the behavior of FindMaximum() in mathematica. In a perfect world I'd like the code to work something like this:
optimize(f=lik, a>0, b>0, x=x, t=t, maximum=TRUE, iteration=5000)
$maximum
a 150
b 6
Thanks.
optim's first argument can be a vector of parameters. So you could try something like this:
lik <- function(p=c(1,1), x, t){
# In the body of the function replace a by p[1] and b by p[2]
}
optim(c(1,1), lik, method = c("L-BFGS-B"), x=x, t=t, control=list(fnscale=-1))
So the solution that ended up working out is:
attempt2d <- optim(
par = c(sumx/sumt, 1), fn = lik, data = data11,
method = "L-BFGS-B", control = list(fnscale = -1, trace=TRUE),
lower=0.1, upper = 170
)
However my parameters run out to 170, essentially meaning that my gamma parameters are Inf. Because gamma() hits infinity relatively quickly. And in mathematica the solutions are a=169 and b=16505, and R gets nowhere near that maxing out at 170. The known solutions are beyond 170 in some cases any solution for this anomaly?

Floating point comparison with zero

I'm writing a function to calculate the quantile of the GEV distribution. The relevant aspect for this question is that a different form of the function is required when one of the parameters (the shape parameter or kappa) is zero
Programmatically, this is commonly addressed as follows (this is a snippet from evd:qgev and is similar in lmomco::quagev):
(Edit: Version 2.2.2 of lmomco has addressed the issue identified in this question)
if (shape == 0)
return(loc - scale * log(-log(p)))
else return(loc + scale * ((-log(p))^(-shape) - 1)/shape)
This works fine if shape/kappa is exactly equal to zero but there is odd behaviour near zero.
Lets look at an example:
Qgev_zero <- function(shape){
# p is an exceedance probability
p= 0.01
location=0
scale=1
if(shape == 0) return( location - scale*(log(-log(1-p) )))
location + (scale/shape)*((-log(1-p))^-shape - 1)
}
Qgev_zero(0)
#[1] 4.600149
Qgev_zero(1e-8)
#[1] 4.600149
This looks fine because the same answer is returned near zero and at zero. But look at what happens closer to zero.
k.seq <- seq(from = -4e-16, to = 4e-16, length.out = 1000)
plot(k.seq, sapply(k.seq, Qgev_zero), type = 'l')
The value returned by the function oscillates is often incorrect.
These problems go away if I replace the direct comparison with zero with all.equal e.g.
if(isTRUE(all.equal(shape, 0))) return( location - scale*(log(-log(1-p) )))
Looking at the help for all.equal suggests that for default values, anything smaller than 1.5e-8 will be treated as zero.
Of course this odd behaviour near zero is probably not generally an issue but in my case, I'm using optimisation/root finding to determine parameters from known quantiles so am concerned that my code needs to be robust.
To the question: is using all.equal(target, 0) an appropriate way to deal with this problem? Why is it that this approach isn't used routinely?
Some functions, when implemented the obvious way with floating point representations, are ill-behaved at certain points. That's especially likely to be the case when the function has to be manually defined at a single point: When things go absolutely undefined at a point, it's likely that they're hanging on for dear life when they get close.
In this case, that's from the kappa denominator fighting the kappa negative exponent. Which one wins the battle is determined on a bit-by-bit basis, each one sometimes winning the "rounding to a stronger magnitude" contest.
There's a variety of approaches to fixing these sorts of problems, all of them designed on a case-by-case basis. One often-flawed but easy-to-implement approach is to switch to a better-behaved representation (say, the Taylor expansion with respect to kappa) near the problematic point. That'll introduce discontinuities at the boundaries; if necessary, you can try interpolating between the two.
Following Sneftel's suggestion, I calculate the quantile at k = -1e-7 and k = 1e-7 and interpolate if k argument falls between these limits. This seems to work.
In this code I'm using the parameterisation for the gev quantile function from lmomco::quagev
(Edit: Version 2.2.2 of lmomco has addressed the issues identified in this question)
The function Qgev is the problematic version (black line on plot), while Qgev_interp, interpolates near zero (green line on plot).
Qgev <- function(K, f, XI, A){
# K = shape
# f = probability
# XI = location
# A = scale
Y <- -log(-log(f))
Y <- (1-exp(-K*Y))/K
x <- XI + A*Y
return(x)
}
Qgev_interp <- function(K, f, XI, A){
.F <- function(K, f, XI, A){
Y <- -log(-log(f))
Y <- (1-exp(-K*Y))/K
x <- XI + A*Y
return(x)
}
k1 <- -1e-7
k2 <- 1e-7
y1 <- .F(k1, f, XI, A)
y2 <- .F(k2, f, XI, A)
F_nearZero <- approxfun(c(k1, k2), c(y1, y2))
if(K > k1 & K < k2) {
return(F_nearZero(K))
} else {
return(.F(K, f, XI, A))
}
}
k.seq <- seq(from = -1.1e-7, to = 1.1e-7, length.out = 1000)
plot(k.seq, sapply(k.seq, Qgev, f = 0.01, XI = 0, A = 1), col=1, lwd = 1, type = 'l')
lines(k.seq, sapply(k.seq, Qgev_interp, f = 0.01, XI = 0, A = 1), col=3, lwd = 2)

Least square minimization

I hope this is the right place for such a basic question. I found this and this solutions quite articulated, hence they do not help me to get the fundamentals of the procedure.
Consider a random dataset:
x <- c(1.38, -0.24, 1.72, 2.25)
w <- c(3, 2, 4, 2)
How can I find the value of μ that minimizes the least squares equation :
The package manipulate allows to manually change with bar the model with different values of μ, but I am looking for a more precise procedure than "try manually until you do not find the best fit".
Note: If the question is not correctly posted, I would welcome constructive critics.
You could proceed as follows:
optim(mean(x), function(mu) sum(w * (x - mu)^2), method = "BFGS")$par
# [1] 1.367273
Here mean(x) is an initial guess for mu.
I'm not sure if this is what you want, but here's a little algebra:
We want to find mu to minimise
S = Sum{ i | w[i]*(x[i] - mu)*(x[i] - mu) }
Expand the square, and rearrange into three summations. bringing things that don't depend on i outside the sums:
S = Sum{i|w[i]*x[i]*x[i])-2*mu*Sum{i|w[i]*x[i]}+mu*mu*Sum{i|w[i]}
Define
W = Sum{i|w[i]}
m = Sum{i|w[i]*x[i]} / W
Q = Sum{i|w[i]*x[i]*x[i]}/W
Then
S = W*(Q -2*mu*m + mu*mu)
= W*( (mu-m)*(mu-m) + Q - m*m)
(The second step is 'completing the square', a simple but very useful technique).
In the final equation, since a square is always non-negative, the value of mu to minimise S is m.

Find minimums with R (1 Variable X, n times a fixed parameter U)

I'm trying to minimize a function f(X,U) = (X*log(X)-1/(1-U))^2
where U=(U_1,...,U_n) ~ U(0,1), that means I have n amount of fixed U's and want to find the min of:
(x_1*ln(x_1)-1/(1-u_1))^2
(x_2*ln(x_2)-1/(1-u_2))^2
......
(x_n*ln(x_n)-1/(1-u_n))^2
For that, I wanted to use the optim function.
I have defined:
n <- 10^3
U <- sort(runif(n,min=0,max=1))
X <- c()
Xsolution<- c()
f <- function(X,U){
return(-(X*log(X)-(1/(1-U)))^2)
} #-, because min(f) = max(-f)
now I have no idea how to do this with optim()? I always get the following error for the following code:
for(i in 1:n){
Xsolution[i] <- optim(f(X,U[i])
}
Error in log(X) : non-numeric argument to mathematical function
Sidenote: I would welcome a method without a for-loop, since for great n, it will take too long. Maybe you can help me get it work with sapply? Or an alternative way?
Alternatively, I thought I got it working with optimize(...,maximize=FALSE,..):
f <- function (X, a) ((X*log(X)-(1/(1-a)))^2)
for (i in 1:n){
xmin[i] <- optimize(f, c(0, 10000), tol = 0.0001, a = U[i])
}
This doesn't work either properly...
Also, the problem may be that it will take tooooo long. I want to do it with n=10^6. But I'm quite sure there has to be a way doing it without a for-loop? I think the for-loop is the problem that makes this take ages. Please help me, I've been sitting on this problem for ages and it's quite frustrating.
Since X * log(X) = 1 / (1 - U[i]) can be solved numerically for any U[i], there is a solution for each distinct U[i] so any of the (X*ln(X)-1/(1-U[i]))^2 can be driven to zero and therefore there is a solution for each distinct U[i]. If typically the U[i] are all distinct that means there are length(U) solutions. The solutions are given by (can omit the unique if the U[i] are all distinct):
f <- function (X, a) ((X*log(X)-(1/(1-a)))^2)
unique(sapply(U, function(a) optimize(f, c(0, 1000000), a = a)$minimum))

Resources