Finding the Maximum of a Function with numerical derivatives in R - r

I wish to numerically find the maximum of the function multiplied by Beta 3 shown on p346 of the following link when tau=30:
http://www.ssc.upenn.edu/~fdiebold/papers/paper49/Diebold-Li.pdf
They give the answer on p347 as 0.0609.
I would like to confirm this numerically in R. I.e. to take the derivative and find the value where it reaches zero.
library(numDeriv)
x <- 30
testh <- function(lambda){ ((1-exp(-lambda*30))/(lambda*30)) - exp(-lambda*30) }
grad_h <- function(lambda){
val <- grad(testh, lambda)
return(val^2)
}
OptLam <- optimize(f=grad_h, interval=c(0.0001,120), tol=0.0000000000001)
I take the square of the gradient as I want the minimum to be at zero.
Unfortunately, the answer comes back as Lambda=120!! With lambda at 120 the value of the objective function is 5.36e-12.
By working by hand I can func a lower value of the numerical derivative that is closer to zero (it is also close to the analytical value given above):
grad_h(0.05977604)
## [1] 4.24494e-12
Why is the function above not finding this lower value? I have set the tolerance very high so it should be able to find such this optimal value?
Is it possible to correct the existing method so that it gives the correct answer?
Is there a better way to find the maximum gradient of a function numerically in R?
For example is there an optimizer that looks for zero rather than trying to find a minimum of maximum?

You can use uniroot to find where the derivative is 0. This might work for you,
grad_h <- function(lambda){
val=grad(testh,lambda)
return(val)
}
## The root
res <- uniroot(grad_h, c(0,120), tol=1e-10)
## see it
ls <- seq(0.001, 1, length=1000)
plot(ls, testh(ls), col="salmon")
abline(v=res$root, col="steelblue", lwd=2, lty=2)
text(x=res$root, y=testh(res$root),
labels=sprintf("(%f, %s)", res$root,
format(testh(res$root), scientific = T)), adj=-0.1)

Related

Trying to plot loglikelihood of Cauchy distribution for different values of theta in R

I am trying to plot the log-likelihood function of the Cauchy distribution for varying values of theta (location parameter). These are my observations:
obs<-c(1.77,-0.23,2.76,3.80,3.47,56.75,-1.34,4.24,3.29,3.71,-2.40,4.53,-0.07,-1.05,-13.87,-2.53,-1.74,0.27,43.21)
Here is my log-likelihood function:
ll_c<-function(theta,x_values){
n<-length(x_values)
logl<- -n*log(pi)-sum(log(1+(x_values-theta)^2))
return(logl)
}
and Ive tried making a plot by using this code:
x<-seq(from=-10,to=10,by=0.1);length(x)
theta_null<-NULL
for (i in x){
theta_log<-ll_c(i,counts)
theta_null<-c(theta_null,theta_log)
}
plot(theta_null)
The graph does not look right and for some reason the length of x and theta_null differs.
I am assuming that theta is your location parameter (the scale is set to 1 in my example). You should obtain the same result using a t-distribution with 1 df and shifting the observations by theta. I left some comments in the code as guidance.
obs = c(1.77,-0.23,2.76,3.80,3.47,56.75,-1.34,4.24,3.29,3.71,-2.40,4.53,-0.07,-1.05,-13.87,-2.53,-1.74,0.27,43.21)
ll_c=function(theta, obs)
{
# Compute log-lik for obs and a value of thet (location)
logl= sum(dcauchy(obs, location = theta, scale = 1, log = T))
return(logl)
}
# Loop for possible values of theta(obs given)
x = seq(from=-10,to=10,by=0.1)
ll = NULL
for (i in x)
{
ll = c(ll, ll_c(i, obs))
}
# Plot log-lik vs possible value of theta
plot(x, ll)
It is hard to say exactly what you are experiencing without more info. But I'll make an educated guess.
First of all, we can simplify this a lot by using the *t family of functions for the t distribution, as the cauchy distribution is just the t distribution with df = 1. So your calculations could've been done using
for(i in ncp)
theta_null <- c(theta_null, sum(dt(values, 1, i, log = TRUE)))
Note that multiplying by n doesn't actually matter for any practical purposes. We are usually interested in minimizing/maximizing the likelihood in which case all constants are irrelevant.
Now if we use this approach, we can quite quickly notice something by printing the values:
print(head(theta_null))
[1] -Inf -Inf -Inf -Inf -Inf -Inf
So I am assuming what you are experiencing is that many of your values are "almost" negative infinity, and maybe these are not stored correctly in your outcome vector. I can't see that this should be the case from your code, but this would be my initial guess.

Error in optim(): searching for global minimum for a univariate function

I am trying to optmize a function in R
The function is the Likelihood function of negative binominal when estimating only mu parameter. This should not be a problem since the function clearly has just one point of maximum. But, I am not being able to reach the desirable result.
The function to be optmized is:
EMV <- function(data, par) {
Mi <- par
Phi <- 2
N <- NROW(data)
Resultado <- log(Mi/(Mi + Phi))*sum(data) + N*Phi*log(Phi/(Mi + Phi))
return(Resultado)
}
Data is a vector of negative binomial variables with parameters 2 and 2
data <- rnegbin(10000, mu = 2, theta = 2)
When I plot the function having mu as variable with the following code:
x <- seq(0.1, 100, 0.02)
z <- EMV(data,0.1)
for (aux in x) {z <- rbind(z, EMV(data,aux))}
z <- z[2:NROW(z)]
plot(x,z)
I get the following curve:
And the maximum value of z is close to parameter value --> 2
x[which.max(z)]
But the optimization is not working with BFGS
Error in optim(par = theta, fn = EMV, data = data, method = "BFGS") :
non-finite finite-difference value [1]
And is not going to right value using SANN, for example:
$par
[1] 5.19767e-05
$value
[1] -211981.8
$counts
function gradient
10000 NA
$convergence
[1] 0
$message
NULL
The questions are:
What am I doing wrong?
Is there a way to tell optim that the param should be bigger than 0?
Is there a way to tell optim that I want to maximize the function? (I am afraid the optim is trying to minimize and is going to a very small value where function returns smallest values)
Minimization or Maximization?
Although ?optim says it can do maximization, but that is in a bracket, so minimization is default:
fn: A function to be minimized (or maximized) ...
Thus, if we want to maximize an objective function, we need to multiply an -1 to it, and then minimize it. This is quite a common situation. In statistics we often want to find maximum log likelihood, so to use optim(), we have no choice but to minimize the negative log likelihood.
Which method to use?
If we only do 1D minimization, we should use method "Brent". This method allows us to specify a lower bound and an upper bound of search region. Searching will start from one bound, and search toward the other, until it hit the minimum, or it reach the boundary. Such specification can help you to constrain your parameters. For example, you don't want mu to be smaller than 0, then just set lower = 0.
When we move to 2D or higher dimension, we should resort to "BFGS". In this case, if we want to constrain one of our parameters, say a, to be positive, we need to take log transform log_a = log(a), and reparameterize our objective function using log_a. Now, log_a is free of constraint. The same goes when we want constrain multiple parameters to be positive.
How to change your code?
EMV <- function(data, par) {
Mi <- par
Phi <- 2
N <- NROW(data)
Resultado <- log(Mi/(Mi + Phi))*sum(data) + N*Phi*log(Phi/(Mi + Phi))
return(-1 * Resultado)
}
optim(par = theta, fn = EMV, data = data, method = "Brent", lower = 0, upper = 1E5)
The help file for optim says: "By default optim performs minimization, but it will maximize if control$fnscale is negative." So if you either multiply your function output by -1 or change the control object input, you should get the right answer.

How to calculate the area under each end of a sine curve

Given this data set:
y<-c(-13,16,35,40,28,36,43,33,40,33,22,-5,-27,-31,-29,-25,-26,-31,-26,-24,-25,-29,-23,4)
t<-1:24
My goal is to calculate two areas. The first area would integrate only data from the first part of the curve found above the Zero line. The second area would integrate data from the second part of the curve found below the zero line.
First I would like to fit a sine wave to this data. Using this excellent answer:
https://stats.stackexchange.com/questions/60994/fit-a-sinusoidal-term-to-data
I was able to fit a sine wave (I will be using the periodic with second harmonic which looks to have a better fit)
ssp <- spectrum(y)
per <- 1/ssp$freq[ssp$spec==max(ssp$spec)]
reslm <- lm(y ~ sin(2*pi/per*t)+cos(2*pi/per*t))
summary(reslm)
rg <- diff(range(y))
plot(y~t,ylim=c(min(y)-0.1*rg,max(y)+0.1*rg))
lines(fitted(reslm)~t,col=4,lty=2) # dashed blue line is sin fit
# including 2nd harmonic really improves the fit
reslm2 <- lm(y ~ sin(2*pi/per*t)+cos(2*pi/per*t)+sin(4*pi/per*t)+cos(4*pi/per*t))
summary(reslm2)
lines(fitted(reslm2)~t,col=3) # solid green line is periodic with second harmonic
abline(h=0,lty=2)
Next I would like to calculate the area under the curve that is only positive, as well as the area under the curve that is exclusively negative. I've had luck looking at similar answers using the AUC functions in the Bolstad2 and Mess packages. But my data points do not fall neatly on zero line, and I do not know how to break up the sine function into areas only above the Zero line and only below the Zero line.
First things first. To get an exact calculation, you will need to work with the exact function of the 2nd harmonic fourier. Secondly, the beauty of harmonics functions is that they are repetitive. So if you want to find where your function reaches 0, you merely need to expand your interval to so you can be sure to find more than 2 roots.
First we get the exact function from the regression model
fourierfnct <- function(t){
fnct <- reslm2$coeff[1]+
reslm2$coeff[2]*sin(2*pi/per*t)+
reslm2$coeff[3]*cos(2*pi/per*t)+
reslm2$coeff[4]*sin(4*pi/per*t)+
reslm2$coeff[5]*cos(4*pi/per*t)
return(fnct)
}
secondly,you can write a function which can find the roots (where the function is 0). R provides a uniroot function which you can use to find multiple roots in a loop.
manyroots <- function(f,inter,period){
roots <- array(NA, inter)
for(i in 1:(length(inter)-1)){
roots[i] <- tryCatch({
return_value <- uniroot(f,c(inter[i],inter[i+1]))$root
}, error = function(err) {
return_value <- -1
})
}
retroots <- roots[-which(roots==-1)]
return(retroots)
}
then you simply calculate the roots, and use them to integrate the function across those boundaries.
roots <- manyroots(fourierfnct,seq(0,25),per)
integrate(fourierfnct, roots[1],roots[2])
#300.6378 with absolute error < 3.3e-12
integrate(fourierfnct, roots[2],roots[3])
#-284.6378 with absolute error < 3.2e-12
This may not be the solution you are looking for, but you could try this:
# Create a new t vector but with more subdivisions
t2 = seq(1,24,length.out = 10000)
# Evaluate your model on this t2
y2 = predict(reslm2, newdata = data.frame(t = t2))
lines(t2[y2>=0],y2[y2>=0],col="red")
# Estimate the area where the curve is greater than 0
sum(diff(t2)[1]*y2[y2>0])
# Estimate the area where the curve is less than 0
sum(diff(t2)[1]*y2[y2<0])

R: draw from a vector using custom probability function

Forgive me if this has been asked before (I feel it must have, but could not find precisely what I am looking for).
Have can I draw one element of a vector of whole numbers (from 1 through, say, 10) using a probability function that specifies different chances of the elements. If I want equal propabilities I use runif() to get a number between 1 and 10:
ceiling(runif(1,1,10))
How do I similarly sample from e.g. the exponential distribution to get a number between 1 and 10 (such that 1 is much more likely than 10), or a logistic probability function (if I want a sigmoid increasing probability from 1 through 10).
The only "solution" I can come up with is first to draw e6 numbers from the say sigmoid distribution and then scale min and max to 1 and 10 - but this looks clumpsy.
UPDATE:
This awkward solution (and I dont feel it very "correct") would go like this
#Draw enough from a distribution, here exponential
x <- rexp(1e3)
#Scale probs to e.g. 1-10
scaler <- function(vector, min, max){
(((vector - min(vector)) * (max - min))/(max(vector) - min(vector))) + min
}
x_scale <- scaler(x,1,10)
#And sample once (and round it)
round(sample(x_scale,1))
Are there not better solutions around ?
I believe sample() is what you are looking for, as #HubertL mentioned in the comments. You can specify an increasing function (e.g. logit()) and pass the vector you want to sample from v as an input. You can then use the output of that function as a vector of probabilities p. See the code below.
logit <- function(x) {
return(exp(x)/(exp(x)+1))
}
v <- c(seq(1,10,1))
p <- logit(seq(1,10,1))
sample(v, 1, prob = p, replace = TRUE)

Calculate the derivative of a data-function in r

Is there an easy way to calculate the derivative of non-liner functions that are give by data?
for example:
x = 1 / c(1000:1)
y = x^-1.5
ycs = cumsum(y)
plot (x, ycs, log="xy")
How can I calculate the derivative function from the function given by ´x´ and ´ycs´?
Was also going to suggest an example of a smoothed spline fit followed by prediction of the derivative. In this case, the results are very similar to the diff calculation described by #dbaupp:
spl <- smooth.spline(x, y=ycs)
pred <- predict(spl)
plot (x, ycs, log="xy")
lines(pred, col=2)
ycs.prime <- diff(ycs)/diff(x)
pred.prime <- predict(spl, deriv=1)
plot(ycs.prime)
lines(pred.prime$y, col=2)
Generating derivatives from raw data is risky unless you are very careful. Not for nothing is this process known as "error multiplier." Unless you know the noise content of your data and take some action (e.g. spline) to remove the noise prior to differentiation, you may well end up with a scary curve indeed.
The derivative of a function is dy/dx, which can be approximated by Δy/Δx, that is, "change in y over change in x". This can be written in R as
ycs.prime <- diff(ycs)/diff(x)
and now ycs.prime contains an approximation to the derivative of the function at each x: however it is a vector of length 999, so you will need to shorten x (i.e. use x[1:999] or x[2:1000]) when doing any analysis or plotting.
There is also gradient from the pracma package.
grad <- pracma::gradient(ycs, h1 = x)
plot(grad, col = 1)

Resources