How to increase precision of solution of nlm-solver - r

Given is a function F1:
F1 <- function(C1,C2,C3,...,x,u_target) {
# a lot of equations follow
...
u_actual - u_target
}
F1 returns the result of the very last equation
u_actual - u_target
I want to determine the value for the parameter x in a way that the result of the last equation converges to zero. With
nlm(f=F1,p=c(0),C1=C1,C2=C2,...,stepmax=0.001,ndigit=8)
I get a result, but not a satisfying one:
u_actual = 0.1316566
u_target = 0.1
I played a lot with the arguments of the nlm command (gradtol,stepmax,iterlim etc.), but I was not able to get a better result. I also tried optim, optimize and uniroot, but was not able to get them run at all.
u and x show a negative exponential development. With decreasing x, u increases exponential. If x is zero, u results in a finite value. x also has an upper boundary, which is unknown. So I guessed it would be promising if the iteration starts at the lower boundary (zero) and increases step by step. However, whether I decrease or increase the value of stepmax, the result is not getting better.
I would appreciate any hint from the r-community.
Thank you very much.
PS: in matlab a colleague uses fsolve(#(x) F1(x,u_target,C1,C2,...),0), and it works fine.

Related

MLE using nlminb in R - understand/debug certain errors

This is my first question here, so I will try to make it as well written as possible. Please be overbearing should I make a silly mistake.
Briefly, I am trying to do a maximum likelihood estimation where I need to estimate 5 parameters. The general form of the problem I want to solve is as follows: A weighted average of three copulas, each with one parameter to be estimated, where the weights are nonnegative and sum to 1 and also need to be estimated.
There are packages in R for doing MLE on single copulas or on a weighted average of copulas with fixed weights. However, to the best of my knowledge, no packages exist to directly solve the problem I outlined above. Therefore I am trying to code the problem myself. There is one particular type of error I am having trouble tracing to its source. Below I have tried to give a minimal reproducible example where only one parameter needs to be estimated.
library(copula)
set.seed(150)
x <- rCopula(100, claytonCopula(250))
# Copula density
clayton_density <- function(x, theta){
dCopula(x, claytonCopula(theta))
}
# Negative log-likelihood function
nll.clayton <- function(theta){
theta_trans <- -1 + exp(theta) # admissible theta values for Clayton copula
nll <- -sum(log(clayton_density(x, theta_trans)))
return(nll)
}
# Initial guess for optimization
guess <- function(x){
init <- rep(NA, 1)
tau.n <- cor(x[,1], x[,2], method = "kendall")
# Guess using method of moments
itau <- iTau(claytonCopula(), tau = tau.n)
# In case itau is negative, we need a conditional statement
# Use log because it is (almost) inverse of theta transformation above
if (itau <= 0) {
init[1] <- log(0.1) # Ensures positive initial guess
}
else {
init[1] <- log(itau)
}
}
estimate <- nlminb(guess(x), nll.clayton)
(parameter <- -1 + exp(estimate$par)) # Retrieve estimated parameter
fitCopula(claytonCopula(), x) # Compare with fitCopula function
This works great when simulating data with small values of the copula parameter, and gives almost exactly the same answer as fitCopula() every time.
For large values of the copula parameter, such as 250, the following error shows up when I run the line with nlminb():"Error in .local(u, copula, log, ...) : parameter is NA
Called from: .local(u, copula, log, ...)
Error during wrapup: unimplemented type (29) in 'eval'"
When I run fitCopula(), the optimization is finished, but this message pops up: "Warning message:
In dlogcdtheta(copula, u) :
dlogcdtheta() returned NaN in column(s) 1 for this explicit copula; falling back to numeric derivative for those columns"
I have been able to find out using debug() that somewhere in the optimization process of nlminb, the parameter of interest is assigned the value NaN, which then yields this error when dCopula() is called. However, I do not know at which iteration it happens, and what nlminb() is doing when it happens. I suspect that perhaps at some iteration, the objective function is evaluated at Inf/-Inf, but I do not know what nlminb() does next. Also, something similar seems to happen with fitCopula(), but the optimization is still carried out to the end, only with the abovementioned warning.
I would really appreciate any help in understanding what is going on, how I might debug it myself and/or how I can deal with the problem. As might be evident from the question, I do not have a strong background in coding. Thank you so much in advance to anyone that takes the time to consider this problem.
Update:
When I run dCopula(x, claytonCopula(-1+exp(guess(x)))) or equivalently clayton_density(x, -1+exp(guess(x))), it becomes apparent that the density evaluates to 0 at several datapoints. Unfortunately, creating pseudobservations by using x <- pobs(x) does not solve the problem, which can be see by repeating dCopula(x, claytonCopula(-1+exp(guess(x)))). The result is that when applying the logarithm function, we get several -Inf evaluations, which of course implies that the whole negative log-likelihood function evaluates to Inf, as can be seen by running nll.clayton(guess(x)). Hence, in addition to the above queries, any tips on handling log(0) when doing MLE numerically is welcome and appreciated.
Second update
Editing the second line in nll.clayton as follows seems to work okay:
nll <- -sum(log(clayton_density(x, theta_trans) + 1e-8))
However, I do not know if this is a "good" way to circumvent the problem, in the sense that it does not introduce potential for large errors (though it would surprise me if it did).

Simplify the division of Normals cumulatives functions

I'm struggling on how I can simplify the quotient of two normal probability functions in R. Actually, I'm calculating a conditional skew-Normal density, them I have the division between this two function:
pnorm(alpha0+t(alpha2)%*%chol2inv(chol(omega2))%*%t(y2-xi2.1))/pnorm(tau2.1)
where alpha0+t(alpha2)%*%chol2inv(chol(omega2))%*%t(y2-xi2.1) and tau2.1 result in real numbers. For example, sometimes I have pnorm(-50)/pnorm(-40), e.g. an inconsistency 0/0. But these values are not zero, R is just approximating. I tried to use the erf function, but I got the same problem (0/0).
Any hint on how can I overcome this issue?
pnorm has a log parameter, which makes it return log(p). Change your equation to exp(log(p1) - log(p2)):
exp(pnorm(-50, log = TRUE) - pnorm(-40, log = TRUE))
#[1] 2.95577e-196

Replacing negative values in a model (system of ODEs) with zero

I'm currently working on solving a system of ordinary differential equations using deSolve, and was wondering if there's any way of preventing differential variable values from going below zero. I've seen a few other posts about setting negative values to zero in a vector, data frame, etc., but since this is a biological model (and it doesn't make sense for a T cell count to go negative), I need to stop it from happening to begin with so these values don't skew the results, not just replace the negatives in the final output.
My standard approach is to transform the state variables to an unconstrained scale. The most obvious/standard way to do this for positive variables is to write down equations for the dynamics of log(x) rather than of x.
For example, with the Susceptible-Infected-Recovered (SIR) model for infectious disease epidemics, where the equations are dS/dt = -beta*S*I; dI/dt = beta*S*I-gamma*I; dR/dt = gamma*I we would naively write the gradient function as
gfun <- function(time, y, params) {
g <- with(as.list(c(y,params)),
c(-beta*S*I,
beta*S*I-gamma*I,
gamma*I)
)
return(list(g))
}
If we make log(I) rather than I be the state variable (in principle we could do this with S as well, but in practice S is much less likely to approach the boundary), then we have d(log(I))/dt = (dI/dt)/I = beta*S-gamma; the rest of the equations need to use exp(logI) to refer to I. So:
gfun_log <- function(time, y, params) {
g <- with(as.list(c(y,params)),
c(-beta*S*exp(logI),
beta*S-gamma,
gamma*exp(logI))
)
return(list(g))
}
(it would be slightly more efficient to compute exp(logI) once and store/re-use it rather than computing it twice ...)
If a value doesn’t become negative in reality but becomes negative in your model, you should change your model or, equivalently, modify your differential equations such that this is not possible. With other words: Do not try to constrain your dynamical variables but their derivatives. Everything else will only lead to problems with your solver, while it should not care about a change in the differential equation.
For a simple example, suppose that:
you have a one-dimensional differential equation ẏ = f(y),
y shall not become negative,
your initial y is positive.
In this case, y can only become negative if f(0) < 0. Thus, all you have to do is to modify f such that f(0) ≥ 0 (and it is still smooth).
For a proof of principle, you can multiply f with an appropriately modified sigmoid function (which allows you to compose every logical operation with smooth functions). This way, nothing would change for most values of y, and you only change your differential equation if y is close to 0, i.e., when you were going to manipulate things anyway.
However, I would not really recommend using sigmoids without thinking about your model. If your model is totally wrong near y = 0, it will very likely already be useless for nearby values. If your simulations venture in this terrain and you want the results to be meaningful, you should fix this.

R: one-dimensional optimization

I would like to use optimize(), or something similar, to search for a minimum / maximum value of a function. However I am unsure of about the exact range over which the function should be optimized, which is a required parameter for the function 'optimze()' (e.g. optimize(f=FUN,interval=c(lowerBound,upperBound))).
In this optimization problem, I am able to estimate a value a that is "close" to the optimal solution, but "closeness" depends on the situation.
Is there a function in R that can use the initial value a that does not require that the interval over which the function is optimized to be specified up front?
When you say you're not sure about the lower limit, I suspect that this means that the parameter you are trying to estimate is not bounded below.
If this the case, one trick is to transform the function so that there is a lower bound on the parameter.
This trivial function has a minimum at x=4:
fun <- function(x) -exp(-(x - 4)^2) + 8
which we can find via:
optimize(f=fun,interval=c(0,8))
#> $minimum
#> [1] 4
but let's pretend for a moment that we're not sure if there is a lower limit or not, and that we know that the upper limit is 8. R will throw an error if we try:
optimize(f=fun,interval=c(-Inf,8))
because the bounds must be finite. In this case, we can use the exponential transformation (exp()) which maps
the real numbers to the positive numbers, like so:
optimize(f=function(x)fun(log(x)),
interval=exp(c(-Inf,8)))
#> $minimum
#> [1] 54.59815
and then to get the root, you just need to back transform the above the solution via:
log(54.59815)
#> 4
If you don't know either the upper or lower bound on the underlying parameter, then you can use the log-odds transformation in place of the log():
function(x) log(x/(1-x))
and it's inverse in place of exp():
function(y) exp(y)/(1 + exp(y))
Note that the log-odds transformation maps the real numbers onto the unit interval, so the interval parameter becomes 0:1.
These solutions do have some numerical limitations (e.g. if we had set interval=exp(c(-Inf,16)) in the first solution, we would have gotten an error). Tip, you can re-scale these transformations to center around a given point a which can reduce the numerical limitations.

Decimal points - Probability value of 0 in Language R

How to treat p value in R ?
I am expecting very low p values like:
1.00E-80
I need to -log10
-log10(1.00E-80)
-log10(0) is Inf, but Inf at sense of rounding too.
But is seems that after 1.00E-308, R yields 0.
1/10^308
[1] 1e-308
1/10^309
[1] 0
Is the accuracy of p-value display with lm function the same as the cutoff point, 1e-308, or it is just designed such that we need a cutoff point and I need to consider a different cutoff point - such as 1e-100 (for example) to replace 0 with <1e-100.
There are a variety of possible answers -- which one is most useful depends on the context:
R is indeed incapable under ordinary circumstances of storing floating-point values closer to zero than .Machine$double.xmin, which varies by platform but is typically (as you discovered) on the order of 1e-308. If you really need to work with numbers this small and can't find a way to work on the log scale directly, you need to search Stack Overflow or the R wiki for methods for dealing with arbitrary/extended precision values (but you probably should try to work on the log scale -- it will be much less of a hassle)
in many circumstances R actually computes p values on the (natural) log scale internally, and can if requested return the log values rather than exponentiating them before giving the answer. For example, dnorm(-100,log=TRUE) gives -5000.919. You can convert directly to the log10 scale (without exponentiating and then using log10) by dividing by log(10): dnorm(-100,log=TRUE)/log(10)=-2171, which would be too small to represent in floating point. For the p*** (cumulative distribution function) functions, use log.p=TRUE rather than log=TRUE. (This particular point depends heavily on your particular context. Even if you are not using built-in R functions you may be able to find a way to extract results on the log scale.)
in some cases R presents p-value results as being <2.2e-16 even when a more precise value is known: (t1 <- t.test(rnorm(10,100),rnorm(10,80)))
prints
....
t = 56.2902, df = 17.904, p-value < 2.2e-16
but you can still extract the precise p-value from the result
> t1$p.value
[1] 1.856174e-18
(in many cases this behaviour is controlled by the format.pval() function)
An illustration of how all this would work with lm:
d <- data.frame(x=rep(1:5,each=10))
set.seed(101)
d$y <- rnorm(50,mean=d$x,sd=0.0001)
lm1 <- lm(y~x,data=d)
summary(lm1) prints the p-value of the slope as <2.2e-16, but if we use coef(summary(lm1)) (which does not use the p-value formatting), we can see that the value is 9.690173e-203.
A more extreme case:
set.seed(101); d$y <- rnorm(50,mean=d$x,sd=1e-7)
lm2 <- lm(y~x,data=d)
coef(summary(lm2))
shows that the p-value has actually underflowed to zero. However, we can still get an answer on the log scale:
tval <- coef(summary(lm2))["x","t value"]
2*pt(abs(tval),df=48,lower.tail=FALSE,log.p=TRUE)/log(10)
gives -692.62 (you can check this approach with the previous example where the p-value doesn't overflow and see that you get the same answer as printed in the summary).
Small numbers are generally hard to deal with.
The limit in R for infinite is caused by the use of double precision floating point :
?double All R platforms are required to work with values conforming to the IEC 60559 (also known as IEEE 754) standard. This basically works with a precision of 53 bits, and represents to that precision a range of absolute values from about 2e-308 to 2e+308.
http://en.wikipedia.org/wiki/Double_precision_floating-point_format
You may find the Rmpfr package helpful here as it allows you to create multiple precision numbers.
install.packages("Rmpfr")
require(Rmpfr)
log(mpfr(1/10^309, precBits=500))

Resources