I'm implementing a function in R for finding the parameters that minimize the quadratic error for an arbitrary pdf function (e.g., pnorm, punif, pbinom, etc.). The design is such that the user provides probabilities, empirical quantiles, a function name (densit), and a list of parameter names (params) from the densit function (e.g., mean and sd in pnorm). For instances, being for q1, q2, and q3 equal to -1.644854, 0, and 1.644854, and p1, p2, and p3 equal to 0.05, 0.5, 0.95, the function should recover mean and sd as approximately 0 and 1 respectively.
The goal is to feed such expression to optim, so that the latter takes a function (called internally quadraticFun) that should evaluate an expression during optimization; however, I can't make optim work on such expression for estimation. The function is this:
fun <- function(p1 = 0.05, p2 = 0.50, p3 = 0.95, q1, q2, q3, output = "complete", densit, params) {
densit <- substitute(densit)
params <- sapply(params, as.name)
densiCall1 <- as.call(c(as.list(densit), as.list(q1), params))
densiCall2 <- as.call(c(as.list(densit), as.list(q2), params))
densiCall3 <- as.call(c(as.list(densit), as.list(q3), params))
quadratEq <- substitute((densiCall1 - p1)^2 +
(densiCall2 - p2)^2 +
(densiCall3 - p3)^2)
quadraticFun <- function(params) {
eval(quadratEq)
}
initVals <- rep(1, times = length(params))
res <- optim(initVals, quadraticFun) # ERROR SOMEWHERE AROUND HERE...
# if output is set to "complete", return the whole res object, otherwise, return just the parameters
if (output == "parameters") {
return(res$par)
}
return(res)
}
Debugging with the next example indicates that the error shows up during the call optim(initVals, quadraticFun), specifically during evaluation of quadraticFun with the following message:
> findParams(q1 = -1.644854, q2 = 0, q3 = 1.644854, densit = pnorm, params = list("mean", "sd"))
Error in pnorm(-1.644854, mean, sd) :
Non-numeric argument to mathematical function
It seems that the expression resulting is taken literally by the interpreter, i.e., mean and as as symbols to evaluate instead of arguments in the function to optimize.
Thanks in advance for any hint that helps to solve this problem.
Your function quadraticFun was not working, because is did not properly hand over the params parameter to densiCall1 function.
I changed the code a bit so that there is only one call which needs to be evaluated, which is easier to handle. Now densit is a character string, I am sure this you could be changed.
Here is a function I came up with:
findParams <- function(q = c( -1.644854, 0, 1.644854),
p = c(0.05, 0.50, 0.95), output = "complete", densit="pnorm",
params = c("mean", "sd")) {
l <- length(params)
cl <- vector("list", 2 + length(params))
cl[[1]] <- as.name(densit)
cl[[2]] <- q
names(cl) <- c(NA, "q", params)
mode(cl) <- "call"
quadraticFun <- function(x) {
cl[3:(l+2)] <- x
res <- eval(cl)
sum((res - p)^2)
}
initVals <- rep(1, times = l)
res <- optim(initVals, quadraticFun)
if (output == "parameters") {
return(res$par)
}
return(res)
}
And it seems to work:
findParams()
$par
[1] 0.0001065349 1.0001052494
$value
[1] 2.682477e-09
$counts
function gradient
67 NA
$convergence
[1] 0
$message
NULL
Related
I'm wanting to write a function that will (hopefully) work in the raster calculator in the raster package. What I'm trying to do is regress each cell value against a vector of Time. There are multiple examples of this, but what I would like to do is for the method to try 1 type of regression (gls, controlling for AR1 residual errors), but if for some reason that regression throws an error (perhaps there is no AR1 structure in the residuals) then to revert back to simple OLS regression.
I've written two functions for the regression. One for gls:
# function for calculating the trend, variability, SNR, and residuals for each pixel
## this function will control for AR1 structure in the residuals
funTrAR1 <- function(x, ...) {if (sum(is.na(x)) >= 1) { NA } else {
mod <- nlme::gls(x ~ Year, na = na.omit, method = "REML", verbose = TRUE,
correlation = corAR1(form = ~ Year, fixed = FALSE),
control = glsControl(tolerance = 1e-3, msTol = 1e-3, opt = c("nlminb", "optim"),
singular.ok = TRUE, maxIter = 1000, msMaxIter = 1000))
slope <- mod$coefficients[2]
names(slope) <- "Trend"
var <- sd(mod$residuals)
names(var) <- "Variability"
snr <- slope/var
names(snr) <- "SNR"
residuals <- c(stats::quantile(
mod$residuals, probs = seq(0,1,0.25),
na.rm = TRUE, names = TRUE, type = 8),
base::mean(mod$residuals, na.rm = TRUE))
names(residuals) <- c("P0", "P25", "P50", "P75", "P100", "AvgResid")
return(c(slope, var, snr, residuals))}
}
and for OLS:
# function for calculating the trend, variability, SNR, and residuals for each pixel
## this function performs simple OLS
funTrOLS <- function(x, ...) {if (sum(is.na(x)) >= 1) { NA } else {
mod <- lm(x ~ Year, na.action = na.omit)
slope <- mod$coefficients[2]
names(slope) <- "TrendOLS"
var <- sd(mod$residuals)
names(var) <- "VariabilityOLS"
snr <- slope/var
names(snr) <- "SNROLS"
residuals <- c(stats::quantile(
mod$residuals, probs = seq(0,1,0.25),
na.rm = TRUE, names = TRUE, type = 8),
base::mean(mod$residuals, na.rm = TRUE))
names(residuals) <- c("P0", "P25", "P50", "P75", "P100", "AvgResid")
return(c(slope, var, snr, residuals))}
}
I'm trying to wrap these in a tryCatch expression which can be passed to raster::calc
xReg <- tryCatch(
{
funTrAR1
},
error = function(e) {
## this should create a text file if a model throws an error
sink(paste0(inDir, "/Outputs/localOLSErrors.txt"), append = TRUE)
cat(paste0("Used OLS regression (grid-cell) for model: ", m, ". Scenario: ", t, ". Variable: ", v, ". Realisation/Ensemble: ", r, ". \n"))
sink()
## run the second regression function
funTrOLS
}
)
This function is then passed to raster::calc like so
cellResults <- calc(rasterStack, fun = xReg)
Which if everything works will produce a raster stack of the output variables that looks similar to this
However, for some of my datasets the loop that I'm running all of this in stops and I receive the following error:
Error in nlme::gls(x ~ Year, na = na.omit, method = "REML", verbose = TRUE, :
false convergence (8)
Which is directly from nlme::gls and what I was hoping to avoid. I've never used tryCatch before (this might be very obvious), but does anyone know how to get the tryCatch() to move to the second regression function if the first (AR1) regression fails?
Here is another way to code this, perhaps that helps:
xReg <- function(x, ...) {
r <- try(funTrAR1(x, ...), silent=TRUE)
# if (class(r) == 'try-error') {
if (!is.numeric(r)) { # perhaps a faster test than the one above
r <- c(funTrOLS(x, ...), 2)
} else {
r <- c(r, 1)
}
r
}
I add a layer that shows which model was used for each cell.
You can also do
xReg <- function(x, ...) {
r <- funTrOLS(x, ...)
try( r <- funTrAR1(x, ...), silent=TRUE)
r
}
Or use calc twice and use cover after that
xReg1 <- function(x, ...) {
r <- c(NA, NA, NA, NA)
try( r <- funTrAR1(x, ...), silent=TRUE)
r
}
xReg2 <- function(x, ...) {
funTrOLS(x, ...)
}
a <- calc(rasterStack, xReg1)
b <- calc(rasterStack, xReg2)
d <- cover(a, b)
And a will show you where xReg1 failed.
After doing a bit more reading, and also looking at #RobertH answer, I wrote a bit of (very) ugly code that checks if GLS will fail and if it does, performs OLS instead. I'm positive that there is a nicer way to do this, but it works and maintains raster layer names as they were defined in my functions, it also exports any errors to a txt file.
for (i in 1) {
j <- tempCentredRas
cat(paste("Checking to see if gls(AR1) will work for model", m, r,"cell based calculations\n", sep = " "))
### This check is particularly annoying as it has to do this for every grid-cell
### it therefore has to perform GLS/OLS on every grid cell twice
### First to check if it (GLS) will fail, and then again if it does fail (use OLS) or doesn't (use GLS)
possibleLocalError <- tryCatch(
raster::calc(j, fun = funTrAR1),
error = function(err)
err
)
if (inherits(possibleLocalError, "error")) {
cat(paste("GLS regression failed for model", m, r, "using OLS instead for cell based results.","\n", sep = " "))
cellResults <- raster::calc(j, fun = funTrOLS)
} else {
cellResults <- raster::calc(j, fun = funTrAR1)
}
}
I would like to write estimation function to estimate model parameters. I wrote my function without any error, however optim function does not work as expected. When I run my code line by line I found that my condition returns me warnings messages (the condition has length > 1 and only the first element will be used). So, I think this is the problem that makes optim does not work as expected. That because, when I run my code line by line, I got this message and when I run optim function I got this:
~Error in optim(par = start.parm, fn = t_LL, method = "L-BFGS-B", lower = low, :
object 'low' not found
However, I am not sure.
Here is my code:
library(VineCopula)
simdata <– BiCopSim(300, 5, -2)
## I call my function like this:
Myfun <- MLE(simdata, family = 5, par = -2, par2 = 0)
MLE <- function(data, family, par,par2) {
n <- dim(data)[1]
start.parm <- c(par,par2)
if (family %in% c(3, 13)) {
low <- 1e-04
up <- 100
} else if (family %in% c(4, 14)) {
low <- 1.0001
up <- 100
} else if (family %in% c(5)) {
low <- -100
up <- 100
}
t_LL <- function(param, family, start, start2) {
start <- param[[1]]
start2 <- param[[2]]
ll <- sum(log(BiCopPDF(data[,1], data[,2], family, start, start2)))
return(ll)
}
optimout <- optim(par = start.parm,
fn = t_LL,
family= family,
start=start,
start2=start2,
method = "L-BFGS-B",
lower = low,
upper = up,
control = list(fnscale = -1, maxit = 500))
out <- list()
out$par <- optimout$par[1]
out$value <- optimout$value
out
}
Any help, please?
The problem that you're having in your real case is that you're: 1) passing a vector argument to family; 2) only its first value gets used for the if block; 3) it doesn't match any of the values you check for; and 4) as a result low and up don't get assigned.
If you want to be able to pass a vector argument family, take a look at ifelse to replace your if block. If not, throw an error if length(family) > 1 and add an else to your if block to throw an error if the given family doesn't match any of your choices.
I would like to compute an integral where the integrand is a function of the solution of an ODE.
In order to solve the integral, R needs to solve an ODE for each value the integration algorithm uses. This is what I have done so far:
require(deSolve)
# Function to be passed to zvode in order to solve the ODE
ODESR <- function(t, state, parameters) {
with(as.list(c(state, parameters)),{
dPSI <- -kappa*PSI+0.5*sigma^2*PSI^2
dPHI <- kappa*theta*PSI
return(list(c(dPSI, dPHI)))
})
}
# For a given value of p this code should return the solution of the integral
pdfSRP <- function (p) {
integrand <- function (u) {
state <- c(PSI = u*1i, PHI = 0)
out <- as.complex(zvode(y = state, times = times, parms = parameters, fun = ODESR)[2, 2:3])
Re(exp(out[2] + out[1]*x)*exp(-u*1i*p))
}
integrate(f = integrand, lower = -Inf, upper = Inf)$value/(2*pi)
}
For the following given values:
parameters <- c(kappa = 1, theta = 0.035, sigma = 0.05)
times <- c(0,1)
x <- 0.1
running:
pdfSRP(p = 2)
produces the following error:
Error in eval(expr, envir, enclos) : object 'PSI' not found
I just cannot figure out why. I'm quite sure it is due to a syntax error, because running:
integrand <- function (u) {
state <- c(PSI = u*1i, PHI = 0)
out <- as.complex(zvode(y = state, times = times, parms = parameters, fun = ODESR)[2, 2:3])
Re(exp(out[2] + out[1]*x)*exp(-u*1i*p))
}
with p <- 2 and (for example) u <- 3 works.
Can you help me spot the mistake?
It seems to be a vectorization problem in the integrand input u. If I understand correctly, PSI should be a number for each calculation and not a vector of numbers (which will give a dimensional problem between PSI and PHI. Hence
integrand <- Vectorize(integrand)
should resolve your issue. From ?integrate:
f must accept a vector of inputs and produce a vector of function evaluations at those points.
However, this leads to a different error.
pdfSRP(p = 2)
## Error in integrate(f = integrand, lower = -Inf, upper = Inf) :
## the integral is probably divergent
If we plot the integrand, we may spot the divergence problem
p <- 2
par(mfrow = c(1,2))
curve(integrand,-1e3,1e3,n = 100)
curve(integrand,-1e3,1e3,n = 1e3)
Assuming the integrand converges sufficiently fast to zero in both tails, the divergence of the integral could be a result from numerical imprecision. We can increase precision by increasing the number of subintervals for the integral, which does give a result - I suppose, as expected by heuristically looking at the plot.
pdfSRP <- function (p) {
int <- integrate(f = integrand, lower = -Inf, upper = Inf,
subdivisions = 1e3)
int$value/(2*pi)
}
## [1] 2.482281e-06
The example is from the rootsolve package:
We have this function:
gradient(f, x, centered = FALSE, pert = 1e-8, ...)
in which f is a function and x is the data input in for of a vector
Now the following is an instance of the code being run:
logistic <- function (x, times) {
with (as.list(x),
{
N <- K / (1+(K-N0)/N0*exp(-r*times))
return(c(N = N))
})
}
# parameters for the US population from 1900
x <- c(N0 = 76.1, r = 0.02, K = 500)
# Sensitivity function: SF: dfi/dxj at
# output intervals from 1900 to 1950
SF <- gradient(f = logistic, x, times = 0:50)
My question is how does the code understand to use times in its routine. It's not defined globally and it is not part of the function input list either. Is it possible to pass inputs to a function when it is not defined in its structure? Does ... play a role here?
... is just a way of getting an extra arguments and passing them on to another function.
Simple example:
power.function <- function(x,power) { x^power }
apply.function <- function(f, data, ...) { f(data, ...) }
sample <- c(1,2,3)
apply.function (power.function, sample, power = 3)
# which is the same as
apply.function (power.function, sample, 3)
produces
> apply.function (power.function, sample, 3)
[1] 1 8 27
EDIT
To make it crystal clear, if you look at the source of the rootSolve::gradient you'll see the definition as
function (f, x, centered = FALSE, pert = 1e-08, ...)
and further down the call to
reff <- f(x, ...)
which is the same as described above in the example.
I am trying to use DEoptim with fnMap parameter which is as stated in the documentation "an optional function that will be run after each population is created" so I create this simple test case.
fnm <- function(x) round(x, 2)
fn <- function (x) x ^ 2
upper <- 100
lower <- -100
DEoptim(fn=fn, lower=lower, upper=upper, fnMap=fnm)
The problem is when I use fnMap parameter it returns an error {mapping function did not return an object with dim NP x length(upper)} for any kind of map function.
The expected return is dim NP x length(upper). You've got upper as length 1. You can try to set dim NP to 1 using control. For example...
DEoptim(fn=fn, lower=lower, upper=upper, fnMap=fnm,control=DEoptim.control(NP=1))
but you'll get this warning...
Warning in DEoptim(fn = fn, lower = lower, upper = upper, fnMap = fnm, control = DEoptim.control(NP = 1)) :
'NP' < 4; set to default value 10*length(lower)
but this tells us that NP defaults to 10 * length(lower) which is useful. You just need to redine your function to
fnm <- function(x) matrix(round(x, 2), nrow=10, ncol=1,byrow=TRUE)
but this assumes you always have a length 1 parameter set. I would probably do something like this
fnm <- function(x, Len) matrix(round(x, 2), nrow=10*Len, ncol=Len,byrow=TRUE)
and then call DEoptim like this
DEoptim(fn=fn, lower=lower, upper=upper, fnMap=function(x) fnm(x,length(upper)))