I am trying to use the optimx function in R, but keep getting the error message:
Error in optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower, :
Cannot evaluate function at initial parameters
I have looked at another stackoverflow question R- Optimx for exponential function with 2 parameters - cannot evaluate function at initial parameter values
but that solution has not worked for me.
Here is my test data:
t <- seq(from=1,to=60,by=1)
len <- 100*(1-exp(-0.2*(t-0)))
t.data<-data.frame(t,len)
starting values for par in the optimx function
p <- as.vector(c(30,110,0.3,1.0))
ages for the function below
Age1 <- 1 #### a young age
Age2 <- 50 #### an old age
function to be minimized
schnute_richards <- function(p,data) # which are Len1,Len2,K,R
{
zero <- p[1]^p[4] # Len1^R
one <- p[2]^p[4]-p[1]^p[4] # (Len2^R-Len1^R)
two <- 1-exp(-p[3]*(data$t-Age1)) # (1-EXP(-K*(ObsAge-Age1)))
three <- 1-exp(-p[3]*(Age2-Age1)) # (1-EXP(-K*(Age2-Age1)))
pred <- (zero + one*(two/three))^(1/p[4]) # final equation
sum((data$len-pred)^2)
}
optimx code
temp <-optimx(p,function (x) schnute_richards(x[1],x[2]))
I tried other versions of this code, but I get the same error message. This code was used in the other stackoverflow message I referred earlier that was the solution.
Thanks for any help.
Study help(optimx).
library(optimx)
temp <- optimx(p, schnute_richards, data = t.data)
# p1 p2 p3 p4 value fevals gevals niter convcode kkt1 kkt2 xtimes
#Nelder-Mead 18.12639 99.99589 0.1999604 1.0005907 7.282821e-05 475 NA NA 0 FALSE TRUE 0.03
#BFGS 18.12844 99.99493 0.2000565 0.9993415 6.034452e-05 82 20 NA 0 FALSE TRUE 0.01
Related
Using R and estimating a simple equation by least squares that has the dependent variable as a independent (explanatory, right hand side) variable, I want to forecast out of sample and use the dependent variable forecasts in the out of sample period as a lag for each step ahead.
I.e., I want to extend forecasts of y to be outside the data period
a <- lm( y ~ x + lag(y,1), data= dset1)
b <- forecast(a,newdata=dset2)
where dset2 has the full period of extra x variables, but not the lagged y.
Here is an example using the AirPassengers data set, where dset2 was created with some missing ap data. The results below show only row 143 gets filled in not 144 because forecast did not have the 143 lag.
I looked at the dyn dynlm and forecast packages but nonw seem to work with type of model. (I do not want to restate as an ARMA or a VAR)
What package can easily do this, or am I using forecast incorrectly?
I can loop and step ahead on period at a time, but rather not do that.
##Example case using airline data
data("AirPassengers", package = "datasets")
ap <- log(AirPassengers)
ap <- as.ts(ap)
d1 <- data.frame(ap, index= as.Date(ap))
m1 <- lm(ap ~ lag(ap,1), data=d1)
m2 <- dynlm(ap ~ lag(ap,1), data=d1)
m3 <- dyn(lm(ap ~ lag(ap,1), data=d1))
summary(m3)
##Neither lm or dyn or dynlm obects worked as I want
## Try forecast missing values, 2 steps, rows 143 and 144
d2 <- d1
d2$apx = d2$ap
d2$apx[143:144]= NA
mx <- lm(apx ~ lag(apx,1), data=d2)
b <- forecast(mx,newdata=d2)
Results:
> b
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
1 NA NA NA NA NA
2 4.756850 4.619213 4.894488 4.545513 4.968188
3 4.807218 4.669783 4.944653 4.596191 5.018245
....
140 6.411559 6.273546 6.549572 6.199644 6.623474
141 6.386407 6.248507 6.524306 6.174667 6.598146
142 6.216154 6.078941 6.353368 6.005467 6.426841
143 6.122453 5.985553 6.259354 5.912247 6.332659
144 NA NA NA NA NA
other lm like objects produced errors for forecast
mx <- dynlm(apx ~ lag(apx,1), data=d2)
b <- forecast(mx,newdata=d2)
Error in forecast.lm(mx, newdata = d2) : invalid type/length
(symbol/0) in vector allocation
mx <- dyn(lm(apx ~ lag(apx,1), data=d2))
b <- forecast(mx,newdata=d2)
Error in predict.lm(object, newdata = newdata, se.fit = TRUE, interval
= "prediction", : formal argument "se.fit" matched by multiple actual arguments
I am trying to compute in R. I have the following values.
nb <- 100
tb <- 25
ns <- 90
ts <- 15
A0 <- 1
S_norm <- 0.4
R <-tb/ts
y_meas <- (ns-nb/R)/A0
sigma_meas = sqrt(ns+(nb+1)/R^2)/A0
I am very confused on how I can integrate L(psi), say from -10 to 10. Because I am integrating with respect to log A.
You can substitute for logA and for a fixed value of psi you can integrate as follows:
psi <- 5
integrate(function(x) exp(-0.5*(((x/A0)/S_norm)^2 + ((psi-y_meas*A0/exp(x))/sigma_meas)^2)),
-10, 10)
# 0.1775989 with absolute error < 6.6e-05
On top of an excellent answer by #SandipanDey, if you could extend limits to -Infinity...+Infinity, there is a better way to integrate functions with e-x2 kernel: Gauss-Hermite quadrature, and there is an R package for that.
Simple example:
library(gaussquad)
n.quad <- 128 # integration order
# get the particular (weights,abscissas) as data frame
# with 2 observables and n.quad observations
rule <- ghermite.h.quadrature.rules(n.quad, mu = 0.0)[[n.quad]]
# test function - integrate 1 over exp(-x^2) from -Inf to Inf
# should get sqrt(pi) as an answer
f <- function(x) {
1.0
}
q <- ghermite.h.quadrature(f, rule)
print(q - sqrt(pi))
I am writing my Masters final project in which I am deriving probability of default using Black Scholes Merton Model.I have got stuck in R code. Mathematically, I want to solve this system of nonlinear equations with the package nleqslv:
library(nleqslv)
T <- 1
D1 <- 20010.75
R <- 0.8516
sigmaS <- .11
SO1 <- 1311.74
fnewton <- function(x){
y <- numeric(2)
d1 <- (log(x[1]/D1)+(R+x[2]^2/2)*T)/x[2]*sqrt(T)
d2 <- d1 - x[2]*sqrt(T)
y[1] <- SO1 - (x[1]*pnorm(d1) - exp(-R*T)*D1*pnorm(d2))
y[2] <- sigmaS*SO1 - pnorm(d1)*x[2]*x[1]
y
}
xstart <- c(1311.74,0.11)
nleqslv(xstart, fnewton, method="Broyden")
# $x
# [1] 1311.74 0.11
# $fvec
# [1] 1311.7400 144.2914
# $termcd
# [1] 6
# $message
# [1] "Jacobian is singular (see allowSingular option)"
# $scalex
# [1] 1 1
# $nfcnt
# [1] 0
# $njcnt
# [1] 1
# $iter
# [1] 1
I have tried this with many values of the 5 inputs( stated above that I have computed for 2 companies for different years), but I am not getting the final values of S0 and sigma V.
I am getting message as "Jacobian is singular (see allowSingular option)" If I allow singular Jacobean using "control=list(trace=1,allowSingular=TRUE)", then also no answer is displayed. I do not know how to obtain the solution of these 2 variables now.
I really don’t know, what I am doing wrong as I oriented my model on Teterevas slides ( on slide no.5 is her model code), who’s presentation is the first result by googeling
https://www.google.de/search?q=moodys+KMV+in+R&rlz=1C1SVED_enDE401DE401&aq=f&oq=moodys+KMV+in+R&aqs=chrome.0.57.13309j0&sourceid=chrome&ie=UTF-8#q=distance+to+default+in+R
q=distance+to+default+in+R
Like me, however more successful, she calculates the Distance to Default risk measure via the Black Scholes Merton approach. In this model, the value of equity (usually represented by the market capitalization, > SO1) can be written as a European call option.
The other variables are:
x[1]: the variable I want to derive, value of total assets
x[2]: the variable I want to derive, volatility of total assets
D1: the book value of debt (19982009)
R: a riskfree interest rate
T: is set to 1 year (time)
sigmaS: estimated (historical) equity volatility
You should be able to use the initial values of SO1 and sigmaS as starting values for nleqslv.
First of all the R code given by Tetereva doesn't seem quite correct (the variable Z should be D1 as you have named it; similar changes for her S0 and D).
I have modified Tetereva's into this:
library(nleqslv)
T <- 1
D1 <- 33404048
R <- 2.32
sigmaS <- .02396919
SO1 <- 4740291 # Ve?
fnewton <- function(x){
y <- numeric(2)
d1 <- (log(x[1]/D1)+(R+x[2]^2/2)*T)/x[2]*sqrt(T)
d2 <- d1 - x[2]*sqrt(T)
y[1] <- SO1 - (x[1]*pnorm(d1) - exp(-R*T)*D1*pnorm(d2))
y[2] <- sigmaS*SO1 - pnorm(d1)*x[2]*x[1]
y
}
xstart <- c(SO1,sigmaS)
nleqslv(xstart, fnewton, method="Broyden",control=list(trace=1))
nleqslv(xstart, fnewton, method="Newton",control=list(trace=1))
which will give the solution given by Tetereva. (I use trace=1 here just to check the iteration steps.)
I believe the value you give for R should be 8.516 and not something else. Using your values for the parameters
T <- 1
D1 <- 20010.75
R <- 8.516 # modified
sigmaS <- .11
SO1 <- 1311.74
like this
xstart <- c(1311.74,0.11)
nleqslv(xstart, fnewton, method="Broyden")
nleqslv(xstart, fnewton, method="Newton")
Then running nleqslv with these values converges very quickly.
If one uses R <- 2.32 (like Tetereva) nleqslv will also converge albeit with more iterations.
I cannot help you with what R should actually be but from Tetereva's presentation I assume R is in percentages. Since I don't have enough knowledge on the Black-Scholes model I can't be of any help for finding out what the correct values are for the various parameters. It's up to you.
I am trying to maximize the following function with optimx in R
#Initial Parameter Values
beta1=0.5
beta2=0.5
lambda1=0.5
lambda2=0.5
delta=5
loglik=function(par) {sum(log(lambda1*PDF1+lambda2*PDF2))+delta*(lambda1+lambda2-1)}
G2=optimx(c(0.5,0.5,0.5,0.5,2),fn=loglik,gr=NULL, lower=-Inf, upper=Inf, hessian=FALSE)
But every time the optimization reproduces the initial values that I provide to the system, for e.g this is the response I get to the optimization using the above mentioned initial values.
p1 p2 p3 p4 p5 value fevals gevals niter convcode kkt1 kkt2
Nelder-Mead 0.5 0.5 0.5 0.5 2 5144.569 6 NA NA 0 TRUE NA
BFGS 0.5 0.5 0.5 0.5 2 5144.569 1 1 NA 0 TRUE NA
xtimes
Nelder-Mead 0
BFGS 0
Can anyone please tell whats going on?
Note: I switched out optimx for the built in optim in my answer. This does not change the content. I also switched out your PDF1 and PDF2 in the function body for beta1 and beta2 on a hunch about your intent.
You're misunderstanding how optimx, and honestly, functions, work.
Here's your definition of loglik
loglik <- function(par) {
sum(log(lambda1*beata1 + lambda2*beta2)) + delta*(lambda1 + lambda2 - 1)
}
Now watch
> loglik(1)
[1] -0.6931472
> loglik(2)
[1] -0.6931472
> loglik("I like cats.")
[1] -0.6931472
You haven't so much defined a function, but a constant. You can see this by observing that the function you defined makes no reference to its argument par. Instead, it ignores par and simply looks up the variables it contains in its enclosing environment.
You most likely meant to do this
loglik <- function(par) {
sum(log(par[3]*par[1] + par[4]*par[2])) + par[5]*(par[3] + par[4] - 1)
}
After which the optimization works as intended
optim(c(0.5,0.5,0.5,0.5,2), fn=loglik, gr=NULL, lower=-Inf, upper=Inf, hessian=FALSE)
$par
[1] 0.6466066 0.8102440 -0.2802594 0.2236580 2.6381565
$value
[1] -40.91057
$counts
function gradient
501 NA
$convergence
[1] 1
A convergence code of 1 does not indicate convergence
1
indicates that the iteration limit maxit had been reached.
Indeed, there are plenty of warnings
warnings()
1: In log(par[3] * par[1] + par[4] * par[2]) : NaNs produced
2: In log(par[3] * par[1] + par[4] * par[2]) : NaNs produced
You'll have to sort that out, I don't know what you are actually trying to accomplish with this call.
Thank you, But the problem is PDF1 and PDF2 are also functions defined by
PDF1=function(beta1) {BiCopPDF(u,v,par=abs(beta1),family=5)}
and
PDF2=function(beta2) {BiCopPDF(u,v,par=beta2,family=3)}
How do I manage that?
You will have to call into PDF1 and PDF2 within the function you are optimizing. If I understand correctly, this would result in something like:
loglik <- function(par) {
sum(log( par[3]*PDF1(par[1]) + par[4]*PDF2(par[2]) )) + par[5]*(par[3] + par[4] - 1)
}
I am using optimx package in R to solve an optimization problem. I have a very large data set stored in hadoop and I am trying to use hive streaming with R to solve the problem.
Method 1 :- Using hive streaming with R code.
Method 2 :- Running the code from R console on the same server using
R CMD BATCH
For example, below is a sample of the dataset that is used for both the cases.
ITM_NBR FAM_ID OLD_PRICE MIN_COMP MAX_COMP ELAST Units PTC_FAM_ID PTC_ITM_NBR
1 64430 101006 59.98 50.983 68.977 1 230 AB_101006 AB64430
2 16961 101006 59.98 50.983 68.977 1 151 AB_101006 AB16961
Below is the R code I am using to read data from Hive table and CSV files to calculate optimal value. The only difference between two methods is the way that I read the data.
##library required for the program
library(optimx)
##Variables required for this program
lb = numeric(5)
ub = numeric(5)
exprvec <- expression()
o <- el <- u <- numeric(2)
#**Method 1**
f <- file("stdin")
open(f)
e = read.table(f)
#**Method 2**
e <- read.csv('Dataset1.csv')
names(e) <- c("ITM_NBR","FAM_ID","OLD_PRICE","MIN_COMP","MAX_COMP","ELAST","Units","PTC_FAM_ID","PTC_ITM_NBR","MIN_E","MAX_E")
lb <- max(e$MIN_COMP,e$MIN_E)
ub <- min(e$MAX_COMP,e$MAX_E)
x0 <- mean(c(lb,ub),rm.na = TRUE)
for(j in 1:nrow(e)){
o[j] = e$OLD_PRICE[j]
el[j] = e$ELAST[j]
u[j] = e$Units[j]
exprvec[[j]] <- substitute(x*(1-(x-o)*el/o)*u,list(o=o[j],el=el[j],u=u[j]))
}
eval_obj_f <- function(x){
sum(sapply(exprvec, eval, envir=list(x=x)))
}
## Required options for the optimx package and algorithms
res <- optimx(par = y,fn=eval_obj_f,gr=NULL,lower= lb,upper= ub,method="L- BFGS-B",control=list(maximize = TRUE))
print(res)
#**Method 1 Invocation**
hive> from(select ITM_NBR,FAM_ID,OLD_PRICE,MIN_COMP,MAX_COMP,ELAST,UNITS,PTC_FAM_ID,PTC_ITM_NBR,MIN_E,MAX_E from rtable distribute by PTC_FAM_ID)t1 insert overwrite directory '/user/rapp/t1' reduce ITM_NBR,FAM_ID,OLD_PRICE,MIN_COMP,MAX_COMP,ELAST,UNITS,PTC_FAM_ID,PTC_ITM_NBR,MIN_E,MAX_E using 'rscript.r';
#**Method 2 Invocation**
[m1.hdp22] R CMD BATCH optim.r
#**Method 1 results**
lb= 50.983
ub= 62.979
x0= 56.981
p1 value fevals gevals niter convcode kkt1 kkt2 xtimes
L-BFGS-B 59.98 2.285238e+04 4 4 NA 0 TRUE TRUE 0.001
#**Method 2 results**
lb= 50.983
ub= 62.979
x0= 56.981
p1 value fevals gevals niter convcode kkt1 kkt2 xtimes
L-BFGS-B 53.05923 23247.9 4 4 NA 0 TRUE TRUE 0.001