I have a wrapper function of two functions. Each function has its own parameters vectors. The main idea is to pass the vectors of parameters (which is a vector or two vectors) to optim and then, I would like to maximize the sum of the function.
Since my function is so complex, then I tried to provide a simple example which is similar to my original function. Here is my code:
set.seed(123)
x <- rnorm(10,2,0.5)
ff <- function(x, parOpt){
out <- -sum(log(dnorm(x, parOpt[[1]][1], parOpt[[1]][2]))+log(dnorm(x,parOpt[[2]][1],parOpt[[2]][2])))
return(out)
}
# parameters in mu,sd vectors arranged in list
params <- c(set1 = c(2, 0.2), set2 = c(0.5, 0.3))
xy <- optim(par = params, fn=ff ,x=x)
Which return this error:
Error in optim(par = params, fn = ff, x = x) :
function cannot be evaluated at initial parameters
As I understand, I got this error because optim cannot pass the parameters to each part of my function. So, how can I tell optim that the first vector is the parameter of the first part of my function and the second is for the second part.
You should change method parameter to use initial parameters.
You can read detailed instructions about optim function using ?optim command.
For example you can use "L-BFGS-B" method to use upper and lower constraints.
Related
I would like to know how to do the maximum likelihood estimation in R when fitting parameters are given in an array. This is needed when the number of parameters is large. So basically, to fit a normal distribution to the data x, I would like to do something like the following
LL <- function(param_array) {
R = dnorm(x, param_array[1], param_array[2])
-sum(log(R))
}
mle(LL, start = list(param_array = c(1,1)))
(Instead of this original code in the first section of http://www.r-bloggers.com/fitting-a-model-by-maximum-likelihood/)
If I ran the code above I will get an error
Error in dnorm(x, param_array[1], param_array[2]) : argument
"param_array" is missing, with no default
Could anyone let me know how to achieve what I want in the correct way?
stats4::mle is not a long function, you can inspect it in your R console:
> stats4::mle
Note how start is handled:
start <- sapply(start, eval.parent)
nm <- names(start)
case 1
If you do:
LL <- function(mu, sigma) {
R = dnorm(x, mu, sigma)
-sum(log(R))
}
mle(LL, start = list(mu = 1, sigma = 1))
you get:
nm
#[1] "mu" "sigma"
Also,
formalArgs(LL)
#[1] "mu" "sigma"
case 2
If you do:
LL <- function(param_array) {
R = dnorm(x, param_array[1], param_array[2])
-sum(log(R))
}
mle(LL, start = list(param_array = c(1,1)))
you get
nm
#[1] NULL
but
formalArgs(LL)
#[1] param_array
The problem
The evaluation of function LL inside stats::mle is by matching nm to the formal arguments of LL. In case 1, there is no difficulty in matching, but in case 2 you get no match, thus you will fail to evaluate LL.
So what do people do if they have like 50 parameters? Do they type them in by hand?
Isn't this a bogus argument, after a careful reflection? If you really have 50 parameters, does using an array really save your effort?
First, inside your function LL, you have to specify param_array[1], param_array[2], ..., param_array[50], i.e., you still need to manually input 50 parameters into the right positions. While when specifying start, you still need to type in a length-50 vector element by element, right? Isn't this the same amount of work, compared with not using an array but a list?
I'm using the nls.lm function from the minpack.lm package and something "weird" happens when I change the order of the parameters in the residual function
This code works :
install.packages('minpack.lm')
library(minpack.lm)
## values over which to simulate data
x <- seq(0,100,length=100)
## model based on a list of parameters
getPrediction <- function(parameters, x)
parameters$A*exp(-parameters$alpha*x) + parameters$B*exp(-parameters$beta*x)
## parameter values used to simulate data
pp <- list(A = 2, B = 0.8, alpha = 0.6, beta = 0.01)
## simulated data, with noise
simDNoisy <- getPrediction(pp,x) + rnorm(length(x),sd=.01)
#simDNoisy[seq(1,10)] = rep(10,11)
simDNoisy[1] = 4
## plot data
plot(x,simDNoisy, main="data")
## residual function
residFun <- function(parameters, observed, xx)
sqrt(abs(observed - getPrediction(parameters, xx)))
## starting values for parameters
parStart <- list(Ar = 3, Br = 2, alphar = 1, betar = 0.05)
## perform fit
rm(nls.out)
nls.out <- nls.lm(par=parStart,
fn = residFun,
observed = simDNoisy,
xx = x,
control = nls.lm.control(nprint=1))
nls.out
It doesn't work if I replace the residual function by this (just change parameters order)
residFun <- function(xx, parameters, observed )
sqrt(abs(observed - getPrediction(xx, parameters)))
Error in parameters$A : $ operator is invalid for atomic vectors
Why does it cause this error ?
Parameters should match the order of the parameters as defined in the function. The only exception you should use is if you explicitly name them out of order. Consider this example of what the function thinks are two parameters
theParameters=function(X,Y){
print(paste("I think X is",X))
print(paste("I think Y is",Y))
}
theParameters(X=2,Y=10)
theParameters(Y=10,X=2)
#you can change the parameter order if you identify them with parameter=...
#but if you don't, it assumes it's in the order of how the function is defined.
# which of these is X and which is Y?
theParameters(10,2)
It's preferable to always identify the parameters, but nececessary if it's out of order. (Other languages don't even let you change the order of parameters when you call them).
getPrediction(xx=xx,parameters=parameters)
In this case the reason is the function treats xx and parameters as if it had created its own local copy. Without the identification this line
getPrediction(xx,parameters)
means this to R
getPrediction(parameters=xx,xx=parameters)
because that matches the original signature of the function.
So the function's version of parameters is what you pass in as xx, and so on.
Because you called the parameters the same thing as variable names, it can be confusing. It works easier if you vary the dummy version of variable names slightly. Alternatively if scopes of variables allow it, you don't even have to pass in the parameters, but be careful with that practice because it can cause tracing headaches.
I have a system of Ordinary Differential equation which contains a forcing function R_0. The forcing function has a formula to be calculated at each time step. I tried to use approxfunc() in R, but when I print R_0_matrix I got (?) on the R_0 column at each time point and I got this error message when I run R_0_value
Error in xy.coords(x, y) :
(list) object cannot be coerced to type 'double'
I don't know where is my mistake in the code,any help will be appreciated
Here is a part of my code:
# define model paramters
parameters <- c(N = 3.2*10^6,L =1000/15,dimm = 1/1.07, d_in = 75/365, d_treat0 = 2/52, p1 = 0.87, p2 = 0.08, k = 0.082, eta0 = 0.05, R_m =1.23,amp = 0.67,phi = 3/12)
# R_0 Forcing Function
t <- seq(0, 100, by = 0.1)
R_0 <-function(t,paramters){
with(as.list(parameters),{
R_0=R_m*amp*cos(2*pi*(t-phi))+R_m
return(R_0)
})
}
R_0_matrix <-cbind(t,R_0)
R_0_value<- approxfun(x=R_0_matrix[,1], y= R_0_matrix[,2], method="linear", rule=2)
R_0 is a function. So cbind(t, R_0) is probably not what you intended, since it tries to create a matrix where one column is numeric and the other is a function. See what happens, for example, if you type this into the console: cbind(1:10, function(x) {x}). Probably you meant to do this:
R_0_t = R_0(t, parameters)
R_0_matrix <- cbind(t, R_0_t)
R_0_value <- approxfun(x=R_0_matrix[,1], y= R_0_matrix[,2], method="linear", rule=2)
Naming the function R_0 and then also calling the calculated object within the function environment R_0 may have caused some confusion. Even though the function creates an object called R_0 inside the function environment, the object R_0 that exists in the global environment is the function itself. To avoid confusion it's better to use different names for the function and any objects created inside the function environment.
Furthermore, the function does not return an object called R_0 to the global environment. It just returns the vector of numbers that result from evaluating the function R_0. The name of this returned vector in the global environment is whatever you assign when you call the function (R_0_t in the code sample above).
And just to check the approximation function:
plot(t[1:30], R_0_t[1:30], type="l", lwd=2)
lines(t[1:30], R_0_value(t[1:30]), col="red", lwd=6, lty=3)
I'm relatively new in R and I would appreciated if you could take a look at the following code. I'm trying to estimate the shape parameter of the Frechet distribution (or inverse weibull) using mmedist (I tried also the fitdist that calls for mmedist) but it seems that I get the following error :
Error in mmedist(data, distname, start = start, fix.arg = fix.arg, ...) :
the empirical moment function must be defined.
The code that I use is the below:
require(actuar)
library(fitdistrplus)
library(MASS)
#values
n=100
scale = 1
shape=3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
memp=minvweibull(c(1,2), shape=3, rate=1, scale=1)
# estimating the parameters
para_lm = mmedist(data_fre,"invweibull",start=c(shape=3,scale=1),order=c(1,2),memp = "memp")
Please note that I tried many times en-changing the code in order to see if my mistake was in syntax but I always get the same error.
I'm aware of the paradigm in the documentation. I've tried that as well but with no luck. Please note that in order for the method to work the order of the moment must be smaller than the shape parameter (i.e. shape).
The example is the following:
require(actuar)
#simulate a sample
x4 <- rpareto(1000, 6, 2)
#empirical raw moment
memp <- function(x, order)
ifelse(order == 1, mean(x), sum(x^order)/length(x))
#fit
mmedist(x4, "pareto", order=c(1, 2), memp="memp",
start=c(shape=10, scale=10), lower=1, upper=Inf)
Thank you in advance for any help.
You will need to make non-trivial changes to the source of mmedist -- I recommend that you copy out the code, and make your own function foo_mmedist.
The first change you need to make is on line 94 of mmedist:
if (!exists("memp", mode = "function"))
That line checks whether "memp" is a function that exists, as opposed to whether the argument that you have actually passed exists as a function.
if (!exists(as.character(expression(memp)), mode = "function"))
The second, as I have already noted, relates to the fact that the optim routine actually calls funobj which calls DIFF2, which calls (see line 112) the user-supplied memp function, minvweibull in your case with two arguments -- obs, which resolves to data and order, but since minvweibull does not take data as the first argument, this fails.
This is expected, as the help page tells you:
memp A function implementing empirical moments, raw or centered but
has to be consistent with distr argument. This function must have
two arguments : as a first one the numeric vector of the data and as a
second the order of the moment returned by the function.
How can you fix this? Pass the function moment from the moments package. Here is complete code (assuming that you have made the change above, and created a new function called foo_mmedist):
# values
n = 100
scale = 1
shape = 3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
# estimating the parameters
para_lm = foo_mmedist(data_fre, "invweibull",
start= c(shape=5,scale=2), order=c(1, 2), memp = moment)
You can check that optimization has occurred as expected:
> para_lm$estimate
shape scale
2.490816 1.004128
Note however, that this actually reduces to a crude way of doing overdetermined method of moments, and am not sure that this is theoretically appropriate.
I did searched the questions here before posting and I found only one question in this regard but it doesn't apply to my case.
I have uploaded the data for PRD, INJ, tao and lambda with the links below, which shall be used to reproduce the code:
PRD
INJ
lambda
tao
the code:
PRD=read.csv(file="PRD.csv")
INJ=read.csv(file="INJ.csv")
PRD=do.call(cbind, PRD)
INJ=do.call(cbind, INJ)
tao=do.call(cbind, read.csv(file="tao.csv",header=FALSE))
lambda=do.call(cbind, read.csv(file="lambda.csv",header=FALSE))
fn1 <- function (tao,lambda) {
#perparing i.dash
i.dash=matrix(ncol=ncol(INJ), nrow=(nrow(INJ)))
for (i in 1:ncol(INJ)){
for (j in 1:nrow (INJ)){
temp=0
for (k in 1:j){
temp=(1/tao[i])*exp((k-j)/tao[i])*INJ[k,i]+temp
}
i.dash[j,i]=temp
}
#preparing lambdaXi.dash
lambda.i=matrix(ncol=ncol(INJ),nrow=nrow(INJ))
for (i in 1: ncol(INJ)){
lambda.i[,i]=lambda[i+1]*i.dash[,i]
}
#calc. q. hat (I need to add the pp term)
q.hat=matrix(nrow=nrow(INJ),1 )
for (i in 1:nrow(INJ)){
q.hat[i,1]=sum(lambda.i[i,1:ncol(INJ)])
target= sum((PRD[,1]-q.hat[,1])^2)
}
}
}
what I am trying to do is to minimize the value target by optimizing lambda and tao which the starting values will be the same as the ones uploaded above. I've used optim to do so but I still receive the error cannot coerce type 'closure' to vector of type double
I've used many variations of optim and still recieve the same error.
the last syntax I've used was optim(fn1, tao=tao, lambda=lambda, hessian=T)
Thanks
The calling form of optim is
optim(par, fn, gr = NULL, ...,
method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN",
"Brent"),
lower = -Inf, upper = Inf,
control = list(), hessian = FALSE)
So, you need to pass the parameters first, not the function. Note that "closure" is another term for "function", which explains the error message: you have passed a function as the first argument, when optim expected initial parameter values.
Note also, that optim only optimises over the first argument of the function fn, so you will need to redesign your function fn1 so it only takes a single function. For example, it could be a single vector where of the form c(n, t1, t2,...,tn, l1, l2, l3, ... lm) where ti are the components of tao and li components of lambda and n tells you how many components tao has.