How to calculated DRPS (Discrete Rank Probability Score) - r

I am working on replicating the scoring rule found in a paper Forecasting the intermittent demand for slow-moving inventories: A modelling approach
The paper describes the scoring rule as follows:
This is my attempt
y <- rpois(n = 100, lambda = 10) # forecasted distribution
x <- 10 # actual value
drps_score <- function(x = value, y = q){
# x = actual value (single observation); y = quantile forecasted value (vector)
Fy = ecdf(y) # cdf function
indicator <- ifelse(y - x > 0, 1, 0) # Heaviside
score <- sum((indicator - Fy(y))^2)
return(score)
}
> drps_score(x = x, y = y)
[1] 53.028
This seems to work well until I provide a vector of 0s as follows:
y <- rep(x = 0, 100)
> drps_score(x = x, y = y)
[1] 0
I know that one of their methods used in this paper was a 0s forecast and their results did not show 0 for DRPS. This makes me think that the calculation is off.

I think there are a few issues at play here.
First off, I don't think you are computing the correct sum inside the scoring function. The score asks you to sum across all possible values of y (i.e. across all positive integers) not across all forecasted samples of y.
Second, I don't think the above definition gives the desired result, with \hat F (y) defined to be 0 when y=x then you don't get a zero score for a forecast with a point mass at the true value. (Yes, I'm saying that source is "wrong", or at least has a definition that doesn't give the desired result.) Here is a re-formulated function that I think fixes both issues:
x <- 10 # actual value
drps_score <- function(x = value, y = q, nsum=100){
# x = actual value (single observation); y = quantile forecasted value (vector)
Fy = ecdf(y) # cdf function
ysum <- 0:nsum
indicator <- ifelse(ysum - x >= 0, 1, 0) # Heaviside
score <- sum((indicator - Fy(ysum))^2)
return(score)
}
> drps_score(x = x, y = rpois(n = 1000, lambda = 8))
[1] 1.248676
> drps_score(x = x, y = rpois(n = 1000, lambda = 9))
[1] 0.878183
> drps_score(x = x, y = rpois(n = 1000, lambda = 10))
[1] 0.692667
> drps_score(x = x, y = rep(10, 100))
[1] 0
> drps_score(x = x, y = rpois(n = 1000, lambda = 11))
[1] 0.883333
The above shows that the distribution that is centered on the true value (lambda=10) has the lowest score for distributions that aren't a point mass.

Related

"One after the other" realisation of discrete random variables

I'm stuck with the following problem:
There are given n+1 discrete random variables:
X = {1,...,n} with P(x=i) = p_i
Y_i = {1,...,n_i} with P(y_i = j) = p_ij and i = 1,...,n
We do the following:
We draw from X and the result determines which Y_i we choose for the next step: If x = a, we use Y_a.
We draw from this Y_a.
Now my questions to this:
How do I get the Expected Value and the Variance of the whole?
Can this "process" be defined by a single random variable?
Assume we only know the EV and Var of all Y_i, but not all (or even none) of the probabilities. Can we still calculate the EV and Var of the whole process?
If 2) can be done, how to do this efficiently in R?
To give you an example of what I've tried:
X = {1,2} with P(x = 1) = 0.3 and P(x = 2) = 0.7
Y_1 = {2,3} with P(y_1 = 1) = 0.5 and P(y_1 = 3) = 0.5
Y_2 = {1,5,20} with P(y_2 = 1) = 0.3, P(y_2 = 5) = 0.6 and P(y_2 = 20) = 0.1
I have tried to combine those to a single random variable Z, but I'm not sure, if that can be done that way:
Z = {2,3,1,5,20} with probabilities (0.5*0.3, 0.5*0.3, 0.3*0.7, 0.6*0.7, 0.1*0.7)
The weighted EV is correct, but the "weighted" Var is different - if it is correct to use the formula for Var of linear combination for independent random variables. (Maybe just the formula for the combined Var is wrong.)
I used R and the package "discreteRV":
install.packages("discreteRV")
library(discreteRV)
#defining the RVs
Y_1 <- RV(outcomes = c(2, 3), probs = c(0.5, 0.5)) #occures 30% of the time
Y_2 <- RV(outcomes = c(1, 5, 20), probs = c(0.3, 0.6, 0.1)) #occures 70% of the time
Z <- RV(outcomes = c(2, 3, 1, 5, 20),
probs = c(0.5*0.3, 0.5*0.3, 0.3*0.7, 0.6*0.7, 0.1*0.7))
#calculating the EVs
E(Z)
E(Y_1)*0.3 + E(Y_2)*0.7
#calculating the VARs
V(Z)
V(Y_1)*(0.3)^2 + V(Y_2)*(0.7)^2
Thank you for your help.
Actually Z has a larger sample space expanded by Y1 and Y2, which is not a linear superposition of two components. In other words, we should present Z like Z = [0.3*Y1, 0.7*Y2] rather than Z = 0.3*Y1 + 0.7*Y2.
Since we have
V(Z) = E(Z**2)-E(Z)**2
> E(Z**2) -E(Z)**2
[1] 20.7684
> V(Z)
[1] 20.7684
We will easily find that in the term E(Z)**2, there are cross-product terms between Y1 and Y2, which makes V(Z) != V(Y_1)*(0.3)^2 + V(Y_2)*(0.7)^2.

Non-linear fitting with nls() is giving me singular gradient matrix at initial parameter estimates. Why?

This is my first attempt at fitting a non-linear model in R, so please bear with me.
Problem
I am trying to understand why nls() is giving me this error:
Error in nlsModel(formula, mf, start, wts): singular gradient matrix at initial parameter estimates
Hypotheses
From what I've read from other questions here at SO it could either be because:
my model is discontinuous, or
my model is over-determined, or
bad choice of starting parameter values
So I am calling for help on how to overcome this error. Can I change the model and still use nls(), or do I need to use nls.lm from the minpack.lm package, as I have read elsewhere?
My approach
Here are some details about the model:
the model is a discontinuous function, a kind of staircase type of function (see plot below)
in general, the number of steps in the model can be variable yet they are fixed for a specific fitting event
MWE that shows the problem
Brief explanation of the MWE code
step_fn(x, min = 0, max = 1): function that returns 1 within the interval (min, max] and 0 otherwise; sorry about the name, I realize now it is not really a step function... interval_fn() would be more appropriate I guess.
staircase(x, dx, dy): a summation of step_fn() functions. dx is a vector of widths for the steps, i.e. max - min, and dy is the increment in y for each step.
staircase_formula(n = 1L): generates a formula object that represents the model modeled by the function staircase() (to be used with the nls() function).
please do note that I use the purrr and glue packages in the example below.
Code
step_fn <- function(x, min = 0, max = 1) {
y <- x
y[x > min & x <= max] <- 1
y[x <= min] <- 0
y[x > max] <- 0
return(y)
}
staircase <- function(x, dx, dy) {
max <- cumsum(dx)
min <- c(0, max[1:(length(dx)-1)])
step <- cumsum(dy)
purrr::reduce(purrr::pmap(list(min, max, step), ~ ..3 * step_fn(x, min = ..1, max = ..2)), `+`)
}
staircase_formula <- function(n = 1L) {
i <- seq_len(n)
dx <- sprintf("dx%d", i)
min <-
c('0', purrr::accumulate(dx[-n], .f = ~ paste(.x, .y, sep = " + ")))
max <- purrr::accumulate(dx, .f = ~ paste(.x, .y, sep = " + "))
lhs <- "y"
rhs <-
paste(glue::glue('dy{i} * step_fn(x, min = {min}, max = {max})'),
collapse = " + ")
sc_form <- as.formula(glue::glue("{lhs} ~ {rhs}"))
return(sc_form)
}
x <- seq(0, 10, by = 0.01)
y <- staircase(x, c(1,2,2,5), c(2,5,2,1)) + rnorm(length(x), mean = 0, sd = 0.2)
plot(x = x, y = y)
lines(x = x, y = staircase(x, dx = c(1,2,2,5), dy = c(2,5,2,1)), col="red")
my_data <- data.frame(x = x, y = y)
my_model <- staircase_formula(4)
params <- list(dx1 = 1, dx2 = 2, dx3 = 2, dx4 = 5,
dy1 = 2, dy2 = 5, dy3 = 2, dy4 = 1)
m <- nls(formula = my_model, start = params, data = my_data)
#> Error in nlsModel(formula, mf, start, wts): singular gradient matrix at initial parameter estimates
Any help is greatly appreciated.
I assume you are given a vector of observations of length len as the ones plotted in your example, and you wish to identify k jumps and k jump sizes. (Or maybe I misunderstood you; but you have not really said what you want to achieve.)
Below I will sketch a solution using Local Search. I start with your example data:
x <- seq(0, 10, by = 0.01)
y <- staircase(x,
c(1,2,2,5),
c(2,5,2,1)) + rnorm(length(x), mean = 0, sd = 0.2)
A solution is a list of positions and sizes of the jumps. Note that I use vectors to store these data, as it will become cumbersome to define variables when you have 20 jumps, say.
An example (random) solution:
k <- 5 ## number of jumps
len <- length(x)
sol <- list(position = sample(len, size = k),
size = runif(k))
## $position
## [1] 89 236 859 885 730
##
## $size
## [1] 0.2377453 0.2108495 0.3404345 0.4626004 0.6944078
We need an objective function to compute the quality of the solution. I also define a simple helper function stairs, which is used by the objective function.
The objective function abs_diff computes the average absolute difference between the fitted series (as defined by the solution) and y.
stairs <- function(len, position, size) {
ans <- numeric(len)
ans[position] <- size
cumsum(ans)
}
abs_diff <- function(sol, y, stairs, ...) {
yy <- stairs(length(y), sol$position, sol$size)
sum(abs(y - yy))/length(y)
}
Now comes the key component for a Local Search: the neighbourhood function that is used to evolve the solution. The neighbourhood function takes a solution and changes it slightly. Here, it will either pick a position or a size and modify it slightly.
neighbour <- function(sol, len, ...) {
p <- sol$position
s <- sol$size
if (runif(1) > 0.5) {
## either move one of the positions ...
i <- sample.int(length(p), size = 1)
p[i] <- p[i] + sample(-25:25, size = 1)
p[i] <- min(max(1, p[i]), len)
} else {
## ... or change a jump size
i <- sample.int(length(s), size = 1)
s[i] <- s[i] + runif(1, min = -s[i], max = 1)
}
list(position = p, size = s)
}
An example call: here the new solution has its first jump size changed.
## > sol
## $position
## [1] 89 236 859 885 730
##
## $size
## [1] 0.2377453 0.2108495 0.3404345 0.4626004 0.6944078
##
## > neighbour(sol, len)
## $position
## [1] 89 236 859 885 730
##
## $size
## [1] 0.2127044 0.2108495 0.3404345 0.4626004 0.6944078
I remains to run the Local Search.
library("NMOF")
sol.ls <- LSopt(abs_diff,
list(x0 = sol, nI = 50000, neighbour = neighbour),
stairs = stairs,
len = len,
y = y)
We can plot the solution: the fitted line is shown in blue.
plot(x, y)
lines(x, stairs(len, sol.ls$xbest$position, sol.ls$xbest$size),
col = "blue", type = "S")
Try DE instead:
library(NMOF)
yf= function(params,x){
dx1 = params[1]; dx2 = params[2]; dx3 = params[3]; dx4 = params[4];
dy1 = params[5]; dy2 = params[6]; dy3 = params[7]; dy4 = params[8]
dy1 * step_fn(x, min = 0, max = dx1) + dy2 * step_fn(x, min = dx1,
max = dx1 + dx2) + dy3 * step_fn(x, min = dx1 + dx2, max = dx1 +
dx2 + dx3) + dy4 * step_fn(x, min = dx1 + dx2 + dx3, max = dx1 +
dx2 + dx3 + dx4)
}
algo1 <- list(printBar = FALSE,
nP = 200L,
nG = 1000L,
F = 0.50,
CR = 0.99,
min = c(0,1,1,4,1,4,1,0),
max = c(2,3,3,6,3,6,3,2))
OF2 <- function(Param, data) { #Param=paramsj data=data2
x <- data$x
y <- data$y
ye <- data$model(Param,x)
aux <- y - ye; aux <- sum(aux^2)
if (is.na(aux)) aux <- 1e10
aux
}
data5 <- list(x = x, y = y, model = yf, ww = 1)
system.time(sol5 <- DEopt(OF = OF2, algo = algo1, data = data5))
sol5$xbest
OF2(sol5$xbest,data5)
plot(x,y)
lines(data5$x,data5$model(sol5$xbest, data5$x),col=7,lwd=2)
#> sol5$xbest
#[1] 1.106396 12.719182 -9.574088 18.017527 3.366852 8.721374 -19.879474 1.090023
#> OF2(sol5$xbest,data5)
#[1] 1000.424

Finding the x value of a curve given f(x) in R?

I was wondering why I can't find the other existing value of x whose f(x) equals the f(.6)?
In other words, I'm wondering how to find the x value of the point indicated by the red X in the picture below?
Here is what I have tried without success:
source("https://raw.githubusercontent.com/rnorouzian/i/master/ii.r") # source the function
f <- function(x, n.pred = 5, N = 100, conf.level = .95){
ci <- R2.ci(R2 = x, n.pred = n.pred, N = N, conf.level = conf.level) # The objective function
ci$upper - ci$lower
}
curve(f, panel.f = abline(v = .6, h = f(.6), col = 2, lty = c(2, 1))) # curve the function
uniroot(function(x) f(.6) - f(x), c(0, 1))[[1]] # find the requested 'x' value
`Error: f() values at end points not of opposite sign`
abline(v=uniroot(function(x) f(.6) - f(x), c(0, 0.4))[[1]])

Automatically solve an equation of `pt` for `ncp`

I wonder if it is possible to efficiently change ncp in the below code such that x becomes .025 and .975 (within rounding error).
x <- pt(q = 5, df = 19, ncp = ?)
----------
Clarification
q = 5 and df = 19 (above) are just two hypothetical numbers, so q and df could be any other two numbers. What I expect is a function / routine, that takes q and df as input.
What is wrong with uniroot?
f <- function (ncp, alpha) pt(q = 5, df = 19, ncp = ncp) - alpha
par(mfrow = c(1,2))
curve(f(ncp, 0.025), from = 5, to = 10, xname = "ncp", main = "0.025")
abline(h = 0)
curve(f(ncp, 0.975), from = 0, to = 5, xname = "ncp", main = "0.975")
abline(h = 0)
So for 0.025 case, the root lies in (7, 8); for 0.975 case, the root lies in (2, 3).
uniroot(f, c(7, 8), alpha = 0.025)$root
#[1] 7.476482
uniroot(f, c(2, 3), alpha = 0.975)$root
#[1] 2.443316
---------
(After some discussion...)
OK, now I see your ultimate goal. You want to implement this equation solver as a function, with input q and df. So they are unknown, but fixed. They might come out of an experiment.
Ideally if there is an analytical solution, i.e., ncp can be written as a formula in terms of q, df and alpha, that would be so great. However, this is not possible for t-distribution.
Numerical solution is the way, but uniroot is not a great option for this purpose, as it relies on "plot - view - guess - specification". The answer by loki is also crude but with some improvement. It is a grid search, with fixed step size. Start from a value near 0, say 0.001, and increase this value and check for approximation error. We stop when this error fails to decrease.
This really initiates the idea of numerical optimization with Newton-method or quasi-Newton method. In 1D case, we can use function optimize. It does variable step size in searching, so it converges faster than a fixed step-size searching.
Let's define our function as:
ncp_solver <- function (alpha, q, df) {
## objective function: we minimize squared approximation error
obj_fun <- function (ncp, alpha = alpha, q = q, df = df) {
(pt(q = q, df = df, ncp = ncp) - alpha) ^ 2
}
## now we call `optimize`
oo <- optimize(obj_fun, interval = c(-37.62, 37.62), alpha = alpha, q = q, df = df)
## post processing
oo <- unlist(oo, use.names = FALSE) ## list to numerical vector
oo[2] <- sqrt(oo[2]) ## squared error to absolute error
## return
setNames(oo, c("ncp", "abs.error"))
}
Note, -37.62 / 37.62 is chosen as lower / upper bound for ncp, as it is the maximum supported by t-distribution in R (read ?dt).
For example, let's try this function. If you, as given in your question, has q = 5 and df = 19:
ncp_solver(alpha = 0.025, q = 5, df = 19)
# ncp abs.error
#7.476472e+00 1.251142e-07
The result is a named vector, with ncp and absolute approximation error.
Similarly we can do:
ncp_solver(alpha = 0.975, q = 5, df = 19)
# ncp abs.error
#2.443347e+00 7.221928e-07
----------
Follow up
Is it possible that in the function ncp_solver(), alpha takes a c(.025, .975) together?
Why not wrapping it up for a "vectorization":
sapply(c(0.025, 0.975), ncp_solver, q = 5, df = 19)
# [,1] [,2]
#ncp 7.476472e+00 2.443347e+00
#abs.error 1.251142e-07 7.221928e-07
How come 0.025 gives upper bound of confidence interval, while 0.975 gives lower bound of confidence interval? Should this relationship reversed?
No surprise. By default pt computes lower tail probability. If you want the "right" relationship, set lower.tail = FALSE in pt:
ncp_solver <- function (alpha, q, df) {
## objective function: we minimize squared approximation error
obj_fun <- function (ncp, alpha = alpha, q = q, df = df) {
(pt(q = q, df = df, ncp = ncp, lower.tail = FALSE) - alpha) ^ 2
}
## now we call `optimize`
oo <- optimize(obj_fun, interval = c(-37.62, 37.62), alpha = alpha, q = q, df = df)
## post processing
oo <- unlist(oo, use.names = FALSE) ## list to numerical vector
oo[2] <- sqrt(oo[2]) ## squared error to absolute error
## return
setNames(oo, c("ncp", "abs.error"))
}
Now you see:
ncp_solver(0.025, 5, 19)[[1]] ## use "[[" not "[" to drop name
#[1] 2.443316
ncp_solver(0.975, 5, 19)[[1]]
#[1] 7.476492
--------
Bug report and fix
I was reported that the above ncp_solver is unstable. For example:
ncp_solver(alpha = 0.025, q = 0, df = 98)
# ncp abs.error
#-8.880922 0.025000
But on the other hand, if we double check with uniroot here:
f <- function (ncp, alpha) pt(q = 0, df = 98, ncp = ncp, lower.tail = FALSE) - alpha
curve(f(ncp, 0.025), from = -3, to = 0, xname = "ncp"); abline(h = 0)
uniroot(f, c(-2, -1.5), 0.025)$root
#[1] -1.959961
So there is clearly something wrong with ncp_solver.
Well it turns out that we can not use too big bound, c(-37.62, 37.62). If we narrow it to c(-35, 35), it will be alright.
Also, to avoid tolerance problem, we can change objective function from squared error to absolute error:
ncp_solver <- function (alpha, q, df) {
## objective function: we minimize absolute approximation error
obj_fun <- function (ncp, alpha = alpha, q = q, df = df) {
abs(pt(q = q, df = df, ncp = ncp, lower.tail = FALSE) - alpha)
}
## now we call `optimize`
oo <- optimize(obj_fun, interval = c(-35, 35), alpha = alpha, q = q, df = df)
## post processing and return
oo <- unlist(oo, use.names = FALSE) ## list to numerical vector
setNames(oo, c("ncp", "abs.error"))
}
ncp_solver(alpha = 0.025, q = 0, df = 98)
# ncp abs.error
#-1.959980e+00 9.190327e-07
Damn, this is a pretty annoying bug. But relax now.
Report on getting warning messages from pt
I also receive some report on annoying warning messages from pt:
ncp_solver(0.025, -5, 19)
# ncp abs.error
#-7.476488e+00 5.760562e-07
#Warning message:
#In pt(q = q, df = df, ncp = ncp, lower.tail = FALSE) :
# full precision may not have been achieved in 'pnt{final}'
I am not too sure what is going on here, but meanwhile I did not observe misleading result. Therefore, I decide to suppress those warnings from pt, using suppressWarnings:
ncp_solver <- function (alpha, q, df) {
## objective function: we minimize absolute approximation error
obj_fun <- function (ncp, alpha = alpha, q = q, df = df) {
abs(suppressWarnings(pt(q = q, df = df, ncp = ncp, lower.tail = FALSE)) - alpha)
}
## now we call `optimize`
oo <- optimize(obj_fun, interval = c(-35, 35), alpha = alpha, q = q, df = df)
## post processing and return
oo <- unlist(oo, use.names = FALSE) ## list to numerical vector
setNames(oo, c("ncp", "abs.error"))
}
ncp_solver(0.025, -5, 19)
# ncp abs.error
#-7.476488e+00 5.760562e-07
OK, quiet now.
You could use two while loops like this:
i <- 0.001
lowerFound <- FALSE
while(!lowerFound){
x <- pt(q = 5, df = 19, ncp = i)
if (round(x, 3) == 0.025){
lowerFound <- TRUE
print(paste("Lower is", i))
lower <- i
} else {
i <- i + 0.0005
}
}
i <- 0.001
upperFound <- FALSE
while(!upperFound){
x <- pt(q = 5, df = 19, ncp = i)
if (round(x, 3) == 0.975){
upperFound <- TRUE
print(paste("Upper is ", i))
upper <- i
} else {
i <- i + 0.0005
}
}
c(Lower = lower, Upper = upper)
# Lower Upper
# 7.4655 2.4330
Of course, you can adapt the increment in i <- i + .... or change the check if (round(x,...) == ....) to fit this solution to your specific needs of accuracy.
I know this is an old question, but there is now a one-line solution to this problem using the conf.limits.nct() function in the MBESS package.
install.packages("MBESS")
library(MBESS)
result <- conf.limits.nct(t.value = 5, df = 19)
result
$Lower.Limit
[1] 2.443332
$Prob.Less.Lower
[1] 0.025
$Upper.Limit
[1] 7.476475
$Prob.Greater.Upper
[1] 0.025
$Lower.Limit is the result where pt = 0.975
$Upper.Limit is the result where pt = 0.025
pt(q=5,df=19,ncp=result$Lower.Limit)
[1] 0.975
> pt(q=5,df=19,ncp=result$Upper.Limit)
[1] 0.025

Why is gradient of first iteration step singular in nls with biv.norm

I am trying to fit a non-linear regression model where the mean-function is the bivariate normal distribution. The parameter to specify is the correlation rho.
The problem: "gradient of first iteration step is singular". Why?
I have here a little example with simulated data.
# given values for independent variables
x1 <- c(rep(0.1,5), rep(0.2,5), rep(0.3,5), rep(0.4,5), rep(0.5,5))
x2 <- c(rep(c(0.1,0.2,0.3,0.4,0.5),5))
## 1 generate values for dependent variable (incl. error term)
# from bivariate normal distribution with assumed correlation rho=0.5
fun <- function(b) pmnorm(x = c(qnorm(x1[b]), qnorm(x2[b])),
mean = c(0, 0),
varcov = matrix(c(1, 0.5, 0.5, 1), nrow = 2))
set.seed(123)
y <- sapply(1:25, function(b) fun(b)) + runif(25)/1000
# put it in data frame
dat <- data.frame(y=y, x1=x1, x2=x2 )
# 2 : calculate non-linear regression from the generated data
# use rho=0.51 as starting value
fun <- function(x1, x2,rho) pmnorm(x = c(qnorm(x1), qnorm(x2)),
mean = c(0, 0),
varcov = matrix(c(1, rho, rho, 1), nrow = 2))
nls(formula= y ~ fun(x1, x2, rho), data= dat, start=list(rho=0.51),
lower=0, upper=1, trace=TRUE)
This yields an error message:
Error in nls(formula = y ~ fun(x1, x2, rho), data = dat, start = list(rho = 0.51), :
singulärer Gradient
In addition: Warning message:
In nls(formula = y ~ fun(x1, x2, rho), data = dat, start = list(rho = 0.51), :
Obere oder untere Grenzen ignoriert, wenn nicht algorithm= "port"
What I don't understand is
I have only one variable (rho), so there is only one gradient which must be =0 if the matrix of gradients is supposed to be singular. So why should the gradient be =0?
The start value cannot be the problem as I know the true rho=0.5. So the start value =0.51 should be fine, shouldn't it?
The data cannot be completely linear dependent as I added an error term to y.
I would appreciate help very much. Thanks already.
Perhaps "optim" does a better job than "nls":
library(mnormt)
# given values for independent variables
x1 <- c(rep(0.1,5), rep(0.2,5), rep(0.3,5), rep(0.4,5), rep(0.5,5))
x2 <- c(rep(c(0.1,0.2,0.3,0.4,0.5),5))
## 1 generate values for dependent variable (incl. error term)
# from bivariate normal distribution with assumed correlation rho=0.5
fun <- function(b) pmnorm(x = c(qnorm(x1[b]), qnorm(x2[b])),
mean = c(0, 0),
varcov = matrix(c(1, 0.5, 0.5, 1), nrow = 2))
set.seed(123)
y <- sapply(1:25, function(b) fun(b)) + runif(25)/1000
# put it in data frame
dat <- data.frame(y=y, x1=x1, x2=x2 )
# 2 : calculate non-linear regression from the generated data
# use rho=0.51 as starting value
fun <- function(x1, x2,rho) pmnorm(x = c(qnorm(x1), qnorm(x2)),
mean = c(0, 0),
varcov = matrix(c(1, rho, rho, 1), nrow = 2))
f <- function(rho) { sum( sapply( 1:nrow(dat),
function(i){
(fun(dat[i,2],dat[i,3],rho) - dat[i,1])^2
} ) ) }
optim(0.51, f, method="BFGS")
The result is not that bad:
> optim(0.51, f, method="BFGS")
$par
[1] 0.5043406
$value
[1] 3.479377e-06
$counts
function gradient
14 4
$convergence
[1] 0
$message
NULL
Maybe even a little bit better than 0.5:
> f(0.5043406)
[1] 3.479377e-06
> f(0.5)
[1] 1.103484e-05
>
Let's check another start value:
> optim(0.8, f, method="BFGS")
$par
[1] 0.5043407
$value
[1] 3.479377e-06
$counts
function gradient
28 6
$convergence
[1] 0
$message
NULL

Resources