Here's the code (I'm sorry if it's so long, but it was the first example I had); I'm using the CVaR example from CreditMetrics package by A. Wittmann and DEoptim solver to optimize:
library(CreditMetrics)
library(DEoptim)
N <- 3
n <- 100000
r <- 0.003
ead <- rep(1/N,N)
rc <- c("AAA", "AA", "A", "BBB", "BB", "B", "CCC", "D")
lgd <- 0.99
rating <- c("BBB", "AA", "B")
firmnames <- c("firm 1", "firm 2", "firm 3")
alpha <- 0.99
# correlation matrix
rho <- matrix(c( 1, 0.4, 0.6,
0.4, 1, 0.5,
0.6, 0.5, 1), 3, 3, dimnames = list(firmnames, firmnames),
byrow = TRUE)
# one year empirical migration matrix from standard&poors website
rc <- c("AAA", "AA", "A", "BBB", "BB", "B", "CCC", "D")
M <- matrix(c(90.81, 8.33, 0.68, 0.06, 0.08, 0.02, 0.01, 0.01,
0.70, 90.65, 7.79, 0.64, 0.06, 0.13, 0.02, 0.01,
0.09, 2.27, 91.05, 5.52, 0.74, 0.26, 0.01, 0.06,
0.02, 0.33, 5.95, 85.93, 5.30, 1.17, 1.12, 0.18,
0.03, 0.14, 0.67, 7.73, 80.53, 8.84, 1.00, 1.06,
0.01, 0.11, 0.24, 0.43, 6.48, 83.46, 4.07, 5.20,
0.21, 0, 0.22, 1.30, 2.38, 11.24, 64.86, 19.79,
0, 0, 0, 0, 0, 0, 0, 100
)/100, 8, 8, dimnames = list(rc, rc), byrow = TRUE)
cm.CVaR(M, lgd, ead, N, n, r, rho, alpha, rating)
y <- cm.cs(M, lgd)[which(names(cm.cs(M, lgd)) == rating)]
Now I write my function...
fun <- function(w) {
# ...
- (t(w) %*% y - r) / cm.CVaR(M, lgd, ead = w, N, n, r,
rho, alpha, rating)
}
...and I want to optimize it:
DEoptim(fn = fun, lower = rep(0, N), upper = rep(1, N),
control = DEoptim.control())
Can you tell me what do I have to insert in # ... to make sum(w) = 1 during optimization?
Below I show you optimization results according to flodel's tips:
# The first trick is to include B as large number to force the algorithm to put sum(w) = 1
fun <- function(w) {
- (t(w) %*% y - r) / cm.CVaR(M, lgd, ead = w, N, n, r, rho, alpha, rating) +
abs(10000 * (sum(w) - 1))
}
DEoptim(fn = fun, lower = rep(0, N), upper = rep(1, N),
control = DEoptim.control())
$optim$bestval
[1] -0.05326055
$optim$bestmem
par1 par2 par3
0.005046258 0.000201286 0.994752456
parsB <- c(0.005046258, 0.000201286, 0.994752456)
> fun(parsB)
[,1]
[1,] -0.05326089
...and...
As you can see, the first trick works better in that he finds a results which is smaller than the second one. Unfortunately it seems he takes longer.
# The second trick needs you use w <- w / sum(w) in the function itself
fun <- function(w) {
w <- w / sum(w)
- (t(w) %*% y - r) / cm.CVaR(M, lgd, ead = w, N, n, r, rho, alpha, rating) #+
#abs(10000 * (sum(w) - 1))
}
DEoptim(fn = fun, lower = rep(0, N), upper = rep(1, N),
control = DEoptim.control())
$optim$bestval
[1] -0.0532794
$optim$bestmem
par1 par2 par3
1.306302e-15 2.586823e-15 9.307001e-01
parsC <- c(1.306302e-15, 2.586823e-15, 9.307001e-01)
parC <- parsC / sum(parsC)
> fun(parC)
[,1]
[1,] -0.0532794
Any comment?
Should I increase the number of iterations because of a "too-stochastic" to-be-optimized-function?
Try:
w <- w / sum(w)
and if DEoptim gives you an optimal solution w* such that sum(w*) != 1 then w*/sum(w*) should be your optimal solution.
Another approach is to solve over all your variables but one. We know the value of the last variable must be 1 - sum(w) so in the body of the function, have:
w <- c(w, 1-sum(w))
and do the same to the optimal solution returned by DEoptim: w* <- c(w*, 1-sum(w*))
Both solutions require that you re-formulate your problem into an unconstrained (not counting for variable bounds) optimization so DEoptim can be used; which forces you to do a little extra work outside of DEoptim to recover the solution to the original problem.
In reply to your comment, if you want DEoptim to give you the correct answer right away (i.e. without the need for a post-transformation), you could also try to include a penalty cost to your objective function: for example add B * abs(sum(w)-1) where B is some arbitrary large number so sum(w) will be forced to 1.
I think you should add a penalty for any deviation from one.
Add to your minimizing problem the term +(sum(weights) - 1)^2 * 1e10. You should see that this huge penalty will force the weights to sum to 1!
With the trick you applied:
fun <- function(w) {
w <- w / sum(w)
- (t(w) %*% y - r) / cm.CVaR(M, lgd, ead = w, N, n, r, rho, alpha, rating) #+
#abs(10000 * (sum(w) - 1))
}
Why would you not use optim in this case? I think it will be much faster.
Related
I cam across a mathematical expression for log-likelihood in a CrossValidated.com answer and am unclear how I should implement in R. I'm not sure if SO can represent MathML the same as CV, but this is the first equation in the second (not accepted) anser:
$$
\begin{eqnarray}
\ell(\mu, \Sigma) &=& C - \frac{m}{2}\log|\Sigma|-\frac{1}{2} \sum_{i=1}^m \text{tr}\left[(\mathbf{x}^{(i)}-\mu)^T \Sigma^{-1} (\mathbf{x}^{(i)}-\mu)\right]\\
$$
I focusing on the 3rd term in that equation and I do not think the trace operation is necessary according to another answer on that page. I suppose I could look at one of the several implementations in the various packages that exist, but I'm thinking they use more economical approaches that don't clearly follow that equation's procedure, as did #onyambu in the answer here:
I'm ripping out code from an earlier SO example:
library(MASS)
# Make covariance matrix. See note above re the implications of using a correlation matrix.
S = matrix(c(1.0, 0.2, 0.1, 0.35, 0.0,
0.2, 1.0, 0.0, 0.4, 0.0,
0.1, 0.0, 1.0, 0.0, 0.4,
0.35, 0.4, 0.0, 1.0, 0.6,
0.0, 0.0, 0.4, 0.6, 1.0), ncol = 5)
colnames(S) = c("Y1", "X1", "X2", "Z1" ,"Z2")
rownames(S) = colnames(S)
# Make mean vector
mus = c(1, 2, 3, 4, 5); names(mus) = colnames(S)
# Generate 5347 observations
obs = mvrnorm(n = 200, mu = mus, Sigma = S)
This effort was in response to a question correctly answered now but not using a summation of a matrix expression. I think I can do it with a for-loop to create individual contributions for each data point:
llmat.term3 <- matrix(NA, 200,1)
for(n in 1:200) {
llmat.term3[n] <- t(obs[n,]-mus) %*% solve(S) %*% (obs[n,]-mus) }
sum(llmat.term3)
#[1] 982.7356
.... but I'm wondering if there is a more compact matrix approach? Or I suppose, filled in the gaps in my linear algebra knowledge that explains why sum(u * solve(sig, u) is the same as sum{i=1,N} ( t(obs[n,]-mu) %*% S^-1 %*% (obs[n,]-mu) ).
in your code you have
S = matrix(c(1.0, 0.2, 0.1, 0.35, 0.0,
0.2, 1.0, 0.0, 0.4, 0.0,
0.1, 0.0, 1.0, 0.0, 0.4,
0.35, 0.4, 0.0, 1.0, 0.6,
0.0, 0.0, 0.4, 0.6, 1.0), ncol = 5)
colnames(S) = c("Y1", "X1", "X2", "Z1" ,"Z2")
rownames(S) = colnames(S)
# Make mean vector
mus = c(1, 2, 3, 4, 5); names(mus) = colnames(S)
# Generate 5347 observations
set.seed(123)
obs = MASS::mvrnorm(n = 200, mu = mus, Sigma = S)
llmat.term3 <- matrix(NA, 200,1)
for(n in 1:200) {
llmat.term3[n] <- t(obs[n,]-mus) %*% solve(S) %*% (obs[n,]-mus) }
sum(llmat.term3)
#[1] 982.7356
compare to more compact approaches:
u <- t(obs) - mus
sum(diag(solve(S, tcrossprod(u))))
#> [1] 982.7356
sum(u * solve(S, u))
#> [1] 982.7356
Though the two expressions give similar results, The first one seems to be quicker than the second. I do not know why since in the first one there is a computation of n * n matrix. The for loop takes for-ever to compute.
Unit: milliseconds
expr min lq mean median uq max neval
a 4532.6753 4679.4043 5470.94765 4815.1294 6061.3284 7789.5116 10
b 2.8991 3.2693 3.73495 3.3675 3.7777 6.9719 10
c 7.8176 8.5473 12.03060 9.2542 16.4089 20.1742 10
set.seed(123)
n <- 200000
obs = MASS::mvrnorm(n = n, mu = mus, Sigma = S)
u <- t(obs) -mus
microbenchmark::microbenchmark(a = {
llmat.term3 <- matrix(NA, n,1)
for(i in seq(n)) {
llmat.term3[i] <- t(obs[i,]-mus) %*% solve(S) %*% (obs[i,]-mus) }
sum(llmat.term3)
},
b = sum(diag(solve(S, tcrossprod(u)))),
c = sum(u * solve(S, u)),
check = 'equal', times = 10)
NB: took me a while to get the seed you used. Next time include it in your data generation
The situation is as follows:
I need to create a dataset of triplets where we have discrete distribution of stock prices S <- c(80,100,120,140,160), with probability P <- c(0.2, 0.3, 0.2, 0.2, 0.1), call option C <- max(S-120,0) = c(0,0,0,20,40) and liability of an option which pays 30 if in a certain region otherwise zero, namely L = I{110 \leq S \leq 150} = c(0,0,30,30,0) <- c(0,0,30,30,0). It is important to mention that if P[1] = 80, then C[1] and L[1]. This holds for i = 1,2,3,4,5. How do you create a dataset for N = 10000 simulations where each value for i corresponds to the other two values for the same i?
This is the code I had for now. Note that X_1 = S, X_2 = C and Y = L.
X_1 <- function(n) {
sample(c(80,100,120,140,160), size = n,
prob = c(0.2, 0.3, 0.2, 0.2, 0.1), replace=T)
}
X_2 <- function(n) {
sample(X_1 - 120, size = n,
prob = c(0.2, 0.3, 0.2, 0.2, 0.1), replace=T)
}
Y <- function(n) {
sample(L, size = n,
prob = c(0.2, 0.3, 0.2, 0.2, 0.1), replace=T)
}
##Creating triplets##
df <- data.frame(S_T = X_1(10000), C_T = X_2(10000), L_T =Y(10000))
df```
I'm not sure if you want C_T to be dependent on the S_T values. If you do, I think you just want to call X_1, assign the results to an object, then use that as the argument to X_2 (or just subtract 120, which is what X_2 does).
X_1 <- function(n) {
sample(c(80,100,120,140,160), size = n,
prob = c(0.2, 0.3, 0.2, 0.2, 0.1), replace=T)
}
# Call that function
S_T <- X_1(10) # for practice
C_T <- S_T - 120 # that's all you're doing in function X_2, if you want to use S_T
If you want to C_T to contain values independent of S_T, you can create function within function
X_1 <- function(n) {
sample(c(80,100,120,140,160), size = n,
prob = c(0.2, 0.3, 0.2, 0.2, 0.1), replace=T)
}
X_2 <- function(n) {
X_1(n) - 120
}
S_T <- X_1(10) # Same as above
C_T <- X_2(10) # Gives values not dependent on S_T
EDIT to address comment below:
It's hard to read the comment, but it looks like you want create a function that takes the results of function X_1 and returns a result based on a condition. Use ifelse to read each element one at at time. You can create another function and then input the results of function X_1
Y <- function(X_1_func){
ifelse( X_1_func == 80,
return(0),
ifelse(X_1_func == 100,
return(0),
ifelse(X_1_func == 120,
return(30),
return(60) # Add a default value here or the last possible value if others are F
)
)
)
}
sapply(X_1(10), Y) # Use an apply to input one element of function X_1 at a time. Assign results to L or whatever you with to call.
If this all works for you, you can accept the answer.
I am trying to implement CVaR portfolio optimisation in R. Basically trying to replicate the Matlab approach used in this paper:
https://ethz.ch/content/dam/ethz/special-interest/mtec/chair-of-entrepreneurial-risks-dam/documents/dissertation/master%20thesis/Thesis_Matthias_Kull_2014.pdf
To do this I need to perform nonlinear optimisation with nonlinear constraints.
I have tried to use the nloptr package, but found the derivative calculation for the gradient of matrices beyond me.
Instead I have opted for the NlcOptim package which formulates the constraints in the same way as the Matlab function used in the paper.
library(NlcOptim)
# ====================================================================
# Just generate arbitrary returns data and bootstrap -----------------
asset_returns <- rbind(c(0.1, 0.05, 0.05, 0.01, 0.06),
c(0.05, 0.05, 0.06, -0.01, 0.09),
c(0.025, 0.05, 0.07, 0.02, -0.1),
c(0.01, 0.05, 0.08, -0.02, -0.01),
c(0.01, 0.05, 0.08, 0.00, 0.2),
c(0.005, 0.05, 0.09, 0.005, -0.15),
c(0.01, 0.05, 0.08, 0.01, -0.01),
c(0.012, 0.05, 0.00, -0.01, -0.01),
c(0.015, 0.05, 0.00, 0.03, 0.05),
c(0.02, 0.05, -0.01, 0.04, 0.03))
# Returns for 5 assets over 10 trading periods
nAssets <- ncol(asset_returns)
nReturns <- nrow(asset_returns)
nPeriods <- 4
nSims <- 10
# BOOTSTRAP ---------------------------------------------------------
sim_period_returns <- matrix(nrow = nSims, ncol = nAssets)
for (k in 1:nSims) {# run nSims simulations
sim_returns <- matrix(nrow = nPeriods, ncol = nAssets)
sample_order <- sample(nReturns, nPeriods)
for (i in 1:nPeriods) {
sim_returns[i,] <- asset_returns[sample_order[i],]
}
sim_prices <- rbind(rep(1, nAssets), 1 + sim_returns)
for (j in 1:nAssets) {
sim_period_returns[k, j] <- prod(sim_prices[, j]) - 1
}
}
# ------------------------------------------------------------------------
# ========================================================================
# The important stuff ====================================================
returns <- sim_period_returns
alpha <- 0.95
CVaR_limit <- 0.025
UB <- 0.75
LB <- 0.05
# Inequality constraints
A <- rbind(c(rep(0, nAssets), 1, 1/((1-alpha)*nSims) * rep(1, nSims)),
cbind(- returns, -1, diag(nSims)))
b <- as.matrix(c(-CVaR_limit, rep(0, nSims)), nrow = nSims, ncol = 1)
# Equality constraints
Aeq <- c(rep(1, nAssets), 0, rep(0, nSims))
beq <- 1
# Upper and lower bounds
UB <- c(rep(UB, nAssets), Inf, rep(Inf, nSims))
LB <- c(rep(LB, nAssets), 0, rep(0, nSims))
# Initial portfolio weights
w0 <- rep(1/nAssets, nAssets)
VaR0 <- quantile(returns %*% w0, alpha, names = F)
w0 <- c(w0, VaR0, rep(0, nSims))
objective_function <- function(x) {
# objective function to minimise
return (-colMeans(returns) %*% x[1:nAssets])
}
# **********************************************
# The solnl function giving the error based on the above inputs
solnl(X = w0,
objfun = objective_function,
A = A,
B = b,
Aeq = Aeq,
Beq = beq,
lb = LB,
ub = UB)
# **********************************************
# ===================================================================
I am receiving the following error:
Error in if (eq > 0 & ineq > 0) { : argument is of length zero
I have read the package source code and tried to figure out what is causing this error, but am still at a loss.
Checking the source code and input data, I think that the error starts at line 319 on NlcOptim when the following code is called nLineareq = nrow(Aeq);By calling nrow(Aeq) in the way that you have defined Aeq it will result in NULL a few lines later the expression if (eq > 0 & ineq > 0) is evaluated resulting in the error. Regarding the error you can find an explanation in here Argument is of length zero in if statement
A quick fix could be to change the shape on Aeq by using
Aeq <- t(array(c(rep(1, nAssets), 0, rep(0, nSims))))
However by changing that I get a different error when i try to run the code
Error: object 'lambda' not found
I'm not sure if the R implementation needs a different initial conditions or the method is not converging, since in the paper, the method used for the optimization was interior-point rather than SQP as implemented in NlcOptim.
I have the following script called test.R where I input variable values and then I call functions I wrote in SABR.R to calculate certain values.
However, when I run the code, I keep on getting errors such as
Error: could not find function SABR.BSIV
Error: could not find function SABR.calibration
What am I doing wrong here? It also says "object k not found" even though I very clearly declared it in my test.R code.
library(testthat)
source("SABR.R")
test_that("SABR Model Test", {
iv <- c(0.346, 0.280, 0.243, 0.208, 0.203, 0.192, 0.192, 0.201, 0.205, 0.223, 0.228, 0.247, 0.252, 0.271, 0.275, 0.293, 0.313)
k <- c(12.0, 15.0, 17.0, 19.5, 20.0, 22.0, 22.5, 24.5, 25.0, 27.0, 27.5, 29.5, 30.0, 32.0, 32.5, 34.5, 37.0)
f <- 22.724
t <- 0.583
a <- 0.317
b <- 0.823
r <- 0.111
n <- 1.050
iv.model <- SABR.BSIV(t, f, k, a, b, r, n)
params <- SABR.calibration(t, f, k, iv)
iv.calibrated <- SABR.BSIV(t, f, k, params[1], params[2], params[3], params[4])
# Check whether initial model can produce market IV or not
for(i in length(k)){expect_equal(iv.model[i], iv[i], tolerance = 0.01*iv[i])}
# Check whether calibrated parameter can produce market IV or not
for(i in length(k)){expect_equal(iv.calibrated, iv[i], tolerance = 0.01*iv[i])}
})
Here is the SABR.R code:
EPS <- 10^(-8)
# Sub function for SABR BS-IV (Black-Scholes IV?)
.x <- function(z, r){log((sqrt(1-2*r*z+z^2)+z-r)/(1-r))}
.z <- function(f, k, a, b, nu){nu/a*(f*k)^(0.5*(1-b))*log(f/k)}
# Variable transformation function
.t1 <- function(x){1/(1+exp(x))}
.t2 <- function(x){2/(1+exp(x)) -1}
# Black-Scholes IV apporoximation formula by Hagan
SABR.BSIV <- function(t, f, k, a, b, r, n)
{
z <- .z(f, k, a, b, n)
x <- .x(z, r)
numerator <- 1 + ((1-b)^2/24*a^2/(f*k)^(1-b) + 0.25*r*b*n*a/(f*k)^(0.5*(1-b)) + (2-3*r^2)*n^2/24)*t
denominator <- x*(f*k)^(0.5*(1-b))*(1 + (1-b)^2/24*(log(f/k))^2 + (1-b)^4/1920*(log(f/k))^4)
ifelse(abs((f-k)/f) < EPS, a*numerator/f^(1-b), z*a*numerator/denominator)
}
# Parameter calibration function for SABR
SABR.calibration <- function(t, f, k, iv)
{
# Objective function for optimization, variables are transformed because of satisfing the constraint conditions
objective <- function(x){sum( (iv - SABR.BSIV(t, f, k, exp(x[1]), .t1(x[2]), .t2(x[3]), exp(x[4])))^2) }
x <- nlm(objective, c(0.25, 0.5, 0.5, 0.5))
# Return optimized parameters
parameter <- x$estimate
parameter <- c(exp(parameter[1]), .t1(parameter[2]), .t2(parameter[3]), exp(parameter[4]))
names(parameter) <- c("Alpha", "Beta", "Rho", "Nu")
parameter
}
I'm using R 3.3.1 (64-bit) on Windows 10. I have an x-y dataset that I've fit with a 2nd order polynomial. I'd like to solve that best-fit polynomial for x at y=4, and plot drop-down lines from y=4 to the x-axis.
This will generate the data in a dataframe v1:
v1 <- structure(list(x = c(-5.2549, -3.4893, -3.5909, -2.5546, -3.7247,
-5.1733, -3.3451, -2.8993, -2.6835, -3.9495, -4.9649, -2.8438,
-4.6926, -3.4768, -3.1221, -4.8175, -4.5641, -3.549, -3.08, -2.4153,
-2.9882, -3.4045, -4.6394, -3.3404, -2.6728, -3.3517, -2.6098,
-3.7733, -4.051, -2.9385, -4.5024, -4.59, -4.5617, -4.0658, -2.4986,
-3.7559, -4.245, -4.8045, -4.6615, -4.0696, -4.6638, -4.6505,
-3.7978, -4.5649, -5.7669, -4.519, -3.8561, -3.779, -3.0549,
-3.1241, -2.1423, -3.2759, -4.224, -4.028, -3.3412, -2.8832,
-3.3866, -0.1852, -3.3763, -4.317, -5.3607, -3.3398, -1.9087,
-4.431, -3.7535, -3.2545, -0.806, -3.1419, -3.7269, -3.4853,
-4.3129, -2.8891, -3.0572, -5.3309, -2.5837, -4.1128, -4.6631,
-3.4695, -4.1045, -7.064, -5.1681, -6.4866, -2.7522, -4.6305,
-4.2957, -3.7552, -4.9482, -5.6452, -6.0302, -5.3244, -3.9819,
-3.8123, -5.3085, -5.6096, -6.4557), y = c(0.99, 0.56, 0.43,
2.31, 0.31, 0.59, 0.62, 1.65, 2.12, 0.1, 0.24, 1.68, 0.09, 0.59,
1.23, 0.4, 0.36, 0.49, 1.41, 3.29, 1.22, 0.56, 0.1, 0.67, 2.38,
0.43, 1.56, 0.07, 0.08, 1.53, -0.01, 0.12, 0.1, 0.04, 3.42, 0.23,
0, 0.34, 0.15, 0.03, 0.19, 0.17, 0.2, 0.09, 2.3, 0.07, 0.15,
0.18, 1.07, 1.21, 3.4, 0.8, -0.04, 0.02, 0.74, 1.59, 0.71, 10.64,
0.64, -0.01, 1.06, 0.81, 4.58, 0.01, 0.14, 0.59, 7.35, 0.63,
0.17, 0.38, -0.08, 1.1, 0.89, 0.94, 1.52, 0.01, 0.1, 0.38, 0.02,
7.76, 0.72, 4.1, 1.36, 0.13, -0.02, 0.13, 0.42, 1.49, 2.64, 1.01,
0.08, 0.22, 1.01, 1.53, 4.39)), .Names = c("x", "y"), class = "data.frame", row.names = c(NA,
-95L))
Here's the code to plot y vs x, plot the best fit polynomial, and draw a line at y=4.
> attach(v1)
> # simple x-y plot of the data
> plot(x,y, pch=16)
> # 2nd order polynomial fit
> fit2 <- lm(y~poly(x,2,raw=TRUE))
> summary(fit2)
> # generate range of numbers for plotting polynomial
> xx <- seq(-8,0, length=50)
> # overlay best fit polynomial
>lines(xx, predict(fit2, data.frame(x=xx)), col="blue")
> # add horizontal line at y=4
> abline(h=4, col="red")
>
It's obvious from the plot that y=4 at x of around -2 and -6.5, but I'd like to actually solve the regression polynomial for those values.
Ideally, I'd like lines that drop down from the red-blue line intersections to the x-axis (i.e plot vertical ablines that terminate at the two y=4 solutions). If that's not possible, I'd be happy with good old vertical ablines that go all the way up the plot, so long as they at the proper x solution values.
This graph represents parts that will be out-of-spec when y>4, so I want to use the drop-down lines to highlight the range of x values that will produce in-spec parts.
You can use the quadratic formula to calculate the values:
betas <- coef(fit2) # get coefficients
betas[1] <- betas[1] - 4 # adjust intercept to look for values where y = 4
# note degree increases, so betas[1] is c, etc.
betas
## (Intercept) poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)2
## 8.7555833 6.0807302 0.7319848
solns <- c((-betas[2] + sqrt(betas[2]^2 - 4 * betas[3] * betas[1])) / (2 * betas[3]),
(-betas[2] - sqrt(betas[2]^2 - 4 * betas[3] * betas[1])) / (2 * betas[3]))
solns
## poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)1
## -1.853398 -6.453783
segments(solns, -1, solns, 4, col = 'green') # add segments to graph
Much simpler (if you can find it) is polyroot:
polyroot(betas)
## [1] -1.853398+0i -6.453783+0i
Since it returns a complex vector, you'll need to wrap it in as.numeric if you want to pass it to segments.
I absolutely understand that there is an analytical solution for this simple quadratic polynomial. The reason I show you numerical solution is that you ask this question in regression setting. Numerical solution may always be your solution in general, when you have more complicated regression curve.
In the following I will use uniroot function. If you are not familiar with it, read this short answer first: Uniroot solution in R.
This is the plot produced with your code. You are almost there. This is a root finding problem, and you may numerically use uniroot. Let's define a function:
f <- function (x) {
## subtract 4
predict(fit2, newdata = data.frame(x = x)) - 4
}
From the figure, it is clear that there are two roots, one inside [-7, -6], the other inside [-3, -1]. We use uniroot to find both:
x1 <- uniroot(f, c(-7, -6))$root
#[1] -6.453769
x2 <- uniroot(f, c(-3, -1))$root
#[1] -1.853406
Now you can drop a vertical line from these points down to x-axis:
y1 <- f(x1) + 4 ## add 4 back
y2 <- f(x2) + 4
abline(h = 0, col = 4) ## x-axis
segments(x1, 0, x1, y1, lty = 2)
segments(x2, 0, x2, y2, lty = 2)
You have a quadratic equation
0.73198 * x^2 + 6.08073 * x + 12.75558 = 4
OR
0.73198 * x^2 + 6.08073 * x + 8.75558 = 0
You can just use the quadratic formula to solve this analytically. R gives the two roots:
(-6.08073 + sqrt(6.08073^2 -4*0.73198 * 8.75558)) / (2 * 0.73198)
[1] -1.853392
(-6.08073 - sqrt(6.08073^2 -4*0.73198 * 8.75558)) / (2 * 0.73198)
[1] -6.453843
abline(v=c(-1.853392, -6.453843))
Here is one more solution, based on this
attach(v1)
fit2 = lm(y~poly(x,2,raw=TRUE))
xx = seq(-8,0, length=50)
vector1 = predict(fit2, data.frame(x=xx))
vector2= replicate(length(vector1),4)
# Find points where vector1 is above vector2.
above = vector1 > vector2
# Points always intersect when above=TRUE, then FALSE or reverse
intersect.points = which(diff(above)!=0)
# Find the slopes for each line segment.
vector1.slopes = vector1[intersect.points+1] - vector1[intersect.points]
vector2.slopes = vector2[intersect.points+1] - vector2[intersect.points]
# Find the intersection for each segment.
x.points = intersect.points + ((vector2[intersect.points] - vector1[intersect.points]) / (vector1.slopes-vector2.slopes))
y.points = vector1[intersect.points] + (vector1.slopes*(x.points-intersect.points))
#Scale x.points to the axis value of xx
x.points = xx[1] + ((x.points - 1)/(49))*(xx[50]-xx[1])
plot(xx, y = vector1, type= "l", col = "blue")
points(x,y,pch = 20)
lines(x = c(x.points[1],x.points[1]), y = c(0,y.points[1]), col='red')
lines(x = c(x.points[2],x.points[2]), y = c(0,y.points[2]), col='red')
Many solutions are already proposed, here is another one.
As obvious, we are interested to find the x values that satisfy the polynomial (quadratic) equation a_0 + a_1.x + a_2.x^2 = 4, where a_0, a_1, a_2 are the coefficients of the fitted polynomial. We can rewrite the equation as a standard quadratic equation ax^2+bx+c=0 and find the roots using Sridhar's formula using the coefficients of the fitted polynomial with polynomial regression as follows:
a <- fit2$coefficients[3]
b <- fit2$coefficients[2]
c <- fit2$coefficients[1] - 4
as.numeric((-b + sqrt(b^2-4*a*c)) / (2*a))
#[1] -1.853398
as.numeric((-b-+ sqrt(b^2-4*a*c)) / (2*a))
#[1] -6.453783
We can use some numerical methods such as Newton-Raphson to find the roots as well (although there are faster numerical methods but this will solve our purpose and it's quite fast too, takes ~160 ms on my machine), as we can see from the following code, the numerical and the theoretical solutions agree.
a <- fit2$coefficients # fitted quadratic polynomial coefficients
f <- function(x) {
as.numeric(a[1] + a[2]*x + a[3]*x^2-4)
}
df <- function(x) {
as.numeric(a[2] + 2*a[3]*x)
}
Newton.Raphson <- function(x0) {
eps <- 1e-6
x <- x0
while(TRUE) {
x <- x0 - f(x0) / df(x0)
if (abs(x - x0) < eps) {
return(x0)
}
x0 <- x
}
}
t1 <- Sys.time()
x1 <- Newton.Raphson(-10)
x2 <- Newton.Raphson(10)
x1
#[1] -6.453783
x2
#[1] -1.853398
s2
print(paste('time taken to compute the roots:' ,Sys.time() - t1))
#[1] "time taken to compute the roots: 0.0160109996795654"
points(x1, 4, pch=19, col='green')
points(x2, 4, pch=19, col='green')
abline(v=x1, col='green')
abline(v=x2, col='green')