Setting up a Horner polynomial in R - r

I am trying to set up a function in R that computes a polynomial
P(x) = c1 + c2*x + c3*x^2 + ... + cn-1*x^n-2 + cn*x^n-1
for various values of x and set coefficients c.
Horner's method is to
Set cn = bn
For i = n-1, n-1, ..., 2, 1, set bi = bi+1*x + ci
Return the output
What I have so far:
hornerpoly1 <- function(x, coef, output = tail(coef,n=1), exp = seq_along(coef)-1) {
for(i in 1:tail(exp,n=1)) {
(output*x)+head(tail(coef,n=i),n=1)
}
}
hornerpoly <- function(x, coef) {
exp<-seq_along(coef)-1
output<-tail(coef,n=1)
if(length(coef)<2) {
stop("Must be more than one coefficient")
}
sapply(x, hornerpoly1, coef, output,exp)
}
I also need to error check on the length of coef, that's what the if statement is for but I am not struggling with that part. When I try to compute this function for x = 1:3 and coef = c(4,16,-1), I get three NULL statements, and I can't figure out why. Any help on how to better construct this function or remedy the null output is appreciated. Let me know if I can make anything more clear.

How about the following:
Define a function that takes x as the argument at which to evaluate the polynomial, and coef as the vector of coefficients in decreasing order of degree. So the vector coef = c(-1, 16, 4) corresponds to P(x) = -x^2 + 16 * x + 4.
The Horner algorithm is implemented in the following function:
f.horner <- function(x, coef) {
n <- length(coef);
b <- rep(0, n);
b[n] <- coef[n];
while (n > 0) {
n <- n - 1;
b[n] <- coef[n] + b[n + 1] * x;
}
return(b[1]);
}
We evaluate the polynomial at x = 1:3 for coef = c(-1, 16, 4):
sapply(1:3, f.horner, c(-1, 16, 4))
#[1] 19 47 83
Some final comments:
Note that the check on the length of coef is realised in the statement while (n > 0) {...}, i.e. we go through the coefficients starting from the last and stop when we reach the first coefficient.
You don't need to save the intermediate b values as a vector in the function. This is purely for (my) educational/trouble-shooting purposes. It's easy to rewrite the code to store bs last value, and then update b every iteration. You could then also vectorise f.horner to take a vector of x values instead of only a scalar.

Related

Integration in R language

I'm trying to compute the integral between 1 and some cutoff 'cut' of the function given in the R-code below as 'int'. It depends on 2 parameters dM[i] and dLambda[j] defined before I make the integration and for each pair I save the results in vector 'vec':
vec = c() #vector for INT values: this is our goal
dM = seq(from = 0, to = 3, by = 0.01) #vector for mass density parameter
dLambda = seq(from = -1.5, to = 3, by = 0.01) #vector for vacuum energy density parameter
for (i in 1:length(dM)) {
for (j in 1:length(dLambda)) {
int = function(x) ((dM[i]*x^4*(x - 1) + dLambda[j]*x^2*(1 - x^2) + x^4)^(-1/2))
cut = 30
INT_data = integrate(int, 1, cut)
INT = INT_data$value
vec = c(vec, INT)
}
}
But when I run the script I get the error: "Error in integrate(int, 1, cut) : non-finite function value
". Nonetheless, if I tried the following code
int = function(x) ((0*x^4*(x - 1) -1.5*x^2*(1 - x^2) + x^4)^(-1/2))
cut = 30
INT_data = integrate(int, 1, cut)
INT = INT_data$value
vec = c(vec, INT)
I get the correct result without any error. So the error above is not true, it can calculate the integral but it seems that R cannot work it out if I use the 2 'for'-loops. How can I re-write the code so I can compute all the different values for dM[i] and dLambda[j] I want?
Your function is only defined for some values of dM and dLambda. You can use the try() function to attempt evaluation, but not stop in case an error occurs.
It's also a lot more efficient to pre-allocate the object to hold the results; running vec = c(vec, INT) gradually grows it, and that's very slow, because R needs to keep creating new vectors just one element longer than the last one.
This code fixes both issues, and then plots the result:
dM <- seq(from = 0, to = 3, by = 0.01) #vector for mass density parameter
dLambda <- seq(from = -1.5, to = 3, by = 0.01) #vector for vacuum energy density parameter
result <- matrix(NA, length(dM), length(dLambda))
for (i in 1:length(dM)) {
for (j in 1:length(dLambda)) {
int <- function(x) ((dM[i]*x^4*(x - 1) + dLambda[j]*x^2*(1 - x^2) + x^4)^(-1/2))
cut <- 30
INT_data <- try(integrate(int, 1, cut), silent = TRUE)
if (!inherits(INT_data, "try-error"))
result[i, j] <- INT_data$value
}
}
image(dM, dLambda, result)
Edited to add: Here's how this works. If integrate signals an error in your original code, the loop will stop. try() prevents that. If there's no error, it returns the result of the integrate call. If there is an error, it returns an object with information about the error. That object has class "try-error", so the check if (!inherits(INT_data, "try-error")) is basically asking "Was there an error?" If there was an error, nothing happens, and that entry of the result is left as NA, as it was initialized. The loop then goes on to try the next dM, dLambda pair.
The problem is mathematical rather than being related to coding. The function is not defined for the whole domain you are integrating. With dM[1] = 0 and dLambda > 1, your expression
(dM[i]*x^4*(x - 1) + dLambda[j]*x^2*(1 - x^2) + x^4)^(-1/2)
simplifies to
(dLambda[j] * x^2 * (1 - x^2) + x^4)^(-1/2)
so let's take dLambda[j] at 1.01, which is where your calculation stops:
(1.01 * x^2 * (1 - x^2) + x^4)^(-1/2)
which is
(1.01 * x^2 - 1.01 * x^4 + x^4)^(-1/2)
or
(1.01 * x^2 - 0.01 x^4)^(-1/2)
Now, you are evaluating x between 1 and 30. So what happens when x = 11?
(1.01 * 121 - 0.01 * 14641)^(-1/2)
This leaves you
(122.21 - 146.41)^(-1/2)
which is equivalent to
1/sqrt(-24.2)
So the reason for the error is that you are integrating a function in a domain in which it is undefined.
The function is badly behaved for other values of dM too, with infinite peaks in the midst of the range, so even using the integrate(..., stop.on.error = F) option won't allow you to keep calculating because you will get an infinite sum.

Sequential Quadratic Programming in R to find optimal weights of an Equally-Weighted Risk Contribution Portfolio

Introduction to the problem
I am trying to write down a code in R so to obtain the weights of an Equally-Weighted Contribution (ERC) Portfolio. As some of you may know, the portfolio construction was presented by Maillard, Roncalli and Teiletche.
Skipping technicalities, in order to find the optimal weights of an ERC portfolio one needs to solve the following Sequential Quadratic Programming problem:
with:
Suppose we are analysing N assets. In the above formulas, we have that x is a (N x 1) vector of portfolio weights and Σ is the (N x N) variance-covariance matrix of asset returns.
What I have done so far
Using the function slsqp of the package nloptr which solves SQP problems, I would like to solve the above minimisation problem. Here is my code. Firstly, the objective function to be minimised:
ObjFuncERC <- function (x, Sigma) {
sum <- 0
R <- Sigma %*% x
for (i in 1:N) {
for (j in 1:N) {
sum <- sum + (x[i]*R[i] - x[j]*R[j])^2
}
}
}
Secondly, the starting point (we start by an equally-weighted portfolio):
x0 <- matrix(1/N, nrow = N, ncol = 1)
Then, the equality constraint (weights must sum to one, that is: sum of the weights minus one equal zero):
heqERC <- function (x) {
h <- numeric(1)
h[1] <- (t(matrix(1, nrow = N, ncol = 1)) %*% x) - 1
return(h)
}
Finally, the lower and upper bounds constraints (weights cannot exceed one and cannot be lower than zero):
lowerERC <- matrix(0, nrow = N, ncol = 1)
upperERC <- matrix(1, nrow = N, ncol = 1)
So that the function which should output optimal weights is:
slsqp(x0 = x0, fn = ObjFuncERC, Sigma = Sigma, lower = lowerERC, upper = upperERC, heq = heqERC)
Unfortunately, I do not know how to share with you my variance-covariance matrix (which takes name Sigma and is a (29 x 29) matrix, so that N = 29) so to reproduce my result, still you can simulate one.
The output error
Running the above code yields the following error:
Error in nl.grad(x, fn) :
Function 'f' must be a univariate function of 2 variables.
I have no idea what to do guys. Probably, I have misunderstood how things must be written down in order for the function slsqp to understand what to do. Can someone help me understand how to fix the problem and get the result I want?
UPDATE ONE: as pointed out by #jogo in the comments, I have updated the code, but it still produces an error. The code and the error above are now updated.
UPDATE 2: as requested by #jaySf, here is the full code that allows you to reproduce my error.
## ERC Portfolio Test
# Preliminary Operations
rm(list=ls())
require(quantmod)
require(nloptr)
# Load Stock Data in R through Yahoo! Finance
stockData <- new.env()
start <- as.Date('2014-12-31')
end <- as.Date('2017-12-31')
tickers <-c('AAPL','AXP','BA','CAT','CSCO','CVX','DIS','GE','GS','HD','IBM','INTC','JNJ','JPM','KO','MCD','MMM','MRK','MSFT','NKE','PFE','PG','TRV','UNH','UTX','V','VZ','WMT','XOM')
getSymbols.yahoo(tickers, env = stockData, from = start, to = end, periodicity = 'monthly')
# Create a matrix containing the price of all assets
prices <- do.call(cbind,eapply(stockData, Op))
prices <- prices[-1, order(colnames(prices))]
colnames(prices) <- tickers
# Compute Returns
returns <- diff(prices)/lag(prices)[-1,]
# Compute variance-covariance matrix
Sigma <- var(returns)
N <- 29
# Set up the minimization problem
ObjFuncERC <- function (x, Sigma) {
sum <- 0
R <- Sigma %*% x
for (i in 1:N) {
for (j in 1:N) {
sum <- sum + (x[i]*R[i] - x[j]*R[j])^2
}
}
}
x0 <- matrix(1/N, nrow = N, ncol = 1)
heqERC <- function (x) {
h <- numeric(1)
h[1] <- t(matrix(1, nrow = N, ncol = 1)) %*% x - 1
}
lowerERC <- matrix(0, nrow = N, ncol = 1)
upperERC <- matrix(1, nrow = N, ncol = 1)
slsqp(x0 = x0, fn = ObjFuncERC, Sigma = Sigma, lower = lowerERC, upper = upperERC, heq = heqERC)
I spotted several mistakes in your code. For instance, ObjFuncERC is not returning any value. You should use the following instead:
# Set up the minimization problem
ObjFuncERC <- function (x, Sigma) {
sum <- 0
R <- Sigma %*% x
for (i in 1:N) {
for (j in 1:N) {
sum <- sum + (x[i]*R[i] - x[j]*R[j])^2
}
}
sum
}
heqERC doesn't return anything too, I also changed your function a bit
heqERC <- function (x) {
sum(x) - 1
}
I made those changes and tried slsqp without lower and upper and it worked. Still, another thing to consider is that you set lowerERC and upperERC as matrices. Use the following instead:
lowerERC <- rep(0,N)
upperERC <- rep(1,N)
Hope this helps.

How to loop all values in a dataframe as the start value in maxLik

I'm doing Maximum Likelihood Estimation using maxLik, which requires specifying starting values. Instead of specifying a single value, is there any way that allows me to use all the values from a matrix as the start value?
My current code of maxLik is:
f12 <- function(param){
alpha <- param[1]
rho <- param[2]
lambda <- param[3]
u <- 0.5*(p12$v_50_1)^alpha + 0.5*lambda*(p12$v_50_2)^alpha
p <- 1/(1 + exp(-rho*u))
f <- sum(p12$gamble*log(p) + (1-p12$gamble)*log(1-p))}
ml <- maxLik(f12, start = c(alpha = 1, rho=2, lambda = 1), method = "NM")
I create a dataframe with the upper and lower bounds of potential start values:
st <- expand.grid(alpha = seq(0, 2, len = 100),rho = seq(0, 1, len = 100),lambda = seq(0,2, length(100))
There are 3 parameters in my function, and my goal is to loop all the values in the above dataframe st and select the best vector of start values after running the model from a variety of starting parameters.
Thanks!
Consider Map (wrapper to mapply) to pass the st columns elementwise through your methods. Here, Map will return a list of maxLik objects, specifically inherited maxim class objects containing a list of other components. The number of items in this list will be equal to rows of st.
Notice input parameters, a, r, and l being passed into start argument of maxLik() and no longer hard-coded integers. And f12 is left untouched.
maxLik_run <- function(a, r, l) {
tryCatch({
f12 <- function(param){
alpha <- param[1]
rho <- param[2]
lambda <- param[3]
u <- 0.5*(p12$v_50_1)^alpha + 0.5*lambda*(p12$v_50_2)^alpha
p <- 1/(1 + exp(-rho*u))
f <- sum(p12$gamble*log(p) + (1-p12$gamble)*log(1-p))
}
return(maxLik(f12, start = c(alpha = a, rho = r, lambda = l), method = "NM"))
}, error = function(e) return(NA))
}
st <- expand.grid(alpha = seq(0, 2, len = 100),
rho = seq(0, 1, len = 100),
lambda = seq(0, 2, length(100)))
maxLik_list <- Map(maxLik_run, st$alpha, st$rho, st$lambda)
And to answer the question --best vector of start values after running the model from a variety of starting parameters-- requires a particular definition of "best". Once you define this, you can use Filter() on your returned list of objects to select the one or more element that yields this "best".
Below is a demonstration to find the highest value across each maximum likelihood's maximum. Use estimate if needed. Do note, this returned list can have more than one if the highest value is shared by other list items:
highest_value <- max(sapply(maxLik_list, function(item) item$maximum))
maxLik_item_list <- Filter(function(i) i$maximum == highest_value, maxLik_list)
What you are doing in your logLik function is that you are calculating alpha,lambda,rho whereas your data already has them.Those are the lines with u,p and f12(that is also your function name!). Also it is possible to calculate log likelihood for one row as your log likelihood function has single indices. So you run the code using apply like this
#create a function to find mle estimate for first row
maxlike <- function(a) {
f12 <- function(param){
alpha <- param[1]
rho <- param[2]
lambda <- param[3]
#u <- 0.5*(p12$v_50_1)^alpha + 0.5*lambda*(p12$v_50_2)^alpha
#p <- 1/(1 + exp(-rho*u))
#f12 <- sum(p12$gamble*log(p) + (1-p12$gamble)*log(1-p))
}
ml <- maxLik(f12, start = c(alpha = 1, rho=2, lambda = 1), method = "NM")
}
#then using apply with data = st, 2 means rows and your mle function
mle <- apply(st,2,maxlike)
mle

the integrand has the sum of a vector

I need to integrate a function integrand. The function integrand is a product of A and B. A = 2/(upper-lower), and B is the sum of a vector depending on the input parameter.
if I have
X = 7,
N = 50,
Ck # a vector of N elements,
uk # a vector of N elements,
upper = 10,
lower = -10
and my R-code is as follow:
integrand<-function(y)
{
df<-matrix(,nrow = N,ncol = 1);
res<-NA;
for(k in 1:N)
df[k]<-Ck[k]*cos(y-lower)*uk[k]
res<-2/(upper-lower)*sum(df);
return(res)
}
integrate(function(x){integrand(x)},upper=X,lower = lower)$value
I got an error message after running the code:
Error in integrate(function(x) { :
evaluation of function gave a result of wrong length
what is my mistake?
Additionally, if df[k]<-Ck[k]*(cos(y-lower)*uk[k]), may I write the code as:
integrand<-function(y)
{
df <-Ck*cos((y - lower)*uk)
2 * sum(df) / (upper - lower)
}
integrate(Vectorize(integrand),upper=X,lower = lower)$value
THANKS!
Use
integrand <- function(y) {
mat <- tcrossprod(Ck * uk, cos(y - lower))
2 * colSums(mat) / (upper - lower)
}
Explanation:
If you read the documentation of function integrate, you see that f must be a vectorized function (i.e. you give it a vector argument and it returns a vector of the same length).

fminsearch in R is worse than in Matlab

There is my data (x and y columns are relevant):
https://www.dropbox.com/s/b61a7enhoa0p57p/Simple1.csv
What I need is to fit the data with the polyline. Matlab code that does this is:
spline_fit.m:
function [score, params] = spline_fit (points, x, y)
min_f = min(x)-1;
max_f = max(x);
points = [min_f points max_f];
params = zeros(length(points)-1, 2);
score = 0;
for i = 1:length(points)-1
in = (x > points(i)) & (x <= points(i+1));
if sum(in) > 2
p = polyfit(x(in), y(in), 1);
pred = p(1)*x(in) + p(2);
score = score + norm(pred - y(in));
params(i, :) = p;
else
params(i, :) = nan;
end
end
test.m:
%Find the parameters
r = [100,250,400];
p = fminsearch('spline_fit', r, [], x, y)
[score, param] = spline_fit(p, x, y)
%Plot the result
y1 = zeros(size(x));
p1 = [-inf, p, inf];
for i = 1:size(param, 1)
in = (x > p1(i)) & (x <= p1(i+1));
y1(in) = x(in)*param(i,1) + param(i,2);
end
[x1, I] = sort(x);
y1 = y1(I);
plot(x,y,'x',x1,y1,'k','LineWidth', 2)
And this does work fine, producing following optimization: [102.9842, 191.0006, 421.9912]
I've implemented the same idea in R:
library(pracma);
spline_fit <- function(x, xx, yy) {
min_f = min(xx)-1;
max_f = max(xx);
points = c(min_f, x, max_f)
params = array(0, c(length(points)-1, 2));
score = 0;
for( i in 1:length(points)-1)
{
inn <- (xx > points[i]) & (xx <= points[i+1]);
if (sum(inn) > 2)
{
p <- polyfit(xx[inn], yy[inn], 1);
pred <- p[1]*xx[inn] + p[2];
score <- score + norm(as.matrix(pred - yy[inn]),"F");
params[i,] <- p;
}
else
params[i,] <- NA;
}
score
}
But I get very bad results:
> fminsearch(spline_fit,c(100,250,400), xx = Simple1$x, yy = Simple1$y)
$xval
[1] 100.1667 250.0000 400.0000
$fval
[1] 4452.761
$niter
[1] 2
As you can see, it stops after 2 iterations and doesn't produce good points.
I'll be very glad for any help in resolving this issue.
Also, if anyone knows how to implement this in C# using any free library, it will be even better. I know whereto get polyfit, but not fminsearch.
The problem here is that the likelihood surface is very badly behaved -- there are both multiple minima and discontinuous jumps -- which will make the results you get with different optimizers almost arbitrary. I will admit that MATLAB's optimizers are remarkably robust, but I would say that it's pretty much a matter of chance (and where you start) whether an optimizer will get to the global minimum for this case, unless you use some form of stochastic global optimization such as simulated annealing.
I chose to use R's built-in optimizer (which uses Nelder-Mead by default) rather than fminsearch from the pracma package.
spline_fit <- function(x, xx = Simple1$x, yy=Simple1$y) {
min_f = min(xx)-1
max_f = max(xx)
points = c(min_f, x, max_f)
params = array(0, c(length(points)-1, 2))
score = 0
for( i in 1:(length(points)-1))
{
inn <- (xx > points[i]) & (xx <= points[i+1]);
if (sum(inn) > 2)
{
p <- polyfit(xx[inn], yy[inn], 1);
pred <- p[1]*xx[inn] + p[2];
score <- score + norm(as.matrix(pred - yy[inn]),"F");
params[i,] <- p;
}
else
params[i,] <- NA;
}
score
}
library(pracma) ## for polyfit
Simple1 <- read.csv("Simple1.csv")
opt1 <- optim(fn=spline_fit,c(100,250,400), xx = Simple1$x, yy = Simple1$y)
## [1] 102.4365 201.5835 422.2503
This is better than the fminsearch results, but still different from the MATLAB results, and worse than them:
## Matlab results:
matlab_fit <- c(102.9842, 191.0006, 421.9912)
spline_fit(matlab_fit, xx = Simple1$x, yy = Simple1$y)
## 3724.3
opt1$val
## 3755.5 (worse)
The bbmle package offers an experimental/not very well documented set of tools for exploring optimization surfaces:
library(bbmle)
ss <- slice2D(fun=spline_fit,opt1$par,nt=51)
library(lattice)
A 2D "slice" around the optim-estimated parameters. The circles show the optim fit (solid) and the minimum value within each slice (open).
png("splom1.png")
print(splom(ss))
dev.off()
A 'slice' between the matlab and optim fits shows that the surface is quite rugged:
ss2 <- bbmle:::slicetrans(matlab_fit,opt1$par,spline_fit)
png("slice1.png")
print(plot(ss2))
dev.off()

Resources