Problems with Newton's Method for finding coefficient and Hessian - r

I am trying to write a function that uses Newton's method (coefficients+(inverse hessian)*gradient) to iteratively find the coefficients for a loglinear model.
I am using the following code:
##reading in the data
dat<-read.csv('hw8.csv')
summary(dat)
# data file containing yi and xi
attach(dat)
##Creating column of x's
x<-cbind(1,xi)
mle<-function(c){
gi<- 1-yi*exp(c[1]+c[2]*xi)
hi<- gi-1
H<- -1*(t(x)%*%hi%*%x)
g<-t(x)%*%gi
c<-c+solve(H)%*%g
return(c)
}
optim(c(0,1),mle,hessian=TRUE)
When I run the code, I get the following error:
Error in t(x) %*% hi %*% x : non-conformable arguments
RMate stopped at line 29
Given that the formula is drawn from Bill Greene's problem set, I don't think it is a formula problem. I think I am doing something wrong in passing my function.
How can I fix this?
Any help with this function would be much appreciated.

As Jonathan said in the comments, you need proper dimensions:
R> X <- matrix(1:4, ncol=2)
R> t(X) %*% X
[,1] [,2]
[1,] 5 11
[2,] 11 25
R>
But you also should use the proper tools so maybe look at the loglin function in the stats package, and/or the loglm function in the MASS package. Both will be installed by default with your R installation.

Related

Running multiple OLS regressions with matrix algebra in R

I am having trouble running multiple regressions in R. I have a matrix of returns, which have to be regressed against a vector. To be clear, I have a matrix of 1794 assets, which each, individually, need to be regressed against the SPX. I then need to roll these regressions through time, so I am trying to do this with matrix algebra to get the best speed possible.
I have code like the following:
for (i in 1:(nrow(fundReturns) - 59)){
tempMatrix <- fundReturns[i:(i+59),]
tempSPX <- spxReturns[i:(i+59)]
theSubset <- sapply(tempMatrix, function(x) sum(is.na(x))) == 0
tempMatrix <- tempMatrix[,theSubset]
theBeta <- solve(crossprod(tempMatrix), crossprod(tempMatrix, tempSPX))
theBetas[index(fundReturns)[(i + 59)], colnames(theBeta)] = theBeta
}
This breaks on the first iteration, at the 'theBeta <- ...' line. I am getting the following error:
Error in solve.default(t(tempMatrix) %*% tempMatrix) :
Lapack routine dgesv: system is exactly singular: U[273,273] = 0
I assume this means I have some singular matrix somewhere, which does not make sense to me. (If someone could explain what this error means exactly, that would be worth bonus points.) I do not see where the error could be coming from, and to boot, I attempted to check with the following code:
for (i in 1:1794){
tempMatrix2 <- tempMatrix[,i]
theBeta <- solve(t(tempMatrix2) %*% tempMatrix2) %*% (t(tempMatrix2) %*% tempSPX)
}
This code runs fine, and produces a beta for each individual column. I assume there is something I am missing in my code, or have some calculation backwards, but I cannot find it. Please SO gods, help me.
If it helps, here is the following:
> dim(tempSPX)
[1] 60 1
> dim(tempMatrix)
[1] 60 1794

calculating the Gradient and the Hessian in R

As you know, the Gradient of a function is the following vector:
and the Hessian is the following matrix:
Now, I wonder, is there any way to calculate these in R for a user defined function at a given point?
First, I've found a package named numDeriv, which seems to have the necessary functions grad and hessian but now I can't get the correct results... Thus, here's my workflow:
Let's say that we are given the function f(x,y) = x^2 * x^3, and we need to calculate the Gradient and the Hessian at the point (x=1, y=2).
That's been said, I define this function within R:
dummy <- function(x,y) {
rez <- (z^2)*(y^3)
rez
}
and then use grad the following way:
grad(func=dummy, x=1, y=2)
which gives me result 16 -- and the problem is that this only the first value from a gradient vector, the correct version of which is
[16, 12]
Same goes with the hessian:
hessian(func=dummy, x=1, y=2)
which gives my 1x1 matrix with the value 16 instead of the 2x2 matrix
[,1] [,2]
[1,] 16 24
[2,] 24 12
So, the question is what am I doing wrong?
Thank you.
You can use the pracma library, such as:
library(pracma)
dummy <- function(x) {
z <- x[1]; y <- x[2]
rez <- (z^2)*(y^3)
rez
}
grad(dummy, c(1,2))
[1] 16 12
hessian(dummy, c(1,2))
[,1] [,2]
[1,] 16 24
[2,] 24 12
The following code is an extension of the answer provided. It treats the case where you have the values of the function and not the actual function. Here the function has 1 parameter. The Grad function calculates in a single point. If you have 3 parameters then you need to provide them to x0 with c(x1,x2,x3).
#i is an index, s_carvone$val contains the values of the function
dummy <- function(i)
{
return (s_carvone$val[i])
}
#function that calculates the gradient in a specific point i
calc_grad <- function(i)
{
return (pracma::grad(dummy, x0=i, heps=1))
}
#calculates the derivative from point 2 to 61
first_derivative = unlist(purrr::map(calc_grad, .x = c(2:61)));
plot(first_derivative);

R minimize absolute error

Here's my setup
obs1<-c(1,1,1)
obs2<-c(0,1,2)
obs3<-c(0,0,3)
absoluteError<-function(obs,x){
return(sum(abs(obs-x)))
}
Example:
> absoluteError(obs2,1)
[1] 2
For a random vector of observations, I'd like to find a minimizer, x, which minimizes the absolute error between the observation values and a vector of all x. For instance, clearly the argument that minimizes absoluteError(obs1,x) is x=1 because this results in an error of 0. How do I find a minimizer for a random vector of observations? I'd imagine this is a linear programming problem, but I've never implemented one in R before.
The median of obs is a minimizer for the absolute error. The following is a sketch of how one might try proving this:
Let the median of a set of n observations, obs, be m. Call the absolute error between obs and m f(obs,m).
Case n is odd:
Consider f(obs,m+delta) where delta is some non zero number. Suppose delta is positive - then there are (n-1)/2 +1 observations whose error is delta more than f(obs,m). The remaining (n-1)/2 observations' error is at most delta less than f(obs,m). So f(obs,m+delta)-f(obs,m)>=delta. (The same argument can be made if delta is negative.) So the median is the only minimizer in this case. Thus f(obs,m+delta)>f(obs,m) for any non zero delta so m is a minimizer for f.
Case n is even:
Basically the same logic as above, except in this case any number between the two inner most numbers in the set will be a minimizer.
I am not sure this answer is correct, and even if it is I am not sure this is what you want. Nevertheless, I am taking a stab at it.
I think you are talking about 'Least absolute deviations', a form of regression that differs from 'Least Squares'.
If so, I found this R code for solving Least absolute deviations regression:
fabs=function(beta0,x,y){
b0=beta0[1]
b1=beta0[2]
n=length(x)
llh=0
for(i in 1:n){
r2=(y[i]-b0-b1*x[i])
llh=llh + abs(r2)
}
llh
}
g=optim(c(1,1),fabs,x=x,y=y)
I found the code here:
http://www.stat.colostate.edu/~meyer/hw12ans.pdf
Assuming you are talking about Least absolute deviations, you might not be interested in the above code if you want a solution in R from scratch rather than a solution that uses optim.
The above code is for a regression line with an intercept and one slope. I modified the code as follows to handle a regression with just an intercept:
y <- c(1,1,1)
x <- 1:length(y)
fabs=function(beta0,x,y){
b0=beta0[1]
b1=0
n=length(x)
llh=0
for(i in 1:n){
r2=(y[i]-b0-b1*x[i])
llh=llh + abs(r2)
}
llh
}
# The commands to get the estimator
g = optim(c(1),fabs,x=x,y=y, method='Brent', lower = (min(y)-5), upper = (max(y)+5))
g
I was not familiar with (i.e., had not heard of) Least absolute deviations until tonight. So, hopefully my modifications are fairly reasonable.
With y <- c(1,1,1) the parameter estimate is 1 (which I think you said is the correct answer):
$par
[1] 1
$value
[1] 1.332268e-15
$counts
function gradient
NA NA
$convergence
[1] 0
$message
NULL
With y <- c(0,1,2) the parameter estimate is 1:
$par
[1] 1
$value
[1] 2
$counts
function gradient
NA NA
$convergence
[1] 0
$message
NULL
With y <- c(0,0,3) the parameter estimate is 0 (which you said is the correct answer):
$par
[1] 8.613159e-10
$value
[1] 3
$counts
function gradient
NA NA
$convergence
[1] 0
$message
NULL
If you want R code from scratch, there is additional R code in the file at the link above which might be helpful.
Alternatively, perhaps it might be possible to extract the relevant code from the source file.
Alternatively, perhaps someone else can provide the desired code (and correct any errors on my part) in the next 24 hours.
If you come up with code from scratch please post it as an answer as I would love to see it myself.
lad=function(x,y){
SAD = function(beta, x, y) {
return(sum(abs(y - (beta[1] + beta[2] * x))))
}
d=lm(y~x)
ans1 = optim(par=c(d$coefficients[1], d$coefficients[2]),method = "Nelder-Mead",fn=SAD, x=x, y=y)
coe=setNames(ans1$par,c("(Intercept)",substitute(x)))
fitted=setNames(ans1$par[1]+ans1$par[2]*x,c(1:length(x)))
res=setNames(y-fitted,c(1:length(x)))
results = list(coefficients=coe, fitted.values=fitted, residuals=res)
class(results)="lad"
return(results)
}

A function for calculating the eigenvalues of a matrix in R

I want to write a function like eigen() to calculating eigenvalues and eigenvectors of an arbitary matrix. I wrote the following codes for calculation of eigenvalues and I need a function or method to solve the resulted linear equation.
eig <- function(x){
if(nrow(x)!=ncol(x)) stop("dimension error")
ff <- function(lambda){
for(i in 1:nrow(x)) x[i,i] <- x[i,i] - lambda
}
det(x)
}
I need to solve det(x)=0 that is a polynomial linear equation to find the values of lambda. Is there any way?
Here is one solution using uniroot.all:
library(rootSolve)
myeig <- function(mat){
myeig1 <- function(lambda) {
y = mat
diag(y) = diag(mat) - lambda
return(det(y))
}
myeig2 <- function(lambda){
sapply(lambda, myeig1)
}
uniroot.all(myeig2, c(-10, 10))
}
R > x <- matrix(rnorm(9), 3)
R > eigen(x)$values
[1] -1.77461906 -1.21589769 -0.01010515
R > myeig(x)
[1] -1.77462211 -1.21589767 -0.01009019
Computing determinant is such a bad idea as it is not numerically stable. You can easily get Inf etc even for a moderately big matrix. I suggest reading the following answers (read them otherwise you have no idea what my code is doing):
Are eigenvectors returned by R function eigen() wrong?
eigenvectors when A-lx is singular with no solution
then use either of the following
NullSpace(A - diag(lambda, nrow(A)))
nullspace(A - diag(lambda, nrow(A)))
The solution from #liuminzhao won't work if there is two repeated eigenvalues. The function will fail to find the roots, because the characteristic polynomial of the matrix will not change sign (it is zero and does not cross the zero line), which is what rootSolve::uniroot.all() is doing when looking for roots. So you need another way to find a local minima (like optim()). Moreover, it will failed to determine the number of repeated eigenvalues.
A better way is to find the characteristic equation with, which is easily done with pracma::charpoly() and then using polyroot().
par <- pracma::charpoly(M) # find parameters of the CP of matrix M
par <- par[length(par):1] # reverse order for polyroot()
roots <- Re(polyroot(par)) # keep real part of the polyroot()
The pracma::charpoly() is not too complicated in itself, see its source code, starting at line a1 <- a.

Simple Markov Chain in R (visualization)

i'd like to do a simple first order markov chain in R. I know there are packages like MCMC, but couldn't found one to display it graphically. Is this even possible? It would be nice if given a transition matrix and an initial state, one can visually see the path through the markov chain (maybe i've to do this by hand...).
Thanks.
This shows how to apply a random transition matrix to a particular starting vector: c(1,0,0,0):
set.seed(123)
tmat <- matrix(rnorm(16)^2,ncol=4)
# need entries to be positive, could have used abs()
tmat <- tmat/rowSums(tmat) # need the rows to sum to 1
tmat
[,1] [,2] [,3] [,4]
[1,] 0.326123580 0.01735335 0.48977444 0.166748625
[2,] 0.016529424 0.91768404 0.06196453 0.003822008
[3,] 0.546050789 0.04774713 0.33676288 0.069439199
[4,] 0.001008839 0.32476060 0.02627217 0.647958394
require(expm) # for the %^% function
matplot( t( # need to transpose to get arguments to matplot correctly
sapply(1:20, function(x) matrix(c(1,0,0,0), ncol=4) %*% (tmat %^% x) ) ) )
You can see it approaching equilibrium:
The package coda (http://cran.r-project.org/web/packages/coda/index.html) has tools for analyzing MCMC results, including some plotting functionality.
Perhaps this query on Biostar can help you: Visualizing HMM files of HMMER3. It point to two external applications, LogoMat-M and HMMeditor, for visualizing Profile Hidden Markov Models (pHMMs).
You can use markovchain R package, that models Discrete Time Markov Chains and contains a plotting facility based on igraph package.
library(markovchain) #loading the package
myMatr<-matrix(c(0,.2,.8,.1,.8,.1,.3,0,.7),byrow=TRUE,nrow = 3) #defining a transition matrix
rownames(myMatr)<-colnames(myMatr)<-c("a","b","c")
myMc<-as(myMatr, "markovchain")
plot(myMc)

Resources