Newton Raphson for logistic regression - r

I did code for Newton Raphson for logistic regression. Unfortunately I tried many data there is no convergence. there is a mistake I do not know where is it. Can anyone help to figure out what is the problem.
First the data is as following; y indicate the response (0,1) , Z is 115*30 matrix which is the exploratory variables. I need to estimate the 30 parameters.
y = c(rep(0,60),rep(1,55))
X = sample(c(0,1),size=3450,replace=T)
Z = t(matrix(X,ncol=115))
#The code is ;
B = matrix(rep(0,30*10),ncol=10)
B[,1] = matrix(rep(0,30),ncol=1)
for(i in 2 : 10){
print(i)
p <- exp(Z %*%as.matrix(B[,i])) / (1 + exp(Z %*% as.matrix(B[,i])))
v.2 <- diag(as.vector(1 * p*(1-p)))
score.2 <- t(Z) %*% (y - p) # score function
increm <- solve(t(Z) %*% v.2 %*% Z)
B[,i] = as.matrix(B[,i-1])+increm%*%score.2
if(B[,i]-B[i-1]==matrix(rep(0.0001,30),ncol=1)){
return(B)
}
}

Found it! You're updating p based on B[,i], you should be using B[,i-1] ...
While I was finding the answer, I cleaned up your code and incorporated the results in a function. R's built-in glm seems to work (see below). One note is that this approach is likely to be unstable: fitting a binary model with 30 predictors and only 115 binary responses, and without any penalization or shrinkage, is extremely optimistic ...
set.seed(101)
n.obs <- 115
n.zero <- 60
n.pred <- 30
y <- c(rep(0,n.zero),rep(1,n.obs-n.zero))
X <- sample(c(0,1),size=n.pred*n.obs,replace=TRUE)
Z <- t(matrix(X,ncol=n.obs))
R's built-in glm fitter does work (it uses iteratively reweighted least squares, not N-R):
g1 <- glm(y~.-1,data.frame(y,Z),family="binomial")
(If you want to view the results, library("arm"); coefplot(g1).)
## B_{m+1} = B_m + (X^T V_m X)^{-1} X^T (Y-P_m)
NRfit function:
NRfit <- function(y,X,start,n.iter=100,tol=1e-4,verbose=TRUE) {
## used X rather than Z just because it's more standard notation
n.pred <- ncol(X)
B <- matrix(NA,ncol=n.iter,
nrow=n.pred)
B[,1] <- start
for (i in 2:n.iter) {
if (verbose) cat(i,"\n")
p <- plogis(X %*% B[,i-1])
v.2 <- diag(c(p*(1-p)))
score.2 <- t(X) %*% (y - p) # score function
increm <- solve(t(X) %*% v.2 %*% X)
B[,i] <- B[,i-1]+increm%*%score.2
if (all(abs(B[,i]-B[,i-1]) < tol)) return(B)
}
B
}
matplot(res1 <- t(NRfit(y,Z,start=coef(g1))))
matplot(res2 <- t(NRfit(y,Z,start=rep(0,ncol(Z)))))
all.equal(res2[6,],unname(coef(g1))) ## TRUE

Related

What is going on inside the varimax function in R?

I have been trying to figure out the core part of the varimax function in R. I found a wiki link that writes out the algorithm. But why is B <- t(x) %*% (z^3 - z %*% diag(drop(rep(1, p) %*% z^2))/p) is computed? I also am not sure as to why SVD is computed of the matrix B. The iteration step is probably to maximize/minimize the variance, and the singular values would really be variances of Principal Components. But I am also unsure about that. I am pasting the whole code of varimax for convenience, but really the relevant part and therefore my question on what is actually happening under the hood, is within the for loop.
function (x, normalize = TRUE, eps = 1e-05)
{
nc <- ncol(x)
if (nc < 2)
return(x)
if (normalize) {
sc <- sqrt(drop(apply(x, 1L, function(x) sum(x^2))))
x <- x/sc
}
p <- nrow(x)
TT <- diag(nc)
d <- 0
for (i in 1L:1000L) {
z <- x %*% TT
B <- t(x) %*% (z^3 - z %*% diag(drop(rep(1, p) %*% z^2))/p)
sB <- La.svd(B)
TT <- sB$u %*% sB$vt
dpast <- d
d <- sum(sB$d)
if (d < dpast * (1 + eps))
break
}
z <- x %*% TT
if (normalize)
z <- z * sc
dimnames(z) <- dimnames(x)
class(z) <- "loadings"
list(loadings = z, rotmat = TT)
}
Edit: The algorithm is available in the book "Factor Analysis of Data Matrices" by Holt, Rinehart and Winston and the actual sources can be found therein. This book is also cited with the varimax function in R.

Why do I get he error: argument is of length 0 for dffits?

I have a problem when I try to run the dffits() function for an object of my own logistic regression.
When I'm running dffits(log) I get the error message:
error in if (model$rank == 0) { : Argument is of length 0
However, when I'm using the inbuilt gym function (family = binomial), then dffits(glm) works just fine.
Here is my function for the logistic regression and a short example of my problem:
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
mydata$rank <- factor(mydata$rank)
mydata$admit <- factor(mydata$admit)
logRegEst <- function(x, y, threshold = 1e-10, maxIter = 100)
{
calcPi <- function(x, beta)
{
beta <- as.vector(beta)
return(exp(x %*% beta) / (1 + exp(x %*% beta)))
}
beta <- rep(0, ncol(x)) # initial guess for beta
diff <- 1000
# initial value bigger than threshold so that we can enter our while loop
iterCount = 0
# counter to ensure we're not stuck in an infinite loop
while(diff > threshold) # tests for convergence
{
pi <- as.vector(calcPi(x, beta))
# calculate pi by using the current estimate of beta
W <- diag(pi * (1 - pi)) # calculate matrix of weights W
beta_change <- solve(t(x) %*% W %*% x) %*% t(x) %*% (y - pi)
# calculate the change in beta
beta <- beta + beta_change # new beta
diff <- sum(beta_change^2)
# calculate how much we changed beta by in this iteration
# if this is less than threshold, we'll break the while loop
iterCount <- iterCount + 1
# see if we've hit the maximum number of iterations
if(iterCount > maxIter){
stop("This isn't converging.")
}
# stop if we have hit the maximum number of iterations
}
df <- length(y) - ncol(x)
# calculating the degrees of freedom by taking the length of y minus
# the number of x columns
vcov <- solve(t(x) %*% W %*% x)
list(coefficients = beta, vcov = vcov, df = df)
# returning results
}
logReg <- function(formula, data)
{
mf <- model.frame(formula = formula, data = data)
# model.frame() returns us a data.frame with the variables needed to use the
# formula.
x <- model.matrix(attr(mf, "terms"), data = mf)
# model.matrix() creates a design matrix. That means that for example the
#"Sex"-variable is given as a dummy variable with ones and zeros.
y <- as.numeric(model.response(mf)) - 1
# model.response gives us the response variable.
est <- logRegEst(x, y)
# Now we have the starting position to apply our function from above.
est$formula <- formula
est$call <- match.call()
est$data <- data
# We add the formular and the call to the list.
est$x <- x
est$y <- y
# We add x and y to the list.
class(est) <- "logReg"
# defining the class
est
}
log <- logReg(admit ~ gre + gpa, data= mydata)
glm <- glm(admit ~ gre + gpa, data= mydata, family = binomial)
dffits(glm)
dffits(log)
log$data
glm$data
I don't understand why mydata$rank == 0, because when I look at log$data I see that the rank is just defined as in glm$data.
I really appreciate your help!

How to calculate standardized Pearson residuals by hand in R?

I am trying to calculate the standardized Pearson Residuals by hand in R. However, I am struggling when it comes to calculating the hat matrix.
I have built my own logistic regression and I am trying to calculate the standardized Pearson residuals in the logReg function.
logRegEst <- function(x, y, threshold = 1e-10, maxIter = 100)
{
calcPi <- function(x, beta)
{
beta <- as.vector(beta)
return(exp(x %*% beta) / (1 + exp(x %*% beta)))
}
beta <- rep(0, ncol(x)) # initial guess for beta
diff <- 1000
# initial value bigger than threshold so that we can enter our while loop
iterCount = 0
# counter for the iterations to ensure we're not stuck in an infinite loop
while(diff > threshold) # tests for convergence
{
pi <- as.vector(calcPi(x, beta))
# calculate pi by using the current estimate of beta
W <- diag(pi * (1 - pi))
# calculate matrix of weights W as defined int he fisher scooring algorithem
beta_change <- solve(t(x) %*% W %*% x) %*% t(x) %*% (y - pi)
# calculate the change in beta
beta <- beta + beta_change # new beta
diff <- sum(beta_change^2)
# calculate how much we changed beta by in this iteration
# if this is less than threshold, we'll break the while loop
iterCount <- iterCount + 1
# see if we've hit the maximum number of iterations
if(iterCount > maxIter){
stop("This isn't converging.")
}
# stop if we have hit the maximum number of iterations
}
n <- length(y)
df <- length(y) - ncol(x)
# calculating the degrees of freedom by taking the length of y minus
# the number of x columns
vcov <- solve(t(x) %*% W %*% x)
logLik <- sum(y * log(pi / (1 - pi)) + log(1 - pi))
deviance <- -2 * logLik
AIC <- -2 * logLik + 2 * ncol(x)
rank <- ncol(x)
list(coefficients = beta, vcov = vcov, df = df, deviance = deviance,
AIC = AIC, iter = iterCount - 1, x = x, y = y, n = n, rank = rank)
# returning results
}
logReg <- function(formula, data)
{
if (sum(is.na(data)) > 0) {
print("missing values in data")
} else {
mf <- model.frame(formula = formula, data = data)
# model.frame() returns us a data.frame with the variables needed to use the
# formula.
x <- model.matrix(attr(mf, "terms"), data = mf)
# model.matrix() creates a design matrix. That means that for example the
#"Sex"-variable is given as a dummy variable with ones and zeros.
y <- as.numeric(model.response(mf)) - 1
# model.response gives us the response variable.
est <- logRegEst(x, y)
# Now we have the starting position to apply our function from above.
est$formula <- formula
est$call <- match.call()
# We add the formular and the call to the list.
nullModel <- logRegEst(x = as.matrix(rep(1, length(y))), y)
est$nullDeviance <- nullModel$deviance
est$nullDf <- nullModel$df
mu <- exp(as.vector(est$x %*% est$coefficients)) /
(1 + exp(as.vector(est$x %*% est$coefficients)))
# computing the fitted values
est$residuals <- (est$y - mu) / sqrt(mu * (1 - mu))
est$mu <- mu
est$x <- x
est$y <- y
est$data <- data
hat <- (t(mu))^(1/2)%*%x%*%(t(x)%*%mu%*%x)^(-1)%*%t(x)%*%mu^(1/2)
est$stdresiduals <- est$residuals/(sqrt(1-hat))
class(est) <- "logReg"
# defining the class
est
}
}
I am struggling when it comes to calculating 𝐻=𝑉̂1/2𝑋(𝑋𝑇𝑉̂𝑋)−1𝑋𝑇𝑉̂1/2. This is called hat in my code.
If I try to calculate the hat matrix (hat) I get the error that I cannot multiply the vector mu and the matrix x in this case: t(x)%*%mu%*%x.
I can see that the rank of the matrices are not identical and therefor I can't multiply them.
Can Anyone see where my mistake is? Help is very appreciated. Thanks!

Use the markovchain package to compare two empirically estimated Markov chains

I need to compare two probability matrices to know the degree of proximity of the chains, so I would use the resulting P-Value of the test.
I tried to use the markovchain r package, more specifically the divergenceTest function. But, the problem is that the function is not properly implemented. It is based on the test of the book "Statistical Inference Based on Divergence Measures" on page 139, I contacted the package developers, but they still have not corrected, so I tried to implement, but I'm having trouble, could anyone help me to find the error?
Parameters: freq_matrix: Is a frequency matrix used to estimate the probability matrix. hypothetic: Is the matrix used to compare with the estimated matrix.
divergenceTest3 <- function(freq_matrix, hypothetic){
n <- sum(freq_matrix)
empirical = freq_matrix
for (i in 1:length(hypothetic)){
empirical[i,] <- freq_matrix[i,]/rowSums(freq_matrix)[i]
}
M <- nrow(empirical)
v <- numeric()
out <- 2 * n / .phi2(1)
sum <- 0
c <- 0
for(i in 1:M){
sum2 <- 0
sum3 <- 0
for(j in 1:M){
if(hypothetic[i, j] > 0){
c <- c + 1
}
sum2 <- sum2 + hypothetic[i, j] * .phi(empirical[i, j] / hypothetic[i, j])
}
v[i] <- rowSums(freq_matrix)[i]
sum <- sum + ((v[i] / n) * sum2)
}
TStat <- out * sum
pvalue <- 1 - pchisq(TStat, c-M)
cat("The Divergence test statistic is: ", TStat, " the Chi-Square d.f. are: ", c-M," the p-value is: ", pvalue,"\n")
out <- list(statistic = TStat, p.value = pvalue)
return(out)
}
# phi function for divergence test
.phi <- function(x) {
out <- x*log(x) - x + 1
return(out)
}
# another phi function for divergence test
.phi2 <- function(x) {
out <- 1/x
return(out)
}
The divergence test has been replaced by the verifyHomogeneityfunction. It requires and input list of elements that can be coerced to a raw transition matrix (as of createSequenceMatrix). Then it tests whether they belong to the same unknown DTMC.
See the example below:
myMatr1<-matrix(c(0.2,.8,.5,.5),byrow=TRUE, nrow=2)
myMatr2<-matrix(c(0.5,.5,.4,.6),byrow=TRUE, nrow=2)
mc1<-as(myMatr1,"markovchain")
mc2<-as(myMatr2,"markovchain")
mc
mc2
sample1<-rmarkovchain(n=100, object=mc1)
sample2<-rmarkovchain(n=200, object=mc2)
# should reject
verifyHomogeneity(inputList = list(sample1,sample2))
#should accept
sample2<-rmarkovchain(n=200, object=mc1)
verifyHomogeneity(inputList = list(sample1,sample2))

Least square optimization (of matrices) in R

Yesterday I asked a question about least square optimization in R and it turned out that lm function is the thing that I was looking for.
On the other hand, now I have an other least square optimization question and I am wondering if lm could also solve this problem, or if not, how it can be handled in R.
I have fixed matrices B (of dimension n x m) and V (of dimension n x n), I am looking for an m-long vector u such that
sum( ( V - ( B %*% diag(u) %*% t(B)) )^2 )
is minimized.
1) lm.fit Use the fact that
vec(AXA') = (A ⊗ A ) vec(X)
so:
k <- ncol(A)
AA1 <- kronecker(A, A)[, c(diag(k)) == 1]
lm.fit(AA1, c(V))
Here is a self contained example:
# test data
set.seed(123)
A <- as.matrix(BOD)
u <- 1:2
V <- A %*% diag(u) %*% t(A) + rnorm(36)
# solve
k <- ncol(A)
AA1 <- kronecker(A, A)[, c(diag(k)) == 1]
fm1 <- lm.fit(AA1, c(V))
giving roughly the original coefficients 1:2 :
> coef(fm1)
x1 x2
1.011206 1.999575
2) nls We can alternately use nls like this:
fm2 <- nls(c(V) ~ c(A %*% diag(x) %*% t(A)), start = list(x = numeric(k)))
giving the following for the above example:
> fm2
Nonlinear regression model
model: c(V) ~ c(A %*% diag(x) %*% t(A))
data: parent.frame()
x1 x2
1.011 2.000
residual sum-of-squares: 30.52
Number of iterations to convergence: 1
Achieved convergence tolerance: 1.741e-09
Update: Corrections and second solution.

Resources