I'm trying to implement gradient descent manually and am running into the following error:
Error in beta - ((alpha/6132) * sum(yhat - y) * x) :
non-conformable arrays
My code is
while(itr < 20){
beta <- beta - ((alpha/6132)*sum(yhat-y)*x)
t <- t(beta)
yhat <- t(t%*%x)
itr <- itr+1
}
Where y and yhat are both matrices with the same number of rows (6132) and columns (1), and beta is a set of 12 arbitrary beta values to be optimized iteratively and x is a matrix containing 12 predictor variables.
Here is the code detailing this:
beta <- as.matrix(1:12)
t <- t(beta)
x <- as.matrix(t(bike))
x <- rbind(1, x)
yhat <- t(t%*%x)
y <- t(y)
alpha <- 0.01
x is a matrix with 12 columns and 6132 rows.
Any help/advise is appreciated.
Related
I have a data frame of values and want to add a column based on an inequality condition that involves matrix multiplication.
The data frame looks like this
# Set possible values for variables
b0 <- b1 <- b2 <- seq(0, 2, by=0.1)
# Create data frame of all the different combos of these variables
df <- setNames(expand.grid(b0, b1, b2), c("b0", "b1", "b2"))
There are a lot of precursor objects I have to define before adding this column:
##### Set n
n = 100
#### Generate (x1i, x2i)
# Install and load the 'MASS' package
#install.packages("MASS")
library("MASS")
# Input univariate parameters
rho <- 0.5
mu1 <- 0; s1 <- 1
mu2 <- 0; s2 <- 1
# Generate parameters for bivariate normal distribution
mu <- c(mu1, mu2)
sigma <- matrix(c(s1^2, s1*s2*rho, s1*s2*rho, s2^2), nrow=2, ncol=2)
# Generate draws from bivariate normal distribution
bvn <- mvrnorm(n, mu=mu, Sigma=sigma ) # from MASS package
x1 <- bvn[, 1]
x2 <- bvn[, 2]
##### Generate error
error <- rnorm(n)
##### Generate dependent variable
y <- 0.5 + x1 + x2 + error
##### Create the model
lm <- lm(y ~ x1 + x2)
# Setup parameters
n <- 100
K <- 3
c <- qf(.95, K, n - K)
# Define necessary objects
sigma_hat_sq <- 1
b0_hat <- summary(lm)$coefficients[1, 1]
b1_hat <- summary(lm)$coefficients[2, 1]
b2_hat <- summary(lm)$coefficients[3, 1]
x <- cbind(1, x1, x2)
I am trying to add this conditional column like this:
# Add a column to the data frame that says whether the condition holds
df <- transform(df, ueq = (
(1/(K*sigma_hat_sq))*
t(matrix(c(b0_hat-b0, b1_hat-b1, b2_hat-b2)))%*%
t(x)%*%x%*%
matrix(c(b0_hat-b0, b1_hat-b1, b2_hat-b2))
<= c
))
...but doing so generates the error message
Error in t(matrix(c(b0_hat - b0, b1_hat - b1, b2_hat - b2))) %*% t(x) :
non-conformable arguments
Mathematically, the condition is [1/(Ksigmahat^2)](Bhat-B)'X'X(Bhat-B) <= c, where, for each triple (b0,b1,b2), (Bhat-B) is a 3x1 matrix with elements {B0hat, B1hat, B2hat}. I'm just not sure how to write this condition in R.
Any help would be greatly appreciated!
In order to only work with one row of df at a time (and get a separate answer for each 1 x 3 matrix, you need a loop.
A simple way to do this in R is mapply.
df <- transform(df, ueq = mapply(\(b0, b1, b2)
(1/(K*sigma_hat_sq)) *
t(c(b0_hat-b0, b1_hat-b1, b2_hat-b2)) %*%
t(x) %*% x %*%
c(b0_hat-b0, b1_hat-b1, b2_hat-b2)
<= c,
b0 = b0, b1 = b1, b2 = b2
))
This leads to 91 TRUE rows.
sum(df$ueq)
[1] 91
I have a problem when I try to run the dffits() function for an object of my own logistic regression.
When I'm running dffits(log) I get the error message:
error in if (model$rank == 0) { : Argument is of length 0
However, when I'm using the inbuilt gym function (family = binomial), then dffits(glm) works just fine.
Here is my function for the logistic regression and a short example of my problem:
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
mydata$rank <- factor(mydata$rank)
mydata$admit <- factor(mydata$admit)
logRegEst <- function(x, y, threshold = 1e-10, maxIter = 100)
{
calcPi <- function(x, beta)
{
beta <- as.vector(beta)
return(exp(x %*% beta) / (1 + exp(x %*% beta)))
}
beta <- rep(0, ncol(x)) # initial guess for beta
diff <- 1000
# initial value bigger than threshold so that we can enter our while loop
iterCount = 0
# counter to ensure we're not stuck in an infinite loop
while(diff > threshold) # tests for convergence
{
pi <- as.vector(calcPi(x, beta))
# calculate pi by using the current estimate of beta
W <- diag(pi * (1 - pi)) # calculate matrix of weights W
beta_change <- solve(t(x) %*% W %*% x) %*% t(x) %*% (y - pi)
# calculate the change in beta
beta <- beta + beta_change # new beta
diff <- sum(beta_change^2)
# calculate how much we changed beta by in this iteration
# if this is less than threshold, we'll break the while loop
iterCount <- iterCount + 1
# see if we've hit the maximum number of iterations
if(iterCount > maxIter){
stop("This isn't converging.")
}
# stop if we have hit the maximum number of iterations
}
df <- length(y) - ncol(x)
# calculating the degrees of freedom by taking the length of y minus
# the number of x columns
vcov <- solve(t(x) %*% W %*% x)
list(coefficients = beta, vcov = vcov, df = df)
# returning results
}
logReg <- function(formula, data)
{
mf <- model.frame(formula = formula, data = data)
# model.frame() returns us a data.frame with the variables needed to use the
# formula.
x <- model.matrix(attr(mf, "terms"), data = mf)
# model.matrix() creates a design matrix. That means that for example the
#"Sex"-variable is given as a dummy variable with ones and zeros.
y <- as.numeric(model.response(mf)) - 1
# model.response gives us the response variable.
est <- logRegEst(x, y)
# Now we have the starting position to apply our function from above.
est$formula <- formula
est$call <- match.call()
est$data <- data
# We add the formular and the call to the list.
est$x <- x
est$y <- y
# We add x and y to the list.
class(est) <- "logReg"
# defining the class
est
}
log <- logReg(admit ~ gre + gpa, data= mydata)
glm <- glm(admit ~ gre + gpa, data= mydata, family = binomial)
dffits(glm)
dffits(log)
log$data
glm$data
I don't understand why mydata$rank == 0, because when I look at log$data I see that the rank is just defined as in glm$data.
I really appreciate your help!
I am trying to calculate the standardized Pearson Residuals by hand in R. However, I am struggling when it comes to calculating the hat matrix.
I have built my own logistic regression and I am trying to calculate the standardized Pearson residuals in the logReg function.
logRegEst <- function(x, y, threshold = 1e-10, maxIter = 100)
{
calcPi <- function(x, beta)
{
beta <- as.vector(beta)
return(exp(x %*% beta) / (1 + exp(x %*% beta)))
}
beta <- rep(0, ncol(x)) # initial guess for beta
diff <- 1000
# initial value bigger than threshold so that we can enter our while loop
iterCount = 0
# counter for the iterations to ensure we're not stuck in an infinite loop
while(diff > threshold) # tests for convergence
{
pi <- as.vector(calcPi(x, beta))
# calculate pi by using the current estimate of beta
W <- diag(pi * (1 - pi))
# calculate matrix of weights W as defined int he fisher scooring algorithem
beta_change <- solve(t(x) %*% W %*% x) %*% t(x) %*% (y - pi)
# calculate the change in beta
beta <- beta + beta_change # new beta
diff <- sum(beta_change^2)
# calculate how much we changed beta by in this iteration
# if this is less than threshold, we'll break the while loop
iterCount <- iterCount + 1
# see if we've hit the maximum number of iterations
if(iterCount > maxIter){
stop("This isn't converging.")
}
# stop if we have hit the maximum number of iterations
}
n <- length(y)
df <- length(y) - ncol(x)
# calculating the degrees of freedom by taking the length of y minus
# the number of x columns
vcov <- solve(t(x) %*% W %*% x)
logLik <- sum(y * log(pi / (1 - pi)) + log(1 - pi))
deviance <- -2 * logLik
AIC <- -2 * logLik + 2 * ncol(x)
rank <- ncol(x)
list(coefficients = beta, vcov = vcov, df = df, deviance = deviance,
AIC = AIC, iter = iterCount - 1, x = x, y = y, n = n, rank = rank)
# returning results
}
logReg <- function(formula, data)
{
if (sum(is.na(data)) > 0) {
print("missing values in data")
} else {
mf <- model.frame(formula = formula, data = data)
# model.frame() returns us a data.frame with the variables needed to use the
# formula.
x <- model.matrix(attr(mf, "terms"), data = mf)
# model.matrix() creates a design matrix. That means that for example the
#"Sex"-variable is given as a dummy variable with ones and zeros.
y <- as.numeric(model.response(mf)) - 1
# model.response gives us the response variable.
est <- logRegEst(x, y)
# Now we have the starting position to apply our function from above.
est$formula <- formula
est$call <- match.call()
# We add the formular and the call to the list.
nullModel <- logRegEst(x = as.matrix(rep(1, length(y))), y)
est$nullDeviance <- nullModel$deviance
est$nullDf <- nullModel$df
mu <- exp(as.vector(est$x %*% est$coefficients)) /
(1 + exp(as.vector(est$x %*% est$coefficients)))
# computing the fitted values
est$residuals <- (est$y - mu) / sqrt(mu * (1 - mu))
est$mu <- mu
est$x <- x
est$y <- y
est$data <- data
hat <- (t(mu))^(1/2)%*%x%*%(t(x)%*%mu%*%x)^(-1)%*%t(x)%*%mu^(1/2)
est$stdresiduals <- est$residuals/(sqrt(1-hat))
class(est) <- "logReg"
# defining the class
est
}
}
I am struggling when it comes to calculating 𝐻=𝑉̂1/2𝑋(𝑋𝑇𝑉̂𝑋)−1𝑋𝑇𝑉̂1/2. This is called hat in my code.
If I try to calculate the hat matrix (hat) I get the error that I cannot multiply the vector mu and the matrix x in this case: t(x)%*%mu%*%x.
I can see that the rank of the matrices are not identical and therefor I can't multiply them.
Can Anyone see where my mistake is? Help is very appreciated. Thanks!
this is my code:
#define likelihood function (including an intercept/constant in the function.)
lltobit <- function(b,x,y) {
sigma <- b[3]
y <- as.matrix(y)
x <- as.matrix(x)
vecones <- rep(1,nrow(x))
x <- cbind(vecones,x)
bx <- x %*% b[1:2]
d <- y != 0
llik <- sum(d * ((-1/2)*(log(2*pi) + log(sigma^2) + ((y - bx)/sigma)^2))
+ (1-d) * (log(1 - pnorm(bx/sigma))))
return(-llik)
}
n <- nrow(censored) #define number of variables
y <- censored$y #define y and x for easier use
x1 <- as.matrix(censored$x)
x <- cbind(rep(1,n),x1) #include constant/intercept
bols <- (solve(t(x) %*% x)) %*% (t(x) %*% y) #compute ols estimator (XX) -1 XY
init <- rbind(as.matrix(bols[1:nrow(bols)]),1) #initial values
init
tobit1 <- optim(init, lltobit, x=x, y=y, hessian=TRUE, method="BFGS")
where censored is my data table, including 200 (censored) values of y and 200 values of x.
Everything works, but when running the optim command, i get the following error:
tobit1 <- optim(init, lltobit, x=x, y=y, hessian=TRUE, method="BFGS")
Error in x %*% b[1:2] : non-conformable arguments
I know what it means, but since x is a 200 by 2 matrix, and b[1:2] a vector of 2 by 1, what goes wrong? I tried transposing both, and also the initial values vector, but nothing works. Can anyone help me?
I stumbled upon a similar problem today ("non-conformable arguments" error, even though everything seemed OK), and solution in my case was in basic rules for matrix-multiplication: i.e. number of columns of the left matrix must be the same as the number of rows of the right matrix = I had to switch order in multiplication equation.
In other words, in matrix multiplication (unlike ordinary multiplication), A %*% B is not the same as B %*% A.
I offer one case in Principal Component Regression (PCR) in R, today I met this problem when tring to fit test data with model. it returned an error:
> pcr.pred = predict(pcr.fit, test.data, ncomp=6)
Error in newX %*% B[-1, , i] : non-conformable arguments
In addition: Warning message:
The problem was that, the test data has a new level that is previously not contained in the train data. To find which level has the problem:
cols = colnames(train)
for (col in cols){
if(class(ori.train[[col]]) == 'factor'){
print(col)
print(summary(train[[col]]))
print(summary(test[[col]]))
}
}
You can check which annoying attributes has this new level, then you can replace this 'new' attribute with other common values, save the data with write.csv and reload it, and you can run the PCR prediction.
I did code for Newton Raphson for logistic regression. Unfortunately I tried many data there is no convergence. there is a mistake I do not know where is it. Can anyone help to figure out what is the problem.
First the data is as following; y indicate the response (0,1) , Z is 115*30 matrix which is the exploratory variables. I need to estimate the 30 parameters.
y = c(rep(0,60),rep(1,55))
X = sample(c(0,1),size=3450,replace=T)
Z = t(matrix(X,ncol=115))
#The code is ;
B = matrix(rep(0,30*10),ncol=10)
B[,1] = matrix(rep(0,30),ncol=1)
for(i in 2 : 10){
print(i)
p <- exp(Z %*%as.matrix(B[,i])) / (1 + exp(Z %*% as.matrix(B[,i])))
v.2 <- diag(as.vector(1 * p*(1-p)))
score.2 <- t(Z) %*% (y - p) # score function
increm <- solve(t(Z) %*% v.2 %*% Z)
B[,i] = as.matrix(B[,i-1])+increm%*%score.2
if(B[,i]-B[i-1]==matrix(rep(0.0001,30),ncol=1)){
return(B)
}
}
Found it! You're updating p based on B[,i], you should be using B[,i-1] ...
While I was finding the answer, I cleaned up your code and incorporated the results in a function. R's built-in glm seems to work (see below). One note is that this approach is likely to be unstable: fitting a binary model with 30 predictors and only 115 binary responses, and without any penalization or shrinkage, is extremely optimistic ...
set.seed(101)
n.obs <- 115
n.zero <- 60
n.pred <- 30
y <- c(rep(0,n.zero),rep(1,n.obs-n.zero))
X <- sample(c(0,1),size=n.pred*n.obs,replace=TRUE)
Z <- t(matrix(X,ncol=n.obs))
R's built-in glm fitter does work (it uses iteratively reweighted least squares, not N-R):
g1 <- glm(y~.-1,data.frame(y,Z),family="binomial")
(If you want to view the results, library("arm"); coefplot(g1).)
## B_{m+1} = B_m + (X^T V_m X)^{-1} X^T (Y-P_m)
NRfit function:
NRfit <- function(y,X,start,n.iter=100,tol=1e-4,verbose=TRUE) {
## used X rather than Z just because it's more standard notation
n.pred <- ncol(X)
B <- matrix(NA,ncol=n.iter,
nrow=n.pred)
B[,1] <- start
for (i in 2:n.iter) {
if (verbose) cat(i,"\n")
p <- plogis(X %*% B[,i-1])
v.2 <- diag(c(p*(1-p)))
score.2 <- t(X) %*% (y - p) # score function
increm <- solve(t(X) %*% v.2 %*% X)
B[,i] <- B[,i-1]+increm%*%score.2
if (all(abs(B[,i]-B[,i-1]) < tol)) return(B)
}
B
}
matplot(res1 <- t(NRfit(y,Z,start=coef(g1))))
matplot(res2 <- t(NRfit(y,Z,start=rep(0,ncol(Z)))))
all.equal(res2[6,],unname(coef(g1))) ## TRUE