i am trying to run a simple optimization problem and i always get the 'non-conformable arguments' error. However the error occurs only within optim, so i am at the end of my understanding. Here is my code:
f <- function(W, x, y){
temp <- W %*% x
temp1 <- t(temp) %*% temp
temp2 <- t(temp) %*% y
temp3 <- t(y) %*% y
(temp1 - 2 * temp2 + temp3)[1]
}
x <- matrix(c(1,2), ncol=1)
y <- matrix(c(5,3), ncol=1)
W <- matrix(c(1,2,3,4), ncol=2)
f(W,x=x,y=y)
> 53
so far so good. The function works just fine. However once i put the values into optim:
optim(par=W, fn=f, x=x, y=y)
i get:
> Error in W %*% x : non-conformable arguments
how can i fix this?
If you add print(W) at start of f you would see that W is used as vector (it seems that optim converts par to vector). You should transform W to matrix with adequate dimensions at start of f.
Related
I have been trying to figure out the core part of the varimax function in R. I found a wiki link that writes out the algorithm. But why is B <- t(x) %*% (z^3 - z %*% diag(drop(rep(1, p) %*% z^2))/p) is computed? I also am not sure as to why SVD is computed of the matrix B. The iteration step is probably to maximize/minimize the variance, and the singular values would really be variances of Principal Components. But I am also unsure about that. I am pasting the whole code of varimax for convenience, but really the relevant part and therefore my question on what is actually happening under the hood, is within the for loop.
function (x, normalize = TRUE, eps = 1e-05)
{
nc <- ncol(x)
if (nc < 2)
return(x)
if (normalize) {
sc <- sqrt(drop(apply(x, 1L, function(x) sum(x^2))))
x <- x/sc
}
p <- nrow(x)
TT <- diag(nc)
d <- 0
for (i in 1L:1000L) {
z <- x %*% TT
B <- t(x) %*% (z^3 - z %*% diag(drop(rep(1, p) %*% z^2))/p)
sB <- La.svd(B)
TT <- sB$u %*% sB$vt
dpast <- d
d <- sum(sB$d)
if (d < dpast * (1 + eps))
break
}
z <- x %*% TT
if (normalize)
z <- z * sc
dimnames(z) <- dimnames(x)
class(z) <- "loadings"
list(loadings = z, rotmat = TT)
}
Edit: The algorithm is available in the book "Factor Analysis of Data Matrices" by Holt, Rinehart and Winston and the actual sources can be found therein. This book is also cited with the varimax function in R.
this is my code:
#define likelihood function (including an intercept/constant in the function.)
lltobit <- function(b,x,y) {
sigma <- b[3]
y <- as.matrix(y)
x <- as.matrix(x)
vecones <- rep(1,nrow(x))
x <- cbind(vecones,x)
bx <- x %*% b[1:2]
d <- y != 0
llik <- sum(d * ((-1/2)*(log(2*pi) + log(sigma^2) + ((y - bx)/sigma)^2))
+ (1-d) * (log(1 - pnorm(bx/sigma))))
return(-llik)
}
n <- nrow(censored) #define number of variables
y <- censored$y #define y and x for easier use
x1 <- as.matrix(censored$x)
x <- cbind(rep(1,n),x1) #include constant/intercept
bols <- (solve(t(x) %*% x)) %*% (t(x) %*% y) #compute ols estimator (XX) -1 XY
init <- rbind(as.matrix(bols[1:nrow(bols)]),1) #initial values
init
tobit1 <- optim(init, lltobit, x=x, y=y, hessian=TRUE, method="BFGS")
where censored is my data table, including 200 (censored) values of y and 200 values of x.
Everything works, but when running the optim command, i get the following error:
tobit1 <- optim(init, lltobit, x=x, y=y, hessian=TRUE, method="BFGS")
Error in x %*% b[1:2] : non-conformable arguments
I know what it means, but since x is a 200 by 2 matrix, and b[1:2] a vector of 2 by 1, what goes wrong? I tried transposing both, and also the initial values vector, but nothing works. Can anyone help me?
I stumbled upon a similar problem today ("non-conformable arguments" error, even though everything seemed OK), and solution in my case was in basic rules for matrix-multiplication: i.e. number of columns of the left matrix must be the same as the number of rows of the right matrix = I had to switch order in multiplication equation.
In other words, in matrix multiplication (unlike ordinary multiplication), A %*% B is not the same as B %*% A.
I offer one case in Principal Component Regression (PCR) in R, today I met this problem when tring to fit test data with model. it returned an error:
> pcr.pred = predict(pcr.fit, test.data, ncomp=6)
Error in newX %*% B[-1, , i] : non-conformable arguments
In addition: Warning message:
The problem was that, the test data has a new level that is previously not contained in the train data. To find which level has the problem:
cols = colnames(train)
for (col in cols){
if(class(ori.train[[col]]) == 'factor'){
print(col)
print(summary(train[[col]]))
print(summary(test[[col]]))
}
}
You can check which annoying attributes has this new level, then you can replace this 'new' attribute with other common values, save the data with write.csv and reload it, and you can run the PCR prediction.
In the following code I'm trying to use the function norm.dem to generate a contour plot of the points given by x and y. I can't seems to figure out how to do this. I've tried everything I could think of. For some reason the function isn't letting me put in values of sequence. Shouldn't the outer function give me a list of values?
x=seq(-10,10,length=1000)
y=seq(-10,10,length=1000)
sigma <- matrix(c(10,-5,-5,20), ncol=2)
sigma
norm.den=function(x,y,sigma,mu)
{
j<-c(x,y)
k=j-mu
t<-t(k)
s<-solve(sigma)
d<-det(sigma)
((2.718)^(-t%*%s%*%k/2))/(2*(3.14)*sqrt(d))
}
z=outer(x,y,norm.den,sigma=sigma,mu=c(0,0))
Forcibly using for() loops works.
normden <- function(x, y, solvsig=ss, detsig=ds, mu=c(0, 0)) {
k <- c(x, y) - mu
tk <- t(k)
(exp(-tk %*% ss %*% k / 2)) / (2*pi*sqrt(ds))
}
sigma <- matrix(c(10, -5, -5, 20), ncol=2)
ss <- solve(sigma)
ds <- det(sigma)
x <- seq(-10, 10, length=10)
y <- seq(-10, 10, length=10)
z <- array(dim=c(length(x), length(x)))
for(i in seq(x)) {
for(j in seq(y)) {
z[i, j] <- normden(x=x[i], y=y[j])
}}
z
I couldn't get outer() to work with the normden() function. Don't know why.
outer(x, y, "normden")
I get the same error that the OP mentioned, Error in -tk %*% ss : non-conformable arguments.
I did code for Newton Raphson for logistic regression. Unfortunately I tried many data there is no convergence. there is a mistake I do not know where is it. Can anyone help to figure out what is the problem.
First the data is as following; y indicate the response (0,1) , Z is 115*30 matrix which is the exploratory variables. I need to estimate the 30 parameters.
y = c(rep(0,60),rep(1,55))
X = sample(c(0,1),size=3450,replace=T)
Z = t(matrix(X,ncol=115))
#The code is ;
B = matrix(rep(0,30*10),ncol=10)
B[,1] = matrix(rep(0,30),ncol=1)
for(i in 2 : 10){
print(i)
p <- exp(Z %*%as.matrix(B[,i])) / (1 + exp(Z %*% as.matrix(B[,i])))
v.2 <- diag(as.vector(1 * p*(1-p)))
score.2 <- t(Z) %*% (y - p) # score function
increm <- solve(t(Z) %*% v.2 %*% Z)
B[,i] = as.matrix(B[,i-1])+increm%*%score.2
if(B[,i]-B[i-1]==matrix(rep(0.0001,30),ncol=1)){
return(B)
}
}
Found it! You're updating p based on B[,i], you should be using B[,i-1] ...
While I was finding the answer, I cleaned up your code and incorporated the results in a function. R's built-in glm seems to work (see below). One note is that this approach is likely to be unstable: fitting a binary model with 30 predictors and only 115 binary responses, and without any penalization or shrinkage, is extremely optimistic ...
set.seed(101)
n.obs <- 115
n.zero <- 60
n.pred <- 30
y <- c(rep(0,n.zero),rep(1,n.obs-n.zero))
X <- sample(c(0,1),size=n.pred*n.obs,replace=TRUE)
Z <- t(matrix(X,ncol=n.obs))
R's built-in glm fitter does work (it uses iteratively reweighted least squares, not N-R):
g1 <- glm(y~.-1,data.frame(y,Z),family="binomial")
(If you want to view the results, library("arm"); coefplot(g1).)
## B_{m+1} = B_m + (X^T V_m X)^{-1} X^T (Y-P_m)
NRfit function:
NRfit <- function(y,X,start,n.iter=100,tol=1e-4,verbose=TRUE) {
## used X rather than Z just because it's more standard notation
n.pred <- ncol(X)
B <- matrix(NA,ncol=n.iter,
nrow=n.pred)
B[,1] <- start
for (i in 2:n.iter) {
if (verbose) cat(i,"\n")
p <- plogis(X %*% B[,i-1])
v.2 <- diag(c(p*(1-p)))
score.2 <- t(X) %*% (y - p) # score function
increm <- solve(t(X) %*% v.2 %*% X)
B[,i] <- B[,i-1]+increm%*%score.2
if (all(abs(B[,i]-B[,i-1]) < tol)) return(B)
}
B
}
matplot(res1 <- t(NRfit(y,Z,start=coef(g1))))
matplot(res2 <- t(NRfit(y,Z,start=rep(0,ncol(Z)))))
all.equal(res2[6,],unname(coef(g1))) ## TRUE
I am working on Ridge regression, I want to make my own function. It tried the following. It work for individual value of k but not for array for sequence of values.
dt<-longley
attach(dt)
library(MASS)
X<-cbind(X1,X2,X3,X4,X5,X6)
X<-as.matrix(X)
Y<-as.matrix(Y)
sx<-scale(X)/sqrt(nrow(X)-1)
sy<-scale(Y)/sqrt(nrow(Y)-1)
rxx<-cor(sx)
rxy<-cor(sx,sy)
for (k in 0:1){
res<-solve(rxx+k*diag(rxx))%*%rxy
k=k+0.01
}
Need help for optimized code too.
poly.kernel <- function(v1, v2=v1, p=1) {
((as.matrix(v1) %*% t(v2))+1)^p
}
KernelRidgeReg <- function(TrainObjects,TrainLabels,TestObjects,lambda){
X <- TrainObjects
y <- TrainLabels
kernel <- poly.kernel(X)
design.mat <- cbind(1, kernel)
I <- rbind(0, cbind(0, kernel))
M <- crossprod(design.mat) + lambda*I
#crossprod is just x times traspose of x, just looks neater in my openion
M.inv <- solve(M)
#inverse of M
k <- as.matrix(diag(poly.kernel(cbind(TrainObjects,TrainLabels))))
#Removing diag still gives the same MSE, but will output a vector of prediction.
Labels <- rbind(0,as.matrix(TrainLabels))
y.hat <- t(Labels) %*% M.inv %*% rbind(0,k)
y.true <- Y.test
MSE <-mean((y.hat - y.true)^2)
return(list(MSE=MSE,y.hat=y.hat))
}
Kernel with p=1, will give you ridge regression.
Solve built-in R function sometimes return singular matrix. You may want to write your own function to avoid that.