I want to implement eigen vectors and values without a function(eigen).
Hand calculations in simple matrices are possible, but it is hard to think about to implement them in r.
I want a code that can be used in all x matrixs (n x p).
I would be really grateful if you could provide me with an idea or code.
A <- t(x) %*% x
eigen(A)$vectors # I don't want to use 'eigen'
eigen(A)$values
Related
You have a set of N=400 objects, each having its own coordinates in a, say, 19-dimensional space.
You calculate the (Euclidean) distance matrix (all pairwise distances).
Now you want to select n=50 objects, such that the sum of all pairwise distances between the selected objects is maximal.
I devised a way to solve this by linear programming (code below, for a smaller example), but it seems inefficient to me, because I am using N*(N-1)/2 binary variables, corresponding to all the non-redundant elements of the distance matrix, and then a lot of constraints to ensure self-consistency of the solution vector.
I suspect there must be a simpler approach, where only N variables are used, but I can't immediately think of one.
This post briefly mentions some 'Bron–Kerbosch' algorithm, which apparently addresses the distance sum part.
But in that example the sum of distances is a specific number, so I don't see a direct application to my case.
I had a brief look at quadratic programming, but again I could not see the immediate parallel with my case, although the 'b %*% bT' matrix, where b is the (column) binary solution vector, could in theory be used to multiply the distance matrix, etc.; but I'm really not familiar with this technique.
Could anyone please advise (/point me to other posts explaining) if and how this kind of problem can be solved by linear programming using only N binary variables?
Or provide any other advice on how to tackle the problem more efficiently?
Thanks!
PS: here's the code I referred to above.
require(Matrix)
#distmat defined manually for this example as a sparseMatrix
distmat <- sparseMatrix(i=c(rep(1,4),rep(2,3),rep(3,2),rep(4,1)),j=c(2:5,3:5,4:5,5:5),x=c(0.3,0.2,0.9,0.5,0.1,0.8,0.75,0.6,0.6,0.15))
N = 5
n = 3
distmat_summary <- summary(distmat)
distmat_summary["ID"] <- 1:NROW(distmat_summary)
i.mat <- xtabs(~i+ID,distmat_summary,sparse=T)
j.mat <- xtabs(~j+ID,distmat_summary,sparse=T)
ij.mat <- rbind(i.mat,"5"=rep(0,10))+rbind("1"=rep(0,10),j.mat)
ij.mat.rowSums <- rowSums(ij.mat)
ij.diag.mat <- .sparseDiagonal(n=length(ij.mat.rowSums),-ij.mat.rowSums)
colnames(ij.diag.mat) <- dimnames(ij.mat)[[1]]
mat <- rbind(cbind(ij.mat,ij.diag.mat),cbind(ij.mat,ij.diag.mat),c(rep(0,NCOL(ij.mat)),rep(1,NROW(ij.mat)) ))
dir <- c(rep("<=",NROW(ij.mat)),rep(">=",NROW(ij.mat)),"==")
rhs <- c(rep(0,NROW(ij.mat)),1-unname(ij.mat.rowSums),n)
obj <- xtabs(x~ID,distmat_summary)
obj <- c(obj,setNames(rep(0, NROW(ij.mat)), dimnames(ij.mat)[[1]]))
if (length(find.package(package="Rsymphony",quiet=TRUE))==0) install.packages("Rsymphony")
require(Rsymphony)
LP.sol <- Rsymphony_solve_LP(obj,mat,dir,rhs,types="B",max=TRUE)
items.sol <- (names(obj)[(1+NCOL(ij.mat)):(NCOL(ij.mat)+NROW(ij.mat))])[as.logical(LP.sol$solution[(1+NCOL(ij.mat)):(NCOL(ij.mat)+NROW(ij.mat))])]
items.sol
ID.sol <- names(obj)[1:NCOL(ij.mat)][as.logical(LP.sol$solution[1:NCOL(ij.mat)])]
as.data.frame(distmat_summary[distmat_summary$ID %in% ID.sol,])
This problem is called the p-dispersion-sum problem. It can be formulated using N binary variables, but using quadratic terms. As far as I know, it is not possible to formulate it with only N binary variables in a linear program.
This paper by Pisinger gives the quadratic formulation and discusses bounds and a branch-and-bound algorithm.
Hope this helps.
So I want to ask whether there's any way to define and solve a system of differential equations in R using matrix notation.
I know usually you do something like
lotka-volterra <- function(t,a,b,c,d,x,y){
dx <- ax + bxy
dy <- dxy - cy
return(list(c(dx,dy)))
}
But I want to do
lotka-volterra <- function(t,M,v,x){
dx <- x * M%*% x + v * x
return(list(dx))
}
where x is a vector of length 2, M is a 2*2 matrix and v is a vector of length 2. I.e. I want to define the system of differential equations using matrix/vector notation.
This is important because my system is significantly more complex, and I don't want to define 11 different differential equations with 100+ parameters rather than 1 differential equation with 1 matrix of interaction parameters and 1 vector of growth parameters.
I can define the function as above, but when it comes to using ode function from deSolve, there is an expectation of parms which should be passed as a named vector of parameters, which of course does not accept non-scalar values.
Is this at all possible in R with deSolve, or another package? If not I'll look into perhaps using MATLAB or Python, though I don't know how it's done in either of those languages either at present.
Many thanks,
H
With my low reputation (points), I apologize for posting this as an answer which supposedly should be just a comment. Going back, have you tried this link? In addition, in an attempt to find an alternative solution to your problem, have you tried MANOPT, a toolbox of MATLAB? It's actually open source just like R. I encountered MANOPT on a paper whose problem boils down to solving a system of ODEs involving purely matrices.
I need to take the inner product of every pair of columns in a particular matrix, which I achieve by calculating
t(M) %*% M
However this naturally produces a symmetrical result, doing just over twice the necessary work (I don't need the diagonal either). Obviously I could break the multiply down into individual inner product operations, but is there a better way to calculate just the upper triangular part of the product?
From the description in help("crossprod"):
Given matrices x and y as arguments, return a matrix cross-product.
This is formally equivalent to (but usually slightly faster than) the
call t(x) %*% y (crossprod) or x %*% t(y) (tcrossprod).
Thus, use crossprod(M).
I want to minimize function FlogV (working with a multinormal distribution, Z is data matrix NxC; SIGMA it´s a square matrix CxC of var-covariance of data, R a vector with length C)
FLogV <- function(P){
(here I define parameters, P, within R and SIGMA)
logC <- (C/2)*N*log(2*pi)+(1/2)*N*log(det(SIGMA))
SOMA.t <- 0
for (j in 1:N){
SOMA.t <- SOMA.t+sum(t(Z[j,]-R)%*%solve(SIGMA)%*%(Z[j,]-R))
}
MlogV <- logC + (1/2)*SOMA.t
return(MlogV)
}
minLogV <- optim(P,FLogV)
All this is part of an extend code which was already tested and works well, except in the most important thing: I can´t optimize because I get this error:
“Error in solve.default(SIGMA) :
system is computationally singular: reciprocal condition number = 3.57726e-55”
If I use ginv() or pseudoinverse() or qr.solve() I get:
“Error in svd(X) : infinite or missing values in 'x'”
The thing is: if I take the SIGMA matrix after the error message, I can solve(SIGMA), the eigen values are all positive and the determinant is very small but positive
det(SIGMA)
[1] 3.384674e-76
eigen(SIGMA)$values
[1] 0.066490265 0.024034173 0.018738777 0.015718562 0.013568884 0.013086845
….
[31] 0.002414433 0.002061556 0.001795105 0.001607811
I already read several papers about change matrices like SIGMA (which are close to singular), did several transformations on data scale and form but I realized that, for a 34x34 matrix like the example, after det(SIGMA) close to e-40, R assumes it like 0 and calculation fails; also I can´t reduce matrix dimensions and can´t input in my function correction algorithms to singular matrices because R can´t evaluate it working with this optimization functions like optim. I really appreciate any suggestion to this problem.
Thanks in advance,
Maria D.
It isn't clear from your post whether the failure is coming from det() or solve()
If its just the solve in the quadratic term, you may want to try the two argument version of solve, it can be a bit more stable. solve(X,Y) is the same as solve(X) %*% Y
If you can factor sigma using chol(), you will get a triangular matrix such that LL'=Sigma. The determinant is the product of the diagonals, and you might try this for the quadratic term:
crossprod( backsolve(L, Z[j,]-R))
I've got a sparse Matrix in R that's apparently too big for me to run as.matrix() on (though it's not super-huge either). The as.matrix() call in question is inside the svd() function, so I'm wondering if anyone knows a different implementation of SVD that doesn't require first converting to a dense matrix.
The irlba package has a very fast SVD implementation for sparse matrices.
You can do a very impressive bit of sparse SVD in R using random projection as described in http://arxiv.org/abs/0909.4061
Here is some sample code:
# computes first k singular values of A with corresponding singular vectors
incore_stoch_svd = function(A, k) {
p = 10 # may need a larger value here
n = dim(A)[1]
m = dim(A)[2]
# random projection of A
Y = (A %*% matrix(rnorm((k+p) * m), ncol=k+p))
# the left part of the decomposition works for A (approximately)
Q = qr.Q(qr(Y))
# taking that off gives us something small to decompose
B = t(Q) %*% A
# decomposing B gives us singular values and right vectors for A
s = svd(B)
U = Q %*% s$u
# and then we can put it all together for a complete result
return (list(u=U, v=s$v, d=s$d))
}
So here's what I ended up doing. It's relatively straightforward to write a routine that dumps a sparse matrix (class dgCMatrix) to a text file in SVDLIBC's "sparse text" format, then call the svd executable, and read the three resultant text files back into R.
The catch is that it's pretty inefficient - it takes me about 10 seconds to read & write the files, but the actual SVD calculation takes only about 0.2 seconds or so. Still, this is of course way better than not being able to perform the calculation at all, so I'm happy. =)
rARPACK is the package you need. Works like a charm and is Superfast because it parallelizes via C and C++.