Simple Orthographic Structure from Motion using R -- Determining Metric Constraints - r

I would like to build a simple structure from motion program according to Tomasi and Kanade [1992]. The article can be found below:
https://people.eecs.berkeley.edu/~yang/courses/cs294-6/papers/TomasiC_Shape%20and%20motion%20from%20image%20streams%20under%20orthography.pdf
This method seems elegant and simple, however, I am having trouble calculating the metric constraints outlined in equation 16 of the above reference.
I am using R and have outlined my work thus far below:
Given a set of images
I want to track the corners of the three cabinet doors and the one picture (black points on images). First we read in the points as a matrix w where
Ultimately, we want to factorize w into a rotation matrix R and shape matrix S that describe the 3 dimensional points. I will spare as many details as I can but a complete description of the maths can be gleaned from the Tomasi and Kanade [1992] paper.
I supply w below:
w.vector=c(0.2076,0.1369,0.1918,0.1862,0.1741,0.1434,0.176,0.1723,0.2047,0.233,0.3593,0.3668,0.3744,0.3593,0.3876,0.3574,0.3639,0.3062,0.3295,0.3267,0.3128,0.2811,0.2979,0.2876,0.2782,0.2876,0.3838,0.3819,0.3819,0.3649,0.3913,0.3555,0.3593,0.2997,0.3202,0.3137,0.31,0.2718,0.2895,0.2867,0.825,0.7703,0.742,0.7251,0.7232,0.7138,0.7345,0.6911,0.1937,0.1248,0.1723,0.1741,0.1657,0.1313,0.162,0.1657,0.8834,0.8118,0.7552,0.727,0.7364,0.7232,0.7288,0.6892,0.4309,0.3798,0.4021,0.3965,0.3844,0.3546,0.3695,0.3583,0.314,0.3065,0.3989,0.3876,0.3857,0.3781,0.3989,0.3593,0.5184,0.4849,0.5147,0.5193,0.5109,0.4812,0.4979,0.4849,0.3536,0.3517,0.4121,0.3951,0.3951,0.3781,0.397,0.348,0.5175,0.484,0.5091,0.5147,0.5128,0.4784,0.4905,0.4821,0.7722,0.7326,0.7326,0.7232,0.7232,0.7119,0.7402,0.7006,0.4281,0.3779,0.3918,0.3863,0.3825,0.3472,0.3611,0.3537,0.8043,0.7628,0.7458,0.7288,0.727,0.7213,0.7364,0.6949,0.5789,0.5491,0.5761,0.5817,0.5733,0.5444,0.5537,0.5379,0.3649,0.3536,0.4177,0.3951,0.3857,0.3819,0.397,0.3461,0.697,0.671,0.6821,0.6821,0.6719,0.6412,0.6468,0.6235,0.3744,0.3649,0.4159,0.3819,0.3781,0.3612,0.3763,0.314,0.7008,0.6691,0.6794,0.6812,0.6747,0.6393,0.6412,0.6235,0.7571,0.7345,0.7439,0.7496,0.7402,0.742,0.7647,0.7213,0.5817,0.5463,0.5696,0.5779,0.5761,0.5398,0.551,0.5398,0.7665,0.7326,0.7439,0.7345,0.7288,0.727,0.7515,0.7062,0.8301,0.818,0.8571,0.8878,0.8766,0.8561,0.858,0.8394,0.4121,0.3876,0.4347,0.397,0.38,0.3631,0.3668,0.2971,0.912,0.8962,0.9185,0.939,0.9259,0.898,0.8887,0.8571,0.3989,0.3781,0.4215,0.3725,0.3612,0.3461,0.3423,0.2782,0.9092,0.8952,0.9176,0.9399,0.925,0.8971,0.8887,0.8571,0.4743,0.4536,0.4894,0.4517,0.446,0.4328,0.4385,0.3706,0.8273,0.8171,0.8571,0.8878,0.8766,0.8543,0.8561,0.8394,0.4743,0.4554,0.4969,0.4668,0.4536,0.4404,0.4536,0.3857)
w=matrix(w.vector,ncol=16,nrow=16,byrow=FALSE)
Then create registered measurement matrix wm according to equation 2 as
by
wm = w - rowMeans(w)
We can decompose wm into a '2FxP' matrix o1 a diagonal 'PxP' matrix e and 'PxP' matrix o2 by using a singular value decomposition.
svdwm <- svd(wm)
o1 <- svdwm$u
e <- diag(svdwm$d)
o2 <- t(svdwm$v) ## dont forget the transpose!
However, because of noise, we only pay attention to the first 3 columns of o1, first 3 values of e and the first 3 rows of o2 by:
o1p <- svdwm$u[,1:3]
ep <- diag(svdwm$d[1:3])
o2p <- t(svdwm$v)[1:3,] ## dont forget the transpose!
Now we can solve for our rhat and shat in equation (14)
by
rhat <- o1p%*%ep^(1/2)
shat <- ep^(1/2) %*% o2p
However, these results are not unique and we still need to solve for R and S by equation (15)
by using the metric constraints of equation (16)
Now I need to find Q. I believe there are two potential methods but am unclear how to employ either.
Method 1 involves solving for B where B=Q%*%solve(Q) then using Cholesky decomposition to find Q. Method 1 appears to be the common choice in literature, however, little detail is given as to how to actually solve the linear system. It is apparent that B is a '3x3' symmetric matrix of 6 unknowns. However, given the metric constraints (equations 16), I don't know how to solve for 6 unknowns given 3 equations. Am I forgetting a property of symmetric matrices?
Method II involves using non-linear methods to estimate Q and is less commonly used in structure from motion literature.
Can anyone offer some advice as to how to go about solving this problem? Thanks in advance and let me know if I need to be more clear in my question.

can be written as .
can be written as .
can be written as .
so our equations are:
So the first equation can be written as:
which is equivalent to
To keep it short we define now:
(I know the spacings are terrably small, but yes, this is a Vector...)
So for all equations in all different Frames f, we can write one big equation:
(sorry for the ugly formulas...)
Now you just need to solve the -Matrix using Cholesky decomposition or whatever...

Related

Solving system of ODEs in vector/matrix form in R (with deSolve?)

So I want to ask whether there's any way to define and solve a system of differential equations in R using matrix notation.
I know usually you do something like
lotka-volterra <- function(t,a,b,c,d,x,y){
dx <- ax + bxy
dy <- dxy - cy
return(list(c(dx,dy)))
}
But I want to do
lotka-volterra <- function(t,M,v,x){
dx <- x * M%*% x + v * x
return(list(dx))
}
where x is a vector of length 2, M is a 2*2 matrix and v is a vector of length 2. I.e. I want to define the system of differential equations using matrix/vector notation.
This is important because my system is significantly more complex, and I don't want to define 11 different differential equations with 100+ parameters rather than 1 differential equation with 1 matrix of interaction parameters and 1 vector of growth parameters.
I can define the function as above, but when it comes to using ode function from deSolve, there is an expectation of parms which should be passed as a named vector of parameters, which of course does not accept non-scalar values.
Is this at all possible in R with deSolve, or another package? If not I'll look into perhaps using MATLAB or Python, though I don't know how it's done in either of those languages either at present.
Many thanks,
H
With my low reputation (points), I apologize for posting this as an answer which supposedly should be just a comment. Going back, have you tried this link? In addition, in an attempt to find an alternative solution to your problem, have you tried MANOPT, a toolbox of MATLAB? It's actually open source just like R. I encountered MANOPT on a paper whose problem boils down to solving a system of ODEs involving purely matrices.

Solve quadratic optimization with nonlinear constraints [duplicate]

I am trying to solve the following inequality constraint:
Given time-series data for N stocks, I am trying to construct a portfolio weight vector to minimize the variance of the returns.
the objective function:
min w^{T}\sum w
s.t. e_{n}^{T}w=1
\left \| w \right \|\leq C
where w is the vector of weights, \sum is the covariance matrix, e_{n}^{T} is a vector of ones, C is a constant. Where the second constraint (\left \| w \right \|) is an inequality constraint (2-norm of the weights).
I tried using the nloptr() function but it gives me an error: Incorrect algorithm supplied. I'm not sure how to select the correct algorithm and I'm also not sure if this is the right method of solving this inequality constraint.
I am also open to using other functions as long as they solve this constraint.
Here is my attempted solution:
data <- replicate(4,rnorm(100))
N <- 4
fn<-function(x) {cov.Rt<-cov(data); return(as.numeric(t(x) %*%cov.Rt%*%x))}
eqn<-function(x){one.vec<-matrix(1,ncol=N,nrow=1); return(-1+as.numeric(one.vec%*%x))}
C <- 1.5
ineq<-function(x){
z1<- t(x) %*% x
return(as.numeric(z1-C))
}
uh <-rep(C^2,N)
lb<- rep(0,N)
x0 <- rep(1,N)
local_opts <- list("algorithm"="NLOPT_LN_AUGLAG,",xtol_rel=1.0e-7)
opts <- list("algorithm"="NLOPT_LN_AUGLAG,",
"xtol_rel"=1.0e-8,local_opts=local_opts)
sol1<-nloptr(x0,eval_f=fn,eval_g_eq=eqn, eval_g_ineq=ineq,ub=uh,lb=lb,opts=opts)
This looks like a simple QP (Quadratic Programming) problem. It may be easier to use a QP solver instead of a general purpose NLP (NonLinear Programming) solver (no need for derivatives, functions etc.). R has a QP solver called quadprog. It is not totally trivial to setup a problem for quadprog, but here is a very similar portfolio example with complete R code to show how to solve this. It has the same objective (minimize risk), the same budget constraint and the lower and upper-bounds. The example just has an extra constraint that specifies a minimum required portfolio return.
Actually I misread the question: the second constraint is ||x|| <= C. I think we can express the whole model as:
This actually looks like a convex model. I could solve it with "big" solvers like Cplex,Gurobi and Mosek. These solvers support convex Quadratically Constrained problems. I also believe this can be formulated as a cone programming problem, opening up more possibilities.
Here is an example where I use package cccp in R. cccp stands for
Cone Constrained Convex Problems and is a port of CVXOPT.
The 2-norm of weights doesn't make sense. It has to be the 1-norm. This is essentially a constraint on the leverage of the portfolio. 1-norm(w) <= 1.6 implies that the portfolio is at most 130/30 (Sorry for using finance language here). You want to read about quadratic cones though. w'COV w = w'L'Lw (Cholesky decomp) and hence w'Cov w = 2-Norm (Lw)^2. Hence you can introduce the linear constraint y - Lw = 0 and t >= 2-Norm(Lw) [This defines a quadratic cone). Now you minimize t. The 1-norm can also be replaced by cones as abs(x_i) = sqrt(x_i^2) = 2-norm(x_i). So introduce a quadratic cone for each element of the vector x.

Minimizing quadratic function subject to norm inequality constraint

I am trying to solve the following inequality constraint:
Given time-series data for N stocks, I am trying to construct a portfolio weight vector to minimize the variance of the returns.
the objective function:
min w^{T}\sum w
s.t. e_{n}^{T}w=1
\left \| w \right \|\leq C
where w is the vector of weights, \sum is the covariance matrix, e_{n}^{T} is a vector of ones, C is a constant. Where the second constraint (\left \| w \right \|) is an inequality constraint (2-norm of the weights).
I tried using the nloptr() function but it gives me an error: Incorrect algorithm supplied. I'm not sure how to select the correct algorithm and I'm also not sure if this is the right method of solving this inequality constraint.
I am also open to using other functions as long as they solve this constraint.
Here is my attempted solution:
data <- replicate(4,rnorm(100))
N <- 4
fn<-function(x) {cov.Rt<-cov(data); return(as.numeric(t(x) %*%cov.Rt%*%x))}
eqn<-function(x){one.vec<-matrix(1,ncol=N,nrow=1); return(-1+as.numeric(one.vec%*%x))}
C <- 1.5
ineq<-function(x){
z1<- t(x) %*% x
return(as.numeric(z1-C))
}
uh <-rep(C^2,N)
lb<- rep(0,N)
x0 <- rep(1,N)
local_opts <- list("algorithm"="NLOPT_LN_AUGLAG,",xtol_rel=1.0e-7)
opts <- list("algorithm"="NLOPT_LN_AUGLAG,",
"xtol_rel"=1.0e-8,local_opts=local_opts)
sol1<-nloptr(x0,eval_f=fn,eval_g_eq=eqn, eval_g_ineq=ineq,ub=uh,lb=lb,opts=opts)
This looks like a simple QP (Quadratic Programming) problem. It may be easier to use a QP solver instead of a general purpose NLP (NonLinear Programming) solver (no need for derivatives, functions etc.). R has a QP solver called quadprog. It is not totally trivial to setup a problem for quadprog, but here is a very similar portfolio example with complete R code to show how to solve this. It has the same objective (minimize risk), the same budget constraint and the lower and upper-bounds. The example just has an extra constraint that specifies a minimum required portfolio return.
Actually I misread the question: the second constraint is ||x|| <= C. I think we can express the whole model as:
This actually looks like a convex model. I could solve it with "big" solvers like Cplex,Gurobi and Mosek. These solvers support convex Quadratically Constrained problems. I also believe this can be formulated as a cone programming problem, opening up more possibilities.
Here is an example where I use package cccp in R. cccp stands for
Cone Constrained Convex Problems and is a port of CVXOPT.
The 2-norm of weights doesn't make sense. It has to be the 1-norm. This is essentially a constraint on the leverage of the portfolio. 1-norm(w) <= 1.6 implies that the portfolio is at most 130/30 (Sorry for using finance language here). You want to read about quadratic cones though. w'COV w = w'L'Lw (Cholesky decomp) and hence w'Cov w = 2-Norm (Lw)^2. Hence you can introduce the linear constraint y - Lw = 0 and t >= 2-Norm(Lw) [This defines a quadratic cone). Now you minimize t. The 1-norm can also be replaced by cones as abs(x_i) = sqrt(x_i^2) = 2-norm(x_i). So introduce a quadratic cone for each element of the vector x.

How do I find the matrix of the linear transformation?

Going through the text on Linear Algebra by A. O. Morris (2nd edition) I am trying to understand something on linear transformations.
There is a problem where the R-bases of U and V are given as
{u1, u2} and {v1,v2,v3} respectively and the linear transformation from U to V is given by
Tu1=v1+2v2-v3
Tu2=v1-v2
The problem is to
a) find the matrix of T relative to these bases,
b) the matrix relative to the R-bases
{-u1+u2,2u1-u2} and {v1,v1+v2,v1+v2+v3},
and c) the relationship between the two matrices.
From the very good treatment of the subject here https://math.stackexchange.com/questions/12383/determine-the-matrix-relative-to-a-given-basis I figured out that the first matrix T has columns
(1,2,-1),(1,-1,0).
then for part b) I figure out Matrix A which I take to be the transform of ordered basis of U to standard basis of U as
((-1,2),(2,1)}
Matrix B which I take to be the transform of ordered basis of V to standard basis of V as
{(1,1,1),(0,1,1),(0,0,1)}
I then find inverse of B and form the product
[B]inv.C.A as the answer to part c).
Somehow I do not seem to get it. I am completely at sea. I would appreciate help to understand this.
My sincere apologies for not being able to use Latex.

higher order linear regression

I have the matrix system:
A x B = C
A is a by n and B is n by b. Both A and B are unknown but I have partial information about C (I have some values in it but not all) and n is picked to be small enough that the system is expected to be over constrained. It is not required that all rows in A or columns in B are over constrained.
I'm looking for something like least squares linear regression to find a best fit for this system (Note: I known there will not be a single unique solution but all I want is one of the best solutions)
To make a concrete example; all the a's and b's are unknown, all the c's are known, and the ?'s are ignored. I want to find a least squares solution only taking into account the know c's.
[ a11, a12 ] [ c11, c12, c13, c14, ? ]
[ a21, a22 ] [ b11, b12, b13, b14, b15] [ c21, c22, c23, c24, c25 ]
[ a31, a32 ] x [ b21, b22, b23, b24, b25] = C ~= [ c31, c32, c33, ?, c35 ]
[ a41, a42 ] [ ?, ?, c43, c44, c45 ]
[ a51, a52 ] [ c51, c52, c53, c54, c55 ]
Note that if B is trimmed to b11 and b21 only and the unknown row 4 chomped out, then this is almost a standard least squares linear regression problem.
This problem is illposed as described.
Let A, B, and C=5, be scalars. You are asking to solve
a*b=5
which has an infinite number of solutions.
One approach, on the information provided above, is to minimize
the function g defined as
g(A,B) = ||AB-C||^2 = trace((AB-C)*(AB-C))^2
using Newtons method or a quasi-secant approach (BFGS).
(You can easily compute the gradient here).
M* is the transpose of M and multiplication is implicit.
(The norm is the frobenius norm... I removed the
underscore F as it was not displaying properly)
As this is an inherently nonlinear problem, standard linear
algebra approaches do not apply.
If you provide more information, I may be able to help more.
Some more questions: I think the issue is here is that without
more information, there is no "best solution". We need to
determine a more concrete idea of what we are looking for.
One idea, could be a "sparsest" solution. This area is
a hot area of research, with some of the best minds in the
world working here (See Terry Tao et al. work on Nuclear Norm)
This problem although tractable is still hard.
Unfortunately, I am not yet able to comment, so I will add my comments here.
As said below, LM is a great approach to solving this and is just one approach.
along the lines of the Newton type approaches to either
the optimization problem or the nonlinear solving problem.
Here is an idea, using the example you gave above: Lets define
two new vectors, V and U each with 21 elements (exactly the same number of defined
elements in C).
V is precisely the known elements of C, column ordered, so (in matlab notation)
V = [C11; C21; C31; C51; C12; .... ; C55]
U is a vector which is a column ordering of the product AB, LEAVING OUT THE
ELEMENTS CORRESPONDING TO '?' in matrix C. Collecting all the variables into x
we have
x = [a11, a21, .. a52, b11, b21 ..., b25].
f(x) = U (as defined above).
We can now try to solve f(x)=V with your favorite nonlinear least squares method.
As an aside, although a poster below recommended simulated annealing, I recommend
against it. THere are some problems it works, but it is a heuristic. When you have
powerful analytic methods such as Gauss-Newton or LM, I say use them. (in my own
experience that is)
A wild guess: A singular value decomposition might do the trick?
I have no idea on how to deal with your missing values, so I'm going to ignore that problem.
There are no unique solutions. To find a best solution you need some sort of a metric to judge them by. I'm going to suppose you want to use a least squares metric, i.e. the best guess values of A and B are those that minimize sum of the numbers [C_ij-(A B)_ij]^2.
One thing you didn't mention is how to determine the value you are going to use for n. In short, we can come up with 'good' solutions if 1 <= n <= b. This is because 1 <= rank(span(C)) <= b. Where rank(span(C)) = the dimension of the column space of C. Note that this is assuming a >= b. To be more correct we would write 1 <= rank(span(C)) <= min(a,b).
Now, supposing that you have chosen n such that 1 <= n <= b. You are going to minimize the residual sum of squares if you chose the columns of A such that span(A) = span(First n eigen vectors of C). If you don't have any other good reasons, just choose the columns of A to be to first n eigen vectors of C. Once you have chosen A, you can get the values of B in the usual linear regression way. I.e. B = (A'A)^(-1)A' C
You have a couple of options. The Levenberg-Marquadt algorithm is generally recognized as the best LS method. A free implementation is available at here. However, if the calculation is fast and you have a decent number of parameters, I would strongly suggest a Monte Carlo method such as simulated annealing.
You start with some set of parameters in the answer, and then you increase one of them by a random percentage up to a maximum. You then calculate the fitness function for your system. Now, here's the trick. You don't throw away the bad answers. You accept them with a Boltzmann probability distribution.
P = exp(-(x-x0)/T)
where T is a temperature parameter and x-x0 is the current fitness value minus the previous. After x number of iterations, you decrease T by a fixed amount (this is called the cooling schedule). You then repeat this process for another random parameter. As T decreases, fewer poor solutions are chosen, and eventually the procedure becomes a "greedy search" only accepting the solutions that improve the fit. If your system has many free parameters (> 10 or so), this is really the only way to go where you will have any chance of getting to a global minimum. This fitting method takes about 20 minutes to write in code, and a couple of hours to tweak. Hope this helps.
FYI, Wolfram has a nice discussion of this in the context of the traveling salesman problem, and I've been using it very successfully to solve some very difficult global minimization problems. It is slower than LM methods, but much better in most difficult/relatively large cases.
Based on the realization that cutting B to a single column and them removing row with unknowns converts this to very near a known problem, One approach would be to:
seed A with random values.
solve for each column of B independently.
rework the problem to allow solving for each row of A given the B values from step 2.
repeat at step 2 until things settle out.
I have no clue if that is even stable.

Resources