How do I find the matrix of the linear transformation? - linear-algebra

Going through the text on Linear Algebra by A. O. Morris (2nd edition) I am trying to understand something on linear transformations.
There is a problem where the R-bases of U and V are given as
{u1, u2} and {v1,v2,v3} respectively and the linear transformation from U to V is given by
Tu1=v1+2v2-v3
Tu2=v1-v2
The problem is to
a) find the matrix of T relative to these bases,
b) the matrix relative to the R-bases
{-u1+u2,2u1-u2} and {v1,v1+v2,v1+v2+v3},
and c) the relationship between the two matrices.
From the very good treatment of the subject here https://math.stackexchange.com/questions/12383/determine-the-matrix-relative-to-a-given-basis I figured out that the first matrix T has columns
(1,2,-1),(1,-1,0).
then for part b) I figure out Matrix A which I take to be the transform of ordered basis of U to standard basis of U as
((-1,2),(2,1)}
Matrix B which I take to be the transform of ordered basis of V to standard basis of V as
{(1,1,1),(0,1,1),(0,0,1)}
I then find inverse of B and form the product
[B]inv.C.A as the answer to part c).
Somehow I do not seem to get it. I am completely at sea. I would appreciate help to understand this.
My sincere apologies for not being able to use Latex.

Related

Questions about SVD, Singular Value Decomposition

I am not a mathematician, so I need to understand what SVD does and WHY more than how it works exactly from the math perspective. (I understand at least what is the decomposition though).
This guy on youtube gave the only human explanation of SVD saying, that the U matrix maps "user to concept correlation" Sigma matrix defines the strength of each concept, and V maps "movie to concept correlation" given that initial matrix M has users in the rows, and movie (ratings) in the columns.
He also mentioned two concept specifically "sci fi" and "romance" movies. See the picture below.
My questions are:
How SVD knows the number of concepts. He as human mentioned two - sci fi, and romance, but in reality in resulting matrices are 3 concepts. (for example matrix U - that one with blue titles - has 3 columns not 2).
How SVD knows what is the concept after all. I mean, what If i shuffle the columns randomly how SVD then knows what is sci fi, what is romance. I mean, I suppose there is no rule, group the concepts together in the column order. What if scifi movie is the first and last one? and not first 3 columns in the initial matrix M?
What is the practical usage of either U, Sigma or V matrices? (Except that you can multiply them to get the initial matrix M)
Is there also any other possible human explanation of SVD than the guy up provided, or it is the only one possible function? Matrices of correlations.
As was pointed out in the comments you may well get better explanations elsewhere. However since the question is still open, here is my tuppence worth.
Throughout I'll suppose that A is mxn where m>=n, ie that A has more rows than columns.
First of all there are many forms of the SVD, differing in the sizes of the matrices. They all share the fundamental properties that
A = U*S*V'
S is diagonal
U and V have orthogonal columns (ie U'*U = I, V'*V = I)
Perhaps the most useful from a theoretical point of view is the 'full fat' svd where we have that U is mxm, S is mxn and V is nxn. However this has rather a lot of elements that don't really contribute to A. For example S being diagonal we can write
S = ( S1 ) (where S1 is nxn )
( 0 )
If we divide up U into
U = ( U1 U2) (where U1 is mxn and U2 is (mx(m-n)))
Then its straightforward to calculate that
U*S = U1*S1
and so we can throw away the last m-n columns of U and the last m-n rows of S, and still recover A.
Moreover some of the diagonal elements of S1 may be 0; suppose in fact that p<n of them are non zero. Then we can write
S1 = ( S2 0)
( 0 0)
And arguing as above for U and analogously for V' we can in fact throw away all but the first p columns of U and all of S but S2, and all but the first p rows of V, and still recover A.
This latter is the form of SVD ('thin') in your question:
U is mxp
S is pxp
V' is pxm
where p is the number of non-zero singular values of A. This is my answer to your 1.
By convention the elements of S decrease as you move down the diagonal. To achieve this the routine that calculates the svd in effect works with a version of A with shuffled columns. This shuffling is undone by incorporating the shuffle in the U and V' output. This is my answer to your 2: however you shuffle A, it will be in effect shuffled again to ensure that the singular values decrease down the diagonal.
I struggle to answer 3, because I suspect that our ideas of 'practical' are rather different.
One thing that I think practical is to find simpler approximations to A. The reconstruction of A can be written
A = Sum{ 1<=i<=p | U[i]*S[i]*V[i]' }
where the S[i] are the diagonal elements of S, U[i] are the columns of U and V[i] those of V
We might want to use a simpler model for A, for example want to simplify it down to just one term. That is, we might wonder how much we would lose by using fewer 'concepts'. The 'thin' svd above has already done this in the sense that it has thrown away all the coluns that make no contribution to A. In an extreme case, we might wonder what we would get if we reduced to just one concept. This approximation is found by taking just the first term of the sum above. This extends to however many terms -- q say -- we want to allow: we just take the first q terms of the sum above.
I'm sorry, I can't answer 4.

eigen() and the correct eigenvectors

My problem is the following:
I'm trying to use R in order to compute numerically this problem.
So I've correctly setup the problem in my console, and then I tried to compute the eigenvectors.
But I expect that the eigenvector associated with lambda = 1 is (1,2,1) instead of what I've got here. So, the scaling is correct (0.4082483 is effectively half of 0.8164966), but I would like to obtain a consistent result.
My original problem is to find a stationary distribution for a Markov Chain using R instead of doing it on paper. So from a probabilistic point of view, my stationary distribution is a vector whose sum of the components is equal to 1. For that reason I was trying to change the scale in order to obtain what I've defined "a consistent result".
How can I do that ?
The eigen vectors returned by R are normalized (for the square-norm). If V is a eigen vector then s * V is a eigen vector as well for any non-zero scalar s. If you want the stationary distribution as in your link, divide by the sum:
V / sum(V)
and you will get (1/4, 1/2, 1/4).
So:
ev <- eigen(t(C))$vectors
ev / colSums(ev)
to get all the solutions in one shot.
C <- matrix(c(0.5,0.25,0,0.5,0.5,0.5,0,0.25,0.5),
nrow=3)
ee <- eigen(t(C))$vectors
As suggested by #Stéphane Laurent in the comments, the scaling of eigenvectors is arbitrary; only the relative value is specified. The default in R is that the sum of squares of the eigenvectors (their norms) are equal to 1; colSums(ee^2) is a vector of 1s.
Following the link, we can see that you want each eigenvector to sum to 1.
ee2 <- sweep(ee,MARGIN=2,STATS=colSums(ee),FUN=`/`)
(i.e., divide each eigenvector by its sum).
(This is a good general solution, but in this case the sum of the second and third eigenvectors are both approximately zero [theoretically, they are exactly zero], so this only really makes sense for the first eigenvector.)

Solve matrix equation with eigen

I want to solve the equation A*B=C for A, with Eigen. A is a (3x3) matrix, B and C are (3x25) matrices.
Currently I would compute the pseudoinverse of B and multiply it from right to C. Is there a better way?
If found an easy solution myself: just transpose the whole equation (B^T*A^T=C^T) and solve it with an ordinary linear solver!
edit: it actually does not perfectly do the job, see this link

Simple Orthographic Structure from Motion using R -- Determining Metric Constraints

I would like to build a simple structure from motion program according to Tomasi and Kanade [1992]. The article can be found below:
https://people.eecs.berkeley.edu/~yang/courses/cs294-6/papers/TomasiC_Shape%20and%20motion%20from%20image%20streams%20under%20orthography.pdf
This method seems elegant and simple, however, I am having trouble calculating the metric constraints outlined in equation 16 of the above reference.
I am using R and have outlined my work thus far below:
Given a set of images
I want to track the corners of the three cabinet doors and the one picture (black points on images). First we read in the points as a matrix w where
Ultimately, we want to factorize w into a rotation matrix R and shape matrix S that describe the 3 dimensional points. I will spare as many details as I can but a complete description of the maths can be gleaned from the Tomasi and Kanade [1992] paper.
I supply w below:
w.vector=c(0.2076,0.1369,0.1918,0.1862,0.1741,0.1434,0.176,0.1723,0.2047,0.233,0.3593,0.3668,0.3744,0.3593,0.3876,0.3574,0.3639,0.3062,0.3295,0.3267,0.3128,0.2811,0.2979,0.2876,0.2782,0.2876,0.3838,0.3819,0.3819,0.3649,0.3913,0.3555,0.3593,0.2997,0.3202,0.3137,0.31,0.2718,0.2895,0.2867,0.825,0.7703,0.742,0.7251,0.7232,0.7138,0.7345,0.6911,0.1937,0.1248,0.1723,0.1741,0.1657,0.1313,0.162,0.1657,0.8834,0.8118,0.7552,0.727,0.7364,0.7232,0.7288,0.6892,0.4309,0.3798,0.4021,0.3965,0.3844,0.3546,0.3695,0.3583,0.314,0.3065,0.3989,0.3876,0.3857,0.3781,0.3989,0.3593,0.5184,0.4849,0.5147,0.5193,0.5109,0.4812,0.4979,0.4849,0.3536,0.3517,0.4121,0.3951,0.3951,0.3781,0.397,0.348,0.5175,0.484,0.5091,0.5147,0.5128,0.4784,0.4905,0.4821,0.7722,0.7326,0.7326,0.7232,0.7232,0.7119,0.7402,0.7006,0.4281,0.3779,0.3918,0.3863,0.3825,0.3472,0.3611,0.3537,0.8043,0.7628,0.7458,0.7288,0.727,0.7213,0.7364,0.6949,0.5789,0.5491,0.5761,0.5817,0.5733,0.5444,0.5537,0.5379,0.3649,0.3536,0.4177,0.3951,0.3857,0.3819,0.397,0.3461,0.697,0.671,0.6821,0.6821,0.6719,0.6412,0.6468,0.6235,0.3744,0.3649,0.4159,0.3819,0.3781,0.3612,0.3763,0.314,0.7008,0.6691,0.6794,0.6812,0.6747,0.6393,0.6412,0.6235,0.7571,0.7345,0.7439,0.7496,0.7402,0.742,0.7647,0.7213,0.5817,0.5463,0.5696,0.5779,0.5761,0.5398,0.551,0.5398,0.7665,0.7326,0.7439,0.7345,0.7288,0.727,0.7515,0.7062,0.8301,0.818,0.8571,0.8878,0.8766,0.8561,0.858,0.8394,0.4121,0.3876,0.4347,0.397,0.38,0.3631,0.3668,0.2971,0.912,0.8962,0.9185,0.939,0.9259,0.898,0.8887,0.8571,0.3989,0.3781,0.4215,0.3725,0.3612,0.3461,0.3423,0.2782,0.9092,0.8952,0.9176,0.9399,0.925,0.8971,0.8887,0.8571,0.4743,0.4536,0.4894,0.4517,0.446,0.4328,0.4385,0.3706,0.8273,0.8171,0.8571,0.8878,0.8766,0.8543,0.8561,0.8394,0.4743,0.4554,0.4969,0.4668,0.4536,0.4404,0.4536,0.3857)
w=matrix(w.vector,ncol=16,nrow=16,byrow=FALSE)
Then create registered measurement matrix wm according to equation 2 as
by
wm = w - rowMeans(w)
We can decompose wm into a '2FxP' matrix o1 a diagonal 'PxP' matrix e and 'PxP' matrix o2 by using a singular value decomposition.
svdwm <- svd(wm)
o1 <- svdwm$u
e <- diag(svdwm$d)
o2 <- t(svdwm$v) ## dont forget the transpose!
However, because of noise, we only pay attention to the first 3 columns of o1, first 3 values of e and the first 3 rows of o2 by:
o1p <- svdwm$u[,1:3]
ep <- diag(svdwm$d[1:3])
o2p <- t(svdwm$v)[1:3,] ## dont forget the transpose!
Now we can solve for our rhat and shat in equation (14)
by
rhat <- o1p%*%ep^(1/2)
shat <- ep^(1/2) %*% o2p
However, these results are not unique and we still need to solve for R and S by equation (15)
by using the metric constraints of equation (16)
Now I need to find Q. I believe there are two potential methods but am unclear how to employ either.
Method 1 involves solving for B where B=Q%*%solve(Q) then using Cholesky decomposition to find Q. Method 1 appears to be the common choice in literature, however, little detail is given as to how to actually solve the linear system. It is apparent that B is a '3x3' symmetric matrix of 6 unknowns. However, given the metric constraints (equations 16), I don't know how to solve for 6 unknowns given 3 equations. Am I forgetting a property of symmetric matrices?
Method II involves using non-linear methods to estimate Q and is less commonly used in structure from motion literature.
Can anyone offer some advice as to how to go about solving this problem? Thanks in advance and let me know if I need to be more clear in my question.
can be written as .
can be written as .
can be written as .
so our equations are:
So the first equation can be written as:
which is equivalent to
To keep it short we define now:
(I know the spacings are terrably small, but yes, this is a Vector...)
So for all equations in all different Frames f, we can write one big equation:
(sorry for the ugly formulas...)
Now you just need to solve the -Matrix using Cholesky decomposition or whatever...

Decompose complex matrix transformation into a series of simple transformations?

I wonder if it is possible (and if it is then how) to re-present an arbitrary M3 matrix transformation as a sequence of simpler transformations (such as translate, scale, skew, rotate)
In other words: how to calculate MTranslate, MScale, MRotate, MSkew matrices from the MComplex so that the following equation would be true:
MComplex = MTranslate * MScale * MRotate * MSkew (or in an other order)
Singular Value Decomposition (see also this blog and this PDF). It turns an arbitrary matrix into a composition of 3 matrices: orthogonal + diagonal + orthogonal. The orthogonal matrices are rotation matrices; the diagonal matrix represents skewing along the primary axes = scaling.
The translation throws a monkey wrench into the game, but what you should do is take out the translation part of the matrix so you have a 3x3 matrix, run SVD on that to give you the rotation+skewing, then add the translation part back in. That way you'll have a rotation + scale + rotation + translate composition of 4 matrices. It's probably possible to do this in 3 matrices (rotation + scaling along some set of axes + translation) but I'm not sure exactly how... maybe a QR decomposition (Q = orthogonal = rotation, but I'm not sure if the R is skew-only or has a rotational part.)
Yes, but the solution will not be unique. Also you should rather put translation at the end (the order of the rest doesn't matter)
For any given square matrix A there exists infinitely many matrices B and C so that A = B*C. Choose any invertible matrix B (which means that B^-1 exists or det(B) != 0) and now C = B^-1*A.
So for your solution first decompose MC into MT and MS*MR*MSk*I, choosing MT to be some invertible transposition matrix. Then decompose the rest into MS and MR*MSk*I so that MS is arbitrary scaling matrix. And so on...
Now if at the end of the fun I is an identity matrix (with 1 on diagonal, 0 elsewhere) you're good. If it is not, start over, but choose different matrices ;-)
In fact, using the method above symbolically you can create set of equations that will yield you a parametrized formulas for all of these matrices.
How useful these decompositions would be for you, well - that's another story.
If you type this into Mathematica or Maxima they'll compute this for you in no time.

Resources