I have two tensors of rank 3 each, in other words two 3D matrix. I want to take dot product of these two matrix. I am confused to continue with this problem. Help me out with formula to do so.
A 3-way tensor (or equivalently 3D array or 3-order array) need not necessarily be of rank-3; Here, "rank of a tensor" means the minimum number of rank-1 tensors (i.e. outer product of vectors; For N-way tensor, it's the outer product of N vectors) needed to get your original tensor. This is explained in the below figure of so-called CP decomposition.
In the above figure, the original tensor(x) can be written as a sum of R rank-1 tensors, where R is a positive integer. In CP decomposition, we aim to find a minimum R that yields our original tensor X. And this minimum R is called the rank of our original tensor.
For a 3-way tensor, it is the minimum number of (a1,a2,a3...aR; b1,b2,b3...bR; c1,c2,c3...cR) vectors (where each of the vectors is n dimensional) required to obtain the original tensor. The tensor can be written as the outer product of these vectors as:
In terms of element-wise, we can write the 3-way tensor as:
Now, with that background, to answer your specific question, to take the dot product (also called tensor inner product), both tensors must be of same shape (for e.g. 3x2x5 and 3x2x5), then the inner product is defined as the sum of the element-wise product of their values.
where the script X and Y are the same-shape tensors.
P.S.: The tilde in the above formulae should not be interpreted as an approximation.
The vector inner product sum the elementwise products. The tensor inner product follows the same idea. Match the elements, multiply them, and add them all .
Related
Say I have two matrices A and B. I want to compute the diagonal elements of the matrix product A * B and place them in a pre-allocated vector result.
Is there a BLAS (or similar) routine to do this as fast as possible?
There is no specific routine for that. However, you can use the following definition of matrix multiplication.
Consider C = AB, and aij, bij, cij to denote the (i,j)th element of the corresponding matrices. Without loss of generality, I will assume that all A,B,C are N x N dense matrices.
Then,
cij = sumk=0N-1 (aik, bkj)
Since you are interested only in the diagonal entries:
cii = sumk=0N-1 (aik, bki), for i=1,...,N
In other words, to calculate the ith diagonal matrix of matrix C you need to find a dot product between the ith row of matrix A and ith column of matrix B. That can be achieved by using a dot product BLAS level-1 function ?dot.
res = ?dot(n, x, incx, y, incy)
Let's assume that matrices A and B are stored column-wise and are accessible via pointers *A and *B (which hold N*N values), while *C is a preallocated storage for diagonal entries of matrix C (which holds N values).
The following loop should give you the diagonal:
for (int i=0;i<N;i++)
{
C[i] = ?dot(N,A[i],N,B[i*N],1);
}
Notice, that we are accessing the ith row of matrix A by passing the first element of the ith row: A[i], and using increment (incx) of N. In contrast, to access the ith column of matrix B we pass the first element of the ith column: B[i*N] and use increment of 1.
Notes:
if A,B, and C have different (but consistent with matrix multiplication) dimensions, only slight modifications will have to be applied.
if matrices are stored row-wise, the call to ?dot should be slightly changed
the pseudocode above uses a general ?dot function. In practice, it will be sdot or ddot for single- or double precision real numbers, and versions of ?dotu: cdotu and zdotu for complex numbers of single and double precision, respectively.
is it the most efficient, cache-friendly, etc-etc implementation? probably not, but it would surprise me if that becomes a bottleneck in an algorithm where NxN matrices A and B have been explicitly calculated anyway.
I have a double vector:
r = -50 + (50+50)*rand(10,1)
Now i want to ideally have all the numbers in the vector equal upto a tolerance of say 1e-4. I want to represent each r with a scalar say s(r) such that its value gives an idea of the quality of the vector. The vector is high quality if all elements in the vector are equal-like. I can easily run a for loop like
for i=1:10
for j=i+1:10
check equality upto the tolerance
end
end
But even then i cannot figure what computation to do inside the nested for loops to assign a scalar representing the quality . Is there a better way such that given any vector r length n, i can quickly calculate a scalar representing the quality of the vector.
Your double-loop algorithm is somewhat slow, of order O(n**2) where n is the number of dimensions of the vector. Here is a quick way to find the closeness of the vector elements, which can be done in order O(n), just one pass through the elements.
Find the maximum and the minimum of the vector elements. Just use two variables to store the maximum and minimum so far and run once through all the elements. The difference between the maximum and the minimum is called the range of the values, a commonly accepted measure of dispersion of the values. If the values are exactly equal, the range is zero which shows perfect quality. If the range is below 1e-4 then the vector is of acceptable quality. The bigger the range, the worse the equality.
The code is obvious for just about any given language, so I'll leave that to you. If the fact that the range only really considers the two extreme values of the vector bothers you, you could use other measures of variation such as the interquartile range, variance, or standard deviation. But the range seems to best fit what you request.
My prof introduced a concept that required use of a vector, which he represented as follows (imagine there is only one pair of brackets below, tall enough to encapsulate both terms; I don't have the rep to paste an image and don't know how to format this otherwise):
v =
[-1/2]
[1/2 ]
One of my personal weaknesses is a lack of familiarity with mathematical notation. Is there an accepted way of interpreting this kind notation? Does it vary by discipline, or is this something generalizable that I really should know? Is there something intrinsic about this notation that would lead one to interpret it differently than if it were written v = [-1/4, 1/4]?
Thanks for the help!
A vector is a one-dimensional matrix, but it is a matrix nonetheless. Writing it out horizontally instead of vertically or vice versa changes the dimensionality of the matrix, changing its meaning among the rest of the equations.
Very often you will "transform" a vector by multiplying them by a matrix. For instance, to rotate a vector, you have to multiply it by the rotation matrix, etc. If your vectors are codified in columns, a multiplication by a matrix M will act from the left, M * v, because of the way the multiplication works (every row of M by the column vector v.) Alternatively, if your vectors are codified as rows (v = [-1/4, 1/4]) the multiplication will act from the right: v * M, again, because of the "row by column" definition of the multiplication of "matrices".
So, it is up to you to represent vectors as rows or columns provided your convention is consistent with the way you multiply them by matrices.
How can I turn a regular matrix into a matrix full-ranked in R? Is there an available method for that?
I have a matrix that may have linearly dependent columns and I need to
pass it to a function that requires its argument to be a matrix with
full rank. Since linearly dependent columns are not of interest
anyway, I am looking for a function that removes such columns until
the matrix is full rank. There may be several solutions of course, but
any one of them should be fine.
Right now I am just constructing the matrix column by column and only
add a column if its the resulting matrix is still fullrank, but it
feels like there should be a better way to do this.
Another approach is to minimize |y - Ax|2 + c |x|2,
by tacking an identity matrix on to A and zeros to y.
The parameter c (a.k.a. λ)
trades off fitting y - Ax, and keeping |x| small.
Then run a second fit with the r largest components of x,
r = rank(A) (or any number you please).
How can I get the not-normalized output refracted vector, with an also not-normalized incident vector?
I'm following that formulas, work with normalized input, but if I pass not-normalized doesn't. Tried to divide the dot product by input vector length but also nothing.
Wikipedia Snell's Law Vector form
If you divide the dot product by the incident vector length, then your thetas will be correct.
After that, if you multiply n by the incident vector length, then your vreflected and vrefracted vectors will be correct.