Isabelle: difference between A * 1 and A ** mat 1 - isabelle

What is the difference between * and ** for matrices
and also A * 1andA ** mat 1`?
Example:
lemma myexample:
fixes A :: "('a::comm_ring_1)^'n∷finite^'n∷finite"
shows "(A * 1 = A) ∧ (A ** (mat 1) = A)"
by (metis comm_semiring_1_class.normalizing_semiring_rules(12) matrix_mul_rid)

Matrices in Isabelle are defined simply as vectors of vectors, so the * on matrices is inherited from vectors, and * on vectors is just componentwise multiplication. Therefore, you have (A*B) $ i $ j = A $ i $ j * B $ i $ j, i.e. * is entry-by-entry multiplication of a matrix. Whether this is actually useful anywhere, I do not know – I don't think so. It's probably just an artifact of defining matrices as vectors of vectors. It might have been better to do a proper typedef for matrices and define * as the right matrix multiplication on them, but there must have been some reason why that was not done – maybe just because it's more work and a lot of copypasted code.
** is the proper matrix multiplication. mat x is simply the matrix that has x on its diagonal and 0 everywhere else, so of course, mat 1 is the identity matrix and A ** mat 1 = A.
The matrix 1 however is, again, an artifact from the vector definition; the n-dimensional vector 1 is simply defined as the vector that has n components, all of which are 1. Consequently, the matrix 1 is the matrix whose entries are all 1, and then of course, A * 1 = A. This does not seem useful to me in any way.

Related

How to calculate matrix multiplication in which matrix is saved as vector

I have two symmetric matrices A, B and a vector X. The dimension of A is n by n, the dimension of B is n by n, the dimension of X is n by 1. Let the element at ith row and jth column of matrix A denoted by A[i,j].
Since A is symmetric, only each column of the upper triangular matrix of A is saved. The matrix A is saved as an array:
Vector_A = [A[1,1],
A[1,2], A[2,2],
A[1,3], A[2,3], A[3,3],
A[1,4], A[2,4], A[3,4], A[4,4],
...,
A[1,n], A[2,n], ..., A[n,n]]
The matrix B is saved in the same format as matrix A. Now I would like to calculate ABA without transforming Vector_A, Vector_B back to matrix A, B. Since ABA is also symmetric, I would like to save the ABA in the same way as an array. How can I do it in Julia?
I would also like to calculate X'AX without transforming Vector_A back to matrix A where X' denotes transpose(X). How can I do it in Julia?
You need to implement your own data structures that inherit from the the AbstractMatrix type.
For example this could be done as:
struct SymmetricM{T} <: AbstractMatrix{T}
data::Vector{T}
end
So we have a symmetric matrix that is using only a vector for its data storage.
Now you need to implement functions so it actually behaves like a matrix so you can let the Julia magic work.
We start by providing the size of our new matrix datatype.
function Base.size(m::SymmetricM)
n = ((8*length(m.data)+1)^0.5-1)/2
nr = round(Int, n)
#assert n ≈ nr "The vector length must match the number of triang matrix elements"
(nr,nr)
end
In this code nr will be calculate every time to checkbounds is done on matrix. Perhaps in your production implementation you might want to move it to be a field of SymmetricM. You would scarify some elasticity and store 8 bytes more but would gain on the speed.
Now the next function we need is to calculate position of the vector on the base of matrix indices. Here is one possible implementation.
function getix(idx)::Int
n = size(m)[1]
row, col = idx
#assume left/lower triangular
if col > row
row = col
col = idx[1]
end
(row-1)*row/2 + col
end
Having that now we can implement getindex and setindex functions:
#inline function Base.getindex(m::SymmetricM, idx::Vararg{Int,2})
#boundscheck checkbounds(m, idx...)
m.data[getix(idx)]
end
#inline function Base.getindex(m::SymmetricM{T}, v::T, idx::Vararg{Int,2}) where T
#boundscheck checkbounds(m, idx...)
m.data[getix(idx)] = v
end
Now let us test this thing:
julia> m = SymmetricM(collect(1:10))
4×4 SymmetricM{Int64}:
1 2 4 7
2 3 5 8
4 5 6 9
7 8 9 10
You can see that we have provide elements of only one triangle (be it the lower or upper - they are the same) - and we got the full matrix!
This is indeed a fully valid Julia matrix so all matrix algebra should work on it:
julia> m * SymmetricM(collect(10:10:100))
4×4 Array{Int64,2}:
700 840 1010 1290
840 1020 1250 1630
1010 1250 1580 2120
1290 1630 2120 2940
Note that the result of multiplication is a Matrix rather than SymmetricM - to get a SymmetricM you need to overload the * operator to accept 2 SymmetricM arguments. For illustrative purposes let us show a custom operator overloading with the minus sign -:
import Base.-
-(m1::SymmetricM, m2::SymmetricM) = SymmetricM(m1.data .- m2.data)
And now you will see that substraction of SymmetricM is going to return another SymmetricM:
julia> m-m
4×4 SymmetricM{Int64}:
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
In this way you can build a full triangular matrix algebra system in Julia.
Note that however the getix function has an if statement so access to SymmetricM elements without using the data field will be much slower than those of a regular matrix so perhaps you should try to overload as many operators as is required for your project.

How to assign off-diagonal entries in SymTridiagonal matrix in julia?

The SymTridiagonal data type in Julia is not letting me assign non-diagonal values to anything other than zero. I get this error: ArgumentError: cannot set off-diagonal entry (2, 1).
I need to assign non-diagonal values because I am trying to implement the ImplicitSymmetricQRStep algorithm which needs to do that in the process.
It is indeed not possible to set the off diagonal values of SymTridiagonal matrix - why this decision was taken I cannot say.
I see now two alternatives:
1) In Julia the fields of a structure are not hidden, so it is possible to change the value that way. This is dangerous though, as the internal structure of that matrix might change in future versions without any warnings. Here is an example of how you would do that:
using LinearAlgebra: SymTridiagonal
a = SymTridiagonal([1 2 0; 2 1 2; 0 2 1)] # 1 on diagonal, 2 on off diagonals
a.ev[1] = 4 # a[1, 2] == 4 and a[2, 1] == 4
2) You could also use the Tridiagonal matrix type, that is also in the LinearAlgebra package; this type allows one to set the off diagonal entries. Then you just have to make sure yourself that you don't violate the symmetric properties of that matrix i.e if you set a[i, j] then you also have to set a[j, i] to the same value.

BLAS routine to compute diagonal elements only of a matrix product?

Say I have two matrices A and B. I want to compute the diagonal elements of the matrix product A * B and place them in a pre-allocated vector result.
Is there a BLAS (or similar) routine to do this as fast as possible?
There is no specific routine for that. However, you can use the following definition of matrix multiplication.
Consider C = AB, and aij, bij, cij to denote the (i,j)th element of the corresponding matrices. Without loss of generality, I will assume that all A,B,C are N x N dense matrices.
Then,
cij = sumk=0N-1 (aik, bkj)
Since you are interested only in the diagonal entries:
cii = sumk=0N-1 (aik, bki), for i=1,...,N
In other words, to calculate the ith diagonal matrix of matrix C you need to find a dot product between the ith row of matrix A and ith column of matrix B. That can be achieved by using a dot product BLAS level-1 function ?dot.
res = ?dot(n, x, incx, y, incy)
Let's assume that matrices A and B are stored column-wise and are accessible via pointers *A and *B (which hold N*N values), while *C is a preallocated storage for diagonal entries of matrix C (which holds N values).
The following loop should give you the diagonal:
for (int i=0;i<N;i++)
{
C[i] = ?dot(N,A[i],N,B[i*N],1);
}
Notice, that we are accessing the ith row of matrix A by passing the first element of the ith row: A[i], and using increment (incx) of N. In contrast, to access the ith column of matrix B we pass the first element of the ith column: B[i*N] and use increment of 1.
Notes:
if A,B, and C have different (but consistent with matrix multiplication) dimensions, only slight modifications will have to be applied.
if matrices are stored row-wise, the call to ?dot should be slightly changed
the pseudocode above uses a general ?dot function. In practice, it will be sdot or ddot for single- or double precision real numbers, and versions of ?dotu: cdotu and zdotu for complex numbers of single and double precision, respectively.
is it the most efficient, cache-friendly, etc-etc implementation? probably not, but it would surprise me if that becomes a bottleneck in an algorithm where NxN matrices A and B have been explicitly calculated anyway.

Find paths of length = 4, starting by an adjacency matrix of a directed graph, considering only distinct edges?

Given an EREW-PRAM model, that allows me to use an arbitrary number of processors in parallel without them conflicting nor in read, nor in write access, I need to find the number of paths of length 4, considering that I have an input node-node adjacency matrix A representing a directed graph and that I need to exclude paths that don't use distinct edges (e.g.: (a,b),(b,a),(a,b),(b,a) is not a valid path).
I have a function that uses n^3 processors and calculates the matrix multiplication of two given matrices in time O(logn):
mult-matrix(A, A, n) => B --> gives me the paths of length 2.
mult-matrix(B, B, n) => C --> gives me the paths of length 4, but I think it considers paths that run across the same edges.
I tried subtracting 1 from elements of C that have a node u communicating with a node v in both directions, but I'm not sure it works.
How could I solve the problem considering that I just need to exclude some paths from the resulting matrix C?
Any working solution is appreciated, considering that the number of processors is constrained to n^3 and time must be O(logn) in the worst case. The exercises must be solved using a pseudo-pascal language, but given a working solution, I should be able to write the pseudocode by myself.
I think I found a solution in https://www.perlmonks.org/?node_id=522270
Given an input matrix A, I am able to calculate the adjacency matrix for paths of length 2, 3 and 4 with the provided function.
A2 is the adjacency matrix obtained by multiplying A*A and contains paths of length 2
A3 is obtained by multiplying A2*A and contains paths of length 3
A4 is obtained by multiplying A3*A and contains paths of length 4
In order to exclude the repeated edges, I have to compute the matrix C, obtained by doing an element-wise subtraction among the calculated matrices.
C[i,j] = A4[i,j] - A3[i,j] - A2[i,j] - A[i,j]
C contains the final result.
The following pseudocode solves the problem with an EREW-PRAM using O(n^3) processors and in time O(logn).
procedure paths_length_4(A, n) // Work = O(n^3 logn)
begin
A2 := mult_matrix(A, A, n) // T=O(logn), P=O(n^3)
A3 := mult_matrix(A2, A, n) // T=O(logn), P=O(n^3)
A4 := mult_matrix(A3, A, n) // T=O(logn), P=O(n^3)
for all i,j where 1 ≤ i ≤ n, 1 ≤ j ≤ n pardo // P=O(n^2)
C[i,j] := A4[i,j] - A3[i,j] - A2[i,j] - A[i,j]
end

Are these functions column-major or row-major?

I'm comparing two different linear math libraries for 3D graphics using matrices. Here are two similar Translate functions from the two libraries:
static Matrix4<T> Translate(T x, T y, T z)
{
Matrix4 m;
m.x.x = 1; m.x.y = 0; m.x.z = 0; m.x.w = 0;
m.y.x = 0; m.y.y = 1; m.y.z = 0; m.y.w = 0;
m.z.x = 0; m.z.y = 0; m.z.z = 1; m.z.w = 0;
m.w.x = x; m.w.y = y; m.w.z = z; m.w.w = 1;
return m;
}
(c++ library from SO user prideout)
static inline void mat4x4_translate(mat4x4 T, float x, float y, float z)
{
mat4x4_identity(T);
T[3][0] = x;
T[3][1] = y;
T[3][2] = z;
}
(linmath c library from SO user datenwolf)
I'm new to this stuff but I know that the order of matrix multiplication depends a lot on whether you are using a column-major or row-major format.
To my eyes, these two are using the same format, in that in both the first index is treated as the row, the second index is the column. That is, in both the x y z are applied to the same first index. This would imply to me row-major, and thus matrix multiplication is left associative (for example, you'd typically do a rotate * translate in that order).
I have used the first example many times in a left associative context and it has been working as expected. While I have not used the second, the author says it is right-associative, yet I'm having trouble seeing the difference between the formats of the two.
To my eyes, these two are using the same format, in that in both the first index is treated as the row, the second index is the column.
The looks may be deceiving, but in fact the first index in linmath.h is the column. C and C++ specify that in a multidimensional array defined like this
sometype a[n][m];
there are n times m elements of sometype in succession. If it is row or column major order solely depends on how you interpret the indices. Now OpenGL defines 4×4 matrices to be indexed in the following linear scheme
0 4 8 c
1 5 9 d
2 6 a e
3 7 b f
If you apply the rules of C++ multidimensional arrays you'd add the following column row designation
----> n
| 0 4 8 c
| 1 5 9 d
V 2 6 a e
m 3 7 b f
Which remaps the linear indices into 2-tuples of
0 -> 0,0
1 -> 0,1
2 -> 0,2
3 -> 0,3
4 -> 1,0
5 -> 1,1
6 -> 1,2
7 -> 1,3
8 -> 2,0
9 -> 2,1
a -> 2,2
b -> 2,3
c -> 3,0
d -> 3,1
e -> 3,2
f -> 3,3
Okay, OpenGL and some math libraries use column major ordering, fine. But why do it this way and break with the usual mathematical convention that in Mi,j the index i designates the row and j the column? Because it is make things look nicer. You see, matrix is just a bunch of vectors. Vectors that can and usually do form a coordinate base system.
Have a look at this picture:
The axes X, Y and Z are essentially vectors. They are defined as
X = (1,0,0)
Y = (0,1,0)
Z = (0,0,1)
Moment, does't that up there look like a identity matrix? Indeed it does and in fact it is!
However written as it is the matrix has been formed by stacking row vectors. And the rules for matrix multiplication essentially tell, that a matrix formed by row vectors, transforms row vectors into row vectors by left associative multiplication. Column major matrices transform column vectors into column vectors by right associative multiplication.
Now this is not really a problem, because left associative can do the same stuff as right associative can, you just have to swap rows for columns (i.e. transpose) everything and reverse the order of operands. However left<>right row<>column are just notational conventions in which we write things.
And the typical mathematical notation is (for example)
v_clip = P · V · M · v_local
This notation makes it intuitively visible what's going on. Furthermore in programming the key character = usually designates assignment from right to left. Some programming languages are more mathematically influenced, like Pascal or Delphi and write it :=. Anyway with row major ordering we'd have to write it
v_clip = v_local · M · V · P
and to the majority of mathematical folks this looks unnatural. Because, technically M, V and P are in fact linear operators (yes they're also matrices and linear transforms) and operators always go between the equality / assignment and the variable.
So that's why we use column major format: It looks nicer. Technically it could be done using row major format as well. And what does this have to do with the memory layout of matrices? Well, When you want to use a column major order notation, then you want direct access to the base vectors of the transformation matrices, without having them to extract them element by element. With storing numbers in a column major format, all it takes to access a certain base vector of a matrix is a simple offset in linear memory.
I can't speak for the code example of the other library, but I'd strongly assume, that it treats first index as the slower incrementing index as well, which makes it work in column major if subjected to the notations of OpenGL. Remember: column major & right associativity == row major & left associativity.
The fragments posted are not enough to answer the question. They could be row-major matrices stored in row order, or column-major matrices stored in column order.
It may be more obvious if you look at how a vector is treated when multiplied with an appropriate matrix. In a row-major system, you would expect the vector to be treated as a single row matrix, whereas in a column-major system it would similarly be a single column matrix. That then dictates how a vector and a matrix may be multiplied. You can only multiply a vector with a matrix as either a single column on the right, or a single row on the left.
The GL convention is column-major, so a vector is multiplied to the right.
D3D is row-major, so vectors are rows and are multiplied to the left.
This needs to be taken into account when concatenating transforms, so that they are applied in the correct order.
i.e:
GL:
V' = CAMERA * WORLD * LOCAL * V
D3D:
V' = V * LOCAL * WORLD * CAMERA
However they choose to store their matrices such that the in-memory representations are actually the same (until we get into shaders and some representations need to be transposed...)

Resources