matrix multiplication order PVM vs MVP in graphics programming - math

hi there I was wondering why most tutorials and programming code use MVP to describe the Model-View-Projection matrix. Instead of PVM which is the actual order of implementation in the code:
mat4 MVP = ProjectionMatrix * ViewMatrix * ModelMatrix;
gl_Position = MVP * VertexInModelSpace;
seems much more understandable to me to write PVM instead of MVP.

Matrices don't actually have a fixed meaning, just relations between rows and columns. The meaning is freely definable by the developers. The MVP order follows from standard mathematical conventions. But since nothing says you can not define the vectors as columns instead of rows nothing precludes this ordering.
Clarification: Since changing notation transposes the meaning. Then following applies:
MmvpT = Mpvm
Due to the definition of matrix multiplication following rule kicks in:
(AB)T = BTAT
Since B can be recursively another matrix multiplication a infinite chain of these are possible. Which means essentially that you have swapped the multiplication order, by changing notation.
Its a bit like looking at the problem from the outside or the problem from the inside. In this case your thinking as a outside observer. Whereas the other way around one would observe the thing from the standpoint of the first operator in the chain. Personally I think the notation you use may be more intuitive for this specific task, the other is just way more common. Mainly due to the fact that all mathematics books I have ever seen use this convention, so blame the mathematicians.
So better stick with the more common way, makes things more generally understandable. For example: Nothing stops me from typing the answer in Finnish but the convention of stackoverflow is to answer in English, making answers more understandable to most users. Use the more common form since others may not grasp the difference, and this leads to errors.

The other problem is that matrix multiplication is not necessarily commutative:
AB != BA
So it's a good idea to stick with the convention.

Related

What is the point of the Symmetric type in Julia?

What is the point of the Symmetric type in the LinearAlgebra package of Julia? It seems like it is equivalent to the type Hermitian for real matrices (although: is this true?). If that is true, then the only case for which Symmetric is not redundant with Hermitian is for complex matrices, and it would be surprising to want to have a symmetric as opposed to Hermitian complex matrix (maybe I am mistaken on that though).
I ask this question in part because I sometimes find myself doing casework like this: if I have a real matrix, then use Symmetric; if complex, then Hermitian. It seems though that I could save work by just always using Hermitian. Will I be missing out on performance or otherwise if I do this?
(Also, bonus question that may be related: why is there no HermTridiagonal type in addition to SymTridiagonal? I could use the former. Plus, it seems more useful than SymTridiagonal in consideration of the above.)
To copy the answer from the linked discourse thread (via #stevengj):
Always use Hermitian. For real elements, there is no penalty compared to Symmetric.
There aren’t any specialized routines for complex Symmetric matrices that I know of. My feeling is that it was probably a mistake to have a separate Symmetric type in LinearAlgebra, but it is hard to remove at this point.

Struggling with building an intuition for recursion

Though I have studied and able am able to understand some programs in recursion, I am still not able to intuitively obtain a solution using recursion as I do easily using Iteration. Is there any course or track available in order to build an intuition for recursion? How can one master the concept of recursion?
if you want to gain a thorough understanding of how recursion works, I highly recommend that you start with understanding mathematical induction, as the two are very closely related, if not arguably identical.
Recursion is a way of breaking down seemingly complicated problems into smaller bits. Consider the trivial example of the factorial function.
def factorial(n):
if n < 2:
return 1
return n * factorial(n - 1)
To calculate factorial(100), for example, all you need is to calculate factorial(99) and multiply 100. This follows from the familiar definition of the factorial.
Here are some tips for coming up with a recursive solution:
Assume you know the result returned by the immediately preceding recursive call (e.g. in calculating factorial(100), assume you already know the value of factorial(99). How do you go from there?)
Consider the base case (i.e. when should the recursion come to a halt?)
The first bullet point might seem rather abstract, but all it means is this: a large portion of the work has already been done. How do you go from there to complete the task? In the case of the factorial, factorial(99) constituted this large portion of work. In many cases, you will find that identifying this portion of work simply amounts to examining the argument to the function (e.g. n in factorial), and assuming that you already have the answer to func(n - 1).
Here's another example for concreteness. Let's say we want to reverse a string without using in-built functions. In using recursion, we might assume that string[:-1], or the substring until the very last character, has already been reversed. Then, all that is needed is to put the last remaining character in the front. Using this inspiration, we might come up with the following recursive solution:
def my_reverse(string):
if not string: # base case: empty string
return string # return empty string, nothing to reverse
return string[-1] + my_reverse(string[:-1])
With all of this said, recursion is built on mathematical induction, and these two are inseparable ideas. In fact, one can easily prove that recursive algorithms work using induction. I highly recommend that you checkout this lecture.

How to speed up writing to a matrix in a reference class in R

Here is a piece of R code that writes to each element of a matrix in a reference class. It runs incredibly slowly, and I’m wondering if I’ve missed a simple trick that will speed this up.
nx = 2000
ny = 10
ref_matrix <- setRefClass(
"ref_matrix",fields = list(data = "matrix"),
)
out <- ref_matrix(data = matrix(0.0,nx,ny))
#tracemem(out$data)
for (iy in 1:ny) {
for (ix in 1:nx) {
out$data[ix,iy] <- ix + iy
}
}
It seems that each write to an element of the matrix triggers a check that involves a copy of the entire matrix. (Uncommenting the tracemen() call shows this.) Now, I’ve found a discussion that seems to confirm this:
https://r-devel.r-project.narkive.com/8KtYICjV/rd-copy-on-assignment-to-large-field-of-reference-class
and this also seems to be covered by Speeding up field access in R reference classes
but in both of these this behaviour can be bypassed by not declaring a class for the field, and this works for the example in the first link which uses a 1D vector, b, which can just be set as b <<- 1:10000. But I’ve not found an equivalent way of creating a 2D array without using a explicit “matrix” instance.
Am I just missing something simple, or is this actually not possible?
Let me add a couple of things. First, I’m very new to R, so could easily have missed something. Second, I’m really just curious about the way reference classes work in this case and whether there’s a simple way to use them efficiently; I’m not looking for a really fast way to set the elements of a matrix - I can do that by not having the matrix in a reference class at all, and if I really care about speed I can write a C routine to do it and call it from R.
Here’s some background that might explain why I’m interested in this, which you’re welcome to ignore.
I got here by wanting to see how different languages, and even different compiler options and different ways of coding the same operation, compared for efficiency when accessing 2D rectangular arrays. I’ve been playing with a test program that creates two 2D arrays of the same size, and calls a subroutine that sets the first to the elements of the second plus their index values. (Almost any operation would do, but this one isn’t completely trivial to optimise.) I have this in a number of languages now, C, C++, Julia, Tcl, Fortran, Swift, etc., even hand-coded assembler (spoiler alert: assembler isn’t worth the effort any more) and thought I’d try R. The obvious implementation in R passes the two arrays to a subroutine that does the work, but because R doesn’t normally pass by reference, that routine has to make a copy of the modified array and return that as the function value. I thought using a reference class would avoid the relatively minor overhead of that copy, so I tried that and was surprised to discover that, far from speeding things up, it slowed them down enormously.
Use outer:
out$data <- outer(1:ny, 1:nx, `+`)
Also, don't use reference classes (or R6 classes) unless you actually need reference semantics. KISS and all that.

Coding mathematical algorithms - should I use variables in the book or more descriptive ones?

I'm maintaining code for a mathematical algorithm that came from a book, with references in the comments. Is it better to have variable names that are descriptive of what the variables represent, or should the variables match what is in the book?
For a simple example, I may see this code, which reflects the variable in the book.
A_c = v*v/r
I could rewrite it as
centripetal_acceleration = velocity*velocity/radius
The advantage of the latter is that anyone looking at the code could understand it. However, the advantage of the former is that it is easier to compare the code with what is in the book. I may do this in order to double check the implementation of the algorithms, or I may want to add additional calculations.
Perhaps I am over-thinking this, and should simply use comments to describe what the variables are. I tend to favor self-documenting code however (use descriptive variable names instead of adding comments to describe what they are), but maybe this is a case where comments would be very helpful.
I know this question can be subjective, but I wondered if anyone had any guiding principles in order to make a decision, or had links to guidelines for coding math algorithms.
I would prefer to use the more descriptive variable names. You can't guarantee everyone that is going to look at the code has access to "the book". You may leave and take your copy, it may go out of print, etc. In my opinion it's better to be descriptive.
We use a lot of mathematical reference books in our work, and we reference them in comments, but we rarely use the same mathematically abbreviated variable names.
A common practise is to summarise all your variables, indexes and descriptions in a comment header before starting the code proper. eg.
// A_c = Centripetal Acceleration
// v = Velocity
// r = Radius
A_c = (v^2)/r
I write a lot of mathematical software. IF I can insert in the comments a very specific reference to a book or a paper or (best) web site that explains the algorithm and defines the variable names, then I will use the SHORT names like a = v * v / r because it makes the formulas easier to read and write and verify visually.
IF not, then I will write very verbose code with lots of comments and long descriptive variable names. Essentially, my code becomes a paper that describes the algorithm (anyone remember Knuth's "Literate Programming" efforts, years ago? Though the technology for it never took off, I emulate the spirit of that effort). I use a LOT of ascii art in my comments, with box-and-arrow diagrams and other descriptive graphics. I use Jave.de -- the Java Ascii Vmumble Editor.
I will sometimes write my math with short, angry little variable names, easier to read and write for ME because I know the math, then use REFACTOR to replace the names with longer, more descriptive ones at the end, but only for code that is much more informal.
I think it depends almost entirely upon the audience for whom you're writing -- and don't ever mistake the compiler for the audience either. If your code is likely to be maintained by more or less "general purpose" programmers who may not/probably won't know much about physics so they won't recognize what v and r mean, then it's probably better to expand them to be recognizable for non-physicists. If they're going to be physicists (or, for another example, game programmers) for whom the textbook abbreviations are clear and obvious, then use the abbreviations. If you don't know/can't guess which, it's probably safer to err on the side of the names being longer and more descriptive.
I vote for the "book" version. 'v' and 'r' etc are pretty well understood as acronymns for velocity and radius and is more compact.
How far would you take it?
Most (non-greek :-)) keyboards don't provide easy access to Δ, but it's valid as part of an identifier in some languages (e.g. C#):
int Δv;
int Δx;
Anyone coming afterwards and maintaining the code may curse you every day. Similarly for a lot of other symbols used in maths. So if you're not going to use those actual symbols (and I'd encourage you not to), I'd argue you ought to translate the rest, where it doesn't make for code that's too verbose.
In addition, what if you need to combine algorithms, and those algorithms have conflicting usage of variables?
A compromise could be to code and debug as contained in the book, and then perform a global search and replace for all of your variables towards the end of your development, so that it is easier to read. If you do this I would change the names of the variables slightly so that it is easier to change them later.
e.g A_c# = v#*v#/r#

Multidimensional vectors in Scheme?

I earlier asked a question about arrays in scheme (turns out they're called vectors but are basically otherwise the same as you'd expect).
Is there an easy way to do multidimensional arrays vectors in PLT Scheme though? For my purposes I'd like to have a procedure called make-multid-vector or something.
By the way if this doesn't already exist, I don't need a full code example of how to implement it. If I have to roll this myself I'd appreciate some general direction though. The way I'd probably do it is to just iterate through each element of the currently highest dimension of the vector to add another dimension, but I can see that being a bit ugly using scheme's recursive setup.
Also, this seems like something I should have been able to find myself so please know that I did actually google it and nothing came up.
The two common approaches are the same as in many languages, either use a vector of vectors, or (more efficiently) use a single vector of X*Y and compute the location of each reference. But there is a library that does that -- look in the docs for srfi/25, which you can get with (require srfi/25).

Resources