Floating point error in successive coordinate rotations - math

I have code (Python) that must perform some operations regarding distances between reflected segments of a curve.
In order to make the thinking and the code clearer, I apply two rotations (using matrix multiplication) before performing the actual calculation. I suppose it would be possible to perform the calculation without any rotation at all, but the code and the thinking would be much more awkward.
What I have to ask is: are the three rotations too much of a penalty in terms of lost precision because of rounding-off floating point errors? Is there a way to estimate the magnitude of this error?
Thanks for reading

As a rule of thumb in numerical calculations -- only take the first 12 digits seriously :)
Now, assuming 3D rotations, and that outcomes of trig functions are infinitely precise, a matrix multiplication will involve 3 multiplications and 2 additions per element in the rotated vector. Since you do two rotations, this amounts to 6 multiplications and 4 additions per element.
If you read this (which you should read front to back one day), or this, or this, you'll find that individual arithmetic operations of IEEE 754 are guaranteed to be accurate to within half a ULP (=last decimal place).
Applied to your problem, that means that the 10 operations per element in the result vector will be accurate to within 5 ULPs.
In other words -- suppose you're rotating a unit vector. The elements of the rotated vector will be accurate to 0.000000000000005 -- I'd say that's nothing to worry about.
Including the errors in the trig functions, well, that's a bit more complicated...that really depends on your programming language and/or version of your compiler etc. But I guarantee it'll be comparable to the 5 ULPs.
If you do think this accuracy is not going to be enough, then I'd suggest you perform the two rotations in one go. Work out the matrix multiplication analytically, and implement the rotation as a single matrix multiplication. Alternatively: have a look at quaternions (although I suspect that's a bit overkill for your situation).

What you need to do is compute the condition number of your operations and determine whether it may incur loss of significance. That should allow you to estimate the error that could be introduced.

Related

Iterative eigensolver allowing initial guess and complex eigenvalues?

I have a time-dependent complex matrix A(t), and I want to follow its eigenvalues over time. In other words, in the time-dependent list of eigenvalues a[1](t), ..., a[n](t), I want each entry to change continuously over time.
One approach is to find the eigendecomposition of A(t+ε) iteratively, using the eigendecomposition of A(t) as an initial guess. Since the guess is almost correct, the iteration should only change it slightly, giving the desired continuity.
I think the LOBPCG and SVD solvers in IterativeSolvers.jl can do this, because they let you store the iterator state. Unfortunately, they only work for matrices with real eigenvalues. (The SVG solver also requires real entries.) The solvers in ArnoldiMethod.jl can handle complex eigenvalues, but doesn't seem to allow an initial guess. Is there any available eigensolver that has both the features I need?

Why do we use log probability in deep learning?

I got curious while reading the paper 'Sequence to Sequence Learning with Neural Networks'.
In fact, not only this paper but also many other papers use log probabilities, is there a reason for that?
Please check the attached photo.
Two reasons -
Theoretical - Probabilities of two independent events A and B co-occurring together is given by P(A).P(B). This easily gets mapped to a sum if we use log, i.e. log(P(A)) + log(P(B)). It is thus easier to address the neuron firing 'events' as a linear function.
Practical - The probability values are in [0, 1]. Hence multiplying two or more such small numbers could easily lead to an underflow in a floating point precision arithmetic (e.g. consider multiplying 0.0001*0.00001). A practical solution is to use the logs to get rid of the underflow.
For any given problem we need to optimise the likelihood of parameters. But optimising the product require all data at once and requires huge computation.
We know that a sum is a lot easier to optimise as the derivative of a sum is the sum of derivatives. So, taking log convert it to sum and makes computation faster.
Refer this

8 point algorithm for estimating Fundamental Matrix

I'm watching a lecture about estimating the fundamental matrix for use in stereo vision using the 8 point algorithm. I understand that once we recover the fundamental matrix between two cameras we can compute the epipolar line on one camera given a point on the other. To my understanding this epipolar line (after it's been rectified) makes it easy to find feature correspondences, because we are simply matching features along a 1D line.
The confusion comes from the fact that 8-point algorithm itself requires at least 8 feature correspondences to estimate the Fundamental Matrix.
So, we are finding point correspondences to recover a matrix that is used to find point correspondences?
This seems like a chicken-egg paradox so I guess I'm misunderstanding something.
The fundamental matrix can be precomputed. This leads to two advantages:
You can use a nice environment in which features can be matched easily (like using a chessboard) to compute the fundamental matrix.
You can use more computationally expensive operations like a sequence of SIFT, FLANN and RANSAC across the entire image since you only need to do that once.
After getting the fundamental matrix, you can find correspondences in a noisy environment more efficiently than using the same method when you compute the fundamental matrix.

Filtering methods of the complex oscilliations

If I have a system of a springs, not one, but for example 3 degree of freedom system of the springs connected in some with each other. I can make a system of differential equations for but it is impossible to solve it in a general way. The question is, are there any papers or methods for filtering such a complex oscilliations, in order to get rid of the oscilliations and get a real signal as much as possible? For example if I connect 3 springs in some way, and push them to start the vibrations, or put some weight on them, and then take the vibrations from each spring, are there any filtering methods to make it easy to determine the weight (in case if some mass is put above) of each mass? I am interested in filtering complex spring like systems.
Three springs, six degrees of freedom? This is a trivial solution using finite element methods and numerical integration. It's a system of six coupled ODEs. You can apply any form of numerical integration, such as 5th order Runge-Kutta.
I'd recommend doing an eigenvalue analysis of the system first to find out something about its frequency characteristics and normal modes. I'd also do an FFT of the dynamic forces you apply to the system. You don't mention any damping, so if you happen to excite your system at a natural frequency that's close to a resonance you might have some interesting behavior.
If the dynamic equation has this general form (sorry, I don't have LaTeX here to make it look nice):
Ma + Kx = F
where M is the mass matrix (diagonal), a is the acceleration (2nd derivative of displacements w.r.t. time), K is the stiffness matrix, and F is the forcing function.
If you're saying you know the response, you'll have to pre-multiply by the transpose of the response function and try to solve for M. It's diagonal, so you have a shot at it.
Are you connecting the springs in such a way that the behavior of the system is approximately linear? (e.g. at least as close to linear as are musical instrument springs/strings?) Is this behavior consistant over time? (e.g. the springs don't melt or break.) If so, LTI (linear time invariant) systems theory might be applicable. Given enough measurements versus the numbers of degrees of freedom in the LTI system, one might be able to estimate a pole-zero plot of the system response, and go from there. Or something like a linear predictor might be useful.
Actually it is possible to solve the resulting system of differential equations as long as you know the masses, etc.
The standard approach is to use a Laplace Transform. In particular you start with a set of linear differential equations. Add variables until you have a set of first order linear differential equations. (So if you have y'' in your equation, you'd add the equation z = y' and replace y'' with z'.) Rewrite this in the form:
v' = Av + w
where v is a vector of variable, A is a matrix, and w is a scalar vector. (An example of something that winds up in w is gravity.)
Now apply a Laplace transform to get
s L(v) - v(0) = AL(v) + s w
Solve it to get
L(v) = inv(A - I s)(s w + v(0))
where inv inverts a matrix and I is the identity matrix. Apply the inverse Laplace transform (if you read up on Laplace transforms you can find tables of inverse of common types of functions - getting a complete list of the functions you actually encounter shouldn't be that hard), and you have your solution. (Be warned, these computations quickly get very complex.)
Now you have the ability to take a particular setup and solve for the future behavior. You also have the ability to (if you do things really carefully) figure out how the model responds to a small perturbation in parameters. But your problem is that you don't know the parameters to use. However you do have the ability to measure the positions in the system at repeated times.
If you put this together, what you can do is this. Measure your position at a number of points. First estimate all of the initial values of the parameters, and then all of the values a second later. You can adjust your parameters (using Newton's method) to come close enough to the values a second later. Take the measurements from 5 seconds later and use that initial estimate as your starting point to refine your calculations for what is happening 5 seconds later. Repeat with longer intervals to get all of your answers.
Writing and debugging this should take you some time. :-) I would strongly recommend investigating how much of this Mathematica knows how to do for you already...

Where are "Special Numbers" mentioned in Concrete Maths used?

I was glancing through the contents of Concrete Maths online. I had at least heard most of the functions and tricks mentioned but there is a whole section on Special Numbers. These numbers include Stirling Numbers, Eulerian Numbers, Harmonic Numbers so on. Now I have never encountered any of these weird numbers. How do they aid in computational problems? Where are they generally used?
Harmonic Numbers appear almost everywhere! Musical Harmonies, analysis of Quicksort...
Stirling Numbers (first and second kind) arise in a variety of combinatorics and partitioning problems.
Eulerian Numbers also occur several places, most notably in permutations and coefficients of polylogarithm functions.
A lot of the numbers you mentioned are used in the analysis of algorithms. You may not have these numbers in your code, but you'll need them if you want to estimate how long it will take for your code to run. You might see them in your code too. Some of these numbers are related to combinatorics, counting how many ways something can happen.
Sometimes it's not enough to know how many possibilities there are because you need to enumerate over the possibilities. Volume 4 of Knuth's TAOCP, in progress, gives the algorithms you need.
Here's an example of using Fibonacci numbers as part of a numerical integration problem.
Harmonic numbers are a discrete analog of logarithms and so they come up in difference equations just like logs come up in differential equations. Here's an example of physical applications of harmonic means, related to harmonic numbers. See the book Gamma for many examples of harmonic numbers in action, especially the chapter "It's a harmonic world."
These special numbers can help out in computational problems in many ways. For example:
You want to find out when your program to compute the GCD of 2 numbers is going to take the longest amount of time: Try 2 consecutive Fibonacci Numbers.
You want to have a rough estimate of the factorial of a large number, but your factorial program is taking too long: Use Stirling's Approximation.
You're testing for prime numbers, but for some numbers you always get the wrong answer: It could be you're using Fermat's Prime test, in which case the Carmicheal numbers are your culprits.
The most common general case I can think of is in looping. Most of the time you specify a loop using a (start;stop;step) type of syntax, in which case it may be possible to reduce the execution time by using properties of the numbers involved.
For example, summing up all the numbers from 1 to n when n is large in a loop is definitely slower than using the identity sum = n*(n + 1)/2.
There are a large number of examples like these. Many of them are in cryptography, where the security of information systems sometimes depends on tricks like these. They can also help you with performance issues, memory issues, because when you know the formula, you may find a faster/more efficient way to compute other things -- things that you actually care about.
For more information, check out wikipedia, or simply try out Project Euler. You'll start finding patterns pretty fast.
Most of these numbers count certain kinds of discrete structures (for instance, Stirling Numbers count Subsets and Cycles). Such structures, and hence these sequences, implicitly arise in the analysis of algorithms.
There is an extensive list at OEIS that lists almost all sequences that appear in Concrete Math. A short summary from that list:
Golomb's Sequence
Binomial Coefficients
Rencontres Numbers
Stirling Numbers
Eulerian Numbers
Hyperfactorials
Genocchi Numbers
You can browse the OEIS pages for the respective sequences to get detailed information about the "properties" of these sequences (though not exactly applications, if that's what you're most interested in).
Also, if you want to see real-life uses of these sequences in analysis of algorithms, flip through the index of Knuth's Art of Computer Programming, and you'll find many references to "applications" of these sequences. John D. Cook already mentioned applications of Fibonacci & Harmonic numbers; here are some more examples:
Stirling Cycle Numbers arise in the analysis of the standard algorithm that finds the maximum element of an array (TAOCP Sec. 1.2.10): How many times must the current maximum value be updated when finding the maximum value? It turns out that the probability that the maximum will need to be updated k times when finding a maximum in an array of n elements is p[n][k] = StirlingCycle[n, k+1]/n!. From this, we can derive that on the average, approximately Log(n) updates will be necessary.
Genocchi Numbers arise in connection with counting the number of BDDs that are "thin" (TAOCP 7.1.4 Exercise 174).
Not necessarily a magic number from the reference you mentioned, but nonetheless --
0x5f3759df
-- the notorious magic number used to calculate inverse square root of a number by giving a good first estimate to Newton's Approximation of Roots, often attributed to the work of John Carmack - more info here.
Not programming related, huh? :)
Is this directly programming related? Surely related, but I don't know how closely.
Special numbers, such as e, pi, etc., come up all over the place. I don't think that anyone would argue about these two. The Golden_ratio also appears with amazing frequency, in everything from art to other special numbers themselves (look at the ratio between successive Fibonacci numbers.)
Various sequences and families of numbers also appear in many places in mathematics and therefore, in programming too. A beautiful place to look is the Encyclopedia of integer sequences.
I'll suggest this is an experience thing. For example, when I took linear algebra, many, many years ago, I learned about the eigenvalues and eigenvectors of a matrix. I'll admit that I did not at all appreciate the significance of eigenvalues/eigenvectors until I saw them in use in a variety of places. In statistics, in terms of what they tell you about uncertainty of an estimate from a covariance matrix, the size and shape of a confidence ellipse, in terms of principal component analysis, or the long term state of a Markov process. In numerical methods, where they tell you about convergence of a method, be it in optimization or an ODE solver. In mechanical engineering, where you see them as principal stresses and strains.
Discussion in Reddit

Resources