Normalize vector by zero - math

I am working on designing a new sensor, and so I have a vector of measured values and a vector of truth values. To represent error, it's simply measured - truth. Since there's a lot of variation in the truth, I would like to represent the normalized error. My initial thought would be error./truth to get percent error, but there are many cases where my truth value is zero! Can anyone think of a better way to represent the normalized data while avoiding the divide-by-zero? I'm working in Matlab, though the question is a bit language-agnostic as well.
PS, feel free to push this to another stackexchange if you think it's better suited

Try error = (measured-truth)/norm2(truth) for each vector.
Where norm2() is the forbenious norm.
norm2(x) =SQRT( SUM( x[i]^2, i=1..N ) )
This can only fail is all the values of truth are zero. You can mitigate this by adding a small positive number like 1e-12 to the norm, or to avoid the division when the norm is less than a threshold number.

I'd suggest you to separate results with zero (or smaller than 10e-6 for example) truth vector and non-zero truth vector. You can't treat it by the same means (since you can't normalize truth vector) and you should define what to do in that case.
I can't suggest you something specific because I don't know the problem statement, but you should define it by yourself how to deal with it. Or if you post your problem here I hope we can help you.

Related

Looking for good scale factor for converting log to 8.8 fixed point

I have a range of numbers in (0, 1]
I would like to take the natural log of these numbers, and then store as 8.8
fixed point.
My foruma is K*ln(x) + (1<<16)
but I am not sure what the best value is for K .
My thinking is that if x doubles, then ln(x) increases by ln(2), so the fixed point value should increase by 1 in fixed point (i.e. 256)
So, this would mean K = 256/ln(2)
Does this make sense?
As x approaches 0, ln(x) will diverge to negative infinity. So you are essentially trying to map an infinite domain to a finite range.
If you do so in a linear way, you have to cut off at some point. If you choose your cut-off at too low a value, you'll be wasting precision for the numbers you represent. If you choose to high a cut-off, too many values will be clamped to the minimal element of the range. Without knowledge about the distribution of the point, it will be very hard to guess a suitable balance here.
So perhaps you could apply a non-linear map instead of the linear one you proposed. Something like the exponential function? Which would mean you'd actually store x instead of ln(x). So I'd say if you want to store values from [0,1) in 16 bit without too much loss of information, you'd just use Q0.16, i.e. all the digits in the fractional part. For (0,1] you can either store 1 − x or do a special case for x = 1 so that you encode that as 0 instead. If you have Q8.8 numbers, you'd multiply your numbers by 28 = 256 first, but if you have access to the bit representation that multiplication would be a waste of time.
I guess you had a reason you'd want to store logarithms, so this answer may not be what you were hoping for. I don't see an easier way around the underlying problem, though, so you may have to reconsider some of your ideas.

Series of numbers with minimized risk of collision

I want to generate some numbers, which should attempt to share as few common bit patterns as possible, such that collisions happen at minimal amount. Until now its "simple" hashing with a given amount of output bits. However, there is another 'constraint'. I want to minimize the risk that, if you take one number and change it by toggling a small amount of bits, you end up with another number you've just generated. Note: I don't want it to be impossible or something, I want to minimize the risk!
How to calculate the probability for a list with n numbers, where each number has m bits? And, of course, what would be a suitable method to generate those numbers? Any good articles about this?
To answer this question precisely, you need to say what exactly you mean by "collision", and what you mean by "generate". If you just want the strings to be far apart from each other in hamming distance, you could hope to make an optimal, deterministic set of such strings. It is true that random strings will have this property with high probability, so you could use random strings instead.
When you say
Note: I don't want it to be impossible or something, I want to minimize the risk!
this sounds like an XY problem. If some outcome is the "bad thing" then why do you want it to be possible, but just low probability? Shouldn't you want it not to happen at all?
In short I think you should look up the term "error correcting code". The codewords of any good error correcting code, with any parameters that you feel like, will have the minimal risk of collision in the presence of random noise, for that number of code words of that length, and they can typically be generated very easily using matrix multiplication.

Trying to simplify expression with factorials

First off, apologies if there is a better way to format math equations, I could not find anything, but alas, the expressions are pretty short.
As part of an assigned problem I have to produce some code in C that will evaluate x^n/n! for an arbitrary x, and n = { 1-10 , 50, 100}
I can always brute force it with a large number library, but I am wondering if someone with better math skills then mine can suggest a better algorithm than something with a O(n!)...
I understand that I can split the numerator to x^(n/2)x^(n/2) for even values of n, and xx^(n-1/2)*x^(n-1/2) for odd values of n. And that I can further change that into a logarithm base x of n/2.
But I am stuck for multiple reasons:
1 - I do not think that computationally any of these changes actually make a lot of difference since they are not really helping me reduce the large number multiplications I have to perform, or their overall number.
2 - Even as I think of n! as 1*2*3*...*(n-1)*n, I still cannot rationalize a good way to simplify the overall equation.
3 - I have looked at Karatsuba's algorithm for multiplications, and although it is a possibility, it seems a bit complex for an intro to programming problem.
So I am wondering if you guys can think of any middle ground. I prefer explanations to straight answers if you have the time :)
Cheers,
My advice is to compute all the terms of the summation (put them in an array), and then sum them up in reverse order (i.e., smallest to largest) -- that reduces rounding error a little bit.
Note that you can compute the k-th term from the preceding one by multiplying by x/k -- you do not need to ever compute x^n or n! directly (this is important).

log-sum-exp trick why not recursive

I have been researching the log-sum-exp problem. I have a list of numbers stored as logarithms which I would like to sum and store in a logarithm.
the naive algorithm is
def naive(listOfLogs):
return math.log10(sum(10**x for x in listOfLogs))
many websites including:
logsumexp implementation in C?
and
http://machineintelligence.tumblr.com/post/4998477107/
recommend using
def recommend(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + math.log10(sum(10**(x-maxLog) for x in listOfLogs))
aka
def recommend(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + naive((x-maxLog) for x in listOfLogs)
what I don't understand is if recommended algorithm is better why should we call it recursively?
would that provide even more benefit?
def recursive(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + recursive((x-maxLog) for x in listOfLogs)
while I'm asking are there other tricks to make this calculation more numerically stable?
Some background for others: when you're computing an expression of the following type directly
ln( exp(x_1) + exp(x_2) + ... )
you can run into two kinds of problems:
exp(x_i) can overflow (x_i is too big), resulting in numbers that you can't add together
exp(x_i) can underflow (x_i is too small), resulting in a bunch of zeroes
If all the values are big, or all are small, we can divide by some exp(const) and add const to the outside of the ln to get the same value. Thus if we can pick the right const, we can shift the values into some range to prevent overflow/underflow.
The OP's question is, why do we pick max(x_i) for this const instead of any other value? Why don't we recursively do this calculation, picking the max out of each subset and computing the logarithm repeatedly?
The answer: because it doesn't matter.
The reason? Let's say x_1 = 10 is big, and x_2 = -10 is small. (These numbers aren't even very large in magnitude, right?) The expression
ln( exp(10) + exp(-10) )
will give you a value very close to 10. If you don't believe me, go try it. In fact, in general, ln( exp(x_1) + exp(x_2) + ... ) will give be very close to max(x_i) if some particular x_i is much bigger than all the others. (As an aside, this functional form, asymptotically, actually lets you mathematically pick the maximum from a set of numbers.)
Hence, the reason we pick the max instead of any other value is because the smaller values will hardly affect the result. If they underflow, they would have been too small to affect the sum anyway, because it would be dominated by the largest number and anything close to it. In computing terms, the contribution of the small numbers will be less than an ulp after computing the ln. So there's no reason to waste time computing the expression for the smaller values recursively if they will be lost in your final result anyway.
If you wanted to be really persnickety about implementing this, you'd divide by exp(max(x_i) - some_constant) or so to 'center' the resulting values around 1 to avoid both overflow and underflow, and that might give you a few extra digits of precision in the result. But avoiding overflow is much more important about avoiding underflow, because the former determines the result and the latter doesn't, so it's much simpler just to do it this way.
Not really any better to do it recursively. The problem's just that you want to make sure your finite-precision arithmetic doesn't swamp the answer in noise. By dealing with the max on its own, you ensure that any junk is kept small in the final answer because the most significant component of it is guaranteed to get through.
Apologies for the waffly explanation. Try it with some numbers yourself (a sensible list to start with might be [1E-5,1E25,1E-5]) and see what happens to get a feel for it.
As you have defined it, your recursive function will never terminate. That's because ((x-maxlog) for x in listOfLogs) still has the same number of elements as listOfLogs.
I don't think that this is easily fixable either, without significantly impacting either the performance or the precision (compared to the non-recursive version).

How to check if m n-sized vectors are linearly independent?

Disclaimer
This is not strictly a programming question, but most programmers soon or later have to deal with math (especially algebra), so I think that the answer could turn out to be useful to someone else in the future.
Now the problem
I'm trying to check if m vectors of dimension n are linearly independent. If m == n you can just build a matrix using the vectors and check if the determinant is != 0. But what if m < n?
Any hints?
See also this video lecture.
Construct a matrix of the vectors (one row per vector), and perform a Gaussian elimination on this matrix. If any of the matrix rows cancels out, they are not linearly independent.
The trivial case is when m > n, in this case, they cannot be linearly independent.
Construct a matrix M whose rows are the vectors and determine the rank of M. If the rank of M is less than m (the number of vectors) then there is a linear dependence. In the algorithm to determine the rank of M you can stop the procedure as soon as you obtain one row of zeros, but running the algorithm to completion has the added bonanza of providing the dimension of the spanning set of the vectors. Oh, and the algorithm to determine the rank of M is merely Gaussian elimination.
Take care for numerical instability. See the warning at the beginning of chapter two in Numerical Recipes.
If m<n, you will have to do some operation on them (there are multiple possibilities: Gaussian elimination, orthogonalization, etc., almost any transformation which can be used for solving equations will do) and check the result (eg. Gaussian elimination => zero row or column, orthogonalization => zero vector, SVD => zero singular number)
However, note that this question is a bad question for a programmer to ask, and this problem is a bad problem for a program to solve. That's because every linearly dependent set of n<m vectors has a different set of linearly independent vectors nearby (eg. the problem is numerically unstable)
I have been working on this problem these days.
Previously, I have found some algorithms regarding Gaussian or Gaussian-Jordan elimination, but most of those algorithms only apply to square matrix, not general matrix.
To apply for general matrix, one of the best answers might be this:
http://rosettacode.org/wiki/Reduced_row_echelon_form#MATLAB
You can find both pseudo-code and source code in various languages.
As for me, I transformed the Python source code to C++, causes the C++ code provided in the above link is somehow complex and inappropriate to implement in my simulation.
Hope this will help you, and good luck ^^
If computing power is not a problem, probably the best way is to find singular values of the matrix. Basically you need to find eigenvalues of M'*M and look at the ratio of the largest to the smallest. If the ratio is not very big, the vectors are independent.
Another way to check that m row vectors are linearly independent, when put in a matrix M of size mxn, is to compute
det(M * M^T)
i.e. the determinant of a mxm square matrix. It will be zero if and only if M has some dependent rows. However Gaussian elimination should be in general faster.
Sorry man, my mistake...
The source code provided in the above link turns out to be incorrect, at least the python code I have tested and the C++ code I have transformed does not generates the right answer all the time. (while for the exmample in the above link, the result is correct :) -- )
To test the python code, simply replace the mtx with
[30,10,20,0],[60,20,40,0]
and the returned result would be like:
[1,0,0,0],[0,1,2,0]
Nevertheless, I have got a way out of this. It's just this time I transformed the matalb source code of rref function to C++. You can run matlab and use the type rref command to get the source code of rref.
Just notice that if you are working with some really large value or really small value, make sure use the long double datatype in c++. Otherwise, the result will be truncated and inconsistent with the matlab result.
I have been conducting large simulations in ns2, and all the observed results are sound.
hope this will help you and any other who have encontered the problem...
A very simple way, that is not the most computationally efficient, is to simply remove random rows until m=n and then apply the determinant trick.
m < n: remove rows (make the vectors shorter) until the matrix is square, and then
m = n: check if the determinant is 0 (as you said)
m < n (the number of vectors is greater than their length): they are linearly dependent (always).
The reason, in short, is that any solution to the system of m x n equations is also a solution to the n x n system of equations (you're trying to solve Av=0). For a better explanation, see Wikipedia, which explains it better than I can.

Resources