Normalize to scale - math

I have an 2-D array of data (C), where C(:,1) has values corresponding to C(:,2). C(:,2) varies from 0.0001:0.0001:1, i.e. 10,000 values. I need to calculate the d(log(C(i,1))) / d(log(C(i,2))), which I do by simply calculating log(C(i,1)) / log(C(i,2)). But as C(i,2), approaches 1, the denominator approaches zero, and the quotient shoots up. One way to keep this in check would be to normalize it using a parameter, but I'm not sure how to do that. Does anyone have an idea about this?

Since this is discrete differentiation, the answer is bound to be a little inelegant.
You're interested in the derivative d(log(C(i,1))) / d(log(C(i,2)))
=∆(log(C(i,1))) / ∆(log(C(i,2)))
=(log(C(i+1,1))-log(C(i,1))) / (log(C(i,2)) - log(C(i,2)))
which is tractable. The denominator does not go to zero, it goes to the step size (0.0001).

Related

Looking for good scale factor for converting log to 8.8 fixed point

I have a range of numbers in (0, 1]
I would like to take the natural log of these numbers, and then store as 8.8
fixed point.
My foruma is K*ln(x) + (1<<16)
but I am not sure what the best value is for K .
My thinking is that if x doubles, then ln(x) increases by ln(2), so the fixed point value should increase by 1 in fixed point (i.e. 256)
So, this would mean K = 256/ln(2)
Does this make sense?
As x approaches 0, ln(x) will diverge to negative infinity. So you are essentially trying to map an infinite domain to a finite range.
If you do so in a linear way, you have to cut off at some point. If you choose your cut-off at too low a value, you'll be wasting precision for the numbers you represent. If you choose to high a cut-off, too many values will be clamped to the minimal element of the range. Without knowledge about the distribution of the point, it will be very hard to guess a suitable balance here.
So perhaps you could apply a non-linear map instead of the linear one you proposed. Something like the exponential function? Which would mean you'd actually store x instead of ln(x). So I'd say if you want to store values from [0,1) in 16 bit without too much loss of information, you'd just use Q0.16, i.e. all the digits in the fractional part. For (0,1] you can either store 1 − x or do a special case for x = 1 so that you encode that as 0 instead. If you have Q8.8 numbers, you'd multiply your numbers by 28 = 256 first, but if you have access to the bit representation that multiplication would be a waste of time.
I guess you had a reason you'd want to store logarithms, so this answer may not be what you were hoping for. I don't see an easier way around the underlying problem, though, so you may have to reconsider some of your ideas.

Calculate derivative of an array with apache-commons-math

Good Morning,
I have an array with about 3000 double values, I need to find all local minimum and maximum, for this I'm interested to first and second derivative, what's best way to achieve this with Apache Commons Math? My trouble is that I'm starting directly from the array, not from a function like sin(x).
Thanks
With just an array you wont be able to find a min/max.
If the array was calcualted from a known function, then you could differentiate it numerically (just calculate at X and X + epsilon, and divide by epsilon, assuming that there's a single parameter that you're differentating with respect to).
Alternatively, is the array actually the list of coefficients of a big polynomial? If so, then the same approach might work.

Big O confusion: log2(N) vs log3(N)

Why is O(log2N) = O(log3N) ?
I don't understand this. Does big O not mean upper bound of something?
Isn't log2N bigger than log3N ? When I graph them, log2N is above log3N .
Big O doesn't deal with constant factors, and the difference between Logx(n) and Logy(n) is a constant factor.
To put it a little differently, the base of the logarithm basically just modifies the slope of a line/curve on the graph. Big-O isn't concerned with the slope of the curve on the graph, only with the shape of the curve. If you can get one curve to match another by shifting its slope up or down, then as far as Big-O notation cares, they're the same function and the same curve.
To try to put this in perspective, perhaps a drawing of some of the more common curve shapes would be useful:
As noted above, only the shape of a line matters though, not its slope. In the following figure:
...all the lines are straight, so even though their slopes differ radically, they're still all identical as far as big-O cares--they're all just O(N), regardless of the slope. With logarithms, we get roughly the same effect--each line will be curved like the O(log N) line in the previous picture, but changing the base of the logarithm will rotate that curve around the origin so you'll (again) have he same shape of line, but at different slopes (so, again, as far as big-O cares, they're all identical). So, getting to the original question, if we change bases of logarithms, we get curves that look something like this:
Here it may be a little less obvious that all that's happening is a constant change in the slope, but that's exactly the difference here, just like with the straight lines above.
It is because changing base of logarithms is equal to multiplying it by a constant. And big O does not care about constants.
log_a(b) = log_c(b) / log_c(a)
So to get from log2(n) to log3(n) you need to multiply it by 1 / log(3) 2.
In other words log2(n) = log3(n) / log3(2).
log3(2) is a constant and O(cn) = O(n), thus O (log2(n)) = O (log3(n))
There are some good answer here already, so please read them too.
To understand why Log2(n) is O(log3(n)) you need to understand two things.
1) What is mean by BigO notation. I suggest reading this: http://en.wikipedia.org/wiki/Big_O_notation If you understnad this,you will know 2n and 16n+5 are both O(N)
2) how logarithms work. the difference between log2 (N) and log10(N) will be a simple ratio, easily calculated if you want it as per luk32's answer.
Since logs at different bases differ only a by a constant ratio, and Big O is indifferent to minor things like constant multiplying factors, you will often find O(logN) actually omits the base, because the choice of any constant base (eg 2,3,10,e) makes no difference in this context.
It depends on the context in which O notation is used. When you are using it in algorithmic complexity reasoning you are interested in the asymptotic behaviour of a function, ie how it grows/decreases when it tends to (plus or minus) infinity (or another point of accumulation).
Therefore whereas f(n) = 3n is always less than g(n) = 1000n they both appear in O(n) since they grow linearly (according to their expressions) asymptotically.
The same reasoning pattern can be taken for the logarithm case that you posted since different bases logarithms differ for a constant factor, but share the same asymptotical behaviour.
Changing context, if you were interested in computing the exact performance of an algorithm given your estimates being exact and not approximate, you would prefer the lower one of course. In general all computational complexity comparisons are approximation thus done via asymptotical reasoning.

Normalize vector by zero

I am working on designing a new sensor, and so I have a vector of measured values and a vector of truth values. To represent error, it's simply measured - truth. Since there's a lot of variation in the truth, I would like to represent the normalized error. My initial thought would be error./truth to get percent error, but there are many cases where my truth value is zero! Can anyone think of a better way to represent the normalized data while avoiding the divide-by-zero? I'm working in Matlab, though the question is a bit language-agnostic as well.
PS, feel free to push this to another stackexchange if you think it's better suited
Try error = (measured-truth)/norm2(truth) for each vector.
Where norm2() is the forbenious norm.
norm2(x) =SQRT( SUM( x[i]^2, i=1..N ) )
This can only fail is all the values of truth are zero. You can mitigate this by adding a small positive number like 1e-12 to the norm, or to avoid the division when the norm is less than a threshold number.
I'd suggest you to separate results with zero (or smaller than 10e-6 for example) truth vector and non-zero truth vector. You can't treat it by the same means (since you can't normalize truth vector) and you should define what to do in that case.
I can't suggest you something specific because I don't know the problem statement, but you should define it by yourself how to deal with it. Or if you post your problem here I hope we can help you.

svmlib scaling vs. pyml normalization, scaling, and translation

What is the proper way to normalize feature vectors for use in a linear-kernel SVM?
Looking at LIBSVM, it looks like it's done by just rescaling each feature to a single standard upper/lower range. However, it doesn't seem like PyML provides a way to scale the data this way. Instead, there are options to normalize the vectors by their length, shift each feature value by its mean while rescaling by the standard deviation, etc.
I am dealing with a case when most features are binary, except a few that are numeric.
I am not an expert in this, but I believe centering and scaling each feature vector by subtracting its mean and dividing thereafter by the standard deviation is a typical way to normalize feature vectors for use with SVMs. In R, this can be done with the scale function.
Another way is to transform each feature vector to the [0,1] range:
(x - min(x)) / (max(x) - min(x))
Maybe some features could benefit from a log-transformation if the distribution is very scewed, but this would change the shape of the distribution as well and not only "move" it.
I am not sure what you gain in an SVM-setting by normalizing the vectors by their L1 or L2 norm like PyML does with its normalize method. I guess binary features (0 or 1) don't need to be normalized.

Resources