In python the function math.log(1000, 10) returns
2.9999999998 or some approximate value (neraly every third integer does that)
Which firstly is kind of messed up even though I imagine there's not much (except divisibility tests) to do about it.
And secondly it's not the value I want of course, how should I proceed? Casting to int will clearly return 2 and not 3... So what method is used to get the round to nearest int? In this case and in general, please.
Someone removed his/her answer before I could accept it, so I write mine which is no more than a summary.
Two options that I liked:
In this particular case, since the operation was math.log(1000, 10), it could be replaced with math.log10(1000) which shows much greater precision.
In a more general case, round(math.log(1000, 10)) will round 2.999... to the integer 3 so this would be more what was asked.
Related
Problem
I want to find
The first root
The first local minimum/maximum
of a black-box function in a given range.
The function has following properties:
It's continuous and differentiable.
It's combination of constant and periodic functions. All periods are known.
(It's better if it can be done with weaker assumptions)
What is the fastest way to get the root and the extremum?
Do I need more assumptions or bounds of the function?
What I've tried
I know I can use root-finding algorithm. What I don't know is how to find the first root efficiently.
It needs to be fast enough so that it can run within a few miliseconds with precision of 1.0 and range of 1.0e+8, which is the problem.
Since the range could be quite large and it should be precise enough, I can't brute-force it by checking all the possible subranges.
I considered bisection method, but it's too slow to find the first root if the function has only one big root in the range, as every subrange should be checked.
It's preferable if the solution is in java, but any similar language is fine.
Background
I want to calculate when arbitrary celestial object reaches certain height.
It's a configuration-defined virtual object, so I can't assume anything about the object.
It's not easy to get either analytical solution or simple approximation because various coordinates are involved.
I decided to find a numerical solution for this.
For a general black box function, this can't really be done. Any root finding algorithm on a black box function can't guarantee that it has found all the roots or any particular root, even if the function is continuous and differentiable.
The property of being periodic gives a bit more hope, but you can still have periodic functions with infinitely many roots in a bounded domain. Given that your function relates to celestial objects, this isn't likely to happen. Assuming your periodic functions are sinusoidal, I believe you can get away with checking subranges on the order of one-quarter of the shortest period (out of all the periodic components).
Maybe try Brent's Method on the shortest quarter period subranges?
Another approach would be to apply your root finding algorithm iteratively. If your range is (a, b), then apply your algorithm to that range to find a root at say c < b. Then apply your algorithm to the range (a, c) to find a root in that range. Continue until no more roots are found. The last root you found is a good candidate for your minimum root.
Black box function for any range? You cannot even be sure it has the continuous domain over that range. What kind of solutions are you looking for? Natural numbers, integers, real numbers, complex? These are all the question that greatly impact the answer.
So 1st thing should be determining what kind of number you accept as the result.
Second is having some kind of protection against limes of function that will try to explode your calculations as it goes for plus or minus infinity.
Since we are touching the limes topics you could have your solution edge towards zero and look like a solution but never touch 0 and become a solution. This depends on your margin of error, how close something has to be to be considered ok, it's good enough.
I think for this your SIMPLEST TO IMPLEMENT bet for real number solutions (I assume those) is to take an interval and this divide and conquer algorithm:
Take lower and upper border and middle value (or approx middle value for infinity decimals border/borders)
Try to calculate solution with all 3 and have some kind of protection against infinities
remember all 3 values in an array with results from them (3 pair of values)
remember the current best value (one its closest to solution) in seperate variable (a pair of value and result for that value)
STEP FORWARD - repeat above with 1st -2nd value range and 2nd -3rd value range
have a new pair of value and result to be closest to solution.
clear the old value-result pairs, replace them with new ones gotten from this iteration while remembering the best value solution pair (total)
Repeat above for how precise you wish to get and look at that memory explode with each iteration, keep in mind you are gonna to have exponential growth of values there. It can be further improved if you lets say take one interval and go as deep as you wanna, remember best value-result pair and then delete all other memory and go for next interval and dig deep.
I have been researching the log-sum-exp problem. I have a list of numbers stored as logarithms which I would like to sum and store in a logarithm.
the naive algorithm is
def naive(listOfLogs):
return math.log10(sum(10**x for x in listOfLogs))
many websites including:
logsumexp implementation in C?
and
http://machineintelligence.tumblr.com/post/4998477107/
recommend using
def recommend(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + math.log10(sum(10**(x-maxLog) for x in listOfLogs))
aka
def recommend(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + naive((x-maxLog) for x in listOfLogs)
what I don't understand is if recommended algorithm is better why should we call it recursively?
would that provide even more benefit?
def recursive(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + recursive((x-maxLog) for x in listOfLogs)
while I'm asking are there other tricks to make this calculation more numerically stable?
Some background for others: when you're computing an expression of the following type directly
ln( exp(x_1) + exp(x_2) + ... )
you can run into two kinds of problems:
exp(x_i) can overflow (x_i is too big), resulting in numbers that you can't add together
exp(x_i) can underflow (x_i is too small), resulting in a bunch of zeroes
If all the values are big, or all are small, we can divide by some exp(const) and add const to the outside of the ln to get the same value. Thus if we can pick the right const, we can shift the values into some range to prevent overflow/underflow.
The OP's question is, why do we pick max(x_i) for this const instead of any other value? Why don't we recursively do this calculation, picking the max out of each subset and computing the logarithm repeatedly?
The answer: because it doesn't matter.
The reason? Let's say x_1 = 10 is big, and x_2 = -10 is small. (These numbers aren't even very large in magnitude, right?) The expression
ln( exp(10) + exp(-10) )
will give you a value very close to 10. If you don't believe me, go try it. In fact, in general, ln( exp(x_1) + exp(x_2) + ... ) will give be very close to max(x_i) if some particular x_i is much bigger than all the others. (As an aside, this functional form, asymptotically, actually lets you mathematically pick the maximum from a set of numbers.)
Hence, the reason we pick the max instead of any other value is because the smaller values will hardly affect the result. If they underflow, they would have been too small to affect the sum anyway, because it would be dominated by the largest number and anything close to it. In computing terms, the contribution of the small numbers will be less than an ulp after computing the ln. So there's no reason to waste time computing the expression for the smaller values recursively if they will be lost in your final result anyway.
If you wanted to be really persnickety about implementing this, you'd divide by exp(max(x_i) - some_constant) or so to 'center' the resulting values around 1 to avoid both overflow and underflow, and that might give you a few extra digits of precision in the result. But avoiding overflow is much more important about avoiding underflow, because the former determines the result and the latter doesn't, so it's much simpler just to do it this way.
Not really any better to do it recursively. The problem's just that you want to make sure your finite-precision arithmetic doesn't swamp the answer in noise. By dealing with the max on its own, you ensure that any junk is kept small in the final answer because the most significant component of it is guaranteed to get through.
Apologies for the waffly explanation. Try it with some numbers yourself (a sensible list to start with might be [1E-5,1E25,1E-5]) and see what happens to get a feel for it.
As you have defined it, your recursive function will never terminate. That's because ((x-maxlog) for x in listOfLogs) still has the same number of elements as listOfLogs.
I don't think that this is easily fixable either, without significantly impacting either the performance or the precision (compared to the non-recursive version).
When i started reading about Qr Codes every article browsed i can see one QR Code Exponents of αx Table which having values for the specific powers of αx . I am not sure how this table is getting created. Can anybody explain me the logic behind this table.
For reference the table can be found at http://www.matchadesign.com/_blog/Matcha_Design_Blog/post/QR_Code_Demystified_-_Part_4/#
(The zxing source code for this might help you.)
It would take a lot to explain all the math here. For the Reed-Solomon error correction, you need a Galois field of 256 elements (nothing fancy -- just a set of 256 things that have addition and exponentiation and such defined.)
This is defined not in terms of numbers, but in terms of polynomials whose coefficients are all 0 or 1. We work with polynomials with 8 coefficient -- conveniently these map to 8-bit values. While it's tempting to think of those values as numbers, they're really something different.
In fact for addition and such to make sense such that all the operations land you back in a value in the Galois field, all the results are computed modulo an irreducible polynomial in the field. (Skip what that means now.)
To make operations faster, it helps to pre-compute what the powers of the polynomial "x" are in the field. This is alpha. You can think of this as "2", since the polynomial "x" is 00000010, though that's not entirely accurate.
So then you just compute the powers of x in the field. Because it's a field you'll hit every element of the field this way. The sequence seems to be the powers of two, which it happens to map to for a short while, until the first "modulo" of the primitive polynomial takes effect. Multiplying by x is indeed still something like multiplying by 2 but it's a bit of coincidence in this field, really.
So I'm just going to dive into this issue... I've got a heavily used web application that, for the first time in 2 years, failed doing an equality check on two doubles using the equality function a colleague said he'd also been using for years.
The goal of the function I'm about to paste in here is to compare two double values to 4 digits of precision and return the comparison results. For the sake of illustration, my values are:
Dim double1 As Double = 0.14625000000000002 ' The result of a calculation
Dim double2 As Double = 0.14625 ' A value that was looked up in a DB
If I pass them into this function:
Public Shared Function AreEqual(ByVal double1 As Double, ByVal double2 As Double) As Boolean
Return (CType(double1 * 10000, Long) = CType(double2 * 10000, Long))
End Function
the comparison fails. After the multiplication and cast to Long, the comparison ends up being:
Return 1463 = 1462
I'm kind of answering my own question here, but I can see that double1 is within the precision of a double (17 digits) and the cast is working correctly.
My first real question is: If I change the line above to the following, why does it work correctly (returns True)?
Return (CType(CType(double1, Decimal) * 10000, Long) = _
CType(CType(double2, Decimal) * 10000, Long))
Doesn't Decimal have even more precision, thus the cast to Long should still be 1463, and the comparison return False? I think I'm having a brain fart on this stuff...
Secondly, if one were to change this function to make the comparison I'm looking for more accurate or less error prone, would you recommend changing it to something much simpler? For example:
Return (Math.Abs(double1 - double2) < 0.0001)
Would I be crazy to try something like:
Return (double1.ToString("N5").Equals(double2.ToString("N5")))
(I would never do the above, I'm just curious about your reactions. It would be horribly inefficient in my application.)
Anyway, if someone could shed some light on the difference I'm seeing between casting Doubles and Decimals to Long, that would be great.
Thanks!
What Every Computer Scientist Should Know About Floating-Point Arithmetic
Relying on a cast in this situation is error prone, as you have discovered - depending upon the rules used when casting, you may not get the number you expect.
I would strongly advise you to write the comparison code without a cast. Your Math.Abs line is perfectly fine.
Regarding your first question:
My first real question is: If I change
the line above to the following, why
does it work correctly (returns True)?
The reason is that the cast from Double to Decimal is losing precision, resulting in a comparison of 0.1425 to 0.1425.
When you use CType, you're telling your program "I don't care how you round the numbers; just make sure the result is this other type". That's not exactly what you want to say to your program when comparing numbers.
Comparing floating-point numbers is a pain and I wouldn't ever trust a Round function in any language unless you know exactly how it behaves (e.g. sometimes it rounds .5 up and sometimes down, depending on the previous number...it's a mess).
In .NET, I might actually use Math.Truncate() after multiplying out my double value. So, Math.Truncate(.14625 * 10000) (which is Math.Truncate(1462.5)) is going to equal 1462 because it gets rid of all decimal values. Using Truncate() with the data from your example, both values would end up being equal because 1) they remain doubles and 2) you made sure the decimal was removed from each.
I actually don't think String comparison is very bad in this situation since floating point comparison is pretty nasty in itself. Granted, if you're comparing numbers, it's probably better to stick with numeric types, but using string comparison is another option.
I'm trying to write a program that will help someone study for the GRE math. As many of you may know, fractions are a big part of the test, and calculators aren't allowed. Basically what I want to do is generate four random numbers (say, 1-50) and either +-/* them and then accept an answer in fraction format. The random number thing is easy. The problem is, how can I 1) accept a fractional answer and 2) ensure that the answer is reduced all the way?
I am writing in ASP.NET (or jQuery, if that will suffice). I was pretty much wondering if there's some library or something that handles this kind of thing...
Thanks!
have a look at
http://www.geekpedia.com/code73_Get-the-greatest-common-divisor.html
http://javascript.internet.com/math-related/gcd-lcm-calculator.html
Since fractions are essentially divisions you can check to see if the answer is partially correct by performing the division on the fraction entries that you're given.
[pseudocode]
if (answer.contains("/"))
int a = answer.substring(1,answer.instanceof("/"))
int b = answer.substring(answer.instanceof("/"))
if (a/b == expectedAnswer)
if (gcd(a,b) == 1)
GOOD!
else
Not sufficiently reduced
else
WRONG!
To find out whether it's reduced all the way, create a GCD function which should evaluate to the value of the denominator that the user supplied as an answer.
Learn Python and try fractions module.