just learning as3 for flex. i am trying to do this:
var someNumber:String = "10150125903517628"; //this is the actual number i noticed the issue with
var result:String = String(Number(someNumber) + 1);
I've tried different ways of putting the expression together and no matter what i seem to do the result is always equal to 10150125903517628 rather than 10150125903517629
Anyone have any ideas??! thanks!
All numbers in JavaScript/ActionScript are effectively double-precision IEEE-754 floats. These use a 64-bit binary number to represent your decimal, and have a precision of roughly 16 or 17 decimal digits.
You've run up against the limit of that format with your 17-digit number. The internal binary representation of 10150125903517628 is no different to that of 10150125903517629 which is why you're not seeing any difference when you add 1.
If, however, you add 2 then you will (should?) see the result as 10150125903517630 because that's enough of a "step" that the internal binary representation will change.
Related
There is a lot of questions on rounding that i have looked at but tey all involve rounding a number to its nearest whole, or to a certain number of points. What i want to do is simply convert a string to a double without any added digits on the right of the decimal point. Here is my code and result as of now:
Convert the string 0.78240 to a double, which should be 0.78240 but instead is 0.78239999999999998 when i look at it in the debugger.
The string value is a QString and is converted to a double simply using the toDouble() function.
I don't understand how or where these extra numbers are coming from, but any help on converting from QString to double directly would be greatly appreciated!
The extra digits are there because you are converting a decimal real number to binary floating point.
Unlike real numbers, floating-point representations have infinite resolution and finite range, and also binary floating-point values do not exactly coincide with all (or even most) decimal real values.
The simple fact is that binary floating-point cannot exactly represent 0.7824010, your debugger is showing you all the available digits after round-tripping the binary value back to decimal.
It is not necessarily a problem, because the error is infinitesimally small compared to the magnitude of the value, and in any event the original 0.78240 value is no doubt some approximation of a real-world value - they are both approximations, just binary or decimal approximations.
The issue is normally dealt with at presentation rather then representation. For example, in this case, unlike your debugger which necessarily shows the full precision of the internal representation (you would not want it any other way in a debugger), the standard means of presenting such a value will limit itself to a small, or caller defined number of decimal places and this value presented to even 15 decimal places will be correctly presented as 0.782400000000000 (by default standard output methods will show just 0.7824).
Any double value presented at 15 significant decimal figures or fewer will display as expected, for a float this reduces to just 6 significant figures. I imagine your debugger is displaying more digits that can accurately be presented in an IEEE 754 64-bit FP (double) value because internally the x86 FPU uses an 80bit representation.
You are quite literally sweating the small stuff.
One place where this difference in representation does matter is in financial applications. For those, it is common to use decimal floating point and normally to many more significant figures than double can provide. However decimal floating-point is not normally implemented in hardware, so is much slower. Moreover decimal floating point is not directly supported in most programming languages, and requires library support. C# is an example of a language with built-in support for decimal floating-point; its decimal type is good for 28 significant figures.
I'm working with Scilab 5.5.2 and when using the format command I can display at most 25 digits of a number. Is there a way to somehow display more than this number?
Scilab operates with double precision floating point numbers; it does not support variable-precision arithmetics. Double precision means relative error of %eps, which is 2-52, approximately 2e-16.
This means you can't even get 25 correct decimal digits: when using format(25) you get garbage at the end. For example,
format(25); sqrt(3)
returns 1.732050807568877 1931766
I separated the last 7 digits here because they are wrong; the correct value of sqrt(3) begins with
1.732050807568877 2935274
Of course, if you don't mind the digits being wrong, you can have as many as you want:
strcat([sprintf('%.15f', sqrt(3)), "1111111111111111111111111111111"])
returns 1.7320508075688771111111111111111111111111111111.
But if you want to have arbitrary exceeding of real numbers, Scilab is not the right tool for the job (correction: phuclv pointed out Multiple Precision Arithmetic Toolbox which might work for you). Out of free software packages, mpmath Python library implements arbitrary precision of real numbers: it can be used directly or via Sagemath or SymPy. Commercial packages (Matlab, Maple, Mathematica) support variable precision too.
As for Scilab, I recommend using formatted print commands such as fprintf or sprintf, because they actually care about the output being meaningful. Example: printf('%.25f', sqrt(3)) returns
1.7320508075688772000000000
with garbage replaced by zeros. The last nonzero digit is still off by 1, but at least it's not meaningless.
Scilab uses double-precision floating-point type which has 53 bits of mantissa and can only be precise to ~15-17 digits. There's no reason printing digits beyond that.
If 25 digits of accuracy is needed then you can use a quadruple precision or double-double arithmetic library like ATOMS: Multiple Precision Arithmetic Toolbox details
If you need even more precision then the only way is using an arbitrary precision library like mpscilab, Xnum, etc...
I do a simple:
latitude:String = String.fromCString(UnsafePointer(sqlite3_column_text(statement, 11)))!
The value in the Database is "real".
In the database I have
51.234183426424316 (verified using Firefox'SQLite Manager)
With the above I get in my String only:
51.2341834264243
(the last two digits are missing with is not acceptable working with coordinates)
Any explanations? Solutions?
SQLite stores such numbers as as 64-bit IEEE floating-point numbers, which have a significand precisions of 53 bits, which corresponds to about 15-17 decimal digits.
How to format such a number for display is a different question.
If you want to have control over it, get the original value with sqlite3_column_double(), and convert it to a string yourself.
(And you are complaining about a difference that is smaller than the wavelength of visible light ...)
can any one please explain why this gives different outputs?
round(1.49999999999999)
1
round(1.4999999999999999)
2
I have read the round documentation but it does not mention anything about it there.
I know that R represents numbers in binary form, but why does adding two extra 9's changes the result?
Thanks.
1.4999999999999999 can't be represented internally, so it gets rounded to 1.5.
Now, when you apply round(), the result is 2.
Put those two numbers into variable and then print it - you'll see they are different.
Computers doesn't store this kind of numbers with this exact value, (They don't use decadic numbers internaly)
I have never used R, so I don't know is this is the issue, but in other languages such as C/C++ a number like 1.4999999999999999 is represented by a float or a double.
Since these have finite precision, you cannot represent something like 1.4999999999999999 exactly. It might be the case that 1.4999999999999999 actually gets stored as 1.50000000000000 instead due to limitations on floating point precision.
I have a method that deals with some geographic coordinates in .NET, and I have a struct that stores a coordinate pair such that if 256 is passed in for one of the coordinates, it becomes 0. However, in one particular instance a value of approximately 255.99999998 is calculated, and thus stored in the struct. When it's printed in ToString(), it becomes 256, which should not happen - 256 should be 0. I wouldn't mind if it printed 255.9999998 but the fact that it prints 256 when the debugger shows 255.99999998 is a problem. Having it both store and display 0 would be even better.
Specifically there's an issue with comparison. 255.99999998 is sufficiently close to 256 such that it should equal it. What should I do when comparing doubles? use some sort of epsilon value?
EDIT: Specifically, my problem is that I take a value, perform some calculations, then perform the opposite calculations on that number, and I need to get back the original value exactly.
This sounds like a problem with how the number is printed, not how it is stored. A double has about 15 significant figures, so it can tell 255.99999998 from 256 with precision to spare.
You could use the epsilon approach, but the epsilon is typically a fudge to get around the fact that floating-point arithmetic is lossy.
You might consider avoiding binary floating-points altogether and use a nice Rational class.
The calculation above was probably destined to be 256 if you were doing lossless arithmetic as you would get with a Rational type.
Rational types can go by the name of Ratio or Fraction class, and are fairly simple to write
Here's one example.
Here's another
Edit....
To understand your problem consider that when the decimal value 0.01 is converted to a binary representation it cannot be stored exactly in finite memory. The Hexidecimal representation for this value is 0.028F5C28F5C where the "28F5C" repeats infinitely. So even before doing any calculations, you loose exactness just by storing 0.01 in binary format.
Rational and Decimal classes are used to overcome this problem, albeit with a performance cost. Rational types avoid this problem by storing a numerator and a denominator to represent your value. Decimal type use a binary encoded decimal format, which can be lossy in division, but can store common decimal values exactly.
For your purpose I still suggest a Rational type.
You can choose format strings which should let you display as much of the number as you like.
The usual way to compare doubles for equality is to subtract them and see if the absolute value is less than some predefined epsilon, maybe 0.000001.
You have to decide yourself on a threshold under which two values are equal. This amounts to using so-called fixed point numbers (as opposed to floating point). Then, you have to perform the round up manually.
I would go with some unsigned type with known size (eg. uint32 or uint64 if they're available, I don't know .NET) and treat it as a fixed point number type mod 256.
Eg.
typedef uint32 fixed;
inline fixed to_fixed(double d)
{
return (fixed)(fmod(d, 256.) * (double)(1 << 24))
}
inline double to_double(fixed f)
{
return (double)f / (double)(1 << 24);
}
or something more elaborated to suit a rounding convention (to nearest, to lower, to higher, to odd, to even). The highest 8 bits of fixed hold the integer part, the 24 lower bits hold the fractional part. Absolute precision is 2^{-24}.
Note that adding and substracting such numbers naturally wraps around at 256. For multiplication, you should beware.