What is the canonical way of converting a string storing a number in scientific notation into an integer?
from
"1e6"
to
1000000
As for the reverse process, converting integer to string in scientific notation I understand I can use #sprintf macro. If one knows the exact format to achieve exactly the reverse process - so small e and no extra trailing .00 zeros (like 1.00e6), or leading zeros (like 1e08) - I will appreciate if it will be included for completeness.
The conversion from string to integer can be achieved via floats like this:
julia> Int(parse(Float64, "1e6"))
1000000
if you know that the number will fit into Int64 or like this
julia> BigInt(parse(BigFloat, "1e6"))
1000000
for larger numbers.
For the reverse process the default in #sprintf would be the following:
julia> #sprintf("%.0e", 1_000_000)
"1e+06"
However, you get + after e and at least two digits are displayed in the exponent (both features are a standard to expect across different languages when you do such a conversion). Also note that this process will lead to rounding, e.g.:
julia> #sprintf("%.0e", 1_000_001)
"1e+06"
Related
I deal with lots of mathematical expressions in a certain Julia script and would like to know if storing such a formula as a String is ok, or whether using the Symbol data type is better. Thinking about scalability and keeping memory requirements to a minimum. Thanks!
Update: the application involves a machine learning model. Ideally, it should be applicable to big data too, hence the need for scalability.
In a string, each character is stored based on its number of codeunits, eg. 1 for ascii. The same is true for the characters of a Symbol. So that is a wash; do what fits your use best, probably Symbols since you are manipulating expressions.
An expression like :(x + y) is stored as a list of Any, with space allocated according to the sizeof each item in the expression.
In an expression like :(7 + 4 * 9) versus a string like "7 + 4 * 9" there are two conflicting issues. First, 7 is stored as 1 byte in the string, but 8 bytes in the expression since there are 64-bit Ints in play. On the other hand, whitespace takes up 1 byte each space in the string, but does not use memory in the expression. And a number like 123.123456789 takes up 14 bytes in the string and 8 in the expression (64 bit floats).
I think that, again, this is close to being even, and depends on the specific strings you are parsing. You could, as you work with the program, store both, compare memory usage of the resulting arrays, and drop one type of storage if you feel you should.
I am trying to read a .tif-file in julia as a Floating Point Array. With the FileIO & ImageMagick-Package I am able to do this, but the Array that I get is of the Type Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8,8}},2}.
I can convert this FixedPoint-Array to Float32-Array by multiplying it with 255 (because UInt8), but I am looking for a function to do this for any type of FixedPointNumber (i.e. reinterpret() or convert()).
using FileIO
# Load the tif
obj = load("test.tif");
typeof(obj)
# Convert to Float32-Array
objNew = real.(obj) .* 255
typeof(objNew)
The output is
julia> using FileIO
julia> obj = load("test.tif");
julia> typeof(obj)
Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8,8}},2}
julia> objNew = real.(obj) .* 255;
julia> typeof(objNew)
Array{Float32,2}
I have been looking in the docs quite a while and have not found the function with which to convert a given FixedPoint-Array to a FloatingPont-Array without multiplying it with the maximum value of the Integer type.
Thanks for any help.
edit:
I made a small gist to see if the solution by Michael works, and it does. Thanks!
Note:I don't know why, but the real.(obj) .* 255-code does not work (see the gist).
Why not just Float32.()?
using ColorTypes
a = Gray.(convert.(Normed{UInt8,8}, rand(5,6)));
typeof(a)
#Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8,8}},2}
Float32.(a)
The short answer is indeed the one given by Michael, just use Float32.(a) (for grayscale). Another alternative is channelview(a), which generally performs channel separation thus also stripping the color information from the array. In the latter case you won't get a Float32 array, because your image is stored with 8 bits per pixel, instead you'll get an N0f8 (= FixedPointNumbers.Normed{UInt8,8}). You can read about those numbers here.
Your instinct to multiply by 255 is natural, given how other image-processing frameworks work, but Julia has made some effort to be consistent about "meaning" in ways that are worth taking a moment to think about. For example, in another programming language just changing the numerical precision of an array:
img = uint8(255*rand(10, 10, 3)); % an 8-bit per color channel image
figure; image(img)
imgd = double(img); % convert to double-precision, but don't change the values
figure; image(imgd)
produces the following surprising result:
That second "all white" image represents saturation. In this other language, "5" means two completely different things depending on whether it's stored in memory as a UInt8 vs a Float64. I think it's fair to say that under any normal circumstances, a user of a numerical library would call this a bug, and a very serious one at that, yet somehow many of us have grown to accept this in the context of image processing.
These new types arise because in Julia we've gone to the effort to implement new numerical types (FixedPointNumbers) that act like fractional values (e.g., between 0 and 1) but are stored internally with the same bit pattern as the "corresponding" UInt8 (the one you get by multiplying by 255). This allows us to work with 8-bit data and yet allow values to always be interpreted on a consistent scale (0.0=black, 1.0=white).
I'm working with Scilab 5.5.2 and when using the format command I can display at most 25 digits of a number. Is there a way to somehow display more than this number?
Scilab operates with double precision floating point numbers; it does not support variable-precision arithmetics. Double precision means relative error of %eps, which is 2-52, approximately 2e-16.
This means you can't even get 25 correct decimal digits: when using format(25) you get garbage at the end. For example,
format(25); sqrt(3)
returns 1.732050807568877 1931766
I separated the last 7 digits here because they are wrong; the correct value of sqrt(3) begins with
1.732050807568877 2935274
Of course, if you don't mind the digits being wrong, you can have as many as you want:
strcat([sprintf('%.15f', sqrt(3)), "1111111111111111111111111111111"])
returns 1.7320508075688771111111111111111111111111111111.
But if you want to have arbitrary exceeding of real numbers, Scilab is not the right tool for the job (correction: phuclv pointed out Multiple Precision Arithmetic Toolbox which might work for you). Out of free software packages, mpmath Python library implements arbitrary precision of real numbers: it can be used directly or via Sagemath or SymPy. Commercial packages (Matlab, Maple, Mathematica) support variable precision too.
As for Scilab, I recommend using formatted print commands such as fprintf or sprintf, because they actually care about the output being meaningful. Example: printf('%.25f', sqrt(3)) returns
1.7320508075688772000000000
with garbage replaced by zeros. The last nonzero digit is still off by 1, but at least it's not meaningless.
Scilab uses double-precision floating-point type which has 53 bits of mantissa and can only be precise to ~15-17 digits. There's no reason printing digits beyond that.
If 25 digits of accuracy is needed then you can use a quadruple precision or double-double arithmetic library like ATOMS: Multiple Precision Arithmetic Toolbox details
If you need even more precision then the only way is using an arbitrary precision library like mpscilab, Xnum, etc...
When I subtract certain numbers whose difference is rather small, zsh doesn't output a floating point number like I expect. Instead, it outputs the difference like this:
% echo $((-78.44335 - -78.4433))
-5.0000000001659828e-05
This is causing unexpected behavior in a script which deals with arbitrary numbers: except when the difference is very small, there is no problem.
Why is zsh doing this? How can I make it always output a normal floating point number instead?
Edit:
My actual application is closer to this:
var=$((-78.44335 - -78.4433))
var2=$var
var=$((var * var3))
etc.
Concerning the engineering notation, this is normal when the exponent ≤ −5, and often the preferred way to represent floating-point numbers. If you don't like that, you can use printf with %f; for instance:
$ printf "%.24f\n" $((-78.44335 - -78.4433))
-0.000050000000001659827831
Alternatively, to assign the result to a variable without having to use a command substitution (thus a subshell):
$ ((var = -78.44335 - -78.4433))
$ echo $var
-0.0000500000
But only 10 digits are output after the decimal point (like printf "%.10f"). This may not be sufficient for all applications.
Some additional note about the trailing digits: Floating-point numbers are represented in base 2. This means that when converting a decimal number such as -78.44335 or -78.4433 to base 2, a rounding error generally occurs (because the decimal number cannot be represented exactly in the destination format, generally double precision). The effect of rounding errors is what you can see in the output. In particular, when you subtract two inexact numbers that are very close to each other, a catastrophic cancellation occurs, so that the relative error is quite large.
Note: this is not specific to zsh. You'll have similar problems with all software that uses base 2 internally.
just learning as3 for flex. i am trying to do this:
var someNumber:String = "10150125903517628"; //this is the actual number i noticed the issue with
var result:String = String(Number(someNumber) + 1);
I've tried different ways of putting the expression together and no matter what i seem to do the result is always equal to 10150125903517628 rather than 10150125903517629
Anyone have any ideas??! thanks!
All numbers in JavaScript/ActionScript are effectively double-precision IEEE-754 floats. These use a 64-bit binary number to represent your decimal, and have a precision of roughly 16 or 17 decimal digits.
You've run up against the limit of that format with your 17-digit number. The internal binary representation of 10150125903517628 is no different to that of 10150125903517629 which is why you're not seeing any difference when you add 1.
If, however, you add 2 then you will (should?) see the result as 10150125903517630 because that's enough of a "step" that the internal binary representation will change.