Order of magnitude of a float in Julia 1.6 - julia

Is there a way to determine the order of magnitude of a float in Julia 1.6?
For instance, a function such that OrderOfMagnitude(1000) = 3.

There are various definitions of order of magnitude, some of them are (assuming x is positive):
floor(Int, log10(x))
floor(Int, log10(2*x))
floor(Int, log10(sqrt(10)*x))

Related

Converting a Gray-Scale Array to a FloatingPoint-Array

I am trying to read a .tif-file in julia as a Floating Point Array. With the FileIO & ImageMagick-Package I am able to do this, but the Array that I get is of the Type Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8,8}},2}.
I can convert this FixedPoint-Array to Float32-Array by multiplying it with 255 (because UInt8), but I am looking for a function to do this for any type of FixedPointNumber (i.e. reinterpret() or convert()).
using FileIO
# Load the tif
obj = load("test.tif");
typeof(obj)
# Convert to Float32-Array
objNew = real.(obj) .* 255
typeof(objNew)
The output is
julia> using FileIO
julia> obj = load("test.tif");
julia> typeof(obj)
Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8,8}},2}
julia> objNew = real.(obj) .* 255;
julia> typeof(objNew)
Array{Float32,2}
I have been looking in the docs quite a while and have not found the function with which to convert a given FixedPoint-Array to a FloatingPont-Array without multiplying it with the maximum value of the Integer type.
Thanks for any help.
edit:
I made a small gist to see if the solution by Michael works, and it does. Thanks!
Note:I don't know why, but the real.(obj) .* 255-code does not work (see the gist).
Why not just Float32.()?
using ColorTypes
a = Gray.(convert.(Normed{UInt8,8}, rand(5,6)));
typeof(a)
#Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8,8}},2}
Float32.(a)
The short answer is indeed the one given by Michael, just use Float32.(a) (for grayscale). Another alternative is channelview(a), which generally performs channel separation thus also stripping the color information from the array. In the latter case you won't get a Float32 array, because your image is stored with 8 bits per pixel, instead you'll get an N0f8 (= FixedPointNumbers.Normed{UInt8,8}). You can read about those numbers here.
Your instinct to multiply by 255 is natural, given how other image-processing frameworks work, but Julia has made some effort to be consistent about "meaning" in ways that are worth taking a moment to think about. For example, in another programming language just changing the numerical precision of an array:
img = uint8(255*rand(10, 10, 3)); % an 8-bit per color channel image
figure; image(img)
imgd = double(img); % convert to double-precision, but don't change the values
figure; image(imgd)
produces the following surprising result:
That second "all white" image represents saturation. In this other language, "5" means two completely different things depending on whether it's stored in memory as a UInt8 vs a Float64. I think it's fair to say that under any normal circumstances, a user of a numerical library would call this a bug, and a very serious one at that, yet somehow many of us have grown to accept this in the context of image processing.
These new types arise because in Julia we've gone to the effort to implement new numerical types (FixedPointNumbers) that act like fractional values (e.g., between 0 and 1) but are stored internally with the same bit pattern as the "corresponding" UInt8 (the one you get by multiplying by 255). This allows us to work with 8-bit data and yet allow values to always be interpreted on a consistent scale (0.0=black, 1.0=white).

Is there a way to use 32-bit float instead of 64-bit in R dataframes?

The issue I'm having is related to memory. I'm doing financial calculations on dollar amounts and currently my float precision is 64-bits. I'd like to reduce the precision down to at least 32-bits but so far have not found a way to specify this in R. Ideally, this would just be applied to a dataframe that has a number of columns some are ints and some are floats.
No, there is not -- at least not in 'base R' which has only one integer and numeric (floating point) type each, and their sizes are fixed.
You can inspect them (and more) via .Machine -- see help(".Machine").
Now, for your dollar amounts you could of course resort to expressing things in cents instead in which case you could integer -- which is generally half the size of numeric.
The following converts all numeric columns in numeric_df to float32
library(float)
library(purrr)
float32_df <- purrr::modify_if(numeric_df, is.numeric, as.float)

ATMega peformance for different operations

Has anyone experiences replacing floating point operations on ATMega (2560) based systems? There are a couple of very common situations which happen every day.
For example:
Are comparisons faster than divisions/multiplications?
Are float to int type cast with followed multiplication/division faster than pure floating point operations without type cast?
I hope I don't have to make a benchmark just for me.
Example one:
int iPartialRes = (int)fArg1 * (int)fArg2;
iPartialRes *= iFoo;
faster as?:
float fPartialRes = fArg1 * fArg2;
fPartialRes *= iFoo;
And example two:
iSign = fVal < 0 ? -1 : 1;
faster as?:
iSign = fVal / fabs(fVal);
the questions could be solved just by thinking a moment about it.
AVRs does not have a FPU so all floating point related stuff is done in software --> fp multiplication involves much more than a simple int multiplication
since AVRs also does not have a integer division unit a simple branch is also much faster than a software division. if dividing floating points this is the worst worst case :)
but please note, that your first 2 examples produce very different results.
This is an old answer but I will submit this elaborated answer for the curious.
Just typecasting a float will truncate it ie; 3.7 will become 3, there is no rounding.
Fastest math on a 2560 will be (+,-,*) with divide being the slowest due to no hardware divide. Typecasting to an unsigned long int after multiplying all operands by a pseudo decimal point that suits your fractal number(1) range that your floats are expected to see and tracking the sign as a bool will give the best range/accuracy compromise.
If your loop needs to be as fast as possible, avoid even integer division, instead multiplying by a pseudo fraction instead and then doing your typecast back into a float with myFloat(defined elsewhere) = float(myPseudoFloat) / myPseudoDecimalConstant;
Not sure if you came across the Show info page in the playground. It's basically a sketch that runs a benchmark on your (insert Arduino model here) Shows the actual compute times for various things and systems. The Mega 2560 will be very close to an At Mega 328 as far as FLOPs goes, up to 12.5K/s (80uS per divide float). Typecasting would likely handicap the CPU more as it introduces more overhead and might even give erroneous results due to rounding errors and lack of precision.
(1)ie: 543.509,291 * 100000 = 543,509,291 will move the decimal 6 places to the maximum precision of a float on an 8-bit AVR. If you first multiply all values by the same constant like 1000, or 100000, etc, then the decimal point is preserved and then you cast it back to a float number by dividing by your decimal constant when you are ready to print or store it.
float f = 3.1428;
int x;
x = f * 10000;
x now contains 31428

How does one compute the sum of a 1D array with BLAS?

In BLAS level 1 there are *ASUM and *NRM2 that compute the L1 and L2 norms of vectors, but how does one compute the (signed) sum of a vector? There's got to be something better than filling another vector full of ones and doing a *DOT...
BLAS does not provide a horizontal sum operation like you are seeking, because it's not an operation that is frequently needed by linear algebra libraries.
Many DSP libraries do provide this operation; for example, on OS X and iOS you would use the vDSP_sve( ) function provided by the Accelerate framework. Unfortunately, available DSP libraries tend to vary a lot from platform to platform, so we would need to know more about what platform[s] you're targeting.
You can do a dot product where the second vector has an increment of zero. Using C it would be like this:
int n;
int ix = 1;
int iy = 0;
double y = 1.0;
ddot_(&n, x, &ix, &y, &iy);
One way is to use a dot product with a vector of ones, more specifically to use the cblas_caxpy function.
As seen in http://www.netlib.org/blas/blasqr.pdf, xAXPY supports vector summation.

Why are the arguments to atan2 Y,X rather than X,Y?

In C the atan2 function has the following signature:
double atan2( double y, double x );
Other languages do this as well. This is the only function I know of that takes its arguments in Y,X order rather than X,Y order, and it screws me up regularly because when I think coordinates, I think (X,Y).
Does anyone know why atan2's argument order convention is this way?
Because I believe it is related to arctan(y/x), so y appears on top.
Here's a nice link talking about it a bit: Angles and Directions
My assumption has always been that this is because of the trig definition, ie that
tan(theta) = opposite / adjacent
When working with the canonical angle from the origin, opposite is always Y and adjacent is always X, so:
atan2(opposite, adjacent) = theta
Ie, it was done that way so there's no ordering confusion with respect to the mathematical definition.
Suppose a rectangle triangle with its opposite side called y, adjacent side called x:
tan(angle) = y/x
arctan(tan(angle)) = arctan(y/x)
It's because in school, the mnemonic for calculating the gradient
is rise over run, or in other words dy/dx, or more briefly y/x.
And this order has snuck into the arguments of arctangent functions.
So it's a historical artefact. For me it depends on what I'm thinking
about when I use atan2. If I'm thinking about differentials, I get it right
and if I'm thinking about coordinate pairs, I get it wrong.
The order is atan2(X,Y) in excel so I think the reverse order is a programming thing. atan(Y/X) can easily be changed to atan2(Y,X) by putting a '2' between the 'n' and the '(', and replacing the '/' with a ',', only 2 operations. The opposite order would take 4 operations and some of the operations would be more complex (cut and paste).
I often work out my math in Excel then port it to .NET, so will get hung up on atan2 sometimes. It would be best if atan2 could be standardized one way or the other.
It would be more convenient if atan2 had its arguments reversed. Then you wouldn't need to worry about flipping the arguments when computing polar angles. The Mathematica equivalent does just that: https://reference.wolfram.com/language/ref/ArcTan.html
Way back in the dawn of time, FORTRAN had an ATAN2 function with the less convenient argument order that, in this reference manual, is (somewhat inaccurately) described as arctan(arg1 / arg2).
It is plausible that the initial creator was fixated on atan2(arg1, arg2) being (more or less) arctan(arg1 / arg2), and that the decision was blindly copied from FORTRAN to C to C++ and Python and Java and JavaScript.

Resources