OpenCL convert_long from NaN Producing Incorrect Result on RTX 2060 - opencl

OpenCL's convert_T (OpenCL 1.2, others are similar) function for long is producing an odd result.
In particular the function definition states:
Conversions to integer type may opt to convert using the optional
saturated mode by appending the _sat modifier to the conversion
function name. When in saturated mode, values that are outside the
representable range shall clamp to the nearest representable value in
the destination format. (NaN should be converted to 0).
However, on an NVidia RTX 2060 I am getting the most negative integral value for NaN inputs. For instance consider, the following kernel given NaN inputs such as 0x7FC00001 and 0xFFC00001.
kernel void test(
const global uint *srcF)
{
uint id = get_global_id(0);
float x = ((global float *)srcF)[id];
long x_rte = convert_long_sat_rte(x);
if (isnan(x) && x_rte == 0x8000000000000000) {
printf("0x%016llx: oops! 0x%08X fails to generate zero\n", x_rte, srcF[id]);
}
}
On an NVidia RTX 2060 I see.
0x8000000000000000: oops! 0x7FC00001 fails to generate zero
0x8000000000000000: oops! 0xFFC00001 fails to generate zero
It seems to generates 0x8000000000000000 (most negative long) instead of the expected value of 0. On Intel HD 630 I get 0's as expected. Similarly, I noticed some double to other integral types with convert_T_sat also fail similarly (returning the most negative integral value).
My question, am I missing something here? Am I misunderstanding the above spec? I know typical conversion has ill defined behavior outside bounds, but this explicit conversion seems to clearly say NaN's must be converted to 0. Still, this seems like an obvious conformance test that the driver must have gone through and I suspect myself of screwing up here.

Related

I was trying binomial coeffecient problem using dp. i know its not an efficient approach bugt i wamt to know whats this SIGFPE error

Program I wrote. ran it on practice environment of gfg:
class Solution{
public:
int nCr(int n, int r){
// code here
enter code here const unsigned int M = 1000000007;
long long dp[n+1]={0},i,ans;
if(n<r)
return 0;
dp[0]=1;
dp[1]=1;
for(i=2;i<=n;i++){
dp[i]=i*dp[i-1];
}
ans=(dp[n])/(dp[n-r]*dp[r]);
ans=ans%M;
return ans;
}
};
don't really understand what is going on. The division seems to be well defined.
The division seems to be well defined.
You are right suspecting the division as the SIGFPE error origin. As you know, division is well defined as long as the divisor is not zero. At first glance, one wouldn't expect that dp[n-r]*dp[r] could become zero. But the elements of dp have a limited range of values they can hold. With a 64-bit long long, the maximum representable value typically is 263−1 = 9223372036854775807. This means that dp[i] already has overflown for i > 20, though on common processors this overflow is silently ignored. Now, as computing the factorial by multiplication with even higher values of i proceeds, more and more zeros are "shifted in" from the right until eventually all 64 bits are zero; this is on common processors for i = 66 where the exception occurs when n-r or r are equal to or greater than 66.

Inaccurate results with OpenCL Reduction example

I am working with the OpenCL reduction example provided by Apple here
After a few days of dissecting it, I understand the basics; I've converted it to a version that runs more or less reliably on c++ (Openframeworks) and finds the largest number in the input set.
However, in doing so, a few questions have arisen as follows:
why are multiple passes used? the most I have been able to cause the reduction to require is two; the latter pass only taking a very low number of elements and so being very unsuitable for an openCL process (i.e. wouldn't it be better to stick to a single pass and then process the results of that on the cpu?)
when I set the 'count' number of elements to a very high number (24M and up) and the type to a float4, I get inaccurate (or totally wrong) results. Why is this?
in the openCL kernels, can anyone explain what is being done here:
while (i < n){
int a = LOAD_GLOBAL_I1(input, i);
int b = LOAD_GLOBAL_I1(input, i + group_size);
int s = LOAD_LOCAL_I1(shared, local_id);
STORE_LOCAL_I1(shared, local_id, (a + b + s));
i += local_stride;
}
as opposed to what is being done here?
#define ACCUM_LOCAL_I1(s, i, j) \
{ \
int x = ((__local int*)(s))[(size_t)(i)]; \
int y = ((__local int*)(s))[(size_t)(j)]; \
((__local int*)(s))[(size_t)(i)] = (x + y); \
}
Thanks!
S
To answer the first 2 questions:
why are multiple passes used?
Reducing millions of elements to a few thousands can be done in parallel with a device utilization of almost 100%. But the final step is quite tricky. So, instead of keeping everything in one shot and have multiple threads idle, Apple implementation decided to do a first pass reduction; then adapt the work items to the new reduction problem, and finally completing it.
Ii is a very specific optimization for OpenCL, but it may not be for C++.
when I set the 'count' number of elements to a very high number (24M
and up) and the type to a float4, I get inaccurate (or totally wrong)
results. Why is this?
A float32 precision is 2^23 the remainder. Values higher than 24M = 1.43 x 2^24 (in float representation), have an error in the range +/-(2^24/2^23)/2 ~= 1.
That means, if you do:
float A=24000000;
float B= A + 1; //~1 error here
The operator error is in the range of the data, therefore... big errors if you repeat that in a loop!
This will not happen in 64bits CPUs, because the 32bits float math uses internally 48bits precision, therefore avoiding these errors. However if you get the float close to 2^48 they will happen as well. But that is not the typical case for normal "counting" integers.
The problem is with the precision of 32 bit floats. You're not the first person to ask about this either. OpenCL reduction result wrong with large floats

Rounding error with TDateTime on iOS

When calculating a 32bit ID from a timestamp (TDateTime), I get a strange error. In certain situations, the value is different on different processors.
The fTimeStamp field is read from a Double field in a SQLite database.
The code below calculates a 32bit ID (lIntStamp) from fTimeStamp, but in some (rare) situations the value is different on different computers even though the source database file is exactly the same (i.e. the Double stored on file is the same).
...
fTimeStamp: TDateTime
...
var
lIntStamp: Int64;
begin
lIntStamp := Round(fTimeStamp * 864000); //86400=24*60*60*10=steps of 1/10th second
lIntStamp := lIntStamp and $FFFFFFFF;
...
end;
The precision ofTDateTime (Double) is 15 digits, but the rounded value in the code uses only 11 digits, so there should be enough information to round correctly.
To mention an example of values: in a specific test run the value of lIntStamp was $74AE699B on a Windows computer and $74AE699A on an iPad (= only last bit is different).
Is the Round function implemented different on each platform?
PS. Our target platforms are currently Windows, MacOS and iOS.
Edit:
I made a small test program based on the comments:
var d: Double;
id: int64 absolute d;
lDouble: Double;
begin
id := $40E4863E234B78FC;
lDouble := d*864000;
Label1.text := inttostr(Round(d*864000))+' '+floattostr(lDouble)+' '+inttostr(Round(lDouble));
end;
The output on Windows is:
36317325723 36317325722.5 36317325722
On the iPad the output is:
36317325722 36317325722.5 36317325722
The difference is in the first number, which shows the rounding of the intermediate calculation, so the problem happens because x86 has a higher internal precision (80 bit) than the ARM (64 bit).
Assuming that all the processors are IEEE754 compliant, and that you are using the same rounding mode in all processors, then you will be able to get the same results from all the different processors.
However, there may be compiled code differences, or implementation differences with your code as it stands.
Consider how
fTimeStamp * 24 * 60 * 60 * 10
is evaluated. Some compilers may perform
fTimeStamp * 24
and then store the intermediate result in a FP register. Then multiply that by 60, and store to a FP register. And so on.
Now, under x86 the floating point registers are 80 bit extended and by default, those intermediate registers will hold the results to 80 bits.
On the other hand the ARM processors don't have 80 registers. The intermediate values are held at 64 bit double precision.
So that's a machine implementation difference that would explain your observed behaviour.
Another possibility is that the ARM compiler spots the constant in the expression and evaluates it at compile time, reducing the above to
fTimeStamp * 864000
I've never seen an x86 or x64 compiler that does that, but perhaps the ARM compiler does. That's a difference in the compiled code. I'm not saying that it happens, I don't know the mobile compilers. But there's no reason why it could not happen.
However, here is your salvation. Re-write your expression as above with that single multiplication. That way you get rid of any scope for intermediate values being stored to different precision. Then, so long as Round means the same thing on all processors, the results will be identical.
Personally I'd avoid questions over rounding mode and instead of Round would use Trunc. I know it has a different meaning, but for your purposes it is an arbitrary choice.
You'd then be left with:
lIntStamp := Trunc(fTimeStamp * 864000); //steps of 1/10th second
lIntStamp := lIntStamp and $FFFFFFFF;
If Round is behaving differently on the different platforms then you may need to implement it yourself on ARM. On x86 the default rounding mode is bankers. That only matters when half way between two integers. So check if Frac(...) = 0.5 and round accordingly. That check is safe because 0.5 is exactly representable.
On the other hand you seem to be claiming that
Round(36317325722.5000008) = 36317325722
on ARM. If so that is a bug. I cannot believe what you claim. I believe that the value passed to Round is in fact
36317325722.5 on ARM. That's the only thing that can make sense to me. I cannot believe Round is defective.
Just to be complete, here is what is going on:
A call to Round(d*n);, where d is a double and n is a number, will turn the multiplication into an extended value before calling the Round function, on an x86 environment. On a x64 platform or OSX or IOS/Android platform, there is no promotion to an 80 bit extended value.
Analysing the extended values can be tricky, since the RTL has no function to write the full precision of an extended value.
John Herbster wrote such a library http://cc.embarcadero.com/Item/19421. (Add FormatSettings in two places to make it compile on a modern Delphi version).
Here is a small test that writes the results of extended and double values in steps of 1 bit change in the input double value.
program TestRound;
{$APPTYPE CONSOLE}
uses
System.SysUtils,
ExactFloatToStr_JH0 in 'ExactFloatToStr_JH0.pas';
var
// Three consecutive double values (binary representation)
id1 : Int64 = $40E4863E234B78FB;
id2 : Int64 = $40E4863E234B78FC; // <-- the fTimeStamp value
id3 : Int64 = $40E4863E234B78FD;
// Access the values as double
d1 : double absolute id1;
d2 : double absolute id2;
d3 : double absolute id3;
e: Extended;
d: Double;
begin
WriteLn('Extended precision');
e := d1*864000;
WriteLn(e:0:8 , ' ', Round(e), ' ',ExactFloatToStrEx(e,'.',#0));
e := d2*864000;
WriteLn(e:0:8 , ' ', Round(e),' ', ExactFloatToStrEx(e,'.',#0));
e := d3*864000;
WriteLn(e:0:8 , ' ', Round(e),' ', ExactFloatToStrEx(e,'.',#0));
WriteLn('Double precision');
d := d1*864000;
WriteLn(d:0:8 , ' ', Round(d),' ', ExactFloatToStrEx(d,'.',#0));
d := d2*864000;
WriteLn(d:0:8 , ' ', Round(d),' ', ExactFloatToStrEx(d,'.',#0));
d := d3*864000;
WriteLn(d:0:8 , ' ', Round(d),' ', ExactFloatToStrEx(d,'.',#0));
ReadLn;
end.
Extended precision
36317325722.49999480 36317325722 +36317325722.499994792044162750244140625
36317325722.50000110 36317325723 +36317325722.500001080334186553955078125
36317325722.50000740 36317325723 +36317325722.500007368624210357666015625
Double precision
36317325722.49999240 36317325722 +36317325722.49999237060546875
36317325722.50000000 36317325722 +36317325722.5
36317325722.50000760 36317325723 +36317325722.50000762939453125
Note that the fTimeStamp value in the question has an exact double representation (ending with .5) when using double precision calculation, while the extended calculation gives a value that is a tiny bit higher. This is the explanation of the different rounding results for the platforms.
As noted in comments, the solution would be to store the calculation in a Double before rounding. This would not solve the backward compatibility problem, which is not easy to accomplish.
Perhaps that is a good opportunity to store the time in another format.

Positive vs negative nans

I have some numerical code that was developed on AMD64 Linux (using LLVM 3.2).
I have recently ported it to OSX 10.9 with XCode. It runs fine, but it fails a lot of the unit tests: it seems that some calculations which on Linux return NaN (or -NaN) now return, on OSX, -NaN (or NaN).
Can I safely assume that positive and negative NaNs are equivalent and adjust my unit tests to accept either as a success, or is this a sign of something more serious going wrong?
There is no notion of a "negative NaN" in IEEE-754 arithmetic. The NaN encoding still has a sign bit, and there is a notion of a "sign bit" operation which uses or affects this bit (copysign, abs, a few others), but it does not have any meaning when the NaN encoding is interpreted as a value. Many print routines happen to print the bit as a negative sign, but it is formally meaningless, and therefore there isn't much in the standard to govern what its value should be (except w.r.t. the aforementioned functions).
Here's the relevant section of IEEE-754 (2008):
Conversion of a quiet NaN in a supported format to an external character sequence shall produce a language-defined one of “nan” or a sequence that is equivalent except for case (e.g., “NaN”), with an optional preceding sign. (This standard does not interpret the sign of a NaN.)
So your platform's conversion functions may print the "sign" of NaN values, but it has no meaning, and you shouldn't consider it for the purposes of testing.
Edited to be a bit stronger: it is almost always a bug to attach meaning to the "sign bit" of a NaN datum.
It depends entirely on what your unit tests are testing.
Most likely you'll be able to treat them as equivalent unless the testing you're doing is actually of the IEEE754 floating point software itself, or the C runtime code that prints them. Otherwise, you should treat them as identical if the code that uses what you're testing treats them as identical.
That's because the tests should echo your real usage, in every circumstance. An (admittedly contrived) example is if you're testing the function doCalc() which returns a double. If it's only ever used thus:
x = doCalc()
if x is any sort of Nan:
doSomethingWithNan()
then your test should treat all NaN values as equivalent. However, if you use it thus:
x = doCalc()
if x is +Nan:
doSomethingForPositive()
else:
if x is -Nan:
doSomethingForNegative()
then you'll want to treat them as distinct.
Similarly, if your implementation creates a useful payload in the fractional bits (see below), and your real code uses that, it should be checked by the unit tests as well.
Since a NaN is simply all 1-bits in the exponent and something other than all zero bits in the fraction, the sign bit may be positive or negative, and the fractional bits may be a wide variety of values. However, it's still a value or result that was outside the representation of the data type so, if you were expecting just that, it probably makes little difference what the sign or payload contain.
In terms of checking the textual output of NaN values, the Wikipedia page on NaN indicates that different implementations may give you widely varying outputs, among them:
nan
NaN
NaN%
NAN
NaNQ
NaNS
qNaN
sNaN
1.#SNAN
1.#QNAN
-1.#IND
and even variants showing the varying sign and payload that have no affect on its NaN-ness:
-NaN
NaN12345
-sNaN12300
-NaN(s1234)
So, if you want to be massively portable in your unit test, you'll notice that all the output representations bar one have some variant of the string nan in them. So a case-insensitive search through the value for the string nan or ind would pick them all up. That may not work in all environments but it has a very large coverage.
For what it's worth, the C standard has this to say about outputting floating point values with %f (%F uses uppercase letters):
A double argument representing a NaN is converted in one of the styles [-]nan or [-]nan(n-char-sequence) - which style, and the meaning of any n-char-sequence, is implementation-defined.
So it would suffice there to simply check if the value had nan somewhere inside it.

Negative Exponents throwing NaN in Fortran

Very basic Fortran question. The following function returns a NaN and I can't seem to figure out why:
F_diameter = 1. - (2.71828**(-1.0*((-1. / 30.)**1.4)))
I've fed 2.71... in rather than using exp() but they both fail the same way. I've noticed that I only get a NaN when the fractional part (-1 / 30) is negative. Positives evaluate ok.
Thanks a lot
The problem is that you are taking a root of a negative number, which would give you a complex answer. This is more obvious if you imagine e.g.
(-1) ** (3/2)
which is equivalent to
(1/sqrt(-1))**3
In other words, your fractional exponent can't trivially operate on a negative number.
There is another interesting point here I learned today and I want to add to ire_and_curses answer: The fortran compiler seems to compute powers with integers with successive multiplications.
For example
PROGRAM Test
PRINT *, (-23) ** 6
END PROGRAM
work fine and gives 148035889 as an answer.
But for REAL exponents, the compiler uses logarithms: y**x = 10**(x * log(y)) (maybe compilers today do differently, but my book says so). Now that negative logarithms give a complex result, this does not work:
PROGRAM Test
PRINT *, (-23) ** 6.1
END PROGRAM
and even gives an compiler error:
Error: Raising a negative REAL at (1) to a REAL power is prohibited
From an mathematical point of view, this problem seems also be quite interesting: https://math.stackexchange.com/questions/1211/non-integer-powers-of-negative-numbers

Resources