What does the OpenCL compiler option -cl-fast-relaxed-math do?
From reading the documentation - it looks like -cl-fast-relaxed-math allows a kernel to do floating point math on any variables - even if those variables point to the wrong data type, cause division by zero, or some other illegal behavior.
Is this correct? In what situation would this compiler option be useful?
From comments:
Enables -cl-finite-math-only and -cl-unsafe-math-optimizations. These two options provide aditional speed by removing some checks to the input values. IE: Not check for NaN numbers. However, if the input values happend to BE non normal numbers, the results are unknown. – DarkZeros
Related
I'm doing some materials work right now using Open Shading Language (OSL), and it has a convenient function, isinf(), which will determine whether a floating-point is infinite or not...
However, I can't find anything in the documentation about actually setting a variable to infinite. I'm instead going to be setting it to "irrationally large", which will certainly work well enough for my purposes (effectively cell noise generation), but I'm curious whether there's a built-in way to express infinity in OSL?
The problem is that OSL tries very very hard not to let you generate non-finite numbers, and there is no call to intentionally give you an infinity value. You could use what would in C be FLT_MAX: 3.402823466+38
This will be a strange question: I know what to do, and I am actually doing it, and it works, but I don't know how to write about it. Looking for solutions to a homogeneous matrix equation, say AX=0, I use the kernel of the parameter matrix A. But, the world being imperfect as it is, the matrix does not have a "perfect" kernel; it does have an "imperfect" one if you set a nonzero "tolerance" parameter. FWIW I'm using Scilab, the function is kernel(A,tol).
Now what are the correct terms for "imperfect kernel", or "tolerance" (of what?), how should this whole process be described in correct English and maths terminology? Should I say something like a "least-squares kernel"? "Approximate kernel"? Is tol the "tolerance of kernel-determination algorithm"? Sounds lame to me...
Depending on the method used (QR or SVD, third flag allows to choose this in Scilab implementation) the tolerance is used to determine when pivots (QR case) or singular values (SVD case) are consider to be zero. The kernel is then considered to be the associated subspace.
I noticed that the C++ standard library has separate functions for round and lround rather than just having you use long(round(x)) for the latter.
Looking into the implementation in glibc, I find that indeed, for platforms using IEEE754 floating point, the version that returns an integer will directly manipulate the bits from within the floating point representation, and not do the rounding using floating point operations (e.g. adding ±0.5).
What is the benefit of having a distinct implementation when you want the result as an integer type? Is this supposed to be faster, or more accurate? If it is better to use integer math on the underlying representation, why not just always do it that way even if returning the result as a double?
One reason is that adding .5 is insufficient. Let’s say you add .5 and then truncate to an integer. (How? Is there an instruction for that? Or are you doing more work?) If x is ½−2−54 (the greatest representable value less than ½), adding .5 yields 1, because the mathematical sum, 1−2−54, is exactly halfway between the nearest two representable values, 1−2−53 and 1, and the common default rounding mode, round-to-nearest-ties-to-even, rounds that to 1. But the correct result for lround(x) is 0.
And, of course, lround is specified to round ties away from zero, regardless of the current rounding mode. You could set the rounding mode, do some arithmetic, and restore the rounding mode, but there are problems with this.
One is that changing the rounding mode is a typically a time-consuming operation. The rounding mode is a global state that affects most floating-point instructions. So the processor has to ensure all pending instructions complete with the prior mode, change the global state, and ensure all later instructions start after that change.
If you are lucky, you might have a processor with per-instruction rounding modes or something similar, and then you can use any rounding mode you like without time penalty. Hewlett Packard has some processors like that. However, “round away from zero” is an uncommon mode. Most processors have round-to-nearest-ties-to-even, round toward zero, round down (toward −∞), and round up (toward +∞), and round-to-odd is becoming popular for its value in avoiding double-rounding errors. But round away from zero is rare.
Another reason is that doing floating-point instructions alters the floating-point status flags and may generate traps, but it is desired that library routines behave as single operations. For example, if we add .5 and rounding occurs, the inexact flag will be raised, since the floating-point addition with .5 produced a result different from the mathematical sum. But to the user of lround, no inexact condition ever occurs; lround is defined to return a value rounded to an integer, and it always does so—within the long range, it never returns a computed result different from its ideal mathematical definition. So if lround(x) raised the inexact flag, that would be incorrect behavior. To avoid it, an implementation that used floating-point instructions would have to save the current floating-point flags, do its work, and restore the flags before returning.
What happens if you divide by Zero on a Computer?
In any given programming languange (I worked with, at least) this raises an error.
But why? Is it built in the language, that this is prohibited? Or will it compile, and the hardware will figure out that an error must be returned?
I guess handling this by the language can only be done, if it is hard code, e.g. there is a line like double z = 5.0/0.0; If it is a function call, and the devisor is given from outside, the language could not even know that this is a division by zero (at least a compile time).
double divideByZero(double divisor){
return 5.0/divisor;
}
where divisor is called with 0.0.
Update:
According to the comments/answers it makes a difference whether you divide by int 0 or double 0.0.
I was not aware of that. This is interesting in itself and I'm interested in both cases.
Also one answer is, that the CPU throws an error. Now, how is this done? Also in software (doesn't make sense on a CPU), or are there some circuits which recognize this? I guess this happens on the Arithmetic Logic Unit (ALU).
When an integer is divided by 0 in the CPU, this causes an interrupt.¹ A programming language implementation can then handle that interrupt by throwing an exception or employing whichever other error-handling mechanisms the language has.
When a floating point number is divided by 0, the result is infinity, NaN or negative infinity (which are special floating point values). That's mandated by the IEEE floating point standard, which any modern CPU will adhere to. Programming languages generally do as well. If a programming language wanted to handle it as an error instead, it could just check for NaN or infinite results after every floating point operation and cause an error in that case. But, as I said, that's generally not done.
¹ On x86 at least. But I imagine it's the same on most other architectures as well.
can any one please explain why this gives different outputs?
round(1.49999999999999)
1
round(1.4999999999999999)
2
I have read the round documentation but it does not mention anything about it there.
I know that R represents numbers in binary form, but why does adding two extra 9's changes the result?
Thanks.
1.4999999999999999 can't be represented internally, so it gets rounded to 1.5.
Now, when you apply round(), the result is 2.
Put those two numbers into variable and then print it - you'll see they are different.
Computers doesn't store this kind of numbers with this exact value, (They don't use decadic numbers internaly)
I have never used R, so I don't know is this is the issue, but in other languages such as C/C++ a number like 1.4999999999999999 is represented by a float or a double.
Since these have finite precision, you cannot represent something like 1.4999999999999999 exactly. It might be the case that 1.4999999999999999 actually gets stored as 1.50000000000000 instead due to limitations on floating point precision.