Pattern matching with reals (Standard ML) - functional-programming

Doing this:
fun test a 0.0 = "good"
| test a b = "bad";
results in an error, but if I change the 0.0 the error goes away. However, I need to match 0.0 and I'm wondering if and how that can be accomplished.

You can just use an if-statement instead of pattern-matching.
Note that floating point arithmetic is prone to rounding errors, so you should check that the absolute value of b is smaller than some delta rather than that it's equal to 0.0. I assume that's exactly why pattern matching reals is not allowed.

Supposedly it's because real is not an eqtype in SML 97: http://www.smlnj.org/doc/Conversion/types.html#Real-equality

Related

Why isless(0 , missing) returns true but 0<missing returns missing in Julia?

I encountered a strange behavior from Julia today, and that's when I tried to compare missing with 0! If you try this 0<missing in REPL:
julia> 0<missing
missing
But if you try to use pre-built isless function like this:
julia> isless(0 , missing)
true
It's strange because 0<missing means 0 is less than missing same as isless(0 , missing). According to isless documentation:
isless(t1::Tuple, t2::Tuple)
Returns true when t1 is less than t2 in lexicographic order.
Shouldn't 0<missing return true as well?
missing < 0 produces missing because < is a "standard" comparison operator.
Now consider a question: is missing less than 0? The answer is: we do not know. missing stands for some unknown value - it can be greater or less than 0. Therefore missing < 0 produces missing.
As you can see in the documentation of <:
this operator implements a partial order.
This means that it does not guarantee that all values are comparable. A similar (but not identical) situation is with NaN:
julia> NaN < 0.0
false
julia> NaN > 0.0
false
julia> NaN == 0.0
false
Also notice that although -0.0 and 0.0 are different floating point values we have:
julia> -0.0 < 0.0
false
Now let us discuss isless. It is needed because sometimes you want an operator that defines total order, as you can see in its documentation:
Test whether x is less than y, according to a fixed total order (defined together with isequal).
This is useful in cases you want to make sure that you can safely assume that every value is comparable. For example sort uses isless.
Now in order to make sure isless defines total order we need to define how missing is positioned in this total order. It was decided that missing is larger than any other value. Therefore isless(missing, 0) is false. Of course this is an arbitrary decision. It was made so that e.g. when you sort
a vector missing is put at the end:
julia> sort([1, missing, 2])
3-element Vector{Union{Missing, Int64}}:
1
2
missing
which is what people usually want.
Now notice the behavior of NaN and -0.0:
julia> isless(NaN, missing)
true
julia> isless(Inf, NaN)
true
julia> isless(-0.0, 0.0)
true
So as you can see NaN is greater than Inf but less than missing and -0.0 is less than 0.0. Again - since isless is a total order so distinct values have to be put in a distinct place in this total order.
From the doc:
Standard equality and comparison operators follow the propagation rule
presented above: if any of the operands is missing, the result is
missing.
julia> missing < 1
missing
julia> 2 >= missing
missing
And then, a bit below:
The isless operator is another exception: missing is considered as
greater than any other value. This operator is used by sort, which
therefore places missing values after all other values.
julia> isless(1, missing)
true
julia> isless(missing, Inf)
false
julia> isless(missing, missing)
false
So this seems to be by design:
https://docs.julialang.org/en/v1/manual/missing/#Propagation-of-Missing-Values
Now if your question is why this has been designed like this, then perhaps Julia's designers will stop by to enlighten us :)

Why floating point precision error occurs in data table [duplicate]

I know that floating point math can be ugly at best but I am wondering if somebody can explain the following quirk. In most of the programing languages I tested the addition of 0.4 to 0.2 gave a slight error, where as 0.4 + 0.1 + 0.1 gave non.
What is the reason for the inequality of both calculation and what measures can one undertake in the respective programing languages to obtain correct results.
In python2/3
.4 + .2
0.6000000000000001
.4 + .1 + .1
0.6
The same happens in Julia 0.3
julia> .4 + .2
0.6000000000000001
julia> .4 + .1 + .1
0.6
and Scala:
scala> 0.4 + 0.2
res0: Double = 0.6000000000000001
scala> 0.4 + 0.1 + 0.1
res1: Double = 0.6
and Haskell:
Prelude> 0.4 + 0.2
0.6000000000000001
Prelude> 0.4 + 0.1 + 0.1
0.6
but R v3 gets it right:
> .4 + .2
[1] 0.6
> .4 + .1 + .1
[1] 0.6
All these languages are using the system-provided floating-point format, which represents values in binary rather than in decimal. Values like 0.2 and 0.4 can't be represented exactly in that format, so instead the closest representable value is stored, resulting in a small error. For example, the numeric literal 0.2 results in a floating-point number whose exact value is 0.200000000000000011102230246251565404236316680908203125. Similarly, any given arithmetic operation on floating-point numbers may result in a value that's not exactly representable, so the true mathematical result is replaced with the closest representable value. These are the fundamental reasons for the errors you're seeing.
However, this doesn't explain the differences between languages: in all of your examples, the exact same computations are being made and the exact same results are being arrived at. The difference then lies in the way that the various languages choose to display the results.
Strictly speaking, none of the answers you show is correct. Making the (fairly safe) assumption of IEEE 754 binary 64 arithmetic with a round-to-nearest rounding mode, the exact value of the first sum is:
0.600000000000000088817841970012523233890533447265625
while the exact value of the second sum is:
0.59999999999999997779553950749686919152736663818359375
However, neither of those outputs is particularly user-friendly, and clearly all of the languages you tested made the sensible decision to abbreviate the output when printing. However, they don't all adopt the same strategy for formatting the output, which is why you're seeing differences.
There are many possible strategies for formatting, but three particularly common ones are:
Compute and display 17 correctly-rounded significant digits, possibly stripping trailing zeros where they appear. The output of 17 digits guarantees that distinct binary64 floats will have distinct representations, so that a floating-point value can be unambiguously recovered from its representation; 17 is the smallest integer with this property. This is the strategy that Python 2.6 uses, for example.
Compute and display the shortest decimal string that rounds back to the given binary64 value under the usual round-ties-to-even rounding mode. This is rather more complicated to implement than strategy 1, but preserves the property that distinct floats have distinct representations, and tends to make for pleasanter output. This appears to be the strategy that all of the languages you tested (besides R) are using.
Compute and display 15 (or fewer) correctly-rounded significant digits. This has the effect of hiding the errors involved in the decimal-to-binary conversions, giving the illusion of exact decimal arithmetic. It has the drawback that distinct floats can have the same representation. This appears to be what R is doing. (Thanks to #hadley for pointing out in the comments that there's an R setting which controls the number of digits used for display; the default is to use 7 significant digits.)
You should be aware that 0.6 cannot be exactly represented in IEEE floating point, and neither can 0.4, 0.2, and 0.1. This is because the ratio 1/5 is an infinitely repeating fraction in binary, just like ratios such as 1/3 and 1/7 are in decimal. Since none of your initial constants is exact, it is not surprising that your results are not exact, either. (Note: if you want to get a better handle on this lack of exactness, try subtracting the value you expect from your computed results...)
There are a number of other potential gotchas in the same vein. For instance, floating point arithmetic is only approximately associative: adding the same set of numbers together in different orders will usually give you slightly different results (and occasionally can give you very different results). So, in cases where precision is important, you should be careful about how you accumulate floating point values.
The usual advice for this situation is to read "What Every Computer Scientist Should Know About Floating Point Arithmetic", by David Goldberg. The gist: floating point is not exact, and naive assumptions about its behavior may not be supported.
The reason is because it's being rounded up at the end according to the IEEE Standard for Floating-Point Arithmetic :
http://en.wikipedia.org/wiki/IEEE_754
According to the standard: addition, multiplication, and division should be completely correct all the way up to the last bit. This is because a computer has a finite amount of space to represent these values and cannot infinitely trail the precision.

About behaviour of / by vector in Julia

3/[2;2] gives
1×2 LinearAlgebra.Transpose{Float64,Array{Float64,1}}:
0.75 0.75
while 3 ./[2;2] gives
2-element Array{Float64,1}:
1.5
1.5
The second one is easy to comprehend. It broadcasts 3 and performs element wise division. But what is the reasoning behind having the first operation behave as it did? I assume it took the sum of the vector, which was 2x1, performed division of 3 by 4 and broadcast it to a 1x2 transposed vector. I can accept taking the sum of the vector to perform division, but why the transpose? Or why not just return a scalar?
It simply gives the right hand side operand's pseudo-inverse.
julia> ?/
...
Right division operator: multiplication of x by the inverse of y on the right.
Although it seems surprising at first sight, it is actually the natural behavior. A rowvector*columnvector gives a scalar and hence a scalar divided by a column vector should give a row vector, which is the case. Note that RowVector has been removed in 1.0 and what you get is actually a row vector represented with Transpose.
You can write #less 1 / [2;2] to see what actually happens.
Also take a look at this GitHub issue to understand the behaviour a bit more and this discourse topic for some use cases.
It seems it is calculating the pseudoinverse of the vector and then multiplying by 3.
Using #which 3/[2;2] and so on to see what actually happens, I found that it is eventually calling the following method in stdlib/LinearAlgebra/generic.jl:
function _vectorpinv(dualfn::Tf, v::AbstractVector{Tv}, tol) where {Tv,Tf}
res = dualfn(similar(v, typeof(zero(Tv) / (abs2(one(Tv)) + abs2(one(Tv))))))
den = sum(abs2, v)
# as tol is the threshold relative to the maximum singular value, for a vector with
# single singular value σ=√den, σ ≦ tol*σ is equivalent to den=0 ∨ tol≥1
if iszero(den) || tol >= one(tol)
fill!(res, zero(eltype(res)))
else
res .= dualfn(v) ./ den
end
return res
end
which in the given case effectively becomes transpose([2;2])/sum(abs2, [2;2]) which is the pseudoinverse.
However, this is a bit above my head. So someone more qualified might prove me wrong.

After rounduing float variable, there still be number `0.80000001`

I ma using MT4 but it might be the general question of floating number.
I am using NormalizeDouble function which rounds the digit of numbers like this.
double x = 1.33242
y = NormalizeDouble(x,2) // y is 1.33
However in some case.
Even after rounded by NormalizeDouble, there happens a number such us 0.800000001
I have no idea why it happens and how to fix it.
It might be a basic mathematical thing.
You are truncating to powers of 10 but fractional part of float/double can express exactly only powers of 2 like
0.5,0.25,0.125,...
and numbers decomposable to them hence your case:
0.8 = 1/2+1/4 +1/32 +1/64 +1/512 +1/1024 +1/8192 +1/16384...
= 0.5+0.25+0.03125+0.015625+0.001953125+0.0009765625+0.0001220703125+0.00006103515625...
= 0.11001100110011... [bin]
as 0.3 is like periodic number in binary and will always cause some noise in lower bits of mantissa. The FPU implementation tries to find the closest number to your desired value hence the 0.800000001

F#: integer (%) integer - Is Calculated How?

So in my text book there is this example of a recursive function using f#
let rec gcd = function
| (0,n) -> n
| (m,n) -> gcd(n % m,m);;
with this function my text book gives the example by executing:
gcd(36,116);;
and since the m = 36 and not 0 then it ofcourse goes for the second clause like this:
gcd(116 % 36,36)
gcd(8,36)
gcd(36 % 8,8)
gcd(4,8)
gcd(8 % 4,4)
gcd(0,4)
and now hits the first clause stating this entire thing is = 4.
What i don't get is this (%)percentage sign/operator or whatever it is called in this connection. for an instance i don't get how
116 % 36 = 8
I have turned this so many times in my head now and I can't figure how this can turn into 8?
I know this is probably a silly question for those of you who knows this but I would very much appreciate your help the same.
% is a questionable version of modulo, which is the remainder of an integer division.
In the positive, you can think of % as the remainder of the division. See for example Wikipedia on Euclidean Divison. Consider 9 % 4: 4 fits into 9 twice. But two times four is only eight. Thus, there is a remainder of one.
If there are negative operands, % effectively ignores the signs to calculate the remainder and then uses the sign of the dividend as the sign of the result. This corresponds to the remainder of an integer division that rounds to zero, i.e. -2 / 3 = 0.
This is a mathematically unusual definition of division and remainder that has some bad properties. Normally, when calculating modulo n, adding or subtracting n on the input has no effect. Not so for this operator: 2 % 3 is not equal to (2 - 3) % 3.
I usually have the following defined to get useful remainders when there are negative operands:
/// Euclidean remainder, the proper modulo operation
let inline (%!) a b = (a % b + b) % b
So far, this operator was valid for all cases I have encountered where a modulo was needed, while the raw % repeatedly wasn't. For example:
When filling rows and columns from a single index, you could calculate rowNumber = index / nCols and colNumber = index % nCols. But if index and colNumber can be negative, this mapping becomes invalid, while Euclidean division and remainder remain valid.
If you want to normalize an angle to (0, 2pi), angle %! (2. * System.Math.PI) does the job, while the "normal" % might give you a headache.
Because
116 / 36 = 3
116 - (3*36) = 8
Basically, the % operator, known as the modulo operator will divide a number by other and give the rest if it can't divide any longer. Usually, the first time you would use it to understand it would be if you want to see if a number is even or odd by doing something like this in f#
let firstUsageModulo = 55 %2 =0 // false because leaves 1 not 0
When it leaves 8 the first time means that it divided you 116 with 36 and the closest integer was 8 to give.
Just to help you in future with similar problems: in IDEs such as Xamarin Studio and Visual Studio, if you hover the mouse cursor over an operator such as % you should get a tooltip, thus:
Module operator tool tip
Even if you don't understand the tool tip directly, it'll give you something to google.

Resources