R numeric value: exact or rounded value - r

As R FAQ states, the only numbers that can be represented exactly in R’s numeric type are
integers
fractions whose denominator is a power of 2
All other numbers are internally rounded to (typically) 53 binary digits accuracy.
Having said that, how to say a given number can be stored exactly in R's numeric type or should be rounded? Or stated differently, how to say a numeric number is the result of a fraction whose denominator is a power of 2 or not?
For example, 0.03125 can be stored excatly since its the result of 1/32 (i.e., denominator is a power of 2)
sprintf("%.60f", 1/32)
#[1] "0.031250000000000000000000000000000000000000000000000000000000"

Related

Binary 2's Complement

I'm facing a problem. when we want to subtract a number from another using 2's complement we can do that. I don't know how to subtract fractional number using 2's complement.
5 is in binary form 101 and 2 is 10. if we want to subtract 2 from 5 we need to find out 2's complement of 2
2's complement of 2-> 11111110
so if we now add with binary of 5 we can get the subtraction result. If I want to get the result of 5.5-2.125. what would be the procedure.
Fixed point numbers can be used and it is still common to find them in embedded code or hardware.
Their use is identical to integers, but you need to specify where your "point" is. For instance, assume that you want 3 bits after after the point and that your data is 8 bits, bits 7..3 are the integer part (left of "point") and bits 2..0 the fractional part. The interpretation of integer part is as usual the binary decomposition of this integer: bits 3 correspond to 20, bits 4 to 21, etc.
For the fractional part, the decomposition is in negative powers or two. bits 2 correspond to 2-1, bits 1 to 2-2 and bit 0 to 2-3.
So for you problem, 5.5=4+1+1/2=22+20+2-1 and its code is 00101(.)100. Similarly 2.125=2+1/8 and its code is 00010(.)001 (note (.) is just an help to understand the coding).
Indeed they are just integers, but you must take into account that all your numbers are multiplied by 2-3. This will have no impact for addition, but results of multiplication and division must be adjusted. Taking into account the position of point and managing over and underflows is the difficulty of arithmetic with fixed point, but it allows to do fractional computations even if your hardware does not provide floating point support (for instance with low end microcontrollers or FPGA systems).
Two complement is similar to integers and its computation is identical. If code of 2.125 is 00010(.)001, than -2.125==11101(.)111. Operations are as usual.
+5 00101(.)100
-2.125 11101(.)111
00011(.)011
and 00011(.)011=2+1+1/4+1/8=3,375
For the record, two complement first use was for fixed point fractional numbers and two complement name comes from that. If a fractional number if represented by, say 0(.)1100000 (0.75), its negative counter part will be 1(.)0100000 (-0.75 or 1.25 if interpreted as unsigned) and we always have x+(unsigned)-x=2. For this coding, the negative value of a fractional number x is the number y that must be added to x to get a 2, hence the name that y is 2's complement of x.

Representing decimal numbers in binary

How do I represent integers numbers, for example, 23647 in two bytes, where one byte contains the last two digits (47) and the other contains the rest of the digits(236)?
There are several ways do to this.
One way is to try to use Binary Coded Decimal (BCD). This codes decimal digits, rather than the number as a whole into binary. The packed form puts two decimal digits into a byte. However, your example value 23647 has five decimal digits and will not fit into two bytes in BCD. This method will fit values up to 9999.
Another way is to put each of your two parts in binary and place each part into a byte. You can do integer division by 100 to get the upper part, so in Python you could use
upperbyte = 23647 // 100
Then the lower part can be gotten by the modulus operation:
lowerbyte = 23647 % 100
Python will directly convert the results into binary and store them that way. You can do all this in one step in Python and many other languages:
upperbyte, lowerbyte = divmod(23647, 100)
You are guaranteed that the lowerbyte value fits, but if the given value is too large the upperbyte value many not actually fit into a byte. All this assumes that the value is positive, since negative values would complicate things.
(This following answer was for a previous version of the question, which was to fit a floating-point number like 36.47 into two bytes, one byte for the integer part and another byte for the fractional part.)
One way to do that is to "shift" the number so you consider those two bytes to be a single integer.
Take your value (36.47), multiply it by 256 (the number of values that fit into one byte), round it to the nearest integer, convert that to binary. The bottom 8 bits of that value are the "decimal numbers" and the next 8 bits are the "integer value." If there are any other bits still remaining, your number was too large and there is an overflow condition.
This assumes you want to handle only non-negative values. Handling negatives complicates things somewhat. The final result is only an approximation to your starting value, but that is the best you can do.
Doing those calculations on 36.47 gives the binary integer
10010001111000
So the "decimal byte" is 01111000 and the "integer byte" is 100100 or 00100100 when filled out to 8 bits. This represents the float number 36.46875 exactly and your desired value 36.47 approximately.

R largest/smallest representable numbers

I'm trying to get the largest/smallest representable number in R.
After typing ".Machine"
I got:
$double.xmin
[1] 2.225074e-308
$double.xmax
[1] 1.797693e+308
However even if I type 2.225074e-309 in R command prompt I get 2.225074e-309 instead of the expected 0
How can I find the largest/smallest number for which adding or subtracting 1 would lead to either Inf(Adding 1 to largest number) or 0(subtracting 1 from smallest number) ?
.Machine$double.xmin gives the value of the smallest positive number whose representation meets the requirements of IEEE 754 technical standard for floating point computation. As is mentioned in the Wikipedia article on double-precision floating point numbers, that standard requires that:
If a decimal string with at most 15 significant digits is converted to IEEE 754 double precision representation and then converted back to a string with the same number of significant digits, then the final string should match the original. If an IEEE 754 double precision is converted to a decimal string with at least 17 significant digits and then converted back to double, then the final number must match the original.
The same article goes on to note that, by compromising precision, even smaller positive numbers (which do not meet the standards' precision requirements) can be represented:
The 11 bit width of the exponent allows the representation of numbers between 10-308 and 10308, with full 15–17 decimal digits precision. By compromising precision, the subnormal representation allows even smaller values up to about 5 × 10-324.
R's doubles behave in exactly this way, as is noted in the Details section of ?.Machine:
Note that on most platforms smaller positive values than
‘.Machine$double.xmin’ can occur. On a typical R platform the
smallest positive double is about ‘5e-324’.
To confirm that that is the smallest positive value that can be represented using R's doubles and to see the cost in loss of precision, try out a few operations like this:
5e-324
# [1] 4.940656e-324
2e-324
# [1] 0
1.4 * 5e-324
# [1] 4.940656e-324
1.6 * 5e-324
# [1] 9.881313e-324
Here are some representations using SAS, IEEE 754 Big Endian?
data _null_;
y=constant('big');
put y hex16.;
put y E21.3;
run;quit;
Biggest
7FEFFFFFFFFFFFFF
1.79769313486230E+308
data _null_;
y=constant('small');
put y hex16.;
put y E21.3;
run;quit;
Smallest
0010000000000000
2.22507385850720E-308
I am not sure the smallest because SAS may set aside some values for missings.

Performance of string operations in Dyalog

I have 2 questions related to comparing character vectors in Dyalog APL.
The following code will compare character vectors one-by-one:
a←'ATCG'
b←'GTCA'
a=b
In order to speed up (in case of 2 vectors, as well as in case of comparing many vectors to a single vector), should I convert character vector to a numeric vector or it won't matter in APL (similar to comparing chars in C)?
I am comparing DNA sequences (which may consist of letter from the ATCG alphabet only). Is there anything I can do to speed up various operations on such vectors?
Interestingly, on my (old) version of Dyalog APL, converting characters to small integers actually runs some 25% faster. This may have been sped up in more recent versions.
Try
a <- []av iota 'ATCG' // sorry, no apl characters
b <- []av iota 'GTCA'
a = b
Be sure that the largest value is less than 128.
To check that you have the smallest possible representation of integers, use the []dr function. []dr a should return 82 for an integer -128 <= x <= 127.
Dyalog APL will automagically convert to the lowest possible integer width.

What number in binary can only be represented as an approximation?

In decimal (base 10), 1/3 can only be approximated to 0.33333 repeating.
What number is the equivalent in binary that can only be represented as an approximation?
0.1 is one such example, as well as 0.2
This question is also similar to this other SO question, which already has very good answers.
A better question is to ask what numbers can be represented exactly in binary. Everything else can only be approximated or not represented at all.
See What every computer scientist should know about floating point arithmetic.
Well, there are infinite numbers that can't be precisely represented in that notation, but here's one: 1/10.
I am assuming that you mean to ask which rational numbers can be expressed in binary using a finite representation. I am deducing this from your example of 1/3 in decimal. The fact is that every rational number can be expressed in binary if you allow infinite representations. But this question is only interesting from a computer science perspective if you only permit finite representations. I am further assuming that you are not asking about specific computer representations (say, IEEE 754) but rather merely asking about general positional representations.
A rational number p/q with (p, q) = 1 can be expressed a finite representation in base b if and only if every prime factor of q divides b. No irrational numbers have a finite representation in any base.
In particular, a rational number p/q with (p, q) = 1 can be expressed as a finite representation in binary if and only if every prime factor of q divides 2. That is, the only rational numbers p/q with (p, q) = 1 that have a finite representation in binary are those where q = 2^k for some nonnegative integer k. Moreover, all such rational numbers can be expressed in a finite representation in binary. These numbers are known as dyadic rationals.
The numbers that can be exactly represented in base 2 are the dyadic rationals. These are numbers that can be written in the form k/2^n for some integer k and whole number n. Any number that cannot be written in that form will have a non-terminating representation in base 2.
However, you seem to be asking not about what numbers are representable in base 2, but rather what numbers are representable in some fixed floating-point type, such as float or double. This is a more subtle question; any number that is not a dyadic rational cannot be represented, but not all dyadic rationals can be represented either.
It's every number that can't be expressed as k/2^n for integer k and whole number n.
The easy way to find all these numbers is to write down some prime numbers that do not include 2. 3, 5, 7, 11, 13, 17 and 19 are good examples of prime numbers that don't include 2.
Start multiplying. 1/3, 2/3, 1/5, 2/5, 3/5, 4/5, 1/6, 5/6, 1/7, 2/7, etc.
if you do this -- and you avoid numbers of the form k/2^n -- you'll enumerate every possible fraction that cannot be exactly represented in binary.
You should probably stop enumerating when you get to numbers for which the left-most 64-bits are all identical.
In python 2.4:
>>> 1.0 / 5.0
0.20000000000000001
That indicates that base 2 has a hard time representing it exactly.
binary(.00011001100110011...) == decimal(.1)
I'm going to take a stab at infinity
The same set of numbers that can't be exactly represented by base 10 can't exactly be represented by base 2. There shouldn't be a difference there.

Resources