Number of combinations of bytes ordered by value - math

I can't wrap my head around a mathematical question regarding combinations/permutations:
1 byte can store 256 possible combinations (2^8).
2 bytes can store 65536 possible combinations (2^16).
3 bytes can store 16777216 possible combinations (2^24).
This works only, because I have all possible combinations for each byte.
Given the requirement that the bytes need to be ordered by value – meaning byte1 >= byte2 >= ... >= byten – how many combinations are there for 2, 3 or more bytes?
I'm pretty sure for 2 bytes there would be 32,896 combinations, which I got to with the following function:
(combinations * (combinations + 1)) / 2
I have currently no idea how to adapt this to 3 or more bytes.

Note that whether you impose byte1 ≥ byte2 ≥ ... or byte1 > byte2 > ... will change the result.
For byte1 > byte2 > ..., with 256 different possible bytes, there are (256 choose k) different possible ordered sequences of distinct k bytes.
For byte1 ≥ byte2 ≥ ..., with 256 different possible bytes, there are (256 + k - 1 choose k) different possible ordered sequences of k bytes.
Here (n choose k) is the binomial coefficient.
The result for byte1 ≥ byte2 ≥ ... can be proven using the stars and bars method.

Related

Give a big-O bound on the number of bits needed to represent a number N

N is random number,
I am confused with bound.
Any help is appreciated.
Well, for a random number n, there exists a, b such that 2^a <= n <= 2^b or just a k such that 2^(k-1) <= n <= 2^k - 1 (1). We know that for any number less than 2^n, we need log(2^n) = n * log(2) = n bits to represent it (2). For example:
5: 4 < 5 < 8; we need 3 bits for 4, 5 bits for 8 => we need 4 bits for 5
23: 16 = 2^4 < 23 < 32 = 2^5; so we need 5 bits to represent 23
In conclusion, for an exact number of bits b for a random number n, we can use the formula:
b = floor(log(n)) + 1
So, the big-O notation that we are gonna use is O(floor(log(n)) + 1) = O(logn).
Extra info:
SO Answer
Article
1) I supposed it is a random, integer, positive number (although it's easy to generalize for negative numbers too) which I suppose it is your problem case; for fractional numbers is's a bit harder to generalize this formulae
2) The log notation refers to logarithm in base 2

Byte representation of float as integer equation

In this article an equation is used I don't understand:
I = (e + B) * L + m * L
I is the byte representation of a float interpreted as an integer.
Here is an example:
float x = 3.5f;
unsigned int i = *((unsigned int *)&x);
e is the exponent of the float.
B is the bias (127).
L is a constant (1 << 23).
m is the mantissa.
Now my question is:
Why is the equation correct and where can I read more about this equation?
As you know floating point numbers are stored in IEEE 754 standard. And bit pattern of the single precision floating points like the following (see details here):
And the value of the number is calculated based on the following formula:
Hence, for 32-bit value, the equivalent integer would be e * L + m.
Because exponent is started from (23-rd bit) and the first part is m.
As supposed the exponent is stored with -127, the expression is transformed to (e + B)*L + m.
About the L after m might be there is an assumption which might not be mentioned in the article.
Moreover, sign bit is not considered in this formula.
Floating-Point Encoding
A floating-point number is represented with a sign s, an exponent e, and a significand f. (Some people use the term “mantissa,” but that is a legacy from the days of paper tables of logarithms. “Significand” is preferred for the fraction portion of a floating-point value. Mantissas are logarithmic. Significands are linear.) In binary floating-point, the value represented is + 2e • f or − 2e • f, according to the sign s.
Commonly for binary floating-point, the significand is required to be in [1, 2), at least for numbers in the normal range of the format. For encoding, the first bit is separated from the rest, so we may write f = 1 + r, where 0 ≤ r < 1.
In the IEEE 754 basic binary formats, the floating-point number is encoded as a sign bit, some number of exponent bits, and a significand field:
The sign s is encoded with a 0 bit for positive, 1 for negative. Since we are taking a logarithm, the number is presumably positive, and we may ignore the sign bit for current purposes.
The exponent bits are the actual exponent plus some bias B. (For 32-bit format, B is 127. For 64-bit, it is 1023.)
The signifcand field contains the bits of r. Since r is a fraction, the significand field contains the bits of r represented in binary starting after the “binary point.” For example, if r is 5/16, it is “.0101000…” in binary, so the significand field contains 0101000… (For 32-bit format, the significand field contains 23 bits. For 64-bit, 52 bits.)
Let b the number of bits in the significand field (23 or 52). Let L be 2b.
Then the product of r and L, r • L, is an integer equal to the contents of the significand field. In our example, r is 5/16, L is 223 = 8,388,608, and r • L = 2,621,440. So the significand contains 2,621,440, which is 0x280000.
The equation I = (e + B) • L + m • L attempts to capture this. First, the sign is ignored, since it is zero. Then e + B is the exponent plus the bias. Multiplying that by L shifts it left b bits, which puts it in the position of the exponent field of the floating-point encoding. Then adding r • L adds the value of the significand field (for which I use r for “rest of the significand” instead of m for “mantissa”).
Thus, the bits that encode 2e • (1+r) as a floating-point number are, when interpreted as a binary integer, (e + B) • L + r • L.
More Information
Information about IEEE 754 is in Wikipedia and the IEEE 754 standard. Some previous Stack Overflow answers describing the encoding format at here and here.
Aliasing / Reinterpreting Bits
Regarding the code in your question:
float x = 3.5f;
unsigned int i = *((unsigned int *)&x);
Do not use this code, because its behavior is not defined by the C or C++ standards.
In C, use:
#include <string.h>
...
unsigned int i; memcpy(&i, &x, sizeof i);
or:
unsigned int i = (union { float f; unsigned u; }) { x } .u;
In C++, use:
#include <cstring>
...
unsigned int i; std::memcpy(&i, &x, sizeof i);
These ways are defined to reinterpret the bits of the floating-point encoding as an unsigned int. (Of course, they require that a float and an unsigned int be the same size in the C or C++ implementation you are using.)

Convert floating point number from binary to a decimal number

I have to convert floating point number from binary to usable decimal number.
Of course my floating point number has been separated into bytes, so 4 bytes total.
1 2 3 4
[xxxxxxxx][xxxxxxxx][xxxxxxxx][xxxxxxxx]
These 4 bytes are already converted to decimal, so I have e.g.
1 2 3 4
[0][10][104][79]
Now Mantissa is held in three parts, two rightmost bytes (3 & 4) and in byte 2 but without the MSB bit (that one is easy to mask out, so let's assume we have a nice decimal number there as well). So three decimal numbers.
Is there an straightforward mathematical conversion to a floating point mantissa for these three decimal numbers?
This is along the lines: if I needed to get an integer, the formula would be
10 * 65536 + 104 * 256 + 79.
Call these bytes a, b, and c. I assume a has already been masked, so it contains only the bits of the significand and none of the exponent, and that the number is IEEE-754 32-bit binary floating-point, with bytes taken with the appropriate endianness.
If the raw exponent field is 1 to 254 (thus, not 0 or 255), then the significand is:
1 + a*0x1p-7 + b*0x1p-15 + c*0x1p-23
or, equivalently:
(65536*a + 256*b + c) * 0x1p-23 + 1.
If the raw exponent field is 0, then remove the 1 from the sum (the number is subnormal or zero). If the raw exponent field is 255, then the floating-point value is infinity (if a, b, and c are all 0) or a NaN (otherwise).
I cannot be of much help, since it has been a while since I did conversions, but I hope you find this tutorial useful.

Hexadecimal calculation for a checksum

I'm not understanding how this result can be zero. This was presented to me has an example to validate a checksum of a message.
ED(12+01+ED=0)
How can this result be zero?
"1201 is the message" ED is the checksum, my question is more on, how can I determine the checksum?
Thank you for any help.
Best regards,
FR
How can this result be zero?
The checksum is presumably represented by a byte.
A byte can store 256 different values, so the calculation is probably done module 256.
Since 0x12 + 0x01 + 0xED = 256, the result becomes 0.
how can I determine the checksum?
The checksum is the specific byte value B that makes the sum of the bytes in the message + B = 0 (modulo 256).
So, as #LanceH says in the comment, to figure out the checksum B, you...
add up the values of the bytes in the message (say it adds up to M)
compute M' = M % 256
Now, the checksum B is computed as 256 - M'.
I'm not sure about your checksum details but in base-16 arithmetic (and in base-10):
base-16 base-10
-----------------------
12 18
01 1
+ ED 237
------------------------
100 256
If your checksum is modulo-256 (16^2), you only keep the last 2 base-16 digits, so you have 00
Well, obviously, when you add up 12 + 01 + ED the result overflows 1 byte, and it's actually the hex number 100. So, if you only take the final byte of 0x0100. you get 0.

My bit logic is too outdated. Refresher needed

It's been a while since my assembly class in college (20 years to be exact).
When someone gives you a number, say 19444, and says that X is bits 15 through 8 and Y are bits 7 through 0... how do I calculate values of X and Y?
I promise this is not homework, just a software guy unwisely trying to do some firmware programming.
First of all convert the input number to hexadecimal:
19444 => 0x4BF4
Hex is convenient because every 4 binary bits are one hex digit. Hence, every 2 hex digits are 8 bits, or a byte. Now assuming traditional little-endian notation (look it up!), bits 7 downto 0 are the low byte, bits 15 downto 8 are the high byte:
[7:0] => 0xF4
[15:8] => 0x4B
Using your preferred language, you can get the least significant byte by using a bitwise AND:
Y = 19444 & 0xff
or, the more mathematical:
Y = 19444 % 256
Now, for the most significant byte you can use bit shifts (if the number is larget than two byte, apply the first stage again):
X = 19444 >> 8
(The following assumes C notation). In general, to access the value in bits N through M, where N is the smaller value and the bits are numbered from 0, use:
(value >> N) & (1U << (M - N + 1)) - 1;
So for bits 0..7, use:
(value >> 0) & (1U << 8) - 1
and for bits 8..15, use:
(value >> 8) & (1U << 8) - 1
Note that for the case where "N through M" is the entire width of the type, you can't use the shift as written.
Also, mind the byte order (wheter the most significant byte comes first).
When given bit positions (like "15 through 8"), by convention bit 0 is the least significant bit of the binary number. If you're dealing with a 16-bit number, then bit 15 is the most significant bit.
One hexadecimal digit corresponds to 4 binary digits. So hex FF is 11111111 in binary. Bitwise AND is often used to "mask out" a certain collection of bits.
Nearly all processors provide some form of bitwise shifting. For example, shifting 1010001 right by 4 bits gives you 101.
Combining all this, in C you would typically do something like this:
unsigned short int num;
unsigned char x, y;
num = 19444;
y = num & 0xff; //use bitwise AND to get 8 least-sig bits
x = num >> 8; //right-shift by 8 bits to get 8 most-sig bits

Resources