My bit logic is too outdated. Refresher needed - math

It's been a while since my assembly class in college (20 years to be exact).
When someone gives you a number, say 19444, and says that X is bits 15 through 8 and Y are bits 7 through 0... how do I calculate values of X and Y?
I promise this is not homework, just a software guy unwisely trying to do some firmware programming.

First of all convert the input number to hexadecimal:
19444 => 0x4BF4
Hex is convenient because every 4 binary bits are one hex digit. Hence, every 2 hex digits are 8 bits, or a byte. Now assuming traditional little-endian notation (look it up!), bits 7 downto 0 are the low byte, bits 15 downto 8 are the high byte:
[7:0] => 0xF4
[15:8] => 0x4B

Using your preferred language, you can get the least significant byte by using a bitwise AND:
Y = 19444 & 0xff
or, the more mathematical:
Y = 19444 % 256
Now, for the most significant byte you can use bit shifts (if the number is larget than two byte, apply the first stage again):
X = 19444 >> 8

(The following assumes C notation). In general, to access the value in bits N through M, where N is the smaller value and the bits are numbered from 0, use:
(value >> N) & (1U << (M - N + 1)) - 1;
So for bits 0..7, use:
(value >> 0) & (1U << 8) - 1
and for bits 8..15, use:
(value >> 8) & (1U << 8) - 1
Note that for the case where "N through M" is the entire width of the type, you can't use the shift as written.

Also, mind the byte order (wheter the most significant byte comes first).

When given bit positions (like "15 through 8"), by convention bit 0 is the least significant bit of the binary number. If you're dealing with a 16-bit number, then bit 15 is the most significant bit.
One hexadecimal digit corresponds to 4 binary digits. So hex FF is 11111111 in binary. Bitwise AND is often used to "mask out" a certain collection of bits.
Nearly all processors provide some form of bitwise shifting. For example, shifting 1010001 right by 4 bits gives you 101.
Combining all this, in C you would typically do something like this:
unsigned short int num;
unsigned char x, y;
num = 19444;
y = num & 0xff; //use bitwise AND to get 8 least-sig bits
x = num >> 8; //right-shift by 8 bits to get 8 most-sig bits

Related

How does the modulus shorten a number to only 16 digits in this example?

I am learning Solidity by following various tutorials online. One of these tutorials is Cryptozombies. In this tutorial we create a zombie that has "dna" (a number). These numbers must be only 16 digits long to work with the code. We do this by defining
uint dnaDigits = 16;
uint dnaModulus = 10 ** dnaDigits;
At some point in the lesson, we define a function that generates "random" dna by passing a string to the keccak256 hash function.
function _generateRandomDna(string memory _str) private view returns (uint) {
uint rand = uint(keccak256(abi.encodePacked(_str)));
return rand % dnaModulus;
}
I am trying to figure out how the output of keccak256 % 10^16 is always a 16 digit integer. I guess part of the problem is that I don't exactly understand what keccak256 outputs. What I (think) I know is that it outputs a 256 bit number. I bit is either 0 or 1, so there are 2^256 possible outputs from this function?
If you need more information please let me know. I think included everything relevant from the code.
Whatever keccak256(abi.encodePacked(_str)) returns, it converted into uint because of the casting uint(...).
uint rand = uint(keccak256(abi.encodePacked(_str)));
When it's an uint then the simple math, because it's modulus.
xxxxx % 100 always < 100
I see now that this code is mean to produce a number that has no more than 16 digits, not exactly 16 digits.
Take x % 10^16
If x < 10^16, then then the remainder is x, and x is by definition less than 10^16, which is the smallest possible number with 17 digits I.e. every number less than 10^16 has 16 or fewer digits.
If x = 10^16, then remainder is 0 which has fewer than 16 digits.
If x > 10^16, then either 10^16 goes into x fully or not. If it does fully, remainder is zero which has less than 16 digits, if it goes in partially, then remainder is only a part of 10^16 which will always have 16 or fewer digits.

Understanding & identifying signed numbers

I'm new to this subject and I'm struggling to comprehend how 0xFFFFFFF & 0x00000001 can have the same sign, yet 0x0000001 and 0x12345678 have different signs. Based on my understanding thus far, hex digits that have the most significant bit between 0-7 are positive & 8-F are negative.
For further context, here is the thing I was trying to understand:
Question: Complete the C function that performs the operations and meets the requirements indicated in the comments.
Comments:
/*
* diffSign – return 1 if x and y have different signs
* Examples: diffSign(0xFFFFFFF, 0x00000001) = 0
* diffSign(0x0000001, 0x12345678) = 1
* Legal ops: & | ^ ~ << >>
* 1-byte const (0x00 to 0xFF)
*/
Answer:
int diffSign(int x, int y) {
return ((x >> 31) & 0x1) ^ ((y >> 31) & 0x1);
}
If possible, I would also greatly appreciate some clarification on how & 0x1 would help me to identify the sign! It seems rather redundant and I'm not too sure about the significance of that in the equation.
If you look closely, it makes perfect sense, just that you are not seeing that the most significant byte of 0xFFFFFFF is actually 0 because there are 7 F's.
0xFFFFFFF = 0x0FFF FFFF
which for a 32 bit integer represents a positive number.
However 0x0000001 and 0x12345678 do also have the same sign. Because what makes the difference is the most significant BIT. You are right that numbers with the most significant BYTE between 0-7 are positive, and 8-F negative. The comment in the function is wrong.
The code however is right, because it does 31 shifts to the right, leaving only the most significant BIT on each of the arguments (the sign bit of each argument) and does an XOR, which returns true only if both are not the same.

Byte representation of float as integer equation

In this article an equation is used I don't understand:
I = (e + B) * L + m * L
I is the byte representation of a float interpreted as an integer.
Here is an example:
float x = 3.5f;
unsigned int i = *((unsigned int *)&x);
e is the exponent of the float.
B is the bias (127).
L is a constant (1 << 23).
m is the mantissa.
Now my question is:
Why is the equation correct and where can I read more about this equation?
As you know floating point numbers are stored in IEEE 754 standard. And bit pattern of the single precision floating points like the following (see details here):
And the value of the number is calculated based on the following formula:
Hence, for 32-bit value, the equivalent integer would be e * L + m.
Because exponent is started from (23-rd bit) and the first part is m.
As supposed the exponent is stored with -127, the expression is transformed to (e + B)*L + m.
About the L after m might be there is an assumption which might not be mentioned in the article.
Moreover, sign bit is not considered in this formula.
Floating-Point Encoding
A floating-point number is represented with a sign s, an exponent e, and a significand f. (Some people use the term “mantissa,” but that is a legacy from the days of paper tables of logarithms. “Significand” is preferred for the fraction portion of a floating-point value. Mantissas are logarithmic. Significands are linear.) In binary floating-point, the value represented is + 2e • f or − 2e • f, according to the sign s.
Commonly for binary floating-point, the significand is required to be in [1, 2), at least for numbers in the normal range of the format. For encoding, the first bit is separated from the rest, so we may write f = 1 + r, where 0 ≤ r < 1.
In the IEEE 754 basic binary formats, the floating-point number is encoded as a sign bit, some number of exponent bits, and a significand field:
The sign s is encoded with a 0 bit for positive, 1 for negative. Since we are taking a logarithm, the number is presumably positive, and we may ignore the sign bit for current purposes.
The exponent bits are the actual exponent plus some bias B. (For 32-bit format, B is 127. For 64-bit, it is 1023.)
The signifcand field contains the bits of r. Since r is a fraction, the significand field contains the bits of r represented in binary starting after the “binary point.” For example, if r is 5/16, it is “.0101000…” in binary, so the significand field contains 0101000… (For 32-bit format, the significand field contains 23 bits. For 64-bit, 52 bits.)
Let b the number of bits in the significand field (23 or 52). Let L be 2b.
Then the product of r and L, r • L, is an integer equal to the contents of the significand field. In our example, r is 5/16, L is 223 = 8,388,608, and r • L = 2,621,440. So the significand contains 2,621,440, which is 0x280000.
The equation I = (e + B) • L + m • L attempts to capture this. First, the sign is ignored, since it is zero. Then e + B is the exponent plus the bias. Multiplying that by L shifts it left b bits, which puts it in the position of the exponent field of the floating-point encoding. Then adding r • L adds the value of the significand field (for which I use r for “rest of the significand” instead of m for “mantissa”).
Thus, the bits that encode 2e • (1+r) as a floating-point number are, when interpreted as a binary integer, (e + B) • L + r • L.
More Information
Information about IEEE 754 is in Wikipedia and the IEEE 754 standard. Some previous Stack Overflow answers describing the encoding format at here and here.
Aliasing / Reinterpreting Bits
Regarding the code in your question:
float x = 3.5f;
unsigned int i = *((unsigned int *)&x);
Do not use this code, because its behavior is not defined by the C or C++ standards.
In C, use:
#include <string.h>
...
unsigned int i; memcpy(&i, &x, sizeof i);
or:
unsigned int i = (union { float f; unsigned u; }) { x } .u;
In C++, use:
#include <cstring>
...
unsigned int i; std::memcpy(&i, &x, sizeof i);
These ways are defined to reinterpret the bits of the floating-point encoding as an unsigned int. (Of course, they require that a float and an unsigned int be the same size in the C or C++ implementation you are using.)

bit-shift operation in accelerometer code

I'm programming my Arduino micro controller and I found some code for accepting accelerometer sensor data for later use. I can understand all but the following code. I'd like to have some intuition as to what is happening but after all my searching and reading I can't wrap my head around what is going on and truly understand.
I have taken a class in C++ and we did very little with bitwise operations or bit shifting or whatever you'd like to call it. Let me try to explain what I think I understand and you can correct me where it is needed.
So:
I think we are storing a value in x, pretty sure in fact.
It appears that the data in array "buff", slot number 1, is being set to the datatype of integer.
The value in slot 1 is being bit shifted 8 places to the left.(does this point to buff slot 0?)
This new value is being compared to the data in buff slot 0 and if either bits are true then the bit in the data stored in x will also be true so, 0 and 1 = 1, 0 and 0 = 0 and 1 and 0 = 1 in the end stored value.
The code does this for all three axis: x, y, z but I'm not sure why...I need help. I want full understanding before I progress.
//each axis reading comes in 10 bit resolution, ie 2 bytes.
// Least Significant Byte first!!
//thus we are converting both bytes in to one int
x = (((int)buff[1]) << 8) | buff[0];
y = (((int)buff[3]) << 8) | buff[2];
z = (((int)buff[5]) << 8) | buff[4];
This code is being used to convert the raw accelerometer data (in an array of 6 bytes) into three 10-bit integer values. As the comment says, the data is LSB first. That is:
buff[0] // least significant 8 bits of x data
buff[1] // most significant 2 bits of x data
buff[2] // least significant 8 bits of y data
buff[3] // most significant 2 bits of y data
buff[4] // least significant 8 bits of z data
buff[5] // most significant 2 bits of z data
It's using bitwise operators two put the two parts together into a single variable. The (int) typecasts are unnecessary and (IMHO) confusing. This simplified expression:
x = (buff[1] << 8) | buff[0];
Takes the data in buff[1], and shifts it left 8 bits, and then puts the 8 bits from buff[0] in the space so created. Let's label the 10 bits a through j for example's sake:
buff[0] = cdefghij
buff[1] = 000000ab
Then:
buff[1] << 8 = ab00000000
And:
buff[1] << 8 | buff[0] = abcdefghij
The value in slot 1 is being bit shifted 8 places to the left.(does this point to buff slot 0?)
Nah. Bitwise operators ain't pointer arithmetic, don't confuse the two. Shifting by N places to the left is (roughly) equivalent with multiplying by 2 to the Nth power (except some corner cases in C, but let's not talk about those yet).
This new value is being compared to the data in buff slot 0 and if either bits are true then the bit in the data stored in x will also be true
No. | is not the logical OR operator (that would be ||) but the bitwise OR one. All the code does is combining the two bytes in buff[0] and buff[1] into a single 2-byte integer, where buff[1] denotes the MSB of the number.
The device result is in 6 bytes and the bytes need to be rearranged into 3 integers (having values that can only take up 10 bits at most).
So the first two bytes look like this:
00: xxxx xxxx <- binary value
01: ???? ??xx
The ??? part isn't part of the result because the xxx part comprise the 10 bits. I guess the hardware is built in such a way that the ??? part is all zero bits.
To get this into a single integer variable, we need all 8 of the low bits plus the upper-order 2 bits, shifted left by 8 position so they don't interfere with the low order 8 bits. The logical OR (| - vertical bar) will join those two parts into a single integer that looks like this:
x: ???? ??xx xxxx xxxx <- binary value of a single 16 bit integer
Actually it doesn't matter how big the 'int' is (in bits) as the remaining bits (beyond that 16) will be zero in this case.
to expand and clarify the reply by Carl Norum.
The (int) typecast is required because the source is a byte. The bitshift is performed on the source datatype before the result is saved into X. Therefore it must be cast to at least 16 bits (an int) in order to bitshift 8 bits and retain all the data before the OR operation is executed and the result saved.
What the code is not telling you is if this should be an unsigned int or if there is a sign in the bit data. I'd expect -ve data is possible with an Accelerometer.

How to calculate the parity bit of the following bit sequence?

The sequence is:
00111011
How do i calculate the parity bit for the above sequence? This question is from Databases- The complete book by jeffery ullman (Exercise 13.4.1 a)
I am not sure what the answer to this question should be.
Is it as simple as :
i)Even Parity : the number of 1s is 5 (odd) so just append a 1 and the answer is : 001110111
ii)Odd Parity: likewise , just append 0: 001110110
OR:
am i on a totally wrong path here? I looked up on the net but could not find anything concrete . Also, the text for the above question in the text book is not clear.
Yes, your answers are correct. For the given sequence,
00111011
Odd Parity is 001110110, the parity bit is zero so that the total number of 1's in the code is 5, which is an Odd number.
The Even Parity is 001110111, the parity bit is one so that the total number of 1's in the code is 6, which is an Even number.
You can also use XOR i.e;
00111011
0XOR0=0
0XOR0=0
0XOR1=1
1XOR1=0
0XOR1=1
1XOR0=1
1XOR1=0
0XOR1=1
, The last bit is the parity bit; 1 for even parity, 0 for odd parity. you should make this bit the LSB of the original number (00111011) thereby becoming (001110111).
unsigned char CalEvenParity(unsigned char data)
{
unsigned char parity=0;
while(data){
parity^=(data &1);
data>>=1;
}
return (parity);
}
Alternate implementation of parity:
This involves doing an XOR between the consecutive bits in a particular number in an integer.
The x>>1 left shifts the value by 1 bit and the & 1, gets us the value of the last bit of the number.
Parity of the entire sequence can be visualized as below:- i.e due to the properties of XOR.
1 ^ 0 ^ 1 is same as (1 ^ 0 ) ^ 1 and we extend the same.
def parity_val(x):
parity=0
while x>>1:
parity = (x & 1)^ ((x >>1) & 1)
x = x>> 1
return parity

Resources