From Hacker's Delight: 2nd Edition:
The formula here seems a little bit awkward here. How is some x vector is subtracted from 1 vector (presumbly 0x1111 1111) when x is smaller than 1? (Like: (as given in example) 0x0101 1000 - 0x0000 0000 doesn't make any sense to me) The former is a smaller number than the first one and the words aren't storing signed vectors either. Is that something related to RISC specific here or what?
As specified in the notation section of the book. A bold letter corresponds a vector for word like x = 00000000. And a bold one differs from a light face 1. As bold 1 = 11111111 which is an 8 bit word.
Edit2: Special Thanks to Paul Hankin to figure out the unconventional notations used here. A bold faced one refers to 32 bit size word which is [00000001] and a light faced 1 refers to a number one as in C.
Subtracting 1
Since we're more familiar with decimal than with binary, it sometimes helps to look at what happens in decimal.
What happens when subtracting 1 in decimal? Take for example 1786000 - 1 = 1785999.
If you subtract 1 from a positive number x in decimal:
all the zeroes at the right of x become 9;
the rightmost nonzero digit of x becomes 1 less;
other digits are unaffected.
Now, in binary, it works exactly the same, except we only have 0 1 instead of 0 123456789.
If you subtract 1 from a number x in binary:
all the zeroes at the right of x become 1;
the rightmost nonzero bit of x becomes 0;
other bits are unaffected.
What about negative numbers? Happily, representation using 2's complement is such that negative numbers behave exactly like positive numbers. In fact, when looking at the bits of x, you can subtract 1 from x without needing to know whether x is a signed int or an unsigned int.
x & (x-1)
Let's start with an example: x = 01011000. We can subtract 1 the way I just explained:
x = 01011000
x-1 = 01010111
Now what's the result of the bitwise-and operation x & (x-1)? We take the two bits in each column; if they are both 1, we write 1; if at least one of them is 0, we write 0.
x = 01011000
x-1 = 01010111
x&(x-1) = 01010000
What happened?
all the zeroes at the right of x remain zero;
the rightmost 1 of x becomes a 0 because of x-1;
all other bits are unaffected, because they are the same in x and x-1.
Conclusion: we have zeroed the rightmost 1 of x, and left all other bits unaffected.
Let's take a look at an what x-1 does.
Assume x is a value '???? 1000 (? are 0 or 1)
=> x-1 = ???? 0111
=> x & (x-1) = ???? 0000
It's very similar no matter where the right most 1 is placed within x.
Requested example:
x=00001111
=> x-1=00001110
=> x & (x-1) = 00001110
P.s. x-1 = 00001110 - 00000001 (<=> 00001110 + 11111111)
I'm new to this subject and I'm struggling to comprehend how 0xFFFFFFF & 0x00000001 can have the same sign, yet 0x0000001 and 0x12345678 have different signs. Based on my understanding thus far, hex digits that have the most significant bit between 0-7 are positive & 8-F are negative.
For further context, here is the thing I was trying to understand:
Question: Complete the C function that performs the operations and meets the requirements indicated in the comments.
Comments:
/*
* diffSign – return 1 if x and y have different signs
* Examples: diffSign(0xFFFFFFF, 0x00000001) = 0
* diffSign(0x0000001, 0x12345678) = 1
* Legal ops: & | ^ ~ << >>
* 1-byte const (0x00 to 0xFF)
*/
Answer:
int diffSign(int x, int y) {
return ((x >> 31) & 0x1) ^ ((y >> 31) & 0x1);
}
If possible, I would also greatly appreciate some clarification on how & 0x1 would help me to identify the sign! It seems rather redundant and I'm not too sure about the significance of that in the equation.
If you look closely, it makes perfect sense, just that you are not seeing that the most significant byte of 0xFFFFFFF is actually 0 because there are 7 F's.
0xFFFFFFF = 0x0FFF FFFF
which for a 32 bit integer represents a positive number.
However 0x0000001 and 0x12345678 do also have the same sign. Because what makes the difference is the most significant BIT. You are right that numbers with the most significant BYTE between 0-7 are positive, and 8-F negative. The comment in the function is wrong.
The code however is right, because it does 31 shifts to the right, leaving only the most significant BIT on each of the arguments (the sign bit of each argument) and does an XOR, which returns true only if both are not the same.
I was looking at some code today for integrating a real time clock with an arduino and it had some binary to decimal (and vice versa) that I don't fully understand.
The code in question is below:
byte decToBcd(byte val)
{
return ( (val/10*16) + (val%10) );
}
byte bcdToDec(byte val)
{
return ( (val/16*10) + (val%16) );
}
ex: decToBcd(12);
I really fail to grasp how this works. I am not sure I understand the math, or if some sort of assumptions are being taken advantage of.
Would someone mind explaining how exactly the math and data types below are supposed to work? If possible touching on why the value "16" is used in the conversions instead of "8" when we are supposed to be working with a byte value.
For context, the full code can be found here: http://www.codingcolor.com/microcontrollers/an-arduino-lcd-clock-using-a-chronodot-rtc/
The key hint here is BCD - Binary-coded decimal - in the function name. In BCD each decimal digit is represented by four bits (half of a byte). As a result the maximum (decimal) number you can store using BCD notation is 99 - 9 in the upper nibble (half of the byte) and 9 in the lower nibble.
Let's take a look at number 12 as an example. Number 12 looks as follows in the binary notation:
12 = %00001010
However in BCD it looks as follows:
12 = %00010010
because
0001 0010
1 2
Now if you look at the decToBcd function val%10 is responsible for calculating the value of the ones place (i.e. the last digit). Since this goes to the lower part of the byte we don't need to do anything special here. val/10*16 first calculates the value of the tens place - val/10. However since the value has to go to the upper half of the byte it needs to be shifted up by four bits - hence *16. Another (in my opinion more readable) way of writing this function would be:
((val / 10) << 4) | (val % 10)
The bcdToDec does the reverse conversion.
RTC usually stores Year in 1 byte as 2 digits only, i.e: 2014 is 14.
And some of them stores it as a number from the year 1970 so 2014 = 44.
So maximum it can hold is 99 in both cases.
I'm programming my Arduino micro controller and I found some code for accepting accelerometer sensor data for later use. I can understand all but the following code. I'd like to have some intuition as to what is happening but after all my searching and reading I can't wrap my head around what is going on and truly understand.
I have taken a class in C++ and we did very little with bitwise operations or bit shifting or whatever you'd like to call it. Let me try to explain what I think I understand and you can correct me where it is needed.
So:
I think we are storing a value in x, pretty sure in fact.
It appears that the data in array "buff", slot number 1, is being set to the datatype of integer.
The value in slot 1 is being bit shifted 8 places to the left.(does this point to buff slot 0?)
This new value is being compared to the data in buff slot 0 and if either bits are true then the bit in the data stored in x will also be true so, 0 and 1 = 1, 0 and 0 = 0 and 1 and 0 = 1 in the end stored value.
The code does this for all three axis: x, y, z but I'm not sure why...I need help. I want full understanding before I progress.
//each axis reading comes in 10 bit resolution, ie 2 bytes.
// Least Significant Byte first!!
//thus we are converting both bytes in to one int
x = (((int)buff[1]) << 8) | buff[0];
y = (((int)buff[3]) << 8) | buff[2];
z = (((int)buff[5]) << 8) | buff[4];
This code is being used to convert the raw accelerometer data (in an array of 6 bytes) into three 10-bit integer values. As the comment says, the data is LSB first. That is:
buff[0] // least significant 8 bits of x data
buff[1] // most significant 2 bits of x data
buff[2] // least significant 8 bits of y data
buff[3] // most significant 2 bits of y data
buff[4] // least significant 8 bits of z data
buff[5] // most significant 2 bits of z data
It's using bitwise operators two put the two parts together into a single variable. The (int) typecasts are unnecessary and (IMHO) confusing. This simplified expression:
x = (buff[1] << 8) | buff[0];
Takes the data in buff[1], and shifts it left 8 bits, and then puts the 8 bits from buff[0] in the space so created. Let's label the 10 bits a through j for example's sake:
buff[0] = cdefghij
buff[1] = 000000ab
Then:
buff[1] << 8 = ab00000000
And:
buff[1] << 8 | buff[0] = abcdefghij
The value in slot 1 is being bit shifted 8 places to the left.(does this point to buff slot 0?)
Nah. Bitwise operators ain't pointer arithmetic, don't confuse the two. Shifting by N places to the left is (roughly) equivalent with multiplying by 2 to the Nth power (except some corner cases in C, but let's not talk about those yet).
This new value is being compared to the data in buff slot 0 and if either bits are true then the bit in the data stored in x will also be true
No. | is not the logical OR operator (that would be ||) but the bitwise OR one. All the code does is combining the two bytes in buff[0] and buff[1] into a single 2-byte integer, where buff[1] denotes the MSB of the number.
The device result is in 6 bytes and the bytes need to be rearranged into 3 integers (having values that can only take up 10 bits at most).
So the first two bytes look like this:
00: xxxx xxxx <- binary value
01: ???? ??xx
The ??? part isn't part of the result because the xxx part comprise the 10 bits. I guess the hardware is built in such a way that the ??? part is all zero bits.
To get this into a single integer variable, we need all 8 of the low bits plus the upper-order 2 bits, shifted left by 8 position so they don't interfere with the low order 8 bits. The logical OR (| - vertical bar) will join those two parts into a single integer that looks like this:
x: ???? ??xx xxxx xxxx <- binary value of a single 16 bit integer
Actually it doesn't matter how big the 'int' is (in bits) as the remaining bits (beyond that 16) will be zero in this case.
to expand and clarify the reply by Carl Norum.
The (int) typecast is required because the source is a byte. The bitshift is performed on the source datatype before the result is saved into X. Therefore it must be cast to at least 16 bits (an int) in order to bitshift 8 bits and retain all the data before the OR operation is executed and the result saved.
What the code is not telling you is if this should be an unsigned int or if there is a sign in the bit data. I'd expect -ve data is possible with an Accelerometer.
Lets say I have guessed a lottery number of:
1689
And the way the lottery works is, the order of the digits don't matter as long as the digits match up 1:1 with the digits in the actual winning lottery number.
So, the number 1689 would be a winning lottery number with:
1896, 1698, 9816, etc..
As long as each digit in your guess was present in the target number, then you win the lottery.
Is there a mathematical way I can do this?
I've solved this problem with a O(N^2) looping checking each digit against each digit of the winning lottery number (separating them with modulus). Which is fine, it works but I want to know if there are any neat math tricks I can do.
For example, at first... I thought I could be tricky and just take the sum and product of each digit in both numbers and if they matched then you won.
^ Do you think that would work?
However, I quickly disproved this when I found that lottery guess: 222, and 124 have the different digits but the same product and sum.
Anyone know any math tricks I can use to quickly determine if the digits in num1 match the digits in num2 regardless of order?
How about going through each number, and counting up the number of appearances of each digit (into two different 10 element arrays)? After you'd done the totaling, compare the counts of each digit. Since you only look at each digit once, that's O(N).
Code would look something like:
for(int i=0; i<digit_count; i++)
{
guessCounts[guessDigits[i] - '0']++;
actualCounts[actualDigits[i] - '0']++;
}
bool winner = true;
for(int i=0; i<10 && winner; i++)
{
winner &= guessCounts[i] == actualCounts[i];
}
Above code makes the assumption that guessDigits and actualDigits are both char strings; if they held the actual digits then you can just skip the - '0' business.
There are probably optimizations that would make this take less space or terminate sooner, but it's a pretty straightforward example of an O(N) approach.
By the way, as I mentioned in a comment, the multiplication/sum comparison will definitely not work because of zeros. Consider 0123 and 0222. Product is 0, sum is 6 in both cases.
Split into array, sort array, join into string, compare strings.
(Not a math trick, I know)
You can place the digits into an array, sort the array, then compare the arrays element by element. This will give you O( NlogN ) complexity which is better than O( N^2 ).
If N can become large, sorting the digits is the answer.
Because digits are 0..9 you can count the number of occurrences of each digit of the lottery answer in an array [0..9].
To compare you can subtract 1 for each digit you encounter in the guess. When you encounter a digit where the count is already 0, you know the guess is different. When you get through all the digits, the guess is the same (as long as the guess has as many digits as the lottery answer).
For each digit d multiply with the (d+1)-th prime number.
This is more mathematical but less efficient than the sorting or bucket methods. Actually it is the bucket method in disguise.
I'd sort both number's digits and compare them.
If you are only dealing with 4 digits I dont think you have to put much thought into which algorithm you use. They will all perform roughly the same.
Also 222 and 124 dont have the same sum
You have to consider that when n is small, the order of efficiency is irrelevant, and the constants start to matter more. How big can your numbers actually get? Can you get up to 10 digits? 20? 100? If your numbers have just a few digits, n^2 really isn't that bad. If you have strings of thousands of digits, then you might actually need to do something more clever like sorting or bucketing. (i.e. count the 0s, count the 1s, etc.)
I'm stealing the answer from Yuliy, and starblue (upvote them)
Bucketing is the fastest aside from the O(1)
lottonumbers == mynumbers;
Sorting is O(nlog2n)
Bucketsort is an O(n) algorithm.
So all you need to do is do it twice (once for your numbers, once for the target-set), and if the numbers of the digits add up, then they match.
Any kind of sorting is an added overhead that is unnecessary in this case.
array[10] digits;
while(targetnum > 0)
{
short currDig = targetnum % 10;
digits[currDig]++;
targetnum = targetnum / 10;
}
while(mynum > 0)
{
short myDig = mynum % 10;
digits[myDig]--;
mynum = mynum / 10;
}
for(int i = 0; i < 10; i++)
{
if(digits[i] == 0)
continue;
else
//FAIL TO MATCH
}
Not the prettiest code, I'll admit.
Create an array of 10 integers subscripted [0 .. 9].
Initialize each element to a different prime number
Set product to 1.
Use each digit from the number, to subscript into the array,
pull out the prime number, and multiply the product by it.
That gives you a unique representation which is digit order independent.
Do the same procedure for the other number.
If the unique representations match, then the original numbers match.
If there are no repeating digits allowed (not sure if this is the case though) then use a 10-bit binary number. The most significant bit represents the digit 9 and the LSB represents the digit 0. Work through each number in turn and flip the appropriate bit for each digit that you find
So 1689 would be: 1101000010
and 9816 would also be: 1101000010
then a XOR or a subtract will leave 0 if you are a winner
This is just a simple form of bucketing
Just for fun, and thinking outside of the normal, instead of sorting and other ways, do the deletion-thing. If resultstring is empty, you have a winner!
Dim ticket As String = "1324"
Dim WinningNumber As String = "4321"
For Each s As String In WinningNumber.ToCharArray
ticket = Replace(ticket, s, "", 1, 1)
Next
If ticket = "" Then MsgBox("WINNER!")
If ticket.Length=1 then msgbox "Close but no cigar!"
This works with repeating numbers too..
Sort digits before storing a number. After that, your numbers will be equal.
One cute solution is to use a variant of Zobrist hashing. (Yes, I know it's overkill, as well as probabilistic, but hey, it's "clever".)
Initialize a ten-element array a[0..9] to random integers. Then, for each number d[], compute the sum of a[d[i]]. If the numbers contained the same digits, the resulting numbers will be equal; with high probability (~ 1 in how many possible ints there are), the opposite is true as well.
(If you know that there will be at most 10 digits total, then you can use the fixed numbers 1, 10, 100, ... instead of random numbers for guaranteed success. This is bucket sorting in not-too-much disguise.)