converting to little endian with less digits - math

Little endian is pretty simply when you look at something like: 0x8048cc54 -> \x54\xcc\x48\x80. What about 0x8048cc4 or 0x8048cc if you want to convert these to 8 bytes how would you do it. If this is the wrong forum for this just let me know and ill move it.

That depends on the container (memory size) of your values. If (like in the first example) the value is treated as 32 bit, you only need to look at it's "full scope":
0x8048cc4 == 0x08048cc4
0x8048cc == 0x008048cc
From here the answer to convert endianess is simple...

"Endianness" is just the order you write your digits.
Most of the world writes things in big endian; the most significant digit comes first, continuing to the least signficant last:
123 = 100 + 20 + 3 (= 1×10² + 2×10¹ + 3×10⁰)
From a computer's perspective, however, it is often useful to look at it from the other end:
321 = 3 + 02 + 001 (= 3×10⁰ + 2×10¹ + 1×10²)
The 'digits' of a number in a computer are composed of the values 0..255 -- that is, each digit is a single byte. We humans don't have 256 different symbols to write that byte value, so we convert it from base 256 to base 16.
To write it in little endian, start at the least significant byte and peel off until you run out of bytes/digits:
0x0848cc4 --> \xc4\x8c\x84\x00
0x8048cc --> \xcc\x48\x80\x00
Hope this helps.

Related

Understanding Adruino Binary to Decimal Conversations

I was looking at some code today for integrating a real time clock with an arduino and it had some binary to decimal (and vice versa) that I don't fully understand.
The code in question is below:
byte decToBcd(byte val)
{
return ( (val/10*16) + (val%10) );
}
byte bcdToDec(byte val)
{
return ( (val/16*10) + (val%16) );
}
ex: decToBcd(12);
I really fail to grasp how this works. I am not sure I understand the math, or if some sort of assumptions are being taken advantage of.
Would someone mind explaining how exactly the math and data types below are supposed to work? If possible touching on why the value "16" is used in the conversions instead of "8" when we are supposed to be working with a byte value.
For context, the full code can be found here: http://www.codingcolor.com/microcontrollers/an-arduino-lcd-clock-using-a-chronodot-rtc/
The key hint here is BCD - Binary-coded decimal - in the function name. In BCD each decimal digit is represented by four bits (half of a byte). As a result the maximum (decimal) number you can store using BCD notation is 99 - 9 in the upper nibble (half of the byte) and 9 in the lower nibble.
Let's take a look at number 12 as an example. Number 12 looks as follows in the binary notation:
12 = %00001010
However in BCD it looks as follows:
12 = %00010010
because
0001 0010
1 2
Now if you look at the decToBcd function val%10 is responsible for calculating the value of the ones place (i.e. the last digit). Since this goes to the lower part of the byte we don't need to do anything special here. val/10*16 first calculates the value of the tens place - val/10. However since the value has to go to the upper half of the byte it needs to be shifted up by four bits - hence *16. Another (in my opinion more readable) way of writing this function would be:
((val / 10) << 4) | (val % 10)
The bcdToDec does the reverse conversion.
RTC usually stores Year in 1 byte as 2 digits only, i.e: 2014 is 14.
And some of them stores it as a number from the year 1970 so 2014 = 44.
So maximum it can hold is 99 in both cases.

bit-shift operation in accelerometer code

I'm programming my Arduino micro controller and I found some code for accepting accelerometer sensor data for later use. I can understand all but the following code. I'd like to have some intuition as to what is happening but after all my searching and reading I can't wrap my head around what is going on and truly understand.
I have taken a class in C++ and we did very little with bitwise operations or bit shifting or whatever you'd like to call it. Let me try to explain what I think I understand and you can correct me where it is needed.
So:
I think we are storing a value in x, pretty sure in fact.
It appears that the data in array "buff", slot number 1, is being set to the datatype of integer.
The value in slot 1 is being bit shifted 8 places to the left.(does this point to buff slot 0?)
This new value is being compared to the data in buff slot 0 and if either bits are true then the bit in the data stored in x will also be true so, 0 and 1 = 1, 0 and 0 = 0 and 1 and 0 = 1 in the end stored value.
The code does this for all three axis: x, y, z but I'm not sure why...I need help. I want full understanding before I progress.
//each axis reading comes in 10 bit resolution, ie 2 bytes.
// Least Significant Byte first!!
//thus we are converting both bytes in to one int
x = (((int)buff[1]) << 8) | buff[0];
y = (((int)buff[3]) << 8) | buff[2];
z = (((int)buff[5]) << 8) | buff[4];
This code is being used to convert the raw accelerometer data (in an array of 6 bytes) into three 10-bit integer values. As the comment says, the data is LSB first. That is:
buff[0] // least significant 8 bits of x data
buff[1] // most significant 2 bits of x data
buff[2] // least significant 8 bits of y data
buff[3] // most significant 2 bits of y data
buff[4] // least significant 8 bits of z data
buff[5] // most significant 2 bits of z data
It's using bitwise operators two put the two parts together into a single variable. The (int) typecasts are unnecessary and (IMHO) confusing. This simplified expression:
x = (buff[1] << 8) | buff[0];
Takes the data in buff[1], and shifts it left 8 bits, and then puts the 8 bits from buff[0] in the space so created. Let's label the 10 bits a through j for example's sake:
buff[0] = cdefghij
buff[1] = 000000ab
Then:
buff[1] << 8 = ab00000000
And:
buff[1] << 8 | buff[0] = abcdefghij
The value in slot 1 is being bit shifted 8 places to the left.(does this point to buff slot 0?)
Nah. Bitwise operators ain't pointer arithmetic, don't confuse the two. Shifting by N places to the left is (roughly) equivalent with multiplying by 2 to the Nth power (except some corner cases in C, but let's not talk about those yet).
This new value is being compared to the data in buff slot 0 and if either bits are true then the bit in the data stored in x will also be true
No. | is not the logical OR operator (that would be ||) but the bitwise OR one. All the code does is combining the two bytes in buff[0] and buff[1] into a single 2-byte integer, where buff[1] denotes the MSB of the number.
The device result is in 6 bytes and the bytes need to be rearranged into 3 integers (having values that can only take up 10 bits at most).
So the first two bytes look like this:
00: xxxx xxxx <- binary value
01: ???? ??xx
The ??? part isn't part of the result because the xxx part comprise the 10 bits. I guess the hardware is built in such a way that the ??? part is all zero bits.
To get this into a single integer variable, we need all 8 of the low bits plus the upper-order 2 bits, shifted left by 8 position so they don't interfere with the low order 8 bits. The logical OR (| - vertical bar) will join those two parts into a single integer that looks like this:
x: ???? ??xx xxxx xxxx <- binary value of a single 16 bit integer
Actually it doesn't matter how big the 'int' is (in bits) as the remaining bits (beyond that 16) will be zero in this case.
to expand and clarify the reply by Carl Norum.
The (int) typecast is required because the source is a byte. The bitshift is performed on the source datatype before the result is saved into X. Therefore it must be cast to at least 16 bits (an int) in order to bitshift 8 bits and retain all the data before the OR operation is executed and the result saved.
What the code is not telling you is if this should be an unsigned int or if there is a sign in the bit data. I'd expect -ve data is possible with an Accelerometer.

Addition in hexadecimal

I may formulated the question a bit wrong. I need to calculate the IPv4 header checksum in hexadecimal with paper and pen. At this link http://en.wikipedia.org/wiki/IPv4_header_checksum
on the last example they do it.
I have a bit of problem understanding how they count directly in hexadecimal. When doing it on paper what if I get a number over 15 for example 48 what reminder will I use and what will I write down?
Anyone that can explain how to handle this?
Thank you and sorry for formulating the question wrong but I have changed it now:)
See http://www.youtube.com/watch?v=UGK8VyV1gLE which describes the process very well.
Counting in HEX (base 16) is just like counting in decimal (base 10) except that you only start carrying remainders when you count past F.
So in your example from a comment, it's just like counting in decimal with no remainders:
15
24
---
39
A simple true HEX addition is:
11
F
---
20
1 + F = 10 = 1 remainder + 1 = 20
15 over 48 is simple too:
15
48
---
5D
8 + 5 = D no remainder, 1 + 4 = 5 no remainder
Hexadecimal is just a representation of numbers. In order to have the computer helping you with the addition you will have to convert the hexadecimal represented numbers to a number itself then do the addition and then convert it back. This is not a conversion to binary as binary is also only a different representation.
If you do not want the conversion from hexadecimal you will have to explain why you do not want to have this conversion.
I suppose this may sound like a dumb answer, but it's the best I can give with the way you wrote the question.
Addition in hex works exactly the same as in decimal, except with 16 instead of 10 digits. So in effect, what you're asking is how to do addition in general (including in decimal.) In dec, 9 + 1 = 10. In hex, F + 1 = 10. Obviously, the same rules of addition apply in both.

Y = base64(X) where X is integer - is Y alphanumeric?

Additional details:
X is any positive integer 6 digits or less.
X is left-padded with zeros to maintain a width of 6.
Please explain your answer :)
(This might be better in the Math site, but figured it involves programming functions)
The picture from the german Wikipedia article is very helpful:
You see that 6 consecutive bits from the original bytes generate a Base64 value. To generate + or / (codes 62 and 63), you'd need the bitstrings 111110 and 111111, so at least 5 consecutive bits set.
However, look at the ASCII codes for 0...9:
00110000
00110001
00110010
00110011
00110100
00110101
00110110
00110111
00111000
00111001
No matter how you concatenate six of those, there won't be more than 3 consecutive bits set. So it's not possible to generate a Base64 string that contains + or / this way, Y will always be alphanumeric.
EDIT: In fact, you can even rule other Base64 values out like 000010 (C), so this leads to nice follow-up questions/puzzles like "How many of the 64 values are possible at all?".

Arbitrary-precision arithmetic Explanation

I'm trying to learn C and have come across the inability to work with REALLY big numbers (i.e., 100 digits, 1000 digits, etc.). I am aware that there exist libraries to do this, but I want to attempt to implement it myself.
I just want to know if anyone has or can provide a very detailed, dumbed down explanation of arbitrary-precision arithmetic.
It's all a matter of adequate storage and algorithms to treat numbers as smaller parts. Let's assume you have a compiler in which an int can only be 0 through 99 and you want to handle numbers up to 999999 (we'll only worry about positive numbers here to keep it simple).
You do that by giving each number three ints and using the same rules you (should have) learned back in primary school for addition, subtraction and the other basic operations.
In an arbitrary precision library, there's no fixed limit on the number of base types used to represent our numbers, just whatever memory can hold.
Addition for example: 123456 + 78:
12 34 56
78
-- -- --
12 35 34
Working from the least significant end:
initial carry = 0.
56 + 78 + 0 carry = 134 = 34 with 1 carry
34 + 00 + 1 carry = 35 = 35 with 0 carry
12 + 00 + 0 carry = 12 = 12 with 0 carry
This is, in fact, how addition generally works at the bit level inside your CPU.
Subtraction is similar (using subtraction of the base type and borrow instead of carry), multiplication can be done with repeated additions (very slow) or cross-products (faster) and division is trickier but can be done by shifting and subtraction of the numbers involved (the long division you would have learned as a kid).
I've actually written libraries to do this sort of stuff using the maximum powers of ten that can be fit into an integer when squared (to prevent overflow when multiplying two ints together, such as a 16-bit int being limited to 0 through 99 to generate 9,801 (<32,768) when squared, or 32-bit int using 0 through 9,999 to generate 99,980,001 (<2,147,483,648)) which greatly eased the algorithms.
Some tricks to watch out for.
1/ When adding or multiplying numbers, pre-allocate the maximum space needed then reduce later if you find it's too much. For example, adding two 100-"digit" (where digit is an int) numbers will never give you more than 101 digits. Multiply a 12-digit number by a 3 digit number will never generate more than 15 digits (add the digit counts).
2/ For added speed, normalise (reduce the storage required for) the numbers only if absolutely necessary - my library had this as a separate call so the user can decide between speed and storage concerns.
3/ Addition of a positive and negative number is subtraction, and subtracting a negative number is the same as adding the equivalent positive. You can save quite a bit of code by having the add and subtract methods call each other after adjusting signs.
4/ Avoid subtracting big numbers from small ones since you invariably end up with numbers like:
10
11-
-- -- -- --
99 99 99 99 (and you still have a borrow).
Instead, subtract 10 from 11, then negate it:
11
10-
--
1 (then negate to get -1).
Here are the comments (turned into text) from one of the libraries I had to do this for. The code itself is, unfortunately, copyrighted, but you may be able to pick out enough information to handle the four basic operations. Assume in the following that -a and -b represent negative numbers and a and b are zero or positive numbers.
For addition, if signs are different, use subtraction of the negation:
-a + b becomes b - a
a + -b becomes a - b
For subtraction, if signs are different, use addition of the negation:
a - -b becomes a + b
-a - b becomes -(a + b)
Also special handling to ensure we're subtracting small numbers from large:
small - big becomes -(big - small)
Multiplication uses entry-level math as follows:
475(a) x 32(b) = 475 x (30 + 2)
= 475 x 30 + 475 x 2
= 4750 x 3 + 475 x 2
= 4750 + 4750 + 4750 + 475 + 475
The way in which this is achieved involves extracting each of the digits of 32 one at a time (backwards) then using add to calculate a value to be added to the result (initially zero).
ShiftLeft and ShiftRight operations are used to quickly multiply or divide a LongInt by the wrap value (10 for "real" math). In the example above, we add 475 to zero 2 times (the last digit of 32) to get 950 (result = 0 + 950 = 950).
Then we left shift 475 to get 4750 and right shift 32 to get 3. Add 4750 to zero 3 times to get 14250 then add to result of 950 to get 15200.
Left shift 4750 to get 47500, right shift 3 to get 0. Since the right shifted 32 is now zero, we're finished and, in fact 475 x 32 does equal 15200.
Division is also tricky but based on early arithmetic (the "gazinta" method for "goes into"). Consider the following long division for 12345 / 27:
457
+-------
27 | 12345 27 is larger than 1 or 12 so we first use 123.
108 27 goes into 123 4 times, 4 x 27 = 108, 123 - 108 = 15.
---
154 Bring down 4.
135 27 goes into 154 5 times, 5 x 27 = 135, 154 - 135 = 19.
---
195 Bring down 5.
189 27 goes into 195 7 times, 7 x 27 = 189, 195 - 189 = 6.
---
6 Nothing more to bring down, so stop.
Therefore 12345 / 27 is 457 with remainder 6. Verify:
457 x 27 + 6
= 12339 + 6
= 12345
This is implemented by using a draw-down variable (initially zero) to bring down the segments of 12345 one at a time until it's greater or equal to 27.
Then we simply subtract 27 from that until we get below 27 - the number of subtractions is the segment added to the top line.
When there are no more segments to bring down, we have our result.
Keep in mind these are pretty basic algorithms. There are far better ways to do complex arithmetic if your numbers are going to be particularly large. You can look into something like GNU Multiple Precision Arithmetic Library - it's substantially better and faster than my own libraries.
It does have the rather unfortunate misfeature in that it will simply exit if it runs out of memory (a rather fatal flaw for a general purpose library in my opinion) but, if you can look past that, it's pretty good at what it does.
If you cannot use it for licensing reasons (or because you don't want your application just exiting for no apparent reason), you could at least get the algorithms from there for integrating into your own code.
I've also found that the bods over at MPIR (a fork of GMP) are more amenable to discussions on potential changes - they seem a more developer-friendly bunch.
While re-inventing the wheel is extremely good for your personal edification and learning, its also an extremely large task. I don't want to dissuade you as its an important exercise and one that I've done myself, but you should be aware that there are subtle and complex issues at work that larger packages address.
For example, multiplication. Naively, you might think of the 'schoolboy' method, i.e. write one number above the other, then do long multiplication as you learned in school. example:
123
x 34
-----
492
+ 3690
---------
4182
but this method is extremely slow (O(n^2), n being the number of digits). Instead, modern bignum packages use either a discrete Fourier transform or a Numeric transform to turn this into an essentially O(n ln(n)) operation.
And this is just for integers. When you get into more complicated functions on some type of real representation of number (log, sqrt, exp, etc.) things get even more complicated.
If you'd like some theoretical background, I highly recommend reading the first chapter of Yap's book, "Fundamental Problems of Algorithmic Algebra". As already mentioned, the gmp bignum library is an excellent library. For real numbers, I've used MPFR and liked it.
Don't reinvent the wheel: it might turn out to be square!
Use a third party library, such as GNU MP, that is tried and tested.
You do it in basically the same way you do with pencil and paper...
The number is to be represented in a buffer (array) able to take on an arbitrary size (which means using malloc and realloc) as needed
you implement basic arithmetic as much as possible using language supported structures, and deal with carries and moving the radix-point manually
you scour numeric analysis texts to find efficient arguments for dealing by more complex function
you only implement as much as you need.
Typically you will use as you basic unit of computation
bytes containing with 0-99 or 0-255
16 bit words contaning wither 0-9999 or 0--65536
32 bit words containing...
...
as dictated by your architecture.
The choice of binary or decimal base depends on you desires for maximum space efficiency, human readability, and the presence of absence of Binary Coded Decimal (BCD) math support on your chip.
You can do it with high school level of mathematics. Though more advanced algorithms are used in reality. So for example to add two 1024-byte numbers :
unsigned char first[1024], second[1024], result[1025];
unsigned char carry = 0;
unsigned int sum = 0;
for(size_t i = 0; i < 1024; i++)
{
sum = first[i] + second[i] + carry;
carry = sum - 255;
}
result will have to be bigger by one place in case of addition to take care of maximum values. Look at this :
9
+
9
----
18
TTMath is a great library if you want to learn. It is built using C++. The above example was silly one, but this is how addition and subtraction is done in general!
A good reference about the subject is Computational complexity of mathematical operations. It tells you how much space is required for each operation you want to implement. For example, If you have two N-digit numbers, then you need 2N digits to store the result of multiplication.
As Mitch said, it is by far not an easy task to implement! I recommend you take a look at TTMath if you know C++.
One of the ultimate references (IMHO) is Knuth's TAOCP Volume II. It explains lots of algorithms for representing numbers and arithmetic operations on these representations.
#Book{Knuth:taocp:2,
author = {Knuth, Donald E.},
title = {The Art of Computer Programming},
volume = {2: Seminumerical Algorithms, second edition},
year = {1981},
publisher = {\Range{Addison}{Wesley}},
isbn = {0-201-03822-6},
}
Assuming that you wish to write a big integer code yourself, this can be surprisingly simple to do, spoken as someone who did it recently (though in MATLAB.) Here are a few of the tricks I used:
I stored each individual decimal digit as a double number. This makes many operations simple, especially output. While it does take up more storage than you might wish, memory is cheap here, and it makes multiplication very efficient if you can convolve a pair of vectors efficiently. Alternatively, you can store several decimal digits in a double, but beware then that convolution to do the multiplication can cause numerical problems on very large numbers.
Store a sign bit separately.
Addition of two numbers is mainly a matter of adding the digits, then check for a carry at each step.
Multiplication of a pair of numbers is best done as convolution followed by a carry step, at least if you have a fast convolution code on tap.
Even when you store the numbers as a string of individual decimal digits, division (also mod/rem ops) can be done to gain roughly 13 decimal digits at a time in the result. This is much more efficient than a divide that works on only 1 decimal digit at a time.
To compute an integer power of an integer, compute the binary representation of the exponent. Then use repeated squaring operations to compute the powers as needed.
Many operations (factoring, primality tests, etc.) will benefit from a powermod operation. That is, when you compute mod(a^p,N), reduce the result mod N at each step of the exponentiation where p has been expressed in a binary form. Do not compute a^p first, and then try to reduce it mod N.
Here's a simple ( naive ) example I did in PHP.
I implemented "Add" and "Multiply" and used that for an exponent example.
http://adevsoft.com/simple-php-arbitrary-precision-integer-big-num-example/
Code snip
// Add two big integers
function ba($a, $b)
{
if( $a === "0" ) return $b;
else if( $b === "0") return $a;
$aa = str_split(strrev(strlen($a)>1?ltrim($a,"0"):$a), 9);
$bb = str_split(strrev(strlen($b)>1?ltrim($b,"0"):$b), 9);
$rr = Array();
$maxC = max(Array(count($aa), count($bb)));
$aa = array_pad(array_map("strrev", $aa),$maxC+1,"0");
$bb = array_pad(array_map("strrev", $bb),$maxC+1,"0");
for( $i=0; $i<=$maxC; $i++ )
{
$t = str_pad((string) ($aa[$i] + $bb[$i]), 9, "0", STR_PAD_LEFT);
if( strlen($t) > 9 )
{
$aa[$i+1] = ba($aa[$i+1], substr($t,0,1));
$t = substr($t, 1);
}
array_unshift($rr, $t);
}
return implode($rr);
}

Resources