Generate PCR from PTS - mpeg-2

I am trying to create PCR from PTS as follows.
S64 nPcr = nPts * 9 / 100;
pTsBuf[4] = 7 + nStuffyingBytes;
pTsBuf[5] = 0x10; /* flags */
pTsBuf[6] = ( nPcr >> 25 )&0xff;
pTsBuf[7] = ( nPcr >> 17 )&0xff;
pTsBuf[8] = ( nPcr >> 9 )&0xff;
pTsBuf[9] = ( nPcr >> 1 )&0xff;
pTsBuf[10]= ( nPcr << 7 )&0x80;
pTsBuf[11]= 0;
But the problem is VLC is playing only first frame and not playing any other frames.
and I am getting the warning "early picture skipped".
Could any one help me in converting from PTS to PCR..

First, the PCR has 33+9 bits, the PTS 33 bits. The 33 bit-portion (called PCR_base) runs at 90kHz, as does the PTS. The remaining 9 bits are called PCR_ext and run at 27MHz.
Thus, this is how you could calculate the PCR:
S64 nPcr = (S64)nPts << 9;
Note that there should be a time-offset between the PTSs of the multiplexed streams and the PCR, it's usually in the range of a few hundred ms, depending on the stream.
The respective decoder needs some time to decode the data and get it ready for presentation at the time given by the respective PTS, that's why the PTSs are always "ahead" of the PCR. ISO-13818 and some DVB specs give specifics about buffering and (de)multiplexing.
About your bitshifting I'm not sure, this is a code snippet of mine. The comment may help in shifting the bits to the right place, R stands for reserved.
data[4] = 7;
data[5] = 1 << 4; // PCR_flag
// pcr has 33+9=42 bits
// 4 3 2 1 0
// 76543210 98765432 10987654 32109876 54321098 76543210
// xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xRRRRRRx xxxxxxxx
// 10987654 32109876 54321098 76543210 9 8 76543210
// 4 3 2 1 0
// b6 b7 b8 b9 b10 b11
data[ 6] = (pcr >> 34) & 0xff;
data[ 7] = (pcr >> 26) & 0xff;
data[ 8] = (pcr >> 18) & 0xff;
data[ 9] = (pcr >> 10) & 0xff;
data[10] = 0x7e | ((pcr & (1 << 9)) >> 2) | ((pcr & (1 << 8)) >> 8);
data[11] = pcr & 0xff;

The answer of #schieferstapel is correct. I am only adding one more note here which refers to an exception.
There are times when B frames arrives after (who's PTS is less than) P frames. so PTS can be non-linear if every picture is stamped with PTS value. Whereas, PCR must be incrementally linear.
So in the above situation, you must try to either omit B frames or make relevant calculation when putting PCR values. Also, if this is hardware playouts, it is advisable that PCR should be slightly ahead (lesser by 400 ms or so) than PTS of corresponding I frames.

the PCR contains 33(PCR_Base)+6(PCR_const)+9(PCR_Ext) number of bits and also it states that the first 33 bits are based on a 90 kHz clock while the last 9 are based on a 27 MHz clock.PCR_const = 0x3F PCR_Ext=0 PCR_Base=pts/dts
Below code is easy understand.
PCR_Ext = 0;
PCR_Const = 0x3F;
int64_t pcrv = PCR_Ext & 0x1ff;
pcrv |= (PCR_Const << 9) & 0x7E00;
pcrv |= (PCR_Base << 15) & 0xFFFFFFFF8000LL;
pp = (char*)&pcrv;
data[ 6] = pp[5];
data[ 7] = pp[4];
data[ 8] = pp[3];
data[ 9] = pp[2];
data[10] = pp[1];
data[11] = pp[0];

Related

how to encode 27 vector3's into a 0-256 value?

I have 27 combinations of 3 values from -1 to 1 of type:
Vector3(0,0,0);
Vector3(-1,0,0);
Vector3(0,-1,0);
Vector3(0,0,-1);
Vector3(-1,-1,0);
... up to
Vector3(0,1,1);
Vector3(1,1,1);
I need to convert them to and from a 8-bit sbyte / byte array.
One solution is to say the first digit, of the 256 = X the second digit is Y and the third is Z...
so
Vector3(-1,1,1) becomes 022,
Vector3(1,-1,-1) becomes 200,
Vector3(1,0,1) becomes 212...
I'd prefer to encode it in a more compact way, perhaps using bytes (which I am clueless about), because the above solution uses a lot of multiplications and round functions to decode, do you have some suggestions please? the other option is to write 27 if conditions to write the Vector3 combination to an array, it seems inefficient.
Thanks to Evil Tak for the guidance, i changed the code a bit to add 0-1 values to the first bit, and to adapt it for unity3d:
function Pack4(x:int,y:int,z:int,w:int):sbyte {
var b: sbyte = 0;
b |= (x + 1) << 6;
b |= (y + 1) << 4;
b |= (z + 1) << 2;
b |= (w + 1);
return b;
}
function unPack4(b:sbyte):Vector4 {
var v : Vector4;
v.x = ((b & 0xC0) >> 6) - 1; //0xC0 == 1100 0000
v.y = ((b & 0x30) >> 4) - 1; // 0x30 == 0011 0000
v.z = ((b & 0xC) >> 2) - 1; // 0xC == 0000 1100
v.w = (b & 0x3) - 1; // 0x3 == 0000 0011
return v;
}
I assume your values are float not integer
so bit operations will not improve speed too much in comparison to conversion to integer type. So my bet using full range will be better. I would do this for 3D case:
8 bit -> 256 values
3D -> pow(256,1/3) = ~ 6.349 values per dimension
6^3 = 216 < 256
So packing of (x,y,z) looks like this:
BYTE p;
p =floor((x+1.0)*3.0);
p+=floor((y+1.0)*3.0*6.0);
p+=floor((y+1.0)*3.0*6.0*6.0);
The idea is convert <-1,+1> to range <0,1> hence the +1.0 and *3.0 instead of *6.0 and then just multiply to the correct place in final BYTE.
and unpacking of p looks like this:
x=p%6; x=(x/3.0)-1.0; p/=6;
y=p%6; y=(y/3.0)-1.0; p/=6;
z=p%6; z=(z/3.0)-1.0;
This way you use 216 from 256 values which is much better then just 2 bits (4 values). Your 4D case would look similar just use instead 3.0,6.0 different constant floor(pow(256,1/4))=4 so use 2.0,4.0 but beware case when p=256 or use 2 bits per dimension and bit approach like the accepted answer does.
If you need real speed you can optimize this to force float representation holding result of packet BYTE to specific exponent and extract mantissa bits as your packed BYTE directly. As the result will be <0,216> you can add any bigger number to it. see IEEE 754-1985 for details but you want the mantissa to align with your BYTE so if you add to p number like 2^23 then the lowest 8 bit of float should be your packed value directly (as MSB 1 is not present in mantissa) so no expensive conversion is needed.
In case you got just {-1,0,+1} instead of <-1,+1>
then of coarse you should use integer approach like bit packing with 2 bits per dimension or use LUT table of all 3^3 = 27 possibilities and pack entire vector in 5 bits.
The encoding would look like this:
int enc[3][3][3] = { 0,1,2, ... 24,25,26 };
p=enc[x+1][y+1][z+1];
And decoding:
int dec[27][3] = { {-1,-1,-1},.....,{+1,+1,+1} };
x=dec[p][0];
y=dec[p][1];
z=dec[p][2];
Which should be fast enough and if you got many vectors you can pack the p into each 5 bits ... to save even more memory space
One way is to store the component of each vector in every 2 bits of a byte.
Converting a vector component value to and from the 2 bit stored form is as simple as adding and subtracting one, respectively.
-1 (1111 1111 as a signed byte) <-> 00 (in binary)
0 (0000 0000 in binary) <-> 01 (in binary)
1 (0000 0001 in binary) <-> 10 (in binary)
The packed 2 bit values can be stored in a byte in any order of your preference. I will use the following format: 00XXYYZZ where XX is the converted (packed) value of the X component, and so on. The 0s at the start aren't going to be used.
A vector will then be packed in a byte as follows:
byte Pack(Vector3<int> vector) {
byte b = 0;
b |= (vector.x + 1) << 4;
b |= (vector.y + 1) << 2;
b |= (vector.z + 1);
return b;
}
Unpacking a vector from its byte form will be as follows:
Vector3<int> Unpack(byte b) {
Vector3<int> v = new Vector<int>();
v.x = ((b & 0x30) >> 4) - 1; // 0x30 == 0011 0000
v.y = ((b & 0xC) >> 2) - 1; // 0xC == 0000 1100
v.z = (b & 0x3) - 1; // 0x3 == 0000 0011
return v;
}
Both the above methods assume that the input is valid, i.e. All components of vector in Pack are either -1, 0 or 1 and that all two-bit sections of b in Unpack have a (binary) value of either 00, 01 or 10.
Since this method uses bitwise operators, it is fast and efficient. If you wish to compress the data further, you could try using the 2 unused bits too, and convert every 3 two-bit elements processed to a vector.
The most compact way is by writing a 27 digits number in base 3 (using a shift -1 -> 0, 0 -> 1, 1 -> 2).
The value of this number will range from 0 to 3^27-1 = 7625597484987, which takes 43 bits to be encoded, i.e. 6 bytes (and 5 spare bits).
This is a little saving compared to a packed representation with 4 two-bit numbers packed in a byte (hence 7 bytes/56 bits in total).
An interesting variant is to group the base 3 digits five by five in bytes (hence numbers 0 to 242). You will still require 6 bytes (and no spare bits), but the decoding of the bytes can easily be hard-coded as a table of 243 entries.

Convert temperature data from sensor to celsius degree

I have a sensor named LSM303DLHC ,it have 2 temp register but I can't figure it out how to convert it to degrees Celsius.
2 Reg is:
TEMP_OUT_H_M register // high reg
TEMP11 | TEMP10 | TEMP9 | TEMP8 | TEMP7 | TEMP6 | TEMP5 | TEMP4
TEMP_OUT_L_M register //low reg
TEMP3 | TEMP2 | TEMP1 | TEMP0 | 0 | 0 | 0 | 0
In datasheet say: "TEMP[11:0] Temperature data (8 LSB/deg - 12-bit resolution)"
My current code is
uint8_t hig_reg = read(TEMP_OUT_H_M) // value = 0x03
uint8_t low_reg = read(TEMP_OUT_L_M) // value = 0x40
int16_t temp = ((uint16_t)hig_reg << 8) | (uint16_t)low_reg; // temp = 0x0340 = 832
float mTemp = temp/256; // = 3.25
mTemp = mTemp +20 ; // =23.25 (°C) i add 20 more
But I don't understand where the 20 °C offset comes from? Datasheet never mentions it.
Thank for your answer. Turn out that temperature sensor just determine comparative temperature to calculate the variation. It not use for absolute temperature.They should add that information in datasheet. I just waste 2 day of my life for that.
My try...
First, I have note that you are taking the whole 8 bit TEMP_OUT_L_M register and as you described is just the first 4 bits of it.
Then try to make the 12 bit register first. I use python ans SMBus library,
temph = i2cbus.read_byte_data(i2caddress, TEMP_OUT_H_M) << 4
templ = i2cbus.read_byte_data(i2caddress, TEMP_OUT_L_M) >> 4
tempread = temph + templ # it is all ready converted to Decimal
Then you can go ahead with the transformation: see page 11 title 2.2 "Temperature sensor characteristics: 8 LSB/ºC , 12 bit resolution and 2.5 Vdd."
Then it is clear that:
ºC = (read_value * VDD * 10^(log 2 (LSB/ºC)) / ((resolution - 1) * (10*(ºC/LSB))
In the LSM303 then following the python code:
# temperature = (tempread * 2.5 * 1000)/(2^12-1) * (10/8)) better to write:
temperature = (tempread *2500)/(4095 * 1.25)
In your case: you have read: 0x0340, in 12 bits 0x34 in decimal: 54
temperature = (54 * 2500) / (4095 * 1.25) = 23.443223
I also noticed that:
The maximum secure register to read is 85ºC = 0x55, so we better make the register with the 4 bit LSB of TEMP_OUT_H_M and 4 bit MSB of TEMP_OUT_L_M.
In further test the LSM303 can resist near 125 ºC for a while, with out permanent damage, but it a good practice to use this temperature to put the Magnetometer and Accelerometer in a sleep mode. When the temperature reaches 80.
My opinion is that TEMP is on 10 bits and one for the sign (value max you can read : 0x3FF), so:
0x03FF - 0x0340 = 0x0BF
0x0BF / 8 = 0x17 (23.875 in decimal).
As said, don't forget the two's complement in your computation.

Fill three numbers inside One number

I am trying to fit 3 numbers inside 1 number.But numbers will be only between 0 and 11.So their (base) is 12.For example i have 7,5,2 numbers.I come up with something like this:
Three numbers into One number :
7x12=84
84x5=420
420+2=422
Now getting back Three numbers from One number :
422 MOD 12 = 2 (the third number)
422 - 2 = 420
420 / 12 = 35
And i understanded that 35 is multiplication of first and the second number (i.e 7 and 5)
And now i cant get that 7 and 5 anyone knows how could i ???
(I started typing this answer before the other one got posted, but this one is more specific to Arduino then the other one, so I'm leaving it)
The code
You can use bit shifting to get multiple small numbers into one big number, in code it would look like this:
int a, b, c;
//putting then together
int big = (a << 8) + (b << 4) + c;
//separating them again
a = (big >> 8) & 15;
b = (big >> 4) & 15;
c = big & 15;
This code only works when a, b and c are all in the range [0, 15] witch appears to be enough for you case.
How it works
The >> and << operators are the bitshift operators, in short a << n shifts every bit in a by n places to the left, this is equivalent to multiplying by 2^n. Similarly, a >> n shifts to to the right. An example:
11 << 3 == 120 //0000 1011 -> 0101 1000
The & operator performs a bitwise and on the two operands:
6 & 5 == 4 // 0110
// & 0101
//-> 0100
These two operators are combined to "pack" and "unpack" the three numbers. For the packing every small number is shifted a bit to the left and they are all added together. This is how the bits of big now look (there are 16 of them because ints in Arduino are 16 bits wide):
0000aaaabbbbcccc
When unpacking, the bits are shifted to the right again, and they are bitwise anded together with 15 to filter out any excess bits. This is what that last operation looks like to get b out again:
00000000aaaabbbb //big shifted 4 bits to the right
& 0000000000001111 //anded together with 15
-> 000000000000bbbb //gives the original number b
All is working exactly like in base 10 (or 16). Here after your corrected example.
Three numbers into One number :
7x12^2=1008
5*12^1=60
2*12^0=2
1008+60+2=1070
Now getting back Three numbers from One number :
1070 MOD 12 = 2 (the third number)
1070/12 = 89 (integer division) => 89 MOD 12 = 5
89 / 12 = 7
Note also that the maximum value will be 11*12*12+11*12+11=1727.
If this is really programming related, you will be using 16bits instead of 3*8 bits so sparing one byte. An easyer method not using base 12 would be fit each number into half a byte (better code efficiency and same transmission length):
7<<(4+4) + 5<<4 + 2 = 1874
1874 & 0x000F = 2
1874>>4 & 0x000F = 5
1874>>8 & 0x0F = 7
Because MOD(12) and division by 12 is much less efficient than working with powers of 2
you can use the principle of the positional notation to change from one or the other in any base
Treat yours numbers (n0,n1,...,nm) as a digit of a big number in the base B of your choosing so the new number is
N = n0*B^0 + n1*B^1 + ... + nm*B^m
to revert the process is also simple, while your number is greater than 0 find its modulo in respect to the base to get to get the first digit, then subtracts that digit and divide for the base, repeat until finish while saving each digit along the way
digit_list = []
while N > 0 do:
d = N mod B
N = (N - d) / B
digit_list.append( d )
then if N is N = n0*B^0 + n1*B^1 + ... + nm*B^m doing N mod B give you n0, then subtract it leaving you with n1*B^1 + ... + nm*B^m and divide by B to reduce the exponents of all B and that is the new N, N = n1*B^0 + ... + nm*B^(m-1) repetition of that give you all the digit you start with
here is a working example in python
def compact_num( num_list, base=12 ):
return sum( n*pow(base,i) for i,n in enumerate(num_list) )
def decompact_num( n, base=12):
if n==0:
return [0]
result = []
while n:
n,d = divmod(n,base)
result.append(d)
return result
example
>>> compact_num([2,5,7])
1070
>>> decompact_num(1070)
[2, 5, 7]
>>> compact_num([10,2],16)
42
>>> decompact_num(42,16)
[10, 2]
>>>

How to find x mod 15 without using any Arithmetic Operations?

We are given a unsigned integer, suppose. And without using any arithmetic operators ie + - / * or %, we are to find x mod 15. We may use binary bit manipulations.
As far as I could go, I got this based on 2 points.
a = a mod 15 = a mod 16 for a<15
Let a = x mod 15
then a = x - 15k (for some non-negative k).
ie a = x - 16k + k...
ie a mod 16 = ( x mod 16 + k mod 16 ) mod 16
ie a mod 15 = ( x mod 16 + k mod 16 ) mod 16
ie a = ( x mod 16 + k mod 16 ) mod 16
OK. Now to implement this. A mod16 operations is basically & OxF. and k is basically x>>4
So a = ( x & OxF + (x>>4) & OxF ) & OxF.
It boils down to adding 2 4-bit numbers. Which can be done by bit expressions.
sum[0] = a[0] ^ b[0]
sum[1] = a[1] ^ b[1] ^ (a[0] & b[0])
...
and so on
This seems like cheating to me. I'm hoping for a more elegant solution
This reminds me of an old trick from base 10 called "casting out the 9s". This was used for checking the result of large sums performed by hand.
In this case 123 mod 9 = 1 + 2 + 3 mod 9 = 6.
This happens because 9 is one less than the base of the digits (10). (Proof omitted ;) )
So considering the number in base 16 (Hex). you should be able to do:
0xABCE123 mod 0xF = (0xA + 0xB + 0xC + 0xD + 0xE + 0x1 + 0x2 + 0x3 ) mod 0xF
= 0x42 mod 0xF
= 0x6
Now you'll still need to do some magic to make the additions disappear. But it gives the right answer.
UPDATE:
Heres a complete implementation in C++. The f lookup table takes pairs of digits to their sum mod 15. (which is the same as the byte mod 15). We then repack these results and reapply on half as much data each round.
#include <iostream>
uint8_t f[256]={
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,0,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,0,1,
2,3,4,5,6,7,8,9,10,11,12,13,14,0,1,2,
3,4,5,6,7,8,9,10,11,12,13,14,0,1,2,3,
4,5,6,7,8,9,10,11,12,13,14,0,1,2,3,4,
5,6,7,8,9,10,11,12,13,14,0,1,2,3,4,5,
6,7,8,9,10,11,12,13,14,0,1,2,3,4,5,6,
7,8,9,10,11,12,13,14,0,1,2,3,4,5,6,7,
8,9,10,11,12,13,14,0,1,2,3,4,5,6,7,8,
9,10,11,12,13,14,0,1,2,3,4,5,6,7,8,9,
10,11,12,13,14,0,1,2,3,4,5,6,7,8,9,10,
11,12,13,14,0,1,2,3,4,5,6,7,8,9,10,11,
12,13,14,0,1,2,3,4,5,6,7,8,9,10,11,12,
13,14,0,1,2,3,4,5,6,7,8,9,10,11,12,13,
14,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,0};
uint64_t mod15( uint64_t in_v )
{
uint8_t * in = (uint8_t*)&in_v;
// 12 34 56 78 12 34 56 78 => aa bb cc dd
in[0] = f[in[0]] | (f[in[1]]<<4);
in[1] = f[in[2]] | (f[in[3]]<<4);
in[2] = f[in[4]] | (f[in[5]]<<4);
in[3] = f[in[6]] | (f[in[7]]<<4);
// aa bb cc dd => AA BB
in[0] = f[in[0]] | (f[in[1]]<<4);
in[1] = f[in[2]] | (f[in[3]]<<4);
// AA BB => DD
in[0] = f[in[0]] | (f[in[1]]<<4);
// DD => D
return f[in[0]];
}
int main()
{
uint64_t x = 12313231;
std::cout<< mod15(x)<<" "<< (x%15)<<std::endl;
}
Your logic is somewhere flawed but I can't put a finger on it. Think about it yourself, your final formula operates on first 8 bits and ignores the rest. That could only be valid if the part you throw away (9+ bits) are always the multiplication of 15. However, in reality (in binary numbers) 9+ bits are always multiplications of 16 but not 15. For example try putting 1 0000 0000 and 11 0000 0000 in your formula. Your formula will give 0 as a result for both cases, while in reality the answer is 1 and 3.
In essense I'm almost sure that your task can not be solved without loops. And if you are allowed to use loops - then it's nothing easier than to implement bitwiseAdd function and do whatever you like with it.
Added:
Found your problem. Here it is:
... a = x - 15k (for some non-negative k).
... and k is basically x>>4
It equals x>>4 only by pure coincidence for some numbers. Take any big example, for instance x=11110000. By your calculation k = 15, while in reality it is k=16: 16*15 = 11110000.

Divide by 10 using bit shifts?

Is it possible to divide an unsigned integer by 10 by using pure bit shifts, addition, subtraction and maybe multiply? Using a processor with very limited resources and slow divide.
Editor's note: this is not actually what compilers do, and gives the wrong answer for large positive integers ending with 9, starting with div10(1073741829) = 107374183 not 107374182. It is exact for smaller inputs, though, which may be sufficient for some uses.
Compilers (including MSVC) do use fixed-point multiplicative inverses for constant divisors, but they use a different magic constant and shift on the high-half result to get an exact result for all possible inputs, matching what the C abstract machine requires. See Granlund & Montgomery's paper on the algorithm.
See Why does GCC use multiplication by a strange number in implementing integer division? for examples of the actual x86 asm gcc, clang, MSVC, ICC, and other modern compilers make.
This is a fast approximation that's inexact for large inputs
It's even faster than the exact division via multiply + right-shift that compilers use.
You can use the high half of a multiply result for divisions by small integral constants. Assume a 32-bit machine (code can be adjusted accordingly):
int32_t div10(int32_t dividend)
{
int64_t invDivisor = 0x1999999A;
return (int32_t) ((invDivisor * dividend) >> 32);
}
What's going here is that we're multiplying by a close approximation of 1/10 * 2^32 and then removing the 2^32. This approach can be adapted to different divisors and different bit widths.
This works great for the ia32 architecture, since its IMUL instruction will put the 64-bit product into edx:eax, and the edx value will be the wanted value. Viz (assuming dividend is passed in eax and quotient returned in eax)
div10 proc
mov edx,1999999Ah ; load 1/10 * 2^32
imul eax ; edx:eax = dividend / 10 * 2 ^32
mov eax,edx ; eax = dividend / 10
ret
endp
Even on a machine with a slow multiply instruction, this will be faster than a software or even hardware divide.
Though the answers given so far match the actual question, they do not match the title. So here's a solution heavily inspired by Hacker's Delight that really uses only bit shifts.
unsigned divu10(unsigned n) {
unsigned q, r;
q = (n >> 1) + (n >> 2);
q = q + (q >> 4);
q = q + (q >> 8);
q = q + (q >> 16);
q = q >> 3;
r = n - (((q << 2) + q) << 1);
return q + (r > 9);
}
I think that this is the best solution for architectures that lack a multiply instruction.
Of course you can if you can live with some loss in precision. If you know the value range of your input values you can come up with a bit shift and a multiplication which is exact.
Some examples how you can divide by 10, 60, ... like it is described in this blog to format time the fastest way possible.
temp = (ms * 205) >> 11; // 205/2048 is nearly the same as /10
to expand Alois's answer a bit, we can expand the suggested y = (x * 205) >> 11 for a few more multiples/shifts:
y = (ms * 1) >> 3 // first error 8
y = (ms * 2) >> 4 // 8
y = (ms * 4) >> 5 // 8
y = (ms * 7) >> 6 // 19
y = (ms * 13) >> 7 // 69
y = (ms * 26) >> 8 // 69
y = (ms * 52) >> 9 // 69
y = (ms * 103) >> 10 // 179
y = (ms * 205) >> 11 // 1029
y = (ms * 410) >> 12 // 1029
y = (ms * 820) >> 13 // 1029
y = (ms * 1639) >> 14 // 2739
y = (ms * 3277) >> 15 // 16389
y = (ms * 6554) >> 16 // 16389
y = (ms * 13108) >> 17 // 16389
y = (ms * 26215) >> 18 // 43699
y = (ms * 52429) >> 19 // 262149
y = (ms * 104858) >> 20 // 262149
y = (ms * 209716) >> 21 // 262149
y = (ms * 419431) >> 22 // 699059
y = (ms * 838861) >> 23 // 4194309
y = (ms * 1677722) >> 24 // 4194309
y = (ms * 3355444) >> 25 // 4194309
y = (ms * 6710887) >> 26 // 11184819
y = (ms * 13421773) >> 27 // 67108869
each line is a single, independent, calculation, and you'll see your first "error"/incorrect result at the value shown in the comment. you're generally better off taking the smallest shift for a given error value as this will minimise the extra bits needed to store the intermediate value in the calculation, e.g. (x * 13) >> 7 is "better" than (x * 52) >> 9 as it needs two less bits of overhead, while both start to give wrong answers above 68.
if you want to calculate more of these, the following (Python) code can be used:
def mul_from_shift(shift):
mid = 2**shift + 5.
return int(round(mid / 10.))
and I did the obvious thing for calculating when this approximation starts to go wrong with:
def first_err(mul, shift):
i = 1
while True:
y = (i * mul) >> shift
if y != i // 10:
return i
i += 1
(note that // is used for "integer" division, i.e. it truncates/rounds towards zero)
the reason for the "3/1" pattern in errors (i.e. 8 repeats 3 times followed by 9) seems to be due to the change in bases, i.e. log2(10) is ~3.32. if we plot the errors we get the following:
where the relative error is given by: mul_from_shift(shift) / (1<<shift) - 0.1
Considering Kuba Ober’s response, there is another one in the same vein.
It uses iterative approximation of the result, but I wouldn’t expect any surprising performances.
Let say we have to find x where x = v / 10.
We’ll use the inverse operation v = x * 10 because it has the nice property that when x = a + b, then x * 10 = a * 10 + b * 10.
Let use x as variable holding the best approximation of result so far. When the search ends, x Will hold the result. We’ll set each bit b of x from the most significant to the less significant, one by one, end compare (x + b) * 10 with v. If its smaller or equal to v, then the bit b is set in x. To test the next bit, we simply shift b one position to the right (divide by two).
We can avoid the multiplication by 10 by holding x * 10 and b * 10 in other variables.
This yields the following algorithm to divide v by 10.
uin16_t x = 0, x10 = 0, b = 0x1000, b10 = 0xA000;
while (b != 0) {
uint16_t t = x10 + b10;
if (t <= v) {
x10 = t;
x |= b;
}
b10 >>= 1;
b >>= 1;
}
// x = v / 10
Edit: to get the algorithm of Kuba Ober which avoids the need of variable x10 , we can subtract b10 from v and v10 instead. In this case x10 isn’t needed anymore. The algorithm becomes
uin16_t x = 0, b = 0x1000, b10 = 0xA000;
while (b != 0) {
if (b10 <= v) {
v -= b10;
x |= b;
}
b10 >>= 1;
b >>= 1;
}
// x = v / 10
The loop may be unwinded and the different values of b and b10 may be precomputed as constants.
On architectures that can only shift one place at a time, a series of explicit comparisons against decreasing powers of two multiplied by 10 might work better than the solution form hacker's delight. Assuming a 16 bit dividend:
uint16_t div10(uint16_t dividend) {
uint16_t quotient = 0;
#define div10_step(n) \
do { if (dividend >= (n*10)) { quotient += n; dividend -= n*10; } } while (0)
div10_step(0x1000);
div10_step(0x0800);
div10_step(0x0400);
div10_step(0x0200);
div10_step(0x0100);
div10_step(0x0080);
div10_step(0x0040);
div10_step(0x0020);
div10_step(0x0010);
div10_step(0x0008);
div10_step(0x0004);
div10_step(0x0002);
div10_step(0x0001);
#undef div10_step
if (dividend >= 5) ++quotient; // round the result (optional)
return quotient;
}
Well division is subtraction, so yes. Shift right by 1 (divide by 2). Now subtract 5 from the result, counting the number of times you do the subtraction until the value is less than 5. The result is number of subtractions you did. Oh, and dividing is probably going to be faster.
A hybrid strategy of shift right then divide by 5 using the normal division might get you a performance improvement if the logic in the divider doesn't already do this for you.
I've designed a new method in AVR assembly, with lsr/ror and sub/sbc only. It divides by 8, then sutracts the number divided by 64 and 128, then subtracts the 1,024th and the 2,048th, and so on and so on. Works very reliable (includes exact rounding) and quick (370 microseconds at 1 MHz).
The source code is here for 16-bit-numbers:
http://www.avr-asm-tutorial.net/avr_en/beginner/DIV10/div10_16rd.asm
The page that comments this source code is here:
http://www.avr-asm-tutorial.net/avr_en/beginner/DIV10/DIV10.html
I hope that it helps, even though the question is ten years old.
brgs, gsc
elemakil's comments' code can be found here: https://doc.lagout.org/security/Hackers%20Delight.pdf
page 233. "Unsigned divide by 10 [and 11.]"

Resources