Subtraction assembly with "base 10" EMU8086

Subtraction assembly with "base 10" EMU8086 - math

Hello I'm making a base 10 calculator in assembler that can take number with max length of 5 dig... so there is two numbers after the input was taken one of the five dig number is stored in ax and bl for example
AX - 23 45
BX - 00 01
So the value of the input is 12345 And the other is for example is 23243 and it's stored on CX and DX with the same idea of the first number (that stored in AX and BX...) Now, I have made the addition code, but I can't figure out how making the Subtraction code with all the neg problem...
So what I thought to do is to, for example, take bh (that I'm not using because the number can't be longer than 6 digs...) and if the number is negative Ill put 1 and if its positive I'll put 0 so this problem is solved, Now the problem is that I dont know how to make the code work like with all the sub part and the carry and every thing ...(in the addition i used commands like adc,daa...)
last example:
value is: 12345 and its positive
AX - 23 45
BX - 00 01
(if Bh is 0 the number is positive if 1 its negative...)
Now the value is : 23243 and its positive
CX - 32 43
DX - 00 02
Calculation
12345-23243(= -10898)
lets say the answer goes to CX AND DX
so it will look like that:
CX - 08 98
DX - 01 01
answer: (-10898)
Can someone please help me/give me an example code that I'll know how to do it ?
Sorry if I'm little bit Confused...
Thx.
EDIT:
here is the addition code that you ask for:
proc Add_two_numbers;2 values useing stack...
pop [150]
pop dx
pop cx
pop bx
pop ax
add al,cl
daa
mov cl,al
mov al,ah
adc al,ch
daa
mov ch,al
mov al,bl
adc al,dl
daa
mov dl,al
push cx
push dx
push [150]
ret
endp Add_two_numbers
2nd edit:
I figure out how making it Negative so I just need algorithms that sub 2 number it does not need to work with numbers like 1000-2000 please make it work only on positive values like 2000-1000

Answering your comment, this is one way you can convert from decimal and back using C as an example. I leave you to code it in asm!
#include <conio.h>
#define MAX 100000000
// input a signed decimal number
int inp_num(void) {
int number=0, neg=0, key;
while (number < MAX) {
key = _getche();
if (key == '-') {
if (number==0)
neg = 1; // else ignore
}
else if (key >= '0' && key <= '9')
number = number * 10 + key - '0';
else
break;
}
if (neg)
number = -number;
_putch('\n');
return number;
}
// output a signed number as decimal
void out_num(int number) {
int digit, suppress0, d;
suppress0 = 1; // zero-suppression on
if (number < 0) {
_putch('-');
number =-number;
}
for (d=MAX; d>0; d/=10) {
digit = number / d;
if (digit) // if non-0
suppress0 = 0; // cancel zero-suppression
if (!suppress0)
_putch('0' + digit);
number -= digit * d;
}
}
int main(void) {
int number;
number = inp_num();
out_num(number);
return 0;
}

Related

Floating point data format sign+exponent

I am receiving data over UART from a heat meter, but I need some help to understand how i should deal with the data.
I have the documentation but that is not enough for me, I have to little experience with this kind of calculations.
Maybe someone with the right skill could explain to me how it should be done with a better example that I have from the documentation.
One value consists of the following bytes:
[number of bytes][sign+exponent] (integer)
(integer) is the register data value. The length of the integer value is
specified by [number of bytes]. [sign+exponent] is an 8-bit value that
specifies the sign of the data value and sign and value of the exponent. The
meaning of the individual bits in the [sign+exponent] byte is shown below:
Examples:
-123.45 = 04h, C2h, 0h, 0h, 30h, 39h
87654321*103 = 04h, 03h , 05h, 39h, 7Fh, B1h
255*103 = 01h, 03h , FFh
And now to one more example with actual data.
This is the information that I have from the documentation about this.
This is some data that I have received from my heat meter
10 00 56 25 04 42 00 00 1B E4
So in my example then 04 is the [number of bytes], 42 is the [sign+exponent] and 00 00 1B E4 is the (integer).
But I do not know how I should make the calculation to receive the actual value.
Any help?

Your data appears to be big-endian, according to your example. So here's how you break those bytes into the fields you need using bit shifting and masking.
n = b[0]
SI = (b[1] & 0x80) >> 7
SE = (b[1] & 0x40) >> 6
exponent = b[1] & 0x3f
integer = 0
for i = 0 to n-1:
integer = (integer << 8) + b[2+i]

The sign of the mantissa is obtained from the MSb of the Sign+exponent byte, by masking (byte & 80h != 0 => SI = -1).
The sign of the exponent is similarly obtained by byte & 40h != 0 => SE = -1.
The exponent value is EXP = byte & 3Fh.
The mantissa INT is the binary number formed by the four other bytes, which can be read as a single integer (but mind the indianness).
Finally, compute SI * INT * pow(10, SE * EXP).
In your example, SI = 1, SE = -1, EXP = 2, INT = 7140, hence
1 * 7140 * pow(10, -1 * 2) = +71.4
It is not in the scope of this answer to explain how to implement this efficiently.

how to encode 27 vector3's into a 0-256 value?

I have 27 combinations of 3 values from -1 to 1 of type:
Vector3(0,0,0);
Vector3(-1,0,0);
Vector3(0,-1,0);
Vector3(0,0,-1);
Vector3(-1,-1,0);
... up to
Vector3(0,1,1);
Vector3(1,1,1);
I need to convert them to and from a 8-bit sbyte / byte array.
One solution is to say the first digit, of the 256 = X the second digit is Y and the third is Z...
so
Vector3(-1,1,1) becomes 022,
Vector3(1,-1,-1) becomes 200,
Vector3(1,0,1) becomes 212...
I'd prefer to encode it in a more compact way, perhaps using bytes (which I am clueless about), because the above solution uses a lot of multiplications and round functions to decode, do you have some suggestions please? the other option is to write 27 if conditions to write the Vector3 combination to an array, it seems inefficient.
Thanks to Evil Tak for the guidance, i changed the code a bit to add 0-1 values to the first bit, and to adapt it for unity3d:
function Pack4(x:int,y:int,z:int,w:int):sbyte {
var b: sbyte = 0;
b |= (x + 1) << 6;
b |= (y + 1) << 4;
b |= (z + 1) << 2;
b |= (w + 1);
return b;
}
function unPack4(b:sbyte):Vector4 {
var v : Vector4;
v.x = ((b & 0xC0) >> 6) - 1; //0xC0 == 1100 0000
v.y = ((b & 0x30) >> 4) - 1; // 0x30 == 0011 0000
v.z = ((b & 0xC) >> 2) - 1; // 0xC == 0000 1100
v.w = (b & 0x3) - 1; // 0x3 == 0000 0011
return v;
}

I assume your values are float not integer
so bit operations will not improve speed too much in comparison to conversion to integer type. So my bet using full range will be better. I would do this for 3D case:
8 bit -> 256 values
3D -> pow(256,1/3) = ~ 6.349 values per dimension
6^3 = 216 < 256
So packing of (x,y,z) looks like this:
BYTE p;
p =floor((x+1.0)*3.0);
p+=floor((y+1.0)*3.0*6.0);
p+=floor((y+1.0)*3.0*6.0*6.0);
The idea is convert <-1,+1> to range <0,1> hence the +1.0 and *3.0 instead of *6.0 and then just multiply to the correct place in final BYTE.
and unpacking of p looks like this:
x=p%6; x=(x/3.0)-1.0; p/=6;
y=p%6; y=(y/3.0)-1.0; p/=6;
z=p%6; z=(z/3.0)-1.0;
This way you use 216 from 256 values which is much better then just 2 bits (4 values). Your 4D case would look similar just use instead 3.0,6.0 different constant floor(pow(256,1/4))=4 so use 2.0,4.0 but beware case when p=256 or use 2 bits per dimension and bit approach like the accepted answer does.
If you need real speed you can optimize this to force float representation holding result of packet BYTE to specific exponent and extract mantissa bits as your packed BYTE directly. As the result will be <0,216> you can add any bigger number to it. see IEEE 754-1985 for details but you want the mantissa to align with your BYTE so if you add to p number like 2^23 then the lowest 8 bit of float should be your packed value directly (as MSB 1 is not present in mantissa) so no expensive conversion is needed.
In case you got just {-1,0,+1} instead of <-1,+1>
then of coarse you should use integer approach like bit packing with 2 bits per dimension or use LUT table of all 3^3 = 27 possibilities and pack entire vector in 5 bits.
The encoding would look like this:
int enc[3][3][3] = { 0,1,2, ... 24,25,26 };
p=enc[x+1][y+1][z+1];
And decoding:
int dec[27][3] = { {-1,-1,-1},.....,{+1,+1,+1} };
x=dec[p][0];
y=dec[p][1];
z=dec[p][2];
Which should be fast enough and if you got many vectors you can pack the p into each 5 bits ... to save even more memory space

One way is to store the component of each vector in every 2 bits of a byte.
Converting a vector component value to and from the 2 bit stored form is as simple as adding and subtracting one, respectively.
-1 (1111 1111 as a signed byte) <-> 00 (in binary)
0 (0000 0000 in binary) <-> 01 (in binary)
1 (0000 0001 in binary) <-> 10 (in binary)
The packed 2 bit values can be stored in a byte in any order of your preference. I will use the following format: 00XXYYZZ where XX is the converted (packed) value of the X component, and so on. The 0s at the start aren't going to be used.
A vector will then be packed in a byte as follows:
byte Pack(Vector3<int> vector) {
byte b = 0;
b |= (vector.x + 1) << 4;
b |= (vector.y + 1) << 2;
b |= (vector.z + 1);
return b;
}
Unpacking a vector from its byte form will be as follows:
Vector3<int> Unpack(byte b) {
Vector3<int> v = new Vector<int>();
v.x = ((b & 0x30) >> 4) - 1; // 0x30 == 0011 0000
v.y = ((b & 0xC) >> 2) - 1; // 0xC == 0000 1100
v.z = (b & 0x3) - 1; // 0x3 == 0000 0011
return v;
}
Both the above methods assume that the input is valid, i.e. All components of vector in Pack are either -1, 0 or 1 and that all two-bit sections of b in Unpack have a (binary) value of either 00, 01 or 10.
Since this method uses bitwise operators, it is fast and efficient. If you wish to compress the data further, you could try using the 2 unused bits too, and convert every 3 two-bit elements processed to a vector.

The most compact way is by writing a 27 digits number in base 3 (using a shift -1 -> 0, 0 -> 1, 1 -> 2).
The value of this number will range from 0 to 3^27-1 = 7625597484987, which takes 43 bits to be encoded, i.e. 6 bytes (and 5 spare bits).
This is a little saving compared to a packed representation with 4 two-bit numbers packed in a byte (hence 7 bytes/56 bits in total).
An interesting variant is to group the base 3 digits five by five in bytes (hence numbers 0 to 242). You will still require 6 bytes (and no spare bits), but the decoding of the bytes can easily be hard-coded as a table of 243 entries.

Arduino: Formula to convert byte

Im looking for a way to modify a binary byte value on Arduino.
Because of the Hardware, its neccesarry, to split a two digit number into 2 4-bit.
the code to set output is wire.write(byte, 0xFF) which sets all outputs on High.
0xFF = binary 1111 1111
the formula should be convert a value like this:
e.g nr 35 is binary 0010 0011
but for my use it should displayed as 0011 0101 which would be refer to 53 in reality.
The first 4 bits are for a BCD-Input IC which displays the 5 from 35, the second 4 bits are for a BCD-Input IC which displays the 3 from 35.
Does anybody has a idea how to convert this by code, or like a mathematical formula?
Possible numbers are from 00 to 59.
Thank you for your help

To convert a value n between 0 and 99 to BCD:
((n / 10) * 16) + (n % 10)
assuming n is an integer and thus / is doing integer division; also assumes this will be stored in an unsigned byte.
(If this is not producing the desired result, please either explain how it is incorrect for the example given, or provide a different example for which it is incorrect.)

#include <string.h>
int num = // Any number from 0 to 59
int tens = num/10;
int units = num-(tens*10);
// Make string array for binary
string tensbinary;
int quotient = tens;
char buffer[1];
// Convert numbers
for (int i = 0; i < 4; i++)
{
quotientint = quotientint % 2;
sprintf(buffer, 1, "%d", quotientint);
binary.append(buffer);
}
// Repeat above for the units
// Now join the two together
binarytens.append(binaryunits);
I don't know if this will work, but still, you might be able to extrapolate based on the available information in my code.
The last thing you need to do is convert the string to binary.

Divide by 10 using bit shifts?

Is it possible to divide an unsigned integer by 10 by using pure bit shifts, addition, subtraction and maybe multiply? Using a processor with very limited resources and slow divide.

Editor's note: this is not actually what compilers do, and gives the wrong answer for large positive integers ending with 9, starting with div10(1073741829) = 107374183 not 107374182. It is exact for smaller inputs, though, which may be sufficient for some uses.
Compilers (including MSVC) do use fixed-point multiplicative inverses for constant divisors, but they use a different magic constant and shift on the high-half result to get an exact result for all possible inputs, matching what the C abstract machine requires. See Granlund & Montgomery's paper on the algorithm.
See Why does GCC use multiplication by a strange number in implementing integer division? for examples of the actual x86 asm gcc, clang, MSVC, ICC, and other modern compilers make.
This is a fast approximation that's inexact for large inputs
It's even faster than the exact division via multiply + right-shift that compilers use.
You can use the high half of a multiply result for divisions by small integral constants. Assume a 32-bit machine (code can be adjusted accordingly):
int32_t div10(int32_t dividend)
{
int64_t invDivisor = 0x1999999A;
return (int32_t) ((invDivisor * dividend) >> 32);
}
What's going here is that we're multiplying by a close approximation of 1/10 * 2^32 and then removing the 2^32. This approach can be adapted to different divisors and different bit widths.
This works great for the ia32 architecture, since its IMUL instruction will put the 64-bit product into edx:eax, and the edx value will be the wanted value. Viz (assuming dividend is passed in eax and quotient returned in eax)
div10 proc
mov edx,1999999Ah ; load 1/10 * 2^32
imul eax ; edx:eax = dividend / 10 * 2 ^32
mov eax,edx ; eax = dividend / 10
ret
endp
Even on a machine with a slow multiply instruction, this will be faster than a software or even hardware divide.

Though the answers given so far match the actual question, they do not match the title. So here's a solution heavily inspired by Hacker's Delight that really uses only bit shifts.
unsigned divu10(unsigned n) {
unsigned q, r;
q = (n >> 1) + (n >> 2);
q = q + (q >> 4);
q = q + (q >> 8);
q = q + (q >> 16);
q = q >> 3;
r = n - (((q << 2) + q) << 1);
return q + (r > 9);
}
I think that this is the best solution for architectures that lack a multiply instruction.

Of course you can if you can live with some loss in precision. If you know the value range of your input values you can come up with a bit shift and a multiplication which is exact.
Some examples how you can divide by 10, 60, ... like it is described in this blog to format time the fastest way possible.
temp = (ms * 205) >> 11; // 205/2048 is nearly the same as /10

to expand Alois's answer a bit, we can expand the suggested y = (x * 205) >> 11 for a few more multiples/shifts:
y = (ms * 1) >> 3 // first error 8
y = (ms * 2) >> 4 // 8
y = (ms * 4) >> 5 // 8
y = (ms * 7) >> 6 // 19
y = (ms * 13) >> 7 // 69
y = (ms * 26) >> 8 // 69
y = (ms * 52) >> 9 // 69
y = (ms * 103) >> 10 // 179
y = (ms * 205) >> 11 // 1029
y = (ms * 410) >> 12 // 1029
y = (ms * 820) >> 13 // 1029
y = (ms * 1639) >> 14 // 2739
y = (ms * 3277) >> 15 // 16389
y = (ms * 6554) >> 16 // 16389
y = (ms * 13108) >> 17 // 16389
y = (ms * 26215) >> 18 // 43699
y = (ms * 52429) >> 19 // 262149
y = (ms * 104858) >> 20 // 262149
y = (ms * 209716) >> 21 // 262149
y = (ms * 419431) >> 22 // 699059
y = (ms * 838861) >> 23 // 4194309
y = (ms * 1677722) >> 24 // 4194309
y = (ms * 3355444) >> 25 // 4194309
y = (ms * 6710887) >> 26 // 11184819
y = (ms * 13421773) >> 27 // 67108869
each line is a single, independent, calculation, and you'll see your first "error"/incorrect result at the value shown in the comment. you're generally better off taking the smallest shift for a given error value as this will minimise the extra bits needed to store the intermediate value in the calculation, e.g. (x * 13) >> 7 is "better" than (x * 52) >> 9 as it needs two less bits of overhead, while both start to give wrong answers above 68.
if you want to calculate more of these, the following (Python) code can be used:
def mul_from_shift(shift):
mid = 2**shift + 5.
return int(round(mid / 10.))
and I did the obvious thing for calculating when this approximation starts to go wrong with:
def first_err(mul, shift):
i = 1
while True:
y = (i * mul) >> shift
if y != i // 10:
return i
i += 1
(note that // is used for "integer" division, i.e. it truncates/rounds towards zero)
the reason for the "3/1" pattern in errors (i.e. 8 repeats 3 times followed by 9) seems to be due to the change in bases, i.e. log2(10) is ~3.32. if we plot the errors we get the following:
where the relative error is given by: mul_from_shift(shift) / (1<<shift) - 0.1

Considering Kuba Ober’s response, there is another one in the same vein.
It uses iterative approximation of the result, but I wouldn’t expect any surprising performances.
Let say we have to find x where x = v / 10.
We’ll use the inverse operation v = x * 10 because it has the nice property that when x = a + b, then x * 10 = a * 10 + b * 10.
Let use x as variable holding the best approximation of result so far. When the search ends, x Will hold the result. We’ll set each bit b of x from the most significant to the less significant, one by one, end compare (x + b) * 10 with v. If its smaller or equal to v, then the bit b is set in x. To test the next bit, we simply shift b one position to the right (divide by two).
We can avoid the multiplication by 10 by holding x * 10 and b * 10 in other variables.
This yields the following algorithm to divide v by 10.
uin16_t x = 0, x10 = 0, b = 0x1000, b10 = 0xA000;
while (b != 0) {
uint16_t t = x10 + b10;
if (t <= v) {
x10 = t;
x |= b;
}
b10 >>= 1;
b >>= 1;
}
// x = v / 10
Edit: to get the algorithm of Kuba Ober which avoids the need of variable x10 , we can subtract b10 from v and v10 instead. In this case x10 isn’t needed anymore. The algorithm becomes
uin16_t x = 0, b = 0x1000, b10 = 0xA000;
while (b != 0) {
if (b10 <= v) {
v -= b10;
x |= b;
}
b10 >>= 1;
b >>= 1;
}
// x = v / 10
The loop may be unwinded and the different values of b and b10 may be precomputed as constants.

On architectures that can only shift one place at a time, a series of explicit comparisons against decreasing powers of two multiplied by 10 might work better than the solution form hacker's delight. Assuming a 16 bit dividend:
uint16_t div10(uint16_t dividend) {
uint16_t quotient = 0;
#define div10_step(n) \
do { if (dividend >= (n*10)) { quotient += n; dividend -= n*10; } } while (0)
div10_step(0x1000);
div10_step(0x0800);
div10_step(0x0400);
div10_step(0x0200);
div10_step(0x0100);
div10_step(0x0080);
div10_step(0x0040);
div10_step(0x0020);
div10_step(0x0010);
div10_step(0x0008);
div10_step(0x0004);
div10_step(0x0002);
div10_step(0x0001);
#undef div10_step
if (dividend >= 5) ++quotient; // round the result (optional)
return quotient;
}

Well division is subtraction, so yes. Shift right by 1 (divide by 2). Now subtract 5 from the result, counting the number of times you do the subtraction until the value is less than 5. The result is number of subtractions you did. Oh, and dividing is probably going to be faster.
A hybrid strategy of shift right then divide by 5 using the normal division might get you a performance improvement if the logic in the divider doesn't already do this for you.

I've designed a new method in AVR assembly, with lsr/ror and sub/sbc only. It divides by 8, then sutracts the number divided by 64 and 128, then subtracts the 1,024th and the 2,048th, and so on and so on. Works very reliable (includes exact rounding) and quick (370 microseconds at 1 MHz).
The source code is here for 16-bit-numbers:
http://www.avr-asm-tutorial.net/avr_en/beginner/DIV10/div10_16rd.asm
The page that comments this source code is here:
http://www.avr-asm-tutorial.net/avr_en/beginner/DIV10/DIV10.html
I hope that it helps, even though the question is ten years old.
brgs, gsc

elemakil's comments' code can be found here: https://doc.lagout.org/security/Hackers%20Delight.pdf
page 233. "Unsigned divide by 10 [and 11.]"

Designing function f(f(n)) == -n

A question I got on my last interview:
Design a function f, such that:
f(f(n)) == -n
Where n is a 32 bit signed integer; you can't use complex numbers arithmetic.
If you can't design such a function for the whole range of numbers, design it for the largest range possible.
Any ideas?

You didn't say what kind of language they expected... Here's a static solution (Haskell). It's basically messing with the 2 most significant bits:
f :: Int -> Int
f x | (testBit x 30 /= testBit x 31) = negate $ complementBit x 30
| otherwise = complementBit x 30
It's much easier in a dynamic language (Python). Just check if the argument is a number X and return a lambda that returns -X:
def f(x):
if isinstance(x,int):
return (lambda: -x)
else:
return x()

How about:
f(n) = sign(n) - (-1)ⁿ * n
In Python:
def f(n):
if n == 0: return 0
if n >= 0:
if n % 2 == 1:
return n + 1
else:
return -1 * (n - 1)
else:
if n % 2 == 1:
return n - 1
else:
return -1 * (n + 1)
Python automatically promotes integers to arbitrary length longs. In other languages the largest positive integer will overflow, so it will work for all integers except that one.
To make it work for real numbers you need to replace the n in (-1)ⁿ with { ceiling(n) if n>0; floor(n) if n<0 }.
In C# (works for any double, except in overflow situations):
static double F(double n)
{
if (n == 0) return 0;
if (n < 0)
return ((long)Math.Ceiling(n) % 2 == 0) ? (n + 1) : (-1 * (n - 1));
else
return ((long)Math.Floor(n) % 2 == 0) ? (n - 1) : (-1 * (n + 1));
}

Here's a proof of why such a function can't exist, for all numbers, if it doesn't use extra information(except 32bits of int):
We must have f(0) = 0. (Proof: Suppose f(0) = x. Then f(x) = f(f(0)) = -0 = 0. Now, -x = f(f(x)) = f(0) = x, which means that x = 0.)
Further, for any x and y, suppose f(x) = y. We want f(y) = -x then. And f(f(y)) = -y => f(-x) = -y. To summarize: if f(x) = y, then f(-x) = -y, and f(y) = -x, and f(-y) = x.
So, we need to divide all integers except 0 into sets of 4, but we have an odd number of such integers; not only that, if we remove the integer that doesn't have a positive counterpart, we still have 2(mod4) numbers.
If we remove the 2 maximal numbers left (by abs value), we can get the function:
int sign(int n)
{
if(n>0)
return 1;
else
return -1;
}
int f(int n)
{
if(n==0) return 0;
switch(abs(n)%2)
{
case 1:
return sign(n)*(abs(n)+1);
case 0:
return -sign(n)*(abs(n)-1);
}
}
Of course another option, is to not comply for 0, and get the 2 numbers we removed as a bonus. (But that's just a silly if.)

Thanks to overloading in C++:
double f(int var)
{
return double(var);
}
int f(double var)
{
return -int(var);
}
int main(){
int n(42);
std::cout<<f(f(n));
}

Or, you could abuse the preprocessor:
#define f(n) (f##n)
#define ff(n) -n
int main()
{
int n = -42;
cout << "f(f(" << n << ")) = " << f(f(n)) << endl;
}

This is true for all negative numbers.
f(n) = abs(n)
Because there is one more negative number than there are positive numbers for twos complement integers, f(n) = abs(n) is valid for one more case than f(n) = n > 0 ? -n : n solution that is the same same as f(n) = -abs(n). Got you by one ... :D
UPDATE
No, it is not valid for one case more as I just recognized by litb's comment ... abs(Int.Min) will just overflow ...
I thought about using mod 2 information, too, but concluded, it does not work ... to early. If done right, it will work for all numbers except Int.Min because this will overflow.
UPDATE
I played with it for a while, looking for a nice bit manipulation trick, but I could not find a nice one-liner, while the mod 2 solution fits in one.
f(n) = 2n(abs(n) % 2) - n + sgn(n)
In C#, this becomes the following:
public static Int32 f(Int32 n)
{
return 2 * n * (Math.Abs(n) % 2) - n + Math.Sign(n);
}
To get it working for all values, you have to replace Math.Abs() with (n > 0) ? +n : -n and include the calculation in an unchecked block. Then you get even Int.Min mapped to itself as unchecked negation does.
UPDATE
Inspired by another answer I am going to explain how the function works and how to construct such a function.
Lets start at the very beginning. The function f is repeatedly applied to a given value n yielding a sequence of values.
n => f(n) => f(f(n)) => f(f(f(n))) => f(f(f(f(n)))) => ...
The question demands f(f(n)) = -n, that is two successive applications of f negate the argument. Two further applications of f - four in total - negate the argument again yielding n again.
n => f(n) => -n => f(f(f(n))) => n => f(n) => ...
Now there is a obvious cycle of length four. Substituting x = f(n) and noting that the obtained equation f(f(f(n))) = f(f(x)) = -x holds, yields the following.
n => x => -n => -x => n => ...
So we get a cycle of length four with two numbers and the two numbers negated. If you imagine the cycle as a rectangle, negated values are located at opposite corners.
One of many solution to construct such a cycle is the following starting from n.
n => negate and subtract one
-n - 1 = -(n + 1) => add one
-n => negate and add one
n + 1 => subtract one
n
A concrete example is of such an cycle is +1 => -2 => -1 => +2 => +1. We are almost done. Noting that the constructed cycle contains an odd positive number, its even successor, and both numbers negate, we can easily partition the integers into many such cycles (2^32 is a multiple of four) and have found a function that satisfies the conditions.
But we have a problem with zero. The cycle must contain 0 => x => 0 because zero is negated to itself. And because the cycle states already 0 => x it follows 0 => x => 0 => x. This is only a cycle of length two and x is turned into itself after two applications, not into -x. Luckily there is one case that solves the problem. If X equals zero we obtain a cycle of length one containing only zero and we solved that problem concluding that zero is a fixed point of f.
Done? Almost. We have 2^32 numbers, zero is a fixed point leaving 2^32 - 1 numbers, and we must partition that number into cycles of four numbers. Bad that 2^32 - 1 is not a multiple of four - there will remain three numbers not in any cycle of length four.
I will explain the remaining part of the solution using the smaller set of 3 bit signed itegers ranging from -4 to +3. We are done with zero. We have one complete cycle +1 => -2 => -1 => +2 => +1. Now let us construct the cycle starting at +3.
+3 => -4 => -3 => +4 => +3
The problem that arises is that +4 is not representable as 3 bit integer. We would obtain +4 by negating -3 to +3 - what is still a valid 3 bit integer - but then adding one to +3 (binary 011) yields 100 binary. Interpreted as unsigned integer it is +4 but we have to interpret it as signed integer -4. So actually -4 for this example or Int.MinValue in the general case is a second fixed point of integer arithmetic negation - 0 and Int.MinValue are mapped to themselve. So the cycle is actually as follows.
+3 => -4 => -3 => -4 => -3
It is a cycle of length two and additionally +3 enters the cycle via -4. In consequence -4 is correctly mapped to itself after two function applications, +3 is correctly mapped to -3 after two function applications, but -3 is erroneously mapped to itself after two function applications.
So we constructed a function that works for all integers but one. Can we do better? No, we cannot. Why? We have to construct cycles of length four and are able to cover the whole integer range up to four values. The remaining values are the two fixed points 0 and Int.MinValue that must be mapped to themselves and two arbitrary integers x and -x that must be mapped to each other by two function applications.
To map x to -x and vice versa they must form a four cycle and they must be located at opposite corners of that cycle. In consequence 0 and Int.MinValue have to be at opposite corners, too. This will correctly map x and -x but swap the two fixed points 0 and Int.MinValue after two function applications and leave us with two failing inputs. So it is not possible to construct a function that works for all values, but we have one that works for all values except one and this is the best we can achieve.

Using complex numbers, you can effectively divide the task of negating a number into two steps:
multiply n by i, and you get n*i, which is n rotated 90° counter-clockwise
multiply again by i, and you get -n
The great thing is that you don't need any special handling code. Just multiplying by i does the job.
But you're not allowed to use complex numbers. So you have to somehow create your own imaginary axis, using part of your data range. Since you need exactly as much imaginary (intermediate) values as initial values, you are left with only half the data range.
I tried to visualize this on the following figure, assuming signed 8-bit data. You would have to scale this for 32-bit integers. The allowed range for initial n is -64 to +63.
Here's what the function does for positive n:
If n is in 0..63 (initial range), the function call adds 64, mapping n to the range 64..127 (intermediate range)
If n is in 64..127 (intermediate range), the function subtracts n from 64, mapping n to the range 0..-63
For negative n, the function uses the intermediate range -65..-128.

Works except int.MaxValue and int.MinValue
public static int f(int x)
{
if (x == 0) return 0;
if ((x % 2) != 0)
return x * -1 + (-1 *x) / (Math.Abs(x));
else
return x - x / (Math.Abs(x));
}

The question doesn't say anything about what the input type and return value of the function f have to be (at least not the way you've presented it)...
...just that when n is a 32-bit integer then f(f(n)) = -n
So, how about something like
Int64 f(Int64 n)
{
return(n > Int32.MaxValue ?
-(n - 4L * Int32.MaxValue):
n + 4L * Int32.MaxValue);
}
If n is a 32-bit integer then the statement f(f(n)) == -n will be true.
Obviously, this approach could be extended to work for an even wider range of numbers...

for javascript (or other dynamically typed languages) you can have the function accept either an int or an object and return the other. i.e.
function f(n) {
if (n.passed) {
return -n.val;
} else {
return {val:n, passed:1};
}
}
giving
js> f(f(10))
-10
js> f(f(-10))
10
alternatively you could use overloading in a strongly typed language although that may break the rules ie
int f(long n) {
return n;
}
long f(int n) {
return -n;
}

Depending on your platform, some languages allow you to keep state in the function. VB.Net, for example:
Function f(ByVal n As Integer) As Integer
Static flag As Integer = -1
flag *= -1
Return n * flag
End Function
IIRC, C++ allowed this as well. I suspect they're looking for a different solution though.
Another idea is that since they didn't define the result of the first call to the function you could use odd/evenness to control whether to invert the sign:
int f(int n)
{
int sign = n>=0?1:-1;
if (abs(n)%2 == 0)
return ((abs(n)+1)*sign * -1;
else
return (abs(n)-1)*sign;
}
Add one to the magnitude of all even numbers, subtract one from the magnitude of all odd numbers. The result of two calls has the same magnitude, but the one call where it's even we swap the sign. There are some cases where this won't work (-1, max or min int), but it works a lot better than anything else suggested so far.

Exploiting JavaScript exceptions.
function f(n) {
try {
return n();
}
catch(e) {
return function() { return -n; };
}
}
f(f(0)) => 0
f(f(1)) => -1

For all 32-bit values (with the caveat that -0 is -2147483648)
int rotate(int x)
{
static const int split = INT_MAX / 2 + 1;
static const int negativeSplit = INT_MIN / 2 + 1;
if (x == INT_MAX)
return INT_MIN;
if (x == INT_MIN)
return x + 1;
if (x >= split)
return x + 1 - INT_MIN;
if (x >= 0)
return INT_MAX - x;
if (x >= negativeSplit)
return INT_MIN - x + 1;
return split -(negativeSplit - x);
}
You basically need to pair each -x => x => -x loop with a y => -y => y loop. So I paired up opposite sides of the split.
e.g. For 4 bit integers:
0 => 7 => -8 => -7 => 0
1 => 6 => -1 => -6 => 1
2 => 5 => -2 => -5 => 2
3 => 4 => -3 => -4 => 3

A C++ version, probably bending the rules somewhat but works for all numeric types (floats, ints, doubles) and even class types that overload the unary minus:
template <class T>
struct f_result
{
T value;
};
template <class T>
f_result <T> f (T n)
{
f_result <T> result = {n};
return result;
}
template <class T>
T f (f_result <T> n)
{
return -n.value;
}
void main (void)
{
int n = 45;
cout << "f(f(" << n << ")) = " << f(f(n)) << endl;
float p = 3.14f;
cout << "f(f(" << p << ")) = " << f(f(p)) << endl;
}

x86 asm (AT&T style):
; input %edi
; output %eax
; clobbered regs: %ecx, %edx
f:
testl %edi, %edi
je .zero
movl %edi, %eax
movl $1, %ecx
movl %edi, %edx
andl $1, %eax
addl %eax, %eax
subl %eax, %ecx
xorl %eax, %eax
testl %edi, %edi
setg %al
shrl $31, %edx
subl %edx, %eax
imull %ecx, %eax
subl %eax, %edi
movl %edi, %eax
imull %ecx, %eax
.zero:
xorl %eax, %eax
ret
Code checked, all possible 32bit integers passed, error with -2147483647 (underflow).

Uses globals...but so?
bool done = false
f(int n)
{
int out = n;
if(!done)
{
out = n * -1;
done = true;
}
return out;
}

This Perl solution works for integers, floats, and strings.
sub f {
my $n = shift;
return ref($n) ? -$$n : \$n;
}
Try some test data.
print $_, ' ', f(f($_)), "\n" for -2, 0, 1, 1.1, -3.3, 'foo' '-bar';
Output:
-2 2
0 0
1 -1
1.1 -1.1
-3.3 3.3
foo -foo
-bar +bar

Nobody ever said f(x) had to be the same type.
def f(x):
if type(x) == list:
return -x[0]
return [x]
f(2) => [2]
f(f(2)) => -2

I'm not actually trying to give a solution to the problem itself, but do have a couple of comments, as the question states this problem was posed was part of a (job?) interview:
I would first ask "Why would such a function be needed? What is the bigger problem this is part of?" instead of trying to solve the actual posed problem on the spot. This shows how I think and how I tackle problems like this. Who know? That might even be the actual reason the question is asked in an interview in the first place. If the answer is "Never you mind, assume it's needed, and show me how you would design this function." I would then continue to do so.
Then, I would write the C# test case code I would use (the obvious: loop from int.MinValue to int.MaxValue, and for each n in that range call f(f(n)) and checking the result is -n), telling I would then use Test Driven Development to get to such a function.
Only if the interviewer continues asking for me to solve the posed problem would I actually start to try and scribble pseudocode during the interview itself to try and get to some sort of an answer. However, I don't really think I would be jumping to take the job if the interviewer would be any indication of what the company is like...
Oh, this answer assumes the interview was for a C# programming related position. Would of course be a silly answer if the interview was for a math related position. ;-)

I would you change the 2 most significant bits.
00.... => 01.... => 10.....
01.... => 10.... => 11.....
10.... => 11.... => 00.....
11.... => 00.... => 01.....
As you can see, it's just an addition, leaving out the carried bit.
How did I got to the answer? My first thought was just a need for symmetry. 4 turns to get back where I started. At first I thought, that's 2bits Gray code. Then I thought actually standard binary is enough.

Here is a solution that is inspired by the requirement or claim that complex numbers can not be used to solve this problem.
Multiplying by the square root of -1 is an idea, that only seems to fail because -1 does not have a square root over the integers. But playing around with a program like mathematica gives for example the equation
(18494364652+1) mod (232-3) = 0.
and this is almost as good as having a square root of -1. The result of the function needs to be a signed integer. Hence I'm going to use a modified modulo operation mods(x,n) that returns the integer y congruent to x modulo n that is closest to 0. Only very few programming languages have suc a modulo operation, but it can easily be defined. E.g. in python it is:
def mods(x, n):
y = x % n
if y > n/2: y-= n
return y
Using the equation above, the problem can now be solved as
def f(x):
return mods(x*1849436465, 2**32-3)
This satisfies f(f(x)) = -x for all integers in the range [-231-2, 231-2]. The results of f(x) are also in this range, but of course the computation would need 64-bit integers.

C# for a range of 2^32 - 1 numbers, all int32 numbers except (Int32.MinValue)
Func<int, int> f = n =>
n < 0
? (n & (1 << 30)) == (1 << 30) ? (n ^ (1 << 30)) : - (n | (1 << 30))
: (n & (1 << 30)) == (1 << 30) ? -(n ^ (1 << 30)) : (n | (1 << 30));
Console.WriteLine(f(f(Int32.MinValue + 1))); // -2147483648 + 1
for (int i = -3; i <= 3 ; i++)
Console.WriteLine(f(f(i)));
Console.WriteLine(f(f(Int32.MaxValue))); // 2147483647
prints:
2147483647
3
2
1
0
-1
-2
-3
-2147483647

Essentially the function has to divide the available range into cycles of size 4, with -n at the opposite end of n's cycle. However, 0 must be part of a cycle of size 1, because otherwise 0->x->0->x != -x. Because of 0 being alone, there must be 3 other values in our range (whose size is a multiple of 4) not in a proper cycle with 4 elements.
I chose these extra weird values to be MIN_INT, MAX_INT, and MIN_INT+1. Furthermore, MIN_INT+1 will map to MAX_INT correctly, but get stuck there and not map back. I think this is the best compromise, because it has the nice property of only the extreme values not working correctly. Also, it means it would work for all BigInts.
int f(int n):
if n == 0 or n == MIN_INT or n == MAX_INT: return n
return ((Math.abs(n) mod 2) * 2 - 1) * n + Math.sign(n)

Nobody said it had to be stateless.
int32 f(int32 x) {
static bool idempotent = false;
if (!idempotent) {
idempotent = true;
return -x;
} else {
return x;
}
}
Cheating, but not as much as a lot of the examples. Even more evil would be to peek up the stack to see if your caller's address is &f, but this is going to be more portable (although not thread safe... the thread-safe version would use TLS). Even more evil:
int32 f (int32 x) {
static int32 answer = -x;
return answer;
}
Of course, neither of these works too well for the case of MIN_INT32, but there is precious little you can do about that unless you are allowed to return a wider type.

I could imagine using the 31st bit as an imaginary (i) bit would be an approach that would support half the total range.

works for n= [0 .. 2^31-1]
int f(int n) {
if (n & (1 << 31)) // highest bit set?
return -(n & ~(1 << 31)); // return negative of original n
else
return n | (1 << 31); // return n with highest bit set
}

The problem states "32-bit signed integers" but doesn't specify whether they are twos-complement or ones-complement.
If you use ones-complement then all 2^32 values occur in cycles of length four - you don't need a special case for zero, and you also don't need conditionals.
In C:
int32_t f(int32_t x)
{
return (((x & 0xFFFFU) << 16) | ((x & 0xFFFF0000U) >> 16)) ^ 0xFFFFU;
}
This works by
Exchanging the high and low 16-bit blocks
Inverting one of the blocks
After two passes we have the bitwise inverse of the original value. Which in ones-complement representation is equivalent to negation.
Examples:
Pass | x
-----+-------------------
0 | 00000001 (+1)
1 | 0001FFFF (+131071)
2 | FFFFFFFE (-1)
3 | FFFE0000 (-131071)
4 | 00000001 (+1)
Pass | x
-----+-------------------
0 | 00000000 (+0)
1 | 0000FFFF (+65535)
2 | FFFFFFFF (-0)
3 | FFFF0000 (-65535)
4 | 00000000 (+0)

:D
boolean inner = true;
int f(int input) {
if(inner) {
inner = false;
return input;
} else {
inner = true;
return -input;
}
}

return x ^ ((x%2) ? 1 : -INT_MAX);

I'd like to share my point of view on this interesting problem as a mathematician. I think I have the most efficient solution.
If I remember correctly, you negate a signed 32-bit integer by just flipping the first bit. For example, if n = 1001 1101 1110 1011 1110 0000 1110 1010, then -n = 0001 1101 1110 1011 1110 0000 1110 1010.
So how do we define a function f that takes a signed 32-bit integer and returns another signed 32-bit integer with the property that taking f twice is the same as flipping the first bit?
Let me rephrase the question without mentioning arithmetic concepts like integers.
How do we define a function f that takes a sequence of zeros and ones of length 32 and returns a sequence of zeros and ones of the same length, with the property that taking f twice is the same as flipping the first bit?
Observation: If you can answer the above question for 32 bit case, then you can also answer for 64 bit case, 100 bit case, etc. You just apply f to the first 32 bit.
Now if you can answer the question for 2 bit case, Voila!
And yes it turns out that changing the first 2 bits is enough.
Here's the pseudo-code
1. take n, which is a signed 32-bit integer.
2. swap the first bit and the second bit.
3. flip the first bit.
4. return the result.
Remark: The step 2 and the step 3 together can be summerised as (a,b) --> (-b, a). Looks familiar? That should remind you of the 90 degree rotation of the plane and the multiplication by the squar root of -1.
If I just presented the pseudo-code alone without the long prelude, it would seem like a rabbit out of the hat, I wanted to explain how I got the solution.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Subtraction assembly with "base 10" EMU8086 - math

Related

Floating point data format sign+exponent

how to encode 27 vector3's into a 0-256 value?

Arduino: Formula to convert byte

Divide by 10 using bit shifts?

Designing function f(f(n)) == -n

Categories

Resources