Related
Im wondering whether there are any standard approaches to reversing AND routines by brute force.
For example I have the following transformation:
MOV(eax, 0x5b3e0be0) <- Here we move 0x5b3e0be0 to EDX.
MOV(edx, eax) # Here we copy 0x5b3e0be0 to EAX as well.
SHL(edx, 0x7) # Bitshift 0x5b3e0be0 with 0x7 which results in 0x9f05f000
AND(edx, 0x9d2c5680) # AND 0x9f05f000 with 0x9d2c5680 which results in 0x9d045000
XOR(edx, eax) # XOR 0x9d045000 with original value 0x5b3e0be0 which results in 0xc63a5be0
My question is how to brute force and reverse this routine (i.e. transform 0xc63a5be0 back into 0x5b3e0be0)
One idea i had (which didn't work) was this using PeachPy implementation:
#Input values
MOV(esi, 0xffffffff) < Initial value to AND with, which will be decreased by 1 in a loop.
MOV(cl, 0x1) < Initial value to SHR with which will be increased by 1 until 0x1f.
MOV(eax, 0xc63a5be0) < Target result which I'm looking to get using the below loop.
MOV(edx, 0x5b3e0be0) < Input value which will be transformed.
sub_esi = peachpy.x86_64.Label()
with loop:
#End the loop if ESI = 0x0
TEST(esi, esi)
JZ(loop.end)
#Test the routine and check if it matches end result.
MOV(ebx, eax)
SHR(ebx, cl)
TEST(ebx, ebx)
JZ(sub_esi)
AND(ebx, esi)
XOR(ebx, eax)
CMP(ebx, edx)
JZ(loop.end)
#Add to the CL register which is used for SHR.
#Also check if we've reached the last potential value of CL which is 0x1f
ADD(cl, 0x1)
CMP(cl, 0x1f)
JNZ(loop.begin)
#Decrement ESI by 1, reset CL and restart routine.
peachpy.x86_64.LABEL(sub_esi)
SUB(esi, 0x1)
MOV(cl, 0x1)
JMP(loop.begin)
#The ESI result here will either be 0x0 or a valid value to AND with and get the necessary result.
RETURN(esi)
Maybe an article or a book you can recommend specific to this?
It's not lossy, the final operation is an XOR.
The whole routine can be modeled in C as
#define K 0x9d2c5680
uint32_t hash(uint32_t num)
{
return num ^ ( (num << 7) & K);
}
Now, if we have two bits x and y and the operation x XOR y, when y is zero the result is x.
So given two numbers n1 and n2 and considering their XOR, the bits or n1 that pairs with a zero in n2 would make it to the result unchanged (the others will be flipped).
So in considering num ^ ( (num << 7) & K) we can identify num with n1 and (num << 7) & K with n2.
Since n2 is an AND, we can tell that it must have at least the same zero bits that K has.
This means that each bit of num that corresponds to a zero bit in the constant K will make it unchanged into the result.
Thus, by extracting those bits from the result we already have a partial inverse function:
/*hash & ~K extracts the bits of hash that pair with a zero bit in K*/
partial_num = hash & ~K
Technically, the factor num << 7 would also introduce other zeros in the result of the AND. We know for sure that the lowest 7 bits must be zero.
However K already has the lowest 7 bits zero, so we cannot exploit this information.
So we will just use K here, but if its value were different you'd need to consider the AND (which, in practice, means to zero the lower 7 bits of K).
This leaves us with 13 bits unknown (the ones corresponding to the bits that are set in K).
If we forget about the AND for a moment, we would have x ^ (x << 7) meaning that
hi = numi for i from 0 to 6 inclusive
hi = numi ^ numi-7 for i from 7 to 31 inclusive
(The first line is due to the fact that the lower 7 bits of the right-hand are zero)
From this, starting from h7 and going up, we can retrive num7 as h7 ^ num0 = h7 ^ h0.
From bit 7 onward, the equality doesn't work and we need to use numk (for the suitable k) but luckily we already have computed its value in a previous step (that's why we start from lower to higher).
What the AND does to this is just restricting the values the index i runs in, specifically only to the bits that are set in K.
So to fill in the thirteen remaining bits one have to do:
part_num7 = h7 ^ part_num0
part_num9 = h9 ^ part_num2
part_num12 = h12 ^ part_num5
...
part_num31 = h31 ^ part_num24
Note that we exploited that fact that part_num0..6 = h0..6.
Here's a C program that inverts the function:
#include <stdio.h>
#include <stdint.h>
#define BIT(i, hash, result) ( (((result >> i) ^ (hash >> (i+7))) & 0x1) << (i+7) )
#define K 0x9d2c5680
uint32_t base_candidate(uint32_t hash)
{
uint32_t result = hash & ~K;
result |= BIT(0, hash, result);
result |= BIT(2, hash, result);
result |= BIT(3, hash, result);
result |= BIT(5, hash, result);
result |= BIT(7, hash, result);
result |= BIT(11, hash, result);
result |= BIT(12, hash, result);
result |= BIT(14, hash, result);
result |= BIT(17, hash, result);
result |= BIT(19, hash, result);
result |= BIT(20, hash, result);
result |= BIT(21, hash, result);
result |= BIT(24, hash, result);
return result;
}
uint32_t hash(uint32_t num)
{
return num ^ ( (num << 7) & K);
}
int main()
{
uint32_t tester = 0x5b3e0be0;
uint32_t candidate = base_candidate(hash(tester));
printf("candidate: %x, tester %x\n", candidate, tester);
return 0;
}
Since the original question was how to "bruteforce" instead of solve here's something that I eventually came up with which works just as well. Obviously its prone to errors depending on input (might be more than 1 result).
from peachpy import *
from peachpy.x86_64 import *
input = 0xc63a5be0
x = Argument(uint32_t)
with Function("DotProduct", (x,), uint32_t) as asm_function:
LOAD.ARGUMENT(edx, x) # EDX = 1b6fb67c
MOV(esi, 0xffffffff)
with Loop() as loop:
TEST(esi,esi)
JZ(loop.end)
MOV(eax, esi)
SHL(eax, 0x7)
AND(eax, 0x9d2c5680)
XOR(eax, esi)
CMP(eax, edx)
JZ(loop.end)
SUB(esi, 0x1)
JMP(loop.begin)
RETURN(esi)
#Read Assembler Return
abi = peachpy.x86_64.abi.detect()
encoded_function = asm_function.finalize(abi).encode()
python_function = encoded_function.load()
print(hex(python_function(input)))
It's about this dynamic programming challenge.
If you have a hard time to understand the Problem then see also on AbhishekVermaIIT's post
Basically, you get as input an array B and you construct array A. Fo this array A you need the maximum possible sum with absolute(A[i] - A[i-1]), for i = 1 to N. How to construct array A? --> You can choose for every element A[i] in array A either the values 1 or B[i]. (As you will deduce from the problem description any other value between these two values doesn't make any sense.)
And I came up with this recursive Java solution (without memoization):
static int costHelper(int[] arr, int i) {
if (i < 1) return 0;
int q = max(abs(1 - arr[i-1]) + costHelper(arr, i-1) , abs(arr[i] - arr[i-1]) + costHelper(arr, i-1));
int[] arr1 = new int[i];
for (int j = 0; j < arr1.length-1; j++) {
arr1[j] = arr[j];
}
arr1[i-1] = 1;
int r = max(abs(1 - 1) + costHelper(arr1, i-1) , abs(arr[i] - 1) + costHelper(arr1, i-1));
return max(q , r);
}
static int cost(int[] arr) {
return costHelper(arr, arr.length-1);
}
public static void main(String[] args) {
int[] arr = {55, 68, 31, 80, 57, 18, 34, 28, 76, 55};
int result = cost(arr);
System.out.println(result);
}
Basically, I start at the end of the array and check what is maximizing the sum of the last element minus last element - 1. But I have 4 cases:
(1 - arr[i-1])
(arr[i] - arr[i-1])
(1 - 1) // I know, it is not necessary.
(arr[i] -1)
For the 3rd or 4th case I construct a new array one element smaller in size than the input array and with a 1 as the last element.
Now, the result of arr = 55 68 31 80 57 18 34 28 76 55 according to Hackerrank should be 508. But I get 564.
Since it has to be 508 I guess the array should be 1 68 1 80 1 1 34 1 76 1.
For other arrays I get the right answer. For example:
79 6 40 68 68 16 40 63 93 49 91 --> 642 (OK)
100 2 100 2 100 --> 396 (OK)
I don't understand what is wrong with this algorithm.
I'm not sure exactly what's happening with your particular solution but I suspect it might be that the recursive function only has one dimension, i, since we need a way to identify the best previous solution, f(i-1), both if B_(i-1) was chosen and if 1 was chosen at that point, so we can choose the best among them vis-a-vis f(i). (It might help if you could add a description of your algorithm in words.)
Let's look at the brute-force dynamic program: let m[i][j1] represent the best sum-of-abs-diff in A[0..i] when A_i is j1. Then, generally:
m[i][j1] = max(abs(j1 - j0) + m[i-1][j0])
for j0 in [1..B_(i-1)] and j1 in [1..B_i]
Python code:
def cost(arr):
if len(arr) == 1:
return 0
m = [[float('-inf')]*101 for i in xrange(len(arr))]
for i in xrange(1, len(arr)):
for j0 in xrange(1, arr[i-1] + 1):
for j1 in xrange(1, arr[i] + 1):
m[i][j1] = max(m[i][j1], abs(j1 - j0) + (m[i-1][j0] if i > 1 else 0))
return max(m[len(arr) - 1])
That works but times out since we are looping potentially 100*100*10^5 iterations.
I haven't thought through the proof for it, but, as you suggest, apparently we can choose only from either 1 or B_i for each A_i for an optimal solution. This allows us to choose between those directly in a significantly more efficient solution that won't time out:
def cost(arr):
if len(arr) == 1:
return 0
m = [[float('-inf')]*2 for i in xrange(len(arr))]
for i in xrange(1, len(arr)):
for j0 in [1, arr[i-1]]:
for j1 in [1, arr[i]]:
a_i = 0 if j1 == 1 else 1
b_i = 0 if j0 == 1 else 1
m[i][a_i] = max(m[i][a_i], abs(j1 - j0) + (m[i-1][b_i] if i > 1 else 0))
return max(m[len(arr) - 1])
This is a bottom-up tabulation but we could easily convert it to a recursive one using the same idea.
Here is the javascript code with memoization-
function cost(B,n,val) {
if(n==-1){
return 0;
}
let prev1=0,prev2=0;
if(n!=0){
if(dp[n-1][0]==-1)
dp[n-1][0] = cost(B,n-1,1);
if(dp[n-1][1]==-1)
dp[n-1][1] = cost(B,n-1,B[n]);
prev1=dp[n-1][0];
prev2=dp[n-1][1];
}
prev1 = prev1 + Math.abs(val-1);
prev2 = prev2+ Math.abs(val-B[n]);
return Math.max(prev1,prev2);
}
where B->given array,n->total length,val-> 1 or B[n], value considered by the calling function.
Initial call -> Math.max(cost(B,n-2,1),cost(B,n-2,B[n-1]));
BTW, this took me around 3hrs, rather could have easily done with iteration method. :p
//dp[][0] is when a[i]=b[i]
dp[i][0]=max((dp[i-1][0]+abs(b[i]-b[i-1])),(dp[i-1][1]+abs(b[i]-1)));
dp[i][1]=max((dp[i-1][1]+abs(1-1)),(dp[i-1][0]+abs(b[i-1]-1)));
Initially all the elements in dp have the value of 0.
We know that we will get the answer if at any i the value is b[i] or 1. So the final answer is :
max(dp[n-1][0],dp[n-1][1])
dp[i][0] signifies a[i]=b[i] and dp[i][1] signifies a[i]=1.
So at every i we want the maximum of [i-1][0] (previous element is b[i-1]) or [i-1][1] (previous element is 1)
Given XOR & SUM of two numbers. How to find the numbers?
For example, x = a+b, y = a^b; if x,y are given, how to get a, b?
And if can't, give the reason.
This cannot be done reliably. A single counter-example is enough to destroy any theory and, in your case, that example is 0, 100 and 4, 96. Both of these sum to 100 and xor to 100 as well:
0 = 0000 0000 4 = 0000 0100
100 = 0110 0100 96 = 0110 0000
---- ---- ---- ----
xor 0110 0100 = 100 xor 0110 0100 = 100
Hence given a sum of 100 and an xor of 100, you cannot know which of the possibilities generated that situation.
For what it's worth, this program checks the possibilities with just the numbers 0..255:
#include <stdio.h>
static void output (unsigned int a, unsigned int b) {
printf ("%u:%u = %u %u\n", a+b, a^b, a, b);
}
int main (void) {
unsigned int limit = 256;
unsigned int a, b;
output (0, 0);
for (b = 1; b != limit; b++)
output (0, b);
for (a = 1; a != limit; a++)
for (b = 1; b != limit; b++)
output (a, b);
return 0;
}
You can then take that output and massage it to give you all the repeated possibilities:
testprog | sed 's/ =.*$//' | sort | uniq -c | grep -v ' 1 ' | sort -k1 -n -r
which gives:
255 255:255
128 383:127
128 319:191
128 287:223
128 271:239
128 263:247
:
and so on.
Even in that reduced set, there are quite a few combinations which generate the same sum and xor, the worst being the large number of possibilities that generate a sum/xor of 255/255, which are:
255:255 = 0 255
255:255 = 1 254
255:255 = 2 253
255:255 = <n> <255-n>, for n = 3 thru 255 inclusive
It has already been shown that it can't be done, but here are two further reasons why.
For the (rather large) subset of a's and b's (a & b) == 0, you have a + b == (a ^ b) (because there can be no carries) (the reverse implication does not hold). In such a case, you can, for each bit that is 1 in the sum, choose which one of a or b contributed that bit. Obviously this subset does not cover the entire input, but it at least proves that it can't be done in general.
Furthermore, there exist many pairs of (x, y) such that there is no solution to a + b == x && (a ^ b) == y, for example (there are more than just these) all pairs (x, y) where ((x ^ y) & 1) == 1 (ie one is odd and the other is even), because the lowest bit of the xor and the sum are equal (the lowest bit has no carry-in). By a simple counting-argument, that must mean that at least some pairs (x, y) must have multiple solutions: clearly all pairs of (a, b) have some pair of (x, y) associated with them, so if not all pairs of (x, y) can be used, some other pairs (x, y) must be shared.
Here is the solution to get all such pairs
Logic:
let the numbers be a and b, we know
s = a + b
x = a ^ b
therefore
x = (s-b) ^ b
Since we know x and we know s, so for all ints going from 0 to s - just check if this last equation is satisfied
here is the code for this
public List<Pair<Integer>> pairs(int s, int x) {
List<Pair<Integer>> pairs = new ArrayList<Pair<Integer>>();
for (int i = 0; i <= s; i++) {
int calc = (s - i) ^ i;
if (calc == x) {
pairs.add(new Pair<Integer>(i, s - i));
}
}
return pairs;
}
Class pair is defined as
class Pair<T> {
T a;
T b;
public String toString() {
return a.toString() + "," + b.toString();
}
public Pair(T a, T b) {
this.a = a;
this.b = b;
}
}
Code to test this:
public static void main(String[] args) {
List<Pair<Integer>> pairs = new Test().pairs(100,100);
for (Pair<Integer> p : pairs) {
System.out.println(p);
}
}
Output:
0,100
4,96
32,68
36,64
64,36
68,32
96,4
100,0
if you have a , b the sum = a+b = (a^b) + (a&b)*2 this equation may be useful for you
We are given a unsigned integer, suppose. And without using any arithmetic operators ie + - / * or %, we are to find x mod 15. We may use binary bit manipulations.
As far as I could go, I got this based on 2 points.
a = a mod 15 = a mod 16 for a<15
Let a = x mod 15
then a = x - 15k (for some non-negative k).
ie a = x - 16k + k...
ie a mod 16 = ( x mod 16 + k mod 16 ) mod 16
ie a mod 15 = ( x mod 16 + k mod 16 ) mod 16
ie a = ( x mod 16 + k mod 16 ) mod 16
OK. Now to implement this. A mod16 operations is basically & OxF. and k is basically x>>4
So a = ( x & OxF + (x>>4) & OxF ) & OxF.
It boils down to adding 2 4-bit numbers. Which can be done by bit expressions.
sum[0] = a[0] ^ b[0]
sum[1] = a[1] ^ b[1] ^ (a[0] & b[0])
...
and so on
This seems like cheating to me. I'm hoping for a more elegant solution
This reminds me of an old trick from base 10 called "casting out the 9s". This was used for checking the result of large sums performed by hand.
In this case 123 mod 9 = 1 + 2 + 3 mod 9 = 6.
This happens because 9 is one less than the base of the digits (10). (Proof omitted ;) )
So considering the number in base 16 (Hex). you should be able to do:
0xABCE123 mod 0xF = (0xA + 0xB + 0xC + 0xD + 0xE + 0x1 + 0x2 + 0x3 ) mod 0xF
= 0x42 mod 0xF
= 0x6
Now you'll still need to do some magic to make the additions disappear. But it gives the right answer.
UPDATE:
Heres a complete implementation in C++. The f lookup table takes pairs of digits to their sum mod 15. (which is the same as the byte mod 15). We then repack these results and reapply on half as much data each round.
#include <iostream>
uint8_t f[256]={
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,0,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,0,1,
2,3,4,5,6,7,8,9,10,11,12,13,14,0,1,2,
3,4,5,6,7,8,9,10,11,12,13,14,0,1,2,3,
4,5,6,7,8,9,10,11,12,13,14,0,1,2,3,4,
5,6,7,8,9,10,11,12,13,14,0,1,2,3,4,5,
6,7,8,9,10,11,12,13,14,0,1,2,3,4,5,6,
7,8,9,10,11,12,13,14,0,1,2,3,4,5,6,7,
8,9,10,11,12,13,14,0,1,2,3,4,5,6,7,8,
9,10,11,12,13,14,0,1,2,3,4,5,6,7,8,9,
10,11,12,13,14,0,1,2,3,4,5,6,7,8,9,10,
11,12,13,14,0,1,2,3,4,5,6,7,8,9,10,11,
12,13,14,0,1,2,3,4,5,6,7,8,9,10,11,12,
13,14,0,1,2,3,4,5,6,7,8,9,10,11,12,13,
14,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,0};
uint64_t mod15( uint64_t in_v )
{
uint8_t * in = (uint8_t*)&in_v;
// 12 34 56 78 12 34 56 78 => aa bb cc dd
in[0] = f[in[0]] | (f[in[1]]<<4);
in[1] = f[in[2]] | (f[in[3]]<<4);
in[2] = f[in[4]] | (f[in[5]]<<4);
in[3] = f[in[6]] | (f[in[7]]<<4);
// aa bb cc dd => AA BB
in[0] = f[in[0]] | (f[in[1]]<<4);
in[1] = f[in[2]] | (f[in[3]]<<4);
// AA BB => DD
in[0] = f[in[0]] | (f[in[1]]<<4);
// DD => D
return f[in[0]];
}
int main()
{
uint64_t x = 12313231;
std::cout<< mod15(x)<<" "<< (x%15)<<std::endl;
}
Your logic is somewhere flawed but I can't put a finger on it. Think about it yourself, your final formula operates on first 8 bits and ignores the rest. That could only be valid if the part you throw away (9+ bits) are always the multiplication of 15. However, in reality (in binary numbers) 9+ bits are always multiplications of 16 but not 15. For example try putting 1 0000 0000 and 11 0000 0000 in your formula. Your formula will give 0 as a result for both cases, while in reality the answer is 1 and 3.
In essense I'm almost sure that your task can not be solved without loops. And if you are allowed to use loops - then it's nothing easier than to implement bitwiseAdd function and do whatever you like with it.
Added:
Found your problem. Here it is:
... a = x - 15k (for some non-negative k).
... and k is basically x>>4
It equals x>>4 only by pure coincidence for some numbers. Take any big example, for instance x=11110000. By your calculation k = 15, while in reality it is k=16: 16*15 = 11110000.
A question I got on my last interview:
Design a function f, such that:
f(f(n)) == -n
Where n is a 32 bit signed integer; you can't use complex numbers arithmetic.
If you can't design such a function for the whole range of numbers, design it for the largest range possible.
Any ideas?
You didn't say what kind of language they expected... Here's a static solution (Haskell). It's basically messing with the 2 most significant bits:
f :: Int -> Int
f x | (testBit x 30 /= testBit x 31) = negate $ complementBit x 30
| otherwise = complementBit x 30
It's much easier in a dynamic language (Python). Just check if the argument is a number X and return a lambda that returns -X:
def f(x):
if isinstance(x,int):
return (lambda: -x)
else:
return x()
How about:
f(n) = sign(n) - (-1)ⁿ * n
In Python:
def f(n):
if n == 0: return 0
if n >= 0:
if n % 2 == 1:
return n + 1
else:
return -1 * (n - 1)
else:
if n % 2 == 1:
return n - 1
else:
return -1 * (n + 1)
Python automatically promotes integers to arbitrary length longs. In other languages the largest positive integer will overflow, so it will work for all integers except that one.
To make it work for real numbers you need to replace the n in (-1)ⁿ with { ceiling(n) if n>0; floor(n) if n<0 }.
In C# (works for any double, except in overflow situations):
static double F(double n)
{
if (n == 0) return 0;
if (n < 0)
return ((long)Math.Ceiling(n) % 2 == 0) ? (n + 1) : (-1 * (n - 1));
else
return ((long)Math.Floor(n) % 2 == 0) ? (n - 1) : (-1 * (n + 1));
}
Here's a proof of why such a function can't exist, for all numbers, if it doesn't use extra information(except 32bits of int):
We must have f(0) = 0. (Proof: Suppose f(0) = x. Then f(x) = f(f(0)) = -0 = 0. Now, -x = f(f(x)) = f(0) = x, which means that x = 0.)
Further, for any x and y, suppose f(x) = y. We want f(y) = -x then. And f(f(y)) = -y => f(-x) = -y. To summarize: if f(x) = y, then f(-x) = -y, and f(y) = -x, and f(-y) = x.
So, we need to divide all integers except 0 into sets of 4, but we have an odd number of such integers; not only that, if we remove the integer that doesn't have a positive counterpart, we still have 2(mod4) numbers.
If we remove the 2 maximal numbers left (by abs value), we can get the function:
int sign(int n)
{
if(n>0)
return 1;
else
return -1;
}
int f(int n)
{
if(n==0) return 0;
switch(abs(n)%2)
{
case 1:
return sign(n)*(abs(n)+1);
case 0:
return -sign(n)*(abs(n)-1);
}
}
Of course another option, is to not comply for 0, and get the 2 numbers we removed as a bonus. (But that's just a silly if.)
Thanks to overloading in C++:
double f(int var)
{
return double(var);
}
int f(double var)
{
return -int(var);
}
int main(){
int n(42);
std::cout<<f(f(n));
}
Or, you could abuse the preprocessor:
#define f(n) (f##n)
#define ff(n) -n
int main()
{
int n = -42;
cout << "f(f(" << n << ")) = " << f(f(n)) << endl;
}
This is true for all negative numbers.
f(n) = abs(n)
Because there is one more negative number than there are positive numbers for twos complement integers, f(n) = abs(n) is valid for one more case than f(n) = n > 0 ? -n : n solution that is the same same as f(n) = -abs(n). Got you by one ... :D
UPDATE
No, it is not valid for one case more as I just recognized by litb's comment ... abs(Int.Min) will just overflow ...
I thought about using mod 2 information, too, but concluded, it does not work ... to early. If done right, it will work for all numbers except Int.Min because this will overflow.
UPDATE
I played with it for a while, looking for a nice bit manipulation trick, but I could not find a nice one-liner, while the mod 2 solution fits in one.
f(n) = 2n(abs(n) % 2) - n + sgn(n)
In C#, this becomes the following:
public static Int32 f(Int32 n)
{
return 2 * n * (Math.Abs(n) % 2) - n + Math.Sign(n);
}
To get it working for all values, you have to replace Math.Abs() with (n > 0) ? +n : -n and include the calculation in an unchecked block. Then you get even Int.Min mapped to itself as unchecked negation does.
UPDATE
Inspired by another answer I am going to explain how the function works and how to construct such a function.
Lets start at the very beginning. The function f is repeatedly applied to a given value n yielding a sequence of values.
n => f(n) => f(f(n)) => f(f(f(n))) => f(f(f(f(n)))) => ...
The question demands f(f(n)) = -n, that is two successive applications of f negate the argument. Two further applications of f - four in total - negate the argument again yielding n again.
n => f(n) => -n => f(f(f(n))) => n => f(n) => ...
Now there is a obvious cycle of length four. Substituting x = f(n) and noting that the obtained equation f(f(f(n))) = f(f(x)) = -x holds, yields the following.
n => x => -n => -x => n => ...
So we get a cycle of length four with two numbers and the two numbers negated. If you imagine the cycle as a rectangle, negated values are located at opposite corners.
One of many solution to construct such a cycle is the following starting from n.
n => negate and subtract one
-n - 1 = -(n + 1) => add one
-n => negate and add one
n + 1 => subtract one
n
A concrete example is of such an cycle is +1 => -2 => -1 => +2 => +1. We are almost done. Noting that the constructed cycle contains an odd positive number, its even successor, and both numbers negate, we can easily partition the integers into many such cycles (2^32 is a multiple of four) and have found a function that satisfies the conditions.
But we have a problem with zero. The cycle must contain 0 => x => 0 because zero is negated to itself. And because the cycle states already 0 => x it follows 0 => x => 0 => x. This is only a cycle of length two and x is turned into itself after two applications, not into -x. Luckily there is one case that solves the problem. If X equals zero we obtain a cycle of length one containing only zero and we solved that problem concluding that zero is a fixed point of f.
Done? Almost. We have 2^32 numbers, zero is a fixed point leaving 2^32 - 1 numbers, and we must partition that number into cycles of four numbers. Bad that 2^32 - 1 is not a multiple of four - there will remain three numbers not in any cycle of length four.
I will explain the remaining part of the solution using the smaller set of 3 bit signed itegers ranging from -4 to +3. We are done with zero. We have one complete cycle +1 => -2 => -1 => +2 => +1. Now let us construct the cycle starting at +3.
+3 => -4 => -3 => +4 => +3
The problem that arises is that +4 is not representable as 3 bit integer. We would obtain +4 by negating -3 to +3 - what is still a valid 3 bit integer - but then adding one to +3 (binary 011) yields 100 binary. Interpreted as unsigned integer it is +4 but we have to interpret it as signed integer -4. So actually -4 for this example or Int.MinValue in the general case is a second fixed point of integer arithmetic negation - 0 and Int.MinValue are mapped to themselve. So the cycle is actually as follows.
+3 => -4 => -3 => -4 => -3
It is a cycle of length two and additionally +3 enters the cycle via -4. In consequence -4 is correctly mapped to itself after two function applications, +3 is correctly mapped to -3 after two function applications, but -3 is erroneously mapped to itself after two function applications.
So we constructed a function that works for all integers but one. Can we do better? No, we cannot. Why? We have to construct cycles of length four and are able to cover the whole integer range up to four values. The remaining values are the two fixed points 0 and Int.MinValue that must be mapped to themselves and two arbitrary integers x and -x that must be mapped to each other by two function applications.
To map x to -x and vice versa they must form a four cycle and they must be located at opposite corners of that cycle. In consequence 0 and Int.MinValue have to be at opposite corners, too. This will correctly map x and -x but swap the two fixed points 0 and Int.MinValue after two function applications and leave us with two failing inputs. So it is not possible to construct a function that works for all values, but we have one that works for all values except one and this is the best we can achieve.
Using complex numbers, you can effectively divide the task of negating a number into two steps:
multiply n by i, and you get n*i, which is n rotated 90° counter-clockwise
multiply again by i, and you get -n
The great thing is that you don't need any special handling code. Just multiplying by i does the job.
But you're not allowed to use complex numbers. So you have to somehow create your own imaginary axis, using part of your data range. Since you need exactly as much imaginary (intermediate) values as initial values, you are left with only half the data range.
I tried to visualize this on the following figure, assuming signed 8-bit data. You would have to scale this for 32-bit integers. The allowed range for initial n is -64 to +63.
Here's what the function does for positive n:
If n is in 0..63 (initial range), the function call adds 64, mapping n to the range 64..127 (intermediate range)
If n is in 64..127 (intermediate range), the function subtracts n from 64, mapping n to the range 0..-63
For negative n, the function uses the intermediate range -65..-128.
Works except int.MaxValue and int.MinValue
public static int f(int x)
{
if (x == 0) return 0;
if ((x % 2) != 0)
return x * -1 + (-1 *x) / (Math.Abs(x));
else
return x - x / (Math.Abs(x));
}
The question doesn't say anything about what the input type and return value of the function f have to be (at least not the way you've presented it)...
...just that when n is a 32-bit integer then f(f(n)) = -n
So, how about something like
Int64 f(Int64 n)
{
return(n > Int32.MaxValue ?
-(n - 4L * Int32.MaxValue):
n + 4L * Int32.MaxValue);
}
If n is a 32-bit integer then the statement f(f(n)) == -n will be true.
Obviously, this approach could be extended to work for an even wider range of numbers...
for javascript (or other dynamically typed languages) you can have the function accept either an int or an object and return the other. i.e.
function f(n) {
if (n.passed) {
return -n.val;
} else {
return {val:n, passed:1};
}
}
giving
js> f(f(10))
-10
js> f(f(-10))
10
alternatively you could use overloading in a strongly typed language although that may break the rules ie
int f(long n) {
return n;
}
long f(int n) {
return -n;
}
Depending on your platform, some languages allow you to keep state in the function. VB.Net, for example:
Function f(ByVal n As Integer) As Integer
Static flag As Integer = -1
flag *= -1
Return n * flag
End Function
IIRC, C++ allowed this as well. I suspect they're looking for a different solution though.
Another idea is that since they didn't define the result of the first call to the function you could use odd/evenness to control whether to invert the sign:
int f(int n)
{
int sign = n>=0?1:-1;
if (abs(n)%2 == 0)
return ((abs(n)+1)*sign * -1;
else
return (abs(n)-1)*sign;
}
Add one to the magnitude of all even numbers, subtract one from the magnitude of all odd numbers. The result of two calls has the same magnitude, but the one call where it's even we swap the sign. There are some cases where this won't work (-1, max or min int), but it works a lot better than anything else suggested so far.
Exploiting JavaScript exceptions.
function f(n) {
try {
return n();
}
catch(e) {
return function() { return -n; };
}
}
f(f(0)) => 0
f(f(1)) => -1
For all 32-bit values (with the caveat that -0 is -2147483648)
int rotate(int x)
{
static const int split = INT_MAX / 2 + 1;
static const int negativeSplit = INT_MIN / 2 + 1;
if (x == INT_MAX)
return INT_MIN;
if (x == INT_MIN)
return x + 1;
if (x >= split)
return x + 1 - INT_MIN;
if (x >= 0)
return INT_MAX - x;
if (x >= negativeSplit)
return INT_MIN - x + 1;
return split -(negativeSplit - x);
}
You basically need to pair each -x => x => -x loop with a y => -y => y loop. So I paired up opposite sides of the split.
e.g. For 4 bit integers:
0 => 7 => -8 => -7 => 0
1 => 6 => -1 => -6 => 1
2 => 5 => -2 => -5 => 2
3 => 4 => -3 => -4 => 3
A C++ version, probably bending the rules somewhat but works for all numeric types (floats, ints, doubles) and even class types that overload the unary minus:
template <class T>
struct f_result
{
T value;
};
template <class T>
f_result <T> f (T n)
{
f_result <T> result = {n};
return result;
}
template <class T>
T f (f_result <T> n)
{
return -n.value;
}
void main (void)
{
int n = 45;
cout << "f(f(" << n << ")) = " << f(f(n)) << endl;
float p = 3.14f;
cout << "f(f(" << p << ")) = " << f(f(p)) << endl;
}
x86 asm (AT&T style):
; input %edi
; output %eax
; clobbered regs: %ecx, %edx
f:
testl %edi, %edi
je .zero
movl %edi, %eax
movl $1, %ecx
movl %edi, %edx
andl $1, %eax
addl %eax, %eax
subl %eax, %ecx
xorl %eax, %eax
testl %edi, %edi
setg %al
shrl $31, %edx
subl %edx, %eax
imull %ecx, %eax
subl %eax, %edi
movl %edi, %eax
imull %ecx, %eax
.zero:
xorl %eax, %eax
ret
Code checked, all possible 32bit integers passed, error with -2147483647 (underflow).
Uses globals...but so?
bool done = false
f(int n)
{
int out = n;
if(!done)
{
out = n * -1;
done = true;
}
return out;
}
This Perl solution works for integers, floats, and strings.
sub f {
my $n = shift;
return ref($n) ? -$$n : \$n;
}
Try some test data.
print $_, ' ', f(f($_)), "\n" for -2, 0, 1, 1.1, -3.3, 'foo' '-bar';
Output:
-2 2
0 0
1 -1
1.1 -1.1
-3.3 3.3
foo -foo
-bar +bar
Nobody ever said f(x) had to be the same type.
def f(x):
if type(x) == list:
return -x[0]
return [x]
f(2) => [2]
f(f(2)) => -2
I'm not actually trying to give a solution to the problem itself, but do have a couple of comments, as the question states this problem was posed was part of a (job?) interview:
I would first ask "Why would such a function be needed? What is the bigger problem this is part of?" instead of trying to solve the actual posed problem on the spot. This shows how I think and how I tackle problems like this. Who know? That might even be the actual reason the question is asked in an interview in the first place. If the answer is "Never you mind, assume it's needed, and show me how you would design this function." I would then continue to do so.
Then, I would write the C# test case code I would use (the obvious: loop from int.MinValue to int.MaxValue, and for each n in that range call f(f(n)) and checking the result is -n), telling I would then use Test Driven Development to get to such a function.
Only if the interviewer continues asking for me to solve the posed problem would I actually start to try and scribble pseudocode during the interview itself to try and get to some sort of an answer. However, I don't really think I would be jumping to take the job if the interviewer would be any indication of what the company is like...
Oh, this answer assumes the interview was for a C# programming related position. Would of course be a silly answer if the interview was for a math related position. ;-)
I would you change the 2 most significant bits.
00.... => 01.... => 10.....
01.... => 10.... => 11.....
10.... => 11.... => 00.....
11.... => 00.... => 01.....
As you can see, it's just an addition, leaving out the carried bit.
How did I got to the answer? My first thought was just a need for symmetry. 4 turns to get back where I started. At first I thought, that's 2bits Gray code. Then I thought actually standard binary is enough.
Here is a solution that is inspired by the requirement or claim that complex numbers can not be used to solve this problem.
Multiplying by the square root of -1 is an idea, that only seems to fail because -1 does not have a square root over the integers. But playing around with a program like mathematica gives for example the equation
(18494364652+1) mod (232-3) = 0.
and this is almost as good as having a square root of -1. The result of the function needs to be a signed integer. Hence I'm going to use a modified modulo operation mods(x,n) that returns the integer y congruent to x modulo n that is closest to 0. Only very few programming languages have suc a modulo operation, but it can easily be defined. E.g. in python it is:
def mods(x, n):
y = x % n
if y > n/2: y-= n
return y
Using the equation above, the problem can now be solved as
def f(x):
return mods(x*1849436465, 2**32-3)
This satisfies f(f(x)) = -x for all integers in the range [-231-2, 231-2]. The results of f(x) are also in this range, but of course the computation would need 64-bit integers.
C# for a range of 2^32 - 1 numbers, all int32 numbers except (Int32.MinValue)
Func<int, int> f = n =>
n < 0
? (n & (1 << 30)) == (1 << 30) ? (n ^ (1 << 30)) : - (n | (1 << 30))
: (n & (1 << 30)) == (1 << 30) ? -(n ^ (1 << 30)) : (n | (1 << 30));
Console.WriteLine(f(f(Int32.MinValue + 1))); // -2147483648 + 1
for (int i = -3; i <= 3 ; i++)
Console.WriteLine(f(f(i)));
Console.WriteLine(f(f(Int32.MaxValue))); // 2147483647
prints:
2147483647
3
2
1
0
-1
-2
-3
-2147483647
Essentially the function has to divide the available range into cycles of size 4, with -n at the opposite end of n's cycle. However, 0 must be part of a cycle of size 1, because otherwise 0->x->0->x != -x. Because of 0 being alone, there must be 3 other values in our range (whose size is a multiple of 4) not in a proper cycle with 4 elements.
I chose these extra weird values to be MIN_INT, MAX_INT, and MIN_INT+1. Furthermore, MIN_INT+1 will map to MAX_INT correctly, but get stuck there and not map back. I think this is the best compromise, because it has the nice property of only the extreme values not working correctly. Also, it means it would work for all BigInts.
int f(int n):
if n == 0 or n == MIN_INT or n == MAX_INT: return n
return ((Math.abs(n) mod 2) * 2 - 1) * n + Math.sign(n)
Nobody said it had to be stateless.
int32 f(int32 x) {
static bool idempotent = false;
if (!idempotent) {
idempotent = true;
return -x;
} else {
return x;
}
}
Cheating, but not as much as a lot of the examples. Even more evil would be to peek up the stack to see if your caller's address is &f, but this is going to be more portable (although not thread safe... the thread-safe version would use TLS). Even more evil:
int32 f (int32 x) {
static int32 answer = -x;
return answer;
}
Of course, neither of these works too well for the case of MIN_INT32, but there is precious little you can do about that unless you are allowed to return a wider type.
I could imagine using the 31st bit as an imaginary (i) bit would be an approach that would support half the total range.
works for n= [0 .. 2^31-1]
int f(int n) {
if (n & (1 << 31)) // highest bit set?
return -(n & ~(1 << 31)); // return negative of original n
else
return n | (1 << 31); // return n with highest bit set
}
The problem states "32-bit signed integers" but doesn't specify whether they are twos-complement or ones-complement.
If you use ones-complement then all 2^32 values occur in cycles of length four - you don't need a special case for zero, and you also don't need conditionals.
In C:
int32_t f(int32_t x)
{
return (((x & 0xFFFFU) << 16) | ((x & 0xFFFF0000U) >> 16)) ^ 0xFFFFU;
}
This works by
Exchanging the high and low 16-bit blocks
Inverting one of the blocks
After two passes we have the bitwise inverse of the original value. Which in ones-complement representation is equivalent to negation.
Examples:
Pass | x
-----+-------------------
0 | 00000001 (+1)
1 | 0001FFFF (+131071)
2 | FFFFFFFE (-1)
3 | FFFE0000 (-131071)
4 | 00000001 (+1)
Pass | x
-----+-------------------
0 | 00000000 (+0)
1 | 0000FFFF (+65535)
2 | FFFFFFFF (-0)
3 | FFFF0000 (-65535)
4 | 00000000 (+0)
:D
boolean inner = true;
int f(int input) {
if(inner) {
inner = false;
return input;
} else {
inner = true;
return -input;
}
}
return x ^ ((x%2) ? 1 : -INT_MAX);
I'd like to share my point of view on this interesting problem as a mathematician. I think I have the most efficient solution.
If I remember correctly, you negate a signed 32-bit integer by just flipping the first bit. For example, if n = 1001 1101 1110 1011 1110 0000 1110 1010, then -n = 0001 1101 1110 1011 1110 0000 1110 1010.
So how do we define a function f that takes a signed 32-bit integer and returns another signed 32-bit integer with the property that taking f twice is the same as flipping the first bit?
Let me rephrase the question without mentioning arithmetic concepts like integers.
How do we define a function f that takes a sequence of zeros and ones of length 32 and returns a sequence of zeros and ones of the same length, with the property that taking f twice is the same as flipping the first bit?
Observation: If you can answer the above question for 32 bit case, then you can also answer for 64 bit case, 100 bit case, etc. You just apply f to the first 32 bit.
Now if you can answer the question for 2 bit case, Voila!
And yes it turns out that changing the first 2 bits is enough.
Here's the pseudo-code
1. take n, which is a signed 32-bit integer.
2. swap the first bit and the second bit.
3. flip the first bit.
4. return the result.
Remark: The step 2 and the step 3 together can be summerised as (a,b) --> (-b, a). Looks familiar? That should remind you of the 90 degree rotation of the plane and the multiplication by the squar root of -1.
If I just presented the pseudo-code alone without the long prelude, it would seem like a rabbit out of the hat, I wanted to explain how I got the solution.