128.128 Unsigned fixed point Division in Solidity - math

I am currently trying to figure out a concrete way to perform fixed point division on two 128.128 uint256 numbers. This seems like a fairly straightforward thing, but haven't been able to code a solution up.
For two 64.64 fixed point numbers the following works just fine.
function div64x64 (uint128 x, uint128 y) internal pure returns (uint128) {
unchecked {
require (y != 0);
uint256 answer = (uint256 (x) << 64) / y;
require (answer <= 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF);
return uint128 (answer);
}
}
But the same logic does not hold for uint256 128.128 fixed point numbers since you cannot cast x into a larger uint type to left shift x. This is my sad attempt at solving this for 128.128 which doesn't include the correct decimals in the output, but my attempt does include the correct values left of the decimal.
function div128x128 (uint256 x, uint256 y) internal pure returns (uint256) {
unchecked {
require (y != 0);
uint256 xInt = x>>128;
uint256 xDecimal = x<<128;
uint256 yInt = y>>128;
uint256 yDecimal = y<<128;
uint256 hi = ((uint256(xInt) << 64)/yInt)<<64;
uint256 lo = ((uint256(xDecimal)<<64)/yDecimal);
require (hi+lo <= MAX_128x128);
return hi+lo;
}
}
Does anyone know the best way to accomplish this, or even just a conceptual explanation of how to do it would be super appreciated. Thanks in advance!

Okay, so I'll post the solution here for the next guy. The key here is one of the more obvious facts that you can break up a fraction with a common denominator into two additive parts. For example 12.525/9.5= (12/9.5)+(.525/9.5) with this in mind we have a way to break up our numbers into 2 uint256 numbers and just concatenate them with some fancy shifting.
function div128x128 (uint256 x, uint256 y) internal pure returns (uint256) {
unchecked {
//Require denominator != 0
require (y != 0);
// xDec = x & 2**128-1 i.e 128 precision 128 bits of padding on the left
uint256 xDec = x & 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF;
//xInt x *2**-128 i.e. 128 precision 128 bits of padding on the right
uint256 xInt = x >> 128;
//hi = xInt*2**256-1 /y ==> leave a full uint256 of bits to store the integer representation of the fractional decimal with 128.128 precision
uint256 hi = xInt*(MAX_128x128/y);
//xDec*2**256-1 /y ==> leave full uint256 of bits to store the integer representation of fractional decimal with 128.128 precision, right shift 128 bits since output should be the right 128 bits of precision on the output
uint256 lo = (xDec*(MAX_128x128/y))>>128;
/*Example: 12.525/9.5 := 12/9.5 + .525/9.5<-- legal to break up a fraction into additive pieces with common deniminator in the example above just padding to fit 128.128 output in a uint256
*/
require (hi+lo <= MAX_128x128);
return hi+lo;
}
}
This is one solution, that seems to be working so long as the require criterion are met. There are almost undoubtedly optimization improvements to be made. But I have tested this on some real data, and it seems to be accurate to 128.128 precision.

Related

Wrong Answer for n = 656 Nth Fibonacci Number using Dynamic Programmin

class Solution {
public:
long long int nthFibonacci(long long int n){
// code here
//TABULATION
long long int lookup[1001];
lookup[0]=0;
lookup[1]=1;
for(long long int i=2;i<=n;i++){
lookup[i]=lookup[i-1]+lookup[i-2];
}
return (lookup[n]%1000000007);
}
};
When I submit it on GFG, it is showing that your code is giving wrong output for n=656
Wrong Answer. !!!Wrong Answer
Possibly your code doesn't work correctly for multiple test-cases (TCs).
The first test case where your code failed:
Input:
656
Its Correct output is:
823693831
And Your Code's output is:
-584713349
Fibonacci numbers grow quite quickly.
fib(n) / (pow(phi, n)) -> 1/sqrt(5) as n -> infinity
where phi is the golden ratio, phi ~= 1.618
This means that fib(n) requires around
n*log2(phi)-0.5log2(5) ~ 0.694*(n-2) bits.
so fib(656) needs about 452 bits.
A long long int is unlikely to be that big!
In your original code the table, of long long ints, cannot hold the correct result for the larger fibonacci numbers.
In C (and I think C++) it is awkward to detect and correct overflow. It is better to ensure that it never happens. In your case you only want the fibonacci numbers mod m (m=1000000007). So you can fill the table with the fibonacci numbers mod m instead. Since m fits in 32 bits, all numbers modulo that and the sum of two of them fit in 32 bits and so overflow cannot occur.
By the way you have some redundant mods in your amended code; you could have
for(long long int i=2;i<=n;i++){
lookup[i]=(lookup[i-1]+lookup[i-2])%mod;
}
bacause when you use it lookup[i-1] has already been reduced mod m.
Update: found the correct solution to the problem but still don't know what was wrong with my earlier code
If someone can have a look at the correct solution and point out my mistake..Thanks in advance!!
Correct Solution:
class Solution {
public:
long long int nthFibonacci(long long int n){
// code here
//TABULATION
const long long int mod=1000000007;
long long int lookup[1001];
lookup[0]=0;
lookup[1]=1;
for(long long int i=2;i<=n;i++){
lookup[i]=(lookup[i-1]%mod+lookup[i-2]%mod)%mod;
}
return lookup[n];
}
};

How to find prime number on O(1) runtime

I got this question in an interview
Please provide a solution to check if a number is a prime number using
a loop of one - O(1). The input number can be between 1 and 10,000
only.
I said that its impossible unless if you have stored all prime numbers up to 10,000. Now I am not entirely sure whether my answer was correct. I tried to search for an answer on internet and the best I came up with AKS algorithm with run-time of O((log n)^6)
it is doable using SoE (Sieve of Eratosthenes). Its result is an array of bools usually encoded as single bit in BYTE/WORD/DWORD array for better density of storage. Also usually only the odd numbers are stored as the even except 2 are all not primes. Usually true value means it is not prime....
So the naive O(1) C++ code for checking x would look like:
bool SoE[10001]; // precomputed sieve array
int x = 27; // any x <0,10000>
bool x_is_prime = !SoE[x];
if the SoE is encoded as 8 bit BYTE array you need to tweak the access a bit:
BYTE SoE[1251]; // precomputed sieve array ceil(10001/8)
int x = 27; // any x <0,10000>
BYTE x_is_prime = SoE[x>>3]^(1<<(x&7));
of coarse constructing SoE is not O(1) !!! Here an example heavily using it to speedup mine IsPrime function:
Prime numbers by Eratosthenes quicker sequential than concurrently?
YES!,
You can use Sieve of Eratosthenes to check if number is a prime or not,
However you will have to precompute for certain number of value and store it in the array and for each query you can check in O(1).
If you do not want to precompute as it will take O(log(long)) time , then you can use this Concept ,
if P is a Prime Number , then P^2 - 1 is divisible by 24.
So in case of C++ , if the given number is less than or equal to 10^9 , we can use this concept.
The Source to this Concept can be learned at www.brilliant.org
public static boolean prime(int n) {
if(n%2 == 0)
return true;
else if(n%3 == 0)
return true;
else if(n%5 == 0)
return true;
else if(n%7 == 0)
return true;
return false;
}

what is the difference of these two implementations of a recursion algorihtm?

I am doing a leetcode problem.
A robot is located at the top-left corner of a m x n grid (marked 'Start' in the diagram below).
The robot can only move either down or right at any point in time. The robot is trying to reach the bottom-right corner of the grid (marked 'Finish' in the diagram below).
How many possible unique paths are there?
So I tried this implementation first and got a "exceeds runtime" (I forgot the exact term but it means the implementation is slow). So I changed it version 2, which use a array to save the results. I honestly don't know how the recursion works internally and why these two implementations have different efficiency.
version 1(slow):
class Solution {
// int res[101][101]={{0}};
public:
int uniquePaths(int m, int n) {
if (m==1 || n==1) return 1;
else{
return uniquePaths(m-1,n) + uniquePaths(m,n-1);
}
}
};
version2 (faster):
class Solution {
int res[101][101]={{0}};
public:
int uniquePaths(int m, int n) {
if (m==1 || n==1) return 1;
else{
if (res[m-1][n]==0) res[m-1][n] = uniquePaths(m-1,n);
if (res[m][n-1]==0) res[m][n-1] = uniquePaths(m,n-1);
return res[m-1][n] + res[m][n-1];
}
}
};
Version 1 is slower beacuse you are calculating the same data again and again. I'll try to explain this on different problem but I guess that you know Fibonacci numbers. You can calculate any Fibonacci number by following recursive algorithm:
fib(n):
if n == 0 then return 0
if n == 1 then return 1
return fib(n-1) + fib(n-1)
But what actually are you calculating? If you want to find fib(5) you need to calculate fib(4) and fib(3), then to calculate fib(4) you need to calculate fib(3) again! Take a look at the image to fully understand:
The same situation is in your code. You compute uniquePaths(m,n) even if you have it calculated before. To avoid that, in your second version you use array to store computed data and you don't have to compute it again when res[m][n]!=0

Hacks for clamping integer to 0-255 and doubles to 0.0-1.0?

Are there any branch-less or similar hacks for clamping an integer to the interval of 0 to 255, or a double to the interval of 0.0 to 1.0? (Both ranges are meant to be closed, i.e. endpoints are inclusive.)
I'm using the obvious minimum-maximum check:
int value = (value < 0? 0 : value > 255? 255 : value);
but is there a way to get this faster -- similar to the "modulo" clamp value & 255? And is there a way to do similar things with floating points?
I'm looking for a portable solution, so preferably no CPU/GPU-specific stuff please.
This is a trick I use for clamping an int to a 0 to 255 range:
/**
* Clamps the input to a 0 to 255 range.
* #param v any int value
* #return {#code v < 0 ? 0 : v > 255 ? 255 : v}
*/
public static int clampTo8Bit(int v) {
// if out of range
if ((v & ~0xFF) != 0) {
// invert sign bit, shift to fill, then mask (generates 0 or 255)
v = ((~v) >> 31) & 0xFF;
}
return v;
}
That still has one branch, but a handy thing about it is that you can test whether any of several ints are out of range in one go by ORing them together, which makes things faster in the common case that all of them are in range. For example:
/** Packs four 8-bit values into a 32-bit value, with clamping. */
public static int ARGBclamped(int a, int r, int g, int b) {
if (((a | r | g | b) & ~0xFF) != 0) {
a = clampTo8Bit(a);
r = clampTo8Bit(r);
g = clampTo8Bit(g);
b = clampTo8Bit(b);
}
return (a << 24) + (r << 16) + (g << 8) + (b << 0);
}
Note that your compiler may already give you what you want if you code value = min (value, 255). This may be translated into a MIN instruction if it exists, or into a comparison followed by conditional move, such as the CMOVcc instruction on x86.
The following code assumes two's complement representation of integers, which is usually a given today. The conversion from Boolean to integer should not involve branching under the hood, as modern architectures either provide instructions that can directly be used to form the mask (e.g. SETcc on x86 and ISETcc on NVIDIA GPUs), or can apply predication or conditional moves. If all of those are lacking, the compiler may emit a branchless instruction sequence based on arithmetic right shift to construct a mask, along the lines of Boann's answer. However, there is some residual risk that the compiler could do the wrong thing, so when in doubt, it would be best to disassemble the generated binary to check.
int value, mask;
mask = 0 - (value > 255); // mask = all 1s if value > 255, all 0s otherwise
value = (255 & mask) | (value & ~mask);
On many architectures, use of the ternary operator ?: can also result in a branchless instruction sequences. The hardware may support select-type instructions which are essentially the hardware equivalent of the ternary operator, such as ICMP on NVIDIA GPUs. Or it provides CMOV (conditional move) as in x86, or predication as on ARM, both of which can be used to implement branch-less code for ternary operators. As in the previous case, one would want to examine the disassembled binary code to be absolutely sure the resulting code is without branches.
int value;
value = (value > 255) ? 255 : value;
In case of floating-point operands, modern floating-point units typically provide FMIN and FMAX instructions which map straight to the C/C++ standard math functions fmin() and fmax(). Alternatively fmin() and fmax() may be translated into a comparison followed by a conditional move. Again, it would be prudent to examine the generated code to make sure it is branchless.
double value;
value = fmax (fmin (value, 1.0), 0.0);
I use this thing, 100% branchless.
int clampU8(int val)
{
val &= (val<0)-1; // clamp < 0
val |= -(val>255); // clamp > 255
return val & 0xFF; // mask out
}
For those using C#, Kotlin or Java this is the best I could do, it's nice and succinct if somewhat cryptic:
(x & ~(x >> 31) | 255 - x >> 31) & 255
It only works on signed integers so that might be a blocker for some.
For clamping doubles, I'm afraid there's no language/platform agnostic solution.
The problem with floating point that they have options from fastest operations (MSVC /fp:fast, gcc -funsafe-math-optimizations) to fully precise and safe (MSVC /fp:strict, gcc -frounding-math -fsignaling-nans). In fully precise mode the compiler does not try to use any bit hacks, even if they could.
A solution that manipulates double bits cannot be portable. There may be different endianness, also there may be no (efficient) way to get double bits, double is not necessarily IEEE 754 binary64 after all. Plus direct manipulations will not cause signals for signaling NANs, when they are expected.
For integers most likely the compiler will do it right anyway, otherwise there are already good answers given.

One-to-one integer mapping function

We are using MySQL and developing an application where we'd like the ID sequence not to be publicly visible... the IDs are hardly top secret and there is no significant issue if someone indeed was able to decode them.
So, a hash is of course the obvious solution, we are currently using MD5... 32bit integers go in, and we trim the MD5 to 64bits and then store that. However, we have no idea how likely collisions are when you trim like this (especially since all numbers come from autoincrement or the current time). We currently check for collisions, but since we may be inserting 100.000 rows at once the performance is terrible (can't bulk insert).
But in the end, we really don't need the security offered by the hashes and they consume unnecessary space and also require an additional index... so, is there any simple and good enough function/algorithm out there that guarantees one-to-one mapping for any number without obvious visual patterns for sequential numbers?
EDIT: I'm using PHP which does not support integer arithmetic by default, but after looking around I found that it could be cheaply replicated with bitwise operators. Code for 32bit integer multiplication can be found here: http://pastebin.com/np28xhQF
You could simply XOR with 0xDEADBEEF, if that's good enough.
Alternatively multiply by an odd number mod 2^32. For the inverse mapping just multiply by the multiplicative inverse
Example: n = 2345678901; multiplicative inverse (mod 2^32): 2313902621
For the mapping just multiply by 2345678901 (mod 2^32):
1 --> 2345678901
2 --> 396390506
For the inverse mapping, multiply by 2313902621.
If you want to ensure a 1:1 mapping then use an encryption (i.e. a permutation), not a hash. Encryption has to be 1:1 because it can be decrypted.
If you want 32 bit numbers then use Hasty Pudding Cypher or just write a simple four round Feistel cypher.
Here's one I prepared earlier:
import java.util.Random;
/**
* IntegerPerm is a reversible keyed permutation of the integers.
* This class is not cryptographically secure as the F function
* is too simple and there are not enough rounds.
*
* #author Martin Ross
*/
public final class IntegerPerm {
//////////////////
// Private Data //
//////////////////
/** Non-zero default key, from www.random.org */
private final static int DEFAULT_KEY = 0x6CFB18E2;
private final static int LOW_16_MASK = 0xFFFF;
private final static int HALF_SHIFT = 16;
private final static int NUM_ROUNDS = 4;
/** Permutation key */
private int mKey;
/** Round key schedule */
private int[] mRoundKeys = new int[NUM_ROUNDS];
//////////////////
// Constructors //
//////////////////
public IntegerPerm() { this(DEFAULT_KEY); }
public IntegerPerm(int key) { setKey(key); }
////////////////////
// Public Methods //
////////////////////
/** Sets a new value for the key and key schedule. */
public void setKey(int newKey) {
assert (NUM_ROUNDS == 4) : "NUM_ROUNDS is not 4";
mKey = newKey;
mRoundKeys[0] = mKey & LOW_16_MASK;
mRoundKeys[1] = ~(mKey & LOW_16_MASK);
mRoundKeys[2] = mKey >>> HALF_SHIFT;
mRoundKeys[3] = ~(mKey >>> HALF_SHIFT);
} // end setKey()
/** Returns the current value of the key. */
public int getKey() { return mKey; }
/**
* Calculates the enciphered (i.e. permuted) value of the given integer
* under the current key.
*
* #param plain the integer to encipher.
*
* #return the enciphered (permuted) value.
*/
public int encipher(int plain) {
// 1 Split into two halves.
int rhs = plain & LOW_16_MASK;
int lhs = plain >>> HALF_SHIFT;
// 2 Do NUM_ROUNDS simple Feistel rounds.
for (int i = 0; i < NUM_ROUNDS; ++i) {
if (i > 0) {
// Swap lhs <-> rhs
final int temp = lhs;
lhs = rhs;
rhs = temp;
} // end if
// Apply Feistel round function F().
rhs ^= F(lhs, i);
} // end for
// 3 Recombine the two halves and return.
return (lhs << HALF_SHIFT) + (rhs & LOW_16_MASK);
} // end encipher()
/**
* Calculates the deciphered (i.e. inverse permuted) value of the given
* integer under the current key.
*
* #param cypher the integer to decipher.
*
* #return the deciphered (inverse permuted) value.
*/
public int decipher(int cypher) {
// 1 Split into two halves.
int rhs = cypher & LOW_16_MASK;
int lhs = cypher >>> HALF_SHIFT;
// 2 Do NUM_ROUNDS simple Feistel rounds.
for (int i = 0; i < NUM_ROUNDS; ++i) {
if (i > 0) {
// Swap lhs <-> rhs
final int temp = lhs;
lhs = rhs;
rhs = temp;
} // end if
// Apply Feistel round function F().
rhs ^= F(lhs, NUM_ROUNDS - 1 - i);
} // end for
// 4 Recombine the two halves and return.
return (lhs << HALF_SHIFT) + (rhs & LOW_16_MASK);
} // end decipher()
/////////////////////
// Private Methods //
/////////////////////
// The F function for the Feistel rounds.
private int F(int num, int round) {
// XOR with round key.
num ^= mRoundKeys[round];
// Square, then XOR the high and low parts.
num *= num;
return (num >>> HALF_SHIFT) ^ (num & LOW_16_MASK);
} // end F()
} // end class IntegerPerm
Do what Henrik said in his second suggestion. But since these values seem to be used by people (else you wouldn't want to randomize them). Take one additional step. Multiply the sequential number by a large prime and reduce mod N where N is a power of 2. But choose N to be 2 bits smaller than you can store. Next, multiply the result by 11 and use that. So we have:
Hash = ((count * large_prime) % 536870912) * 11
The multiplication by 11 protects against most data entry errors - if any digit is typed wrong, the result will not be a multiple of 11. If any 2 digits are transposed, the result will not be a multiple of 11. So as a preliminary check of any value entered, you check if it's divisible by 11 before even looking in the database.
You can use mod operation for big prime number.
your number * big prime number 1 / big prime number 2.
Prime number 1 should be bigger than second. Seconds should be close to 2^32 but less than it. Than it will be hard to substitute.
Prime 1 and Prime 2 should be constants.
For our application, we use bit shuffle to generate the ID. It is very easy to reverse back to the original ID.
func (m Meeting) MeetingCode() uint {
hashed := (m.ID + 10000000) & 0x00FFFFFF
chunks := [24]uint{}
for i := 0; i < 24; i++ {
chunks[i] = hashed >> i & 0x1
}
shuffle := [24]uint{14, 1, 15, 21, 0, 6, 5, 10, 4, 3, 20, 22, 2, 23, 8, 13, 19, 9, 18, 12, 7, 11, 16, 17}
result := uint(0)
for i := 0; i < 24; i++ {
result = result | (chunks[shuffle[i]] << i)
}
return result
}
There is an exceedingly simple solution that none have posted, even though an answer has been selected I highly advise any visiting this question to consider the nature of binary representations, and the application of modulos arithmetic.
Given an finite range of integers, all the values can be permuted in any order through a simple addition over their index while bound by the range of the index through a modulos. You could even leverage simple integer overflow such that using the modulos operator is not even necessary.
Essentially, you'd have a static variable in memory, where a function when called increments the static variable by some constant, enforces the boundaries, and then returns the value. This output could be an index over a collection of desired outputs, or the desired output itself
The constant of the increment that defines the mapping may be several times the size in memory of the value being returned, but given any mapping there exists some finite constant that will achieve the mapping through a trivial modulos arithmetic.

Resources