Triple Modular Multiplicaiton - math

I am calculating the following sum:
(a[x]+ma[x-1]+2ma[x-2]+3m*a[x-3]+....)%MOD (MOD=1e9+7)
For this, I am using this loop.
long long mulmod(long long a,long long b,long long c)
{
if (a == 0 || b == 0)
return 0;
if (a == 1)
return b;
if (b == 1)
return a;
long long a2 = mulmod(a, b / 2, c);
if ((b & 1) == 0)
{
return (a2 + a2) % c;
}
else
{
return ((a % c) + (a2 + a2)) % c;
}
}
res=a[x]%MOD;
for(i=x-1;i>=1;i--)
res=(res%MOD+mulmod(mulmod(x-i,m,MOD),a[i],MOD))%MOD;
However, this is still giving me overflow errors. The basic error, I guess is in (abc)%MOD.
Thank you.

You need to incorporate the following modular arithmetic identities into your program to avoid overflow:
(A + B + ...) mod C = (A mod C + B mod C + ... mod C) mod C
and
(A * B * ...) mod C = (A mod C * B mod C * ... mod C) mod C

Related

Modular inverses and unsigned integers

Modular inverses can be computed as follows (from Rosetta Code):
#include <stdio.h>
int mul_inv(int a, int b)
{
int b0 = b, t, q;
int x0 = 0, x1 = 1;
if (b == 1) return 1;
while (a > 1) {
q = a / b;
t = b, b = a % b, a = t;
t = x0, x0 = x1 - q * x0, x1 = t;
}
if (x1 < 0) x1 += b0;
return x1;
}
However, the inputs are ints, as you can see. Would the above code work for unsigned integers (e.g. uint64_t) as well? I mean, would it be ok to replaced all int with uint64_t? I could try for few inputs but it is not feasible to try for all 64-bits combinations.
I'm specifically interested in two aspects:
for values [0, 264) of both a and b, would all calculation not overflow/underflow (or overflow with no harm)?
how would (x1 < 0) look like in unsigned case?
First of all how this algorithm works? It is based on the Extended Euclidean algorithm for computation of the GCD. In short the idea is following: if we can find some integer coefficients m and n such that
a*m + b*n = 1
then m will be the answer for the modular inverse problem. It is easy to see because
a*m + b*n = a*m (mod b)
Luckily the Extended Euclidean algorithm does exactly that: if a and b are co-prime, it finds such m and n. It works in the following way: for each iteration track two triplets (ai, xai, yai) and (bi, xbi, ybi) such that at every step
ai = a0*xai + b0*yai
bi = a0*xbi + b0*ybi
so when finally the algorithm stops at the state of ai = 0 and bi = GCD(a0,b0), then
1 = GCD(a0,b0) = a0*xbi + b0*ybi
It is done using more explicit way to calculate modulo: if
q = a / b
r = a % b
then
r = a - q * b
Another important thing is that it can be proven that for positive a and b at every step |xai|,|xbi| <= b and |yai|,|ybi| <= a. This means there can be no overflow during calculation of those coefficients. Unfortunately negative values are possible, moreover, on every step after the first one in each equation one is positive and the other is negative.
What the code in your question does is a reduced version of the same algorithm: since all we are interested in is the x[a/b] coefficients, it tracks only them and ignores the y[a/b] ones. The simplest way to make that code work for uint64_t is to track the sign explicitly in a separate field like this:
typedef struct tag_uint64AndSign {
uint64_t value;
bool isNegative;
} uint64AndSign;
uint64_t mul_inv(uint64_t a, uint64_t b)
{
if (b <= 1)
return 0;
uint64_t b0 = b;
uint64AndSign x0 = { 0, false }; // b = 1*b + 0*a
uint64AndSign x1 = { 1, false }; // a = 0*b + 1*a
while (a > 1)
{
if (b == 0) // means original A and B were not co-prime so there is no answer
return 0;
uint64_t q = a / b;
// (b, a) := (a % b, b)
// which is the same as
// (b, a) := (a - q * b, b)
uint64_t t = b; b = a % b; a = t;
// (x0, x1) := (x1 - q * x0, x0)
uint64AndSign t2 = x0;
uint64_t qx0 = q * x0.value;
if (x0.isNegative != x1.isNegative)
{
x0.value = x1.value + qx0;
x0.isNegative = x1.isNegative;
}
else
{
x0.value = (x1.value > qx0) ? x1.value - qx0 : qx0 - x1.value;
x0.isNegative = (x1.value > qx0) ? x1.isNegative : !x0.isNegative;
}
x1 = t2;
}
return x1.isNegative ? (b0 - x1.value) : x1.value;
}
Note that if a and b are not co-prime or when b is 0 or 1, this problem has no solution. In all those cases my code returns 0 which is an impossible value for any real solution.
Note also that although the calculated value is really the modular inverse, simple multiplication will not always produce 1 because of the overflow at multiplication over uint64_t. For example for a = 688231346938900684 and b = 2499104367272547425 the result is inv = 1080632715106266389
a * inv = 688231346938900684 * 1080632715106266389 =
= 743725309063827045302080239318310076 =
= 2499104367272547425 * 297596738576991899 + 1 =
= b * 297596738576991899 + 1
But if you do a naive multiplication of those a and inv of type uint64_t, you'll get 4042520075082636476 so (a*inv)%b will be 1543415707810089051 rather than expected 1.
The mod_inv C function :
return a modular multiplicative inverse of n with respect to the modulus
return 0 if the linear congruence has no solutions
unsigned mod_inv(unsigned n, const unsigned mod) {
unsigned a = mod, b = a, c = 0, d = 0, e = 1, f, g;
for (n *= a > 1; n > 1 && (n *= a > 0); e = g, c = (c & 3) | (c & 1) << 2) {
g = d, d *= n / (f = a);
a = n % a, n = f;
c = (c & 6) | (c & 2) >> 1;
f = c > 1 && c < 6;
c = (c & 5) | (f || e > d ? (c & 4) >> 1 : ~c & 2);
d = f ? d + e : e > d ? e - d : d - e;
}
return n ? c & 4 ? b - e : e : 0;
}
Examples
n = 7 and mod = 45 then res = 13 so 1 == ( 13 * 7 ) % 45
n = 52 and mod = 107 then res = 35 so 1 == ( 35 * 52 ) % 107
n = 213 and mod = 155 then res = 147 so 1 == ( 147 * 213 ) % 155
n = 392 and mod = 45 then res = 38 so 1 == ( 38 * 392 ) % 45
n = 3708141711 and mod = 4280761040 it still works...

finding time complexity of backtracking solution for generate all subset problem

Given the problem of distinct integers, generate all subsets.
https://www.interviewbit.com/problems/subset/
I have found two solutions.
first solution::
void helper_subsets(vector<vector<int>> &res , vector<int> &A ,
vector<int> &subset ,int current)
{
if(current == A.size())
res.push_back(subset) ;
else
{
helper_subsets(res,A,subset,current+1) ;
subset.push_back(A[current]) ;
helper_subsets(res,A,subset,current+1) ;
subset.pop_back() ;
}
}
vector<vector<int> >subsets(vector<int> &A) {
vector<vector<int>> res ;
sort(A.begin(),A.end()) ;
vector<int> subset ;
helper_subsets(res , A , subset , 0 ) ;
sort(res.begin(),res.end()) ;
return res ;
}
Second solution ::
void helper_subsets(vector<vector<int>> &res , vector<int> &A ,
vector<int> &subset ,int current)
{
res.push_back(subset) ;
for(int i = current ; i < A.size() ; i++)
{
subset.push_back(A[i]) ;
helper_subsets(res,A,subset,i+1) ;
subset.pop_back() ;
}
}
vector<vector<int> > subsets(vector<int> &A) {
vector<vector<int>> res ;
sort(A.begin(),A.end()) ;
vector<int> subset ;
helper_subsets(res , A , subset , 0 ) ;
sort(res.begin(),res.end()) ;
return res ;
}
The problem is that I am able to calculate the time complexity of the first solution mathematically as well using recursion tree.
t(n) = 2t(n-1) + c (i.e 2 recursive calls with size n-1 and some constant time for each n)
t(n) = O(2^n) by solving the above recurrence relation.
But with the second solution, I am not able to define recurrence relation to finally use back substitution to get the time complexity and could not get it by recurrence tree method.Please help me find time complexity of second solution.
The analogous recurrence relation for problem 2 is:
n - 1
T(n) = Σ T(n - i) + c
i = 1
– which follows from the for-loop from current to A.size(). To solve this, expand the first term:
T(n) = T(n - 1) + T(n - 2) + T(n - 3) + ... + T(1) + c
--------
|
= | T(n - 2) + T(n - 3) + ... + T(1) + c +
---> T(n - 2) + T(n - 3) + ... + T(1) + c
= 2 * [T(n - 2) + T(n - 3) + ... + T(1) + c]
= 2 * T(n - 1)
i.e., a very similar recurrence relation differing only by a constant. It still evaluates to O(2^n), taking the base case to be T(1) = O(1).

Division with numerator 2^64

How to to divide the constant 2^64 (i.e. ULLONG_MAX + 1) by uint64 larger than 2, without using unit128?
In other words, given x such as 2 <= x <= 2^64-1, how to obtain the quotient 2^64 / x, using just uint64?
The problem is that I cannot represent 2^64, let alone to divide it so I was hoping there is a trick which would simulate the result.
How to to divide the constant 2^64 (i.e. ULLONG_MAX + 1) by uint64 larger than 2
a/b --> (a-b)/b + 1
First subtract x from (max _value + 1), then divide by x, add 1.
// C solution:
uint64_t foo(uint64_t x) {
return (0u - x)/x + 1; // max_value + 1 is 0 and unsigned subtraction wraps around.
}
Of course division by 0 is a no-no. Code works for x >= 2, but not x == 1 as the quotient is also not representable.
Take ULLONG_MAX / denom, and add 1 if denom is a power of 2. In pseudocode:
if (denom == 0) {
throw ZeroDivisionException;
} else if (denom == 1) {
throw OverflowException;
} else {
return ULLONG_MAX / denom + (denom & (denom-1) == 0);
}
Alternatively, take ULLONG_MAX / denom for odd denom, and take 2^63 / (denom / 2) for even denom:
if (denom == 0) {
throw ZeroDivisionException;
} else if (denom == 1) {
throw OverflowException;
} else if (denom & 1) {
return ULLONG_MAX / denom;
} else {
return (1ULL << 63) / (denom >> 1);
}

3d line-intersection code not working properly

I created this piece of code to get the intersection of two 3d line-segments.
Unfortunately the result of this code is inaccurate, the intersection-point is not always on both lines.
I am confused and unsure what I'm doing wrong.
Here is my code:
--dir = direction
--p1,p2 = represents the line
function GetIntersection(dirStart, dirEnd, p1, p2)
local s1_x, s1_y, s2_x, s2_y = dirEnd.x - dirStart.x, dirEnd.z - dirStart.z, p2.x - p1.x, p2.z - p1.z
local div = (-s2_x * s1_y) + (s1_x * s2_y)
if div == 0 then return nil end
local s = (-s1_y * (dirStart.x - p1.x) + s1_x * (dirStart.z - p1.z)) / div
local t = ( s2_x * (dirStart.z - p1.z) - s2_y * (dirStart.x - p1.x)) / div
if (s >= 0 and s <= 1 and t >= 0 and t <= 1) and (Vector(dirStart.x + (t * s1_x), 0, dirStart.z + (t * s1_y)) or nil) then
local v = Vector(dirStart.x + (t * s1_x),0,dirStart.z + (t * s1_y))
return v
end
end
This is example of Delphi code to find a distance between two skew lines in 3D. For your purposes it is necessary to check that result if small enough value (intersection does exist), check that s and t parameters are in range 0..1, then
calculate point using parameter s
Math of this approach is described in 'the shortest line...' section of Paul Bourke page
VecDiff if vector difference function, Dot id scalar product function
function LineLineDistance(const L0, L1: TLine3D; var s, t: Double): Double;
var
u: TPoint3D;
a, b, c, d, e, det, invdet:Double;
begin
u := VecDiff(L1.Base, L0.Base);
a := Dot(L0.Direction, L0.Direction);
b := Dot(L0.Direction, L1.Direction);
c := Dot(L1.Direction, L1.Direction);
d := Dot(L0.Direction, u);
e := Dot(L1.Direction, u);
det := a * c - b * b;
if det < eps then
Result := -1
else begin
invdet := 1 / det;
s := invdet * (b * e - c * d);
t := invdet * (a * e - b * d);
Result := Distance(PointAtParam(L0, s), PointAtParam(L1, t));
end;
end;
As far as I can tell your code is good. I've implemented this in javascript at https://jsfiddle.net/SalixAlba/kkrc9kcf/
and it seems to work for all the cases I can think of.
The only changes I've done is to change things to work in javascript rather than lua. The final condition was commented out
function GetIntersection(dirStart, dirEnd, p1, p2) {
var s1_x = dirEnd.x - dirStart.x;
var s1_y = dirEnd.z - dirStart.z;
var s2_x = p2.x - p1.x;
var s2_y = p2.z - p1.z;
var div = (-s2_x * s1_y) + (s1_x * s2_y);
if (div == 0)
return new Vector(0,0);
var s = (-s1_y * (dirStart.x - p1.x) + s1_x * (dirStart.z - p1.z)) / div;
var t = ( s2_x * (dirStart.z - p1.z) - s2_y * (dirStart.x - p1.x)) / div;
if (s >= 0 && s <= 1 && t >= 0 && t <= 1) {
//and (Vector(dirStart.x + (t * s1_x), 0, dirStart.z + (t * s1_y)) or nil) then
var v = new Vector(
dirStart.x + (t * s1_x),
dirStart.z + (t * s1_y));
return v;
}
return new Vector(0,0);
}
Mathmatically it makes sense. If A,B and C,D are your two lines. Let s1 = B-A, s2 = C-D. A point of the line AB is given by A + t s1 and a point on the line CD is given by C + s s2. For an intersection we require
A + t s1 = C + s s2
or
(A-C) + t s1 = s s2
You two formula for s, t are found by taking the 2D cross product with each of the vectors s1 and s2
(A-C)^s1 + t s1^s1 = s s2^s1
(A-C)^s2 + t s1^s2 = s s2^s2
recalling s1^s1=s2^s2=0 and s2^s1= - s1^s2 we get
(A-C)^s1 = s s2^s1
(A-C)^s2 + t s1^s2 = 0
which can be solved to get s and t. This matches your equations.

Is there an algorithm known for power towers modulo a number managing all cases?

I would like to have an implementation in PARI/GP
for the calculation of
a_1 ^ a_2 ^ ... ^ a_n (mod m)
which manages all cases, especially the cases where high powers appear in the phi-chain.
Does anyone know such an implementation ?
Here's a possibility using Chinese remainders to make sure the modulus is a prime power. This simplifies the computation of x^n mod m in the painful case where gcd(x,m) is not 1. The code assumes the a_i are > 1; most of the code checks whether p^a_1^a_2^...^a_n is 0 mod (p^e) for a prime number p, while avoiding overflow.
\\ x[1]^x[2]^ ...^ x[#x] mod m, assuming x[i] > 1 for all i
tower(x, m) =
{ my(f = factor(m), P = f[,1], E = f[,2]);
chinese(vector(#P, i, towerp(x, P[i], E[i])));
}
towerp(x, p, e) =
{ my(q = p^e, i, t, v);
if (#x == 0, return (Mod(1, q)));
if (#x == 1, return (Mod(x[1], q)));
if (v = valuation(x[1], p),
t = x[#x]; i = #x;
while (i > 1,
if (t >= e, return (Mod(0, q)));
t = x[i]^t; i--);
if (t * v >= e, return (Mod(0, q)));
return (Mod(x[1], q)^t);
);
Mod(x[1], q)^lift(tower(x[^1], (p-1)*p^e));
}
For instance
? 5^(4^(3^2)) % 163 \\ direct computation, wouldn't scale
%1 = 158
? tower([5,4,3,2], 163)
%2 = Mod(158, 163)

Resources