Implementing FFT over finite fields - math

I would like to implement multiplication of polynomials using NTT. I followed Number-theoretic transform (integer DFT) and it seems to work.
Now I would like to implement multiplication of polynomials over finite fields Z_p[x] where p is arbitrary prime number.
Does it changes anything that the coefficients are now bounded by p, compared to the former unbounded case?
In particular, original NTT required to find prime number N as the working modulus that is larger than (magnitude of largest element of input vector)^2 * (length of input vector) + 1 so that the result never overflows. If the result is going to be bounded by that p prime anyway, how small can the modulus be? Note that p - 1 does not have to be of form (some positive integer) * (length of input vector).
Edit: I copy-pasted the source from the link above to illustrate the problem:
#
# Number-theoretic transform library (Python 2, 3)
#
# Copyright (c) 2017 Project Nayuki
# All rights reserved. Contact Nayuki for licensing.
# https://www.nayuki.io/page/number-theoretic-transform-integer-dft
#
import itertools, numbers
def find_params_and_transform(invec, minmod):
check_int(minmod)
mod = find_modulus(len(invec), minmod)
root = find_primitive_root(len(invec), mod - 1, mod)
return (transform(invec, root, mod), root, mod)
def check_int(n):
if not isinstance(n, numbers.Integral):
raise TypeError()
def find_modulus(veclen, minimum):
check_int(veclen)
check_int(minimum)
if veclen < 1 or minimum < 1:
raise ValueError()
start = (minimum - 1 + veclen - 1) // veclen
for i in itertools.count(max(start, 1)):
n = i * veclen + 1
assert n >= minimum
if is_prime(n):
return n
def is_prime(n):
check_int(n)
if n <= 1:
raise ValueError()
return all((n % i != 0) for i in range(2, sqrt(n) + 1))
def sqrt(n):
check_int(n)
if n < 0:
raise ValueError()
i = 1
while i * i <= n:
i *= 2
result = 0
while i > 0:
if (result + i)**2 <= n:
result += i
i //= 2
return result
def find_primitive_root(degree, totient, mod):
check_int(degree)
check_int(totient)
check_int(mod)
if not (1 <= degree <= totient < mod):
raise ValueError()
if totient % degree != 0:
raise ValueError()
gen = find_generator(totient, mod)
root = pow(gen, totient // degree, mod)
assert 0 <= root < mod
return root
def find_generator(totient, mod):
check_int(totient)
check_int(mod)
if not (1 <= totient < mod):
raise ValueError()
for i in range(1, mod):
if is_generator(i, totient, mod):
return i
raise ValueError("No generator exists")
def is_generator(val, totient, mod):
check_int(val)
check_int(totient)
check_int(mod)
if not (0 <= val < mod):
raise ValueError()
if not (1 <= totient < mod):
raise ValueError()
pf = unique_prime_factors(totient)
return pow(val, totient, mod) == 1 and all((pow(val, totient // p, mod) != 1) for p in pf)
def unique_prime_factors(n):
check_int(n)
if n < 1:
raise ValueError()
result = []
i = 2
end = sqrt(n)
while i <= end:
if n % i == 0:
n //= i
result.append(i)
while n % i == 0:
n //= i
end = sqrt(n)
i += 1
if n > 1:
result.append(n)
return result
def transform(invec, root, mod):
check_int(root)
check_int(mod)
if len(invec) >= mod:
raise ValueError()
if not all((0 <= val < mod) for val in invec):
raise ValueError()
if not (1 <= root < mod):
raise ValueError()
outvec = []
for i in range(len(invec)):
temp = 0
for (j, val) in enumerate(invec):
temp += val * pow(root, i * j, mod)
temp %= mod
outvec.append(temp)
return outvec
def inverse_transform(invec, root, mod):
outvec = transform(invec, reciprocal(root, mod), mod)
scaler = reciprocal(len(invec), mod)
return [(val * scaler % mod) for val in outvec]
def reciprocal(n, mod):
check_int(n)
check_int(mod)
if not (0 <= n < mod):
raise ValueError()
x, y = mod, n
a, b = 0, 1
while y != 0:
a, b = b, a - x // y * b
x, y = y, x % y
if x == 1:
return a % mod
else:
raise ValueError("Reciprocal does not exist")
def circular_convolve(vec0, vec1):
if not (0 < len(vec0) == len(vec1)):
raise ValueError()
if any((val < 0) for val in itertools.chain(vec0, vec1)):
raise ValueError()
maxval = max(val for val in itertools.chain(vec0, vec1))
minmod = maxval**2 * len(vec0) + 1
temp0, root, mod = find_params_and_transform(vec0, minmod)
temp1 = transform(vec1, root, mod)
temp2 = [(x * y % mod) for (x, y) in zip(temp0, temp1)]
return inverse_transform(temp2, root, mod)
vec0 = [24, 12, 28, 8, 0, 0, 0, 0]
vec1 = [4, 26, 29, 23, 0, 0, 0, 0]
print(circular_convolve(vec0, vec1))
def modulo(vec, prime):
return [x % prime for x in vec]
print(modulo(circular_convolve(vec0, vec1), 31))
Prints:
[96, 672, 1120, 1660, 1296, 876, 184, 0]
[3, 21, 4, 17, 25, 8, 29, 0]
However, where I change minmod = maxval**2 * len(vec0) + 1 to minmod = maxval + 1, it stops working:
[14, 16, 13, 20, 25, 15, 20, 0]
[14, 16, 13, 20, 25, 15, 20, 0]
What is the smallest minmod (N in the link above) be in order to work as expected?

If your input of n integers is bound to some prime q (any mod q not just prime will be the same) You can use it as a max value +1 but beware you can not use it as a prime p for the NTT because NTT prime p has special properties. All of them are here:
Translation from Complex-FFT to Finite-Field-FFT
so our max value of each input is q-1 but during your task computation (Convolution on 2 NTT results) the magnitude of first layer results can rise up to n.(q-1) but as we are doing convolution on them the input magnitude of final iNTT will rise up to:
m = n.((q-1)^2)
If you are doing different operations on the NTTs than the m equation might change.
Now let us get back to the p so in a nutshell you can use any prime p that upholds these:
p mod n == 1
p > m
and there exist 1 <= r,L < p such that:
p mod (L-1) = 0
r^(L*i) mod p == 1 // i = { 0,n }
r^(L*i) mod p != 1 // i = { 1,2,3, ... n-1 }
If all this is satisfied then p is nth root of unity and can be used for NTT. To find such prime and also the r,L look at the link above (there is C++ code that finds such).
For example during string multiplication we take 2 strings do NTT on them then convolute the result and iNTT back the result (that is sum of both input sizes). So for example:
99999999999999999999999999999999
*99999999999999999999999999999999
----------------------------------------------------------------
9999999999999999999999999999999800000000000000000000000000000001
the q = 10 and both operands are 9^32 so n=32 hence m = 9*9*32 = 2592 and the found prime is p = 2689. As you can see the result matches so no overflow occurs. However if I use any smaller prime that still fit all the other conditions the result will not match. I used this specifically to stretch the NTT values as much as possible (all values are q-1 and sizes are equal to the same power of 2)
In case your NTT is fast and n is not a power of 2 then you need to zero pad to nearest higher or equal power of 2 size for each NTT. But that should not affect the m value as zero pad should not increase the magnitude of values. My testing proves it so for convolution you can use:
m = (n1+n2).((q-1)^2)/2
where n1,n2 are the raw inputs sizes before zeropad.
For more info about implementing NTT you can check out mine in C++ (extensively optimized):
Modular arithmetics and NTT (finite field DFT) optimizations
So to answer your questions:
yes you can take advantage of the fact that input is mod q but you can not use q as p !!!
You can use minmod = n * (maxval + 1) only for single NTT (or first layer of NTTs) but as you are chaining them with convolution during your NTT usage you can not use that for the final INTT stage !!!
However as I mentioned in the comments easiest is to use max possible p that fits in the data type you are using and is usable for all power of 2 sizes of input supported.
Which basically renders your question irrelevant. The only case I can think of where this is not possible/desired is on arbitrary precision numbers where there is "no" max limit. There are many performance issues binded to variable p as the search for p is really slow (may be even slower than the NTT itself) and also variable p disables many performance optimizations of the modular arithmetics needed making the NTT really slow.

Related

How to approach this type of problem in permutation and combination?

Altitudes
Alice and Bob took a journey to the mountains. They have been climbing
up and down for N days and came home extremely tired.
Alice only remembers that they started their journey at an altitude of
H1 meters and they finished their wandering at an alitude of H2
meters. Bob only remembers that every day they changed their altitude
by A, B, or C meters. If their altitude on the ith day was x,
then their altitude on day i + 1 can be x + A, x + B, or x + C.
Now, Bob wonders in how many ways they could complete their journey.
Two journeys are considered different if and only if there exist a day
when the altitude that Alice and Bob covered that day during the first
journey differs from the altitude Alice and Bob covered that day during
the second journey.
Bob asks Alice to tell her the number of ways to complete the journey.
Bob needs your help to solve this problem.
Input format
The first and only line contains 6 integers N, H1, H2, A, B, C that
represents the number of days Alice and Bob have been wandering,
altitude on which they started their journey, altitude on which they
finished their journey, and three possible altitude changes,
respectively.
Output format
Print the answer modulo 10**9 + 7.
Constraints
1 <= N <= 10**5
-10**9 <= H1, H2 <= 10**9
-10**9 <= A, B, C <= 10**9
Sample Input
2 0 0 1 0 -1
Sample Output
3
Explanation
There are only 3 possible journeys-- (0, 0), (1, -1), (-1, 1).
Note
This problem comes originally from a hackerearth competition, now closed. The explanation for the sample input and output has been corrected.
Here is my solution in Python 3.
The question can be simplified from its 6 input parameters to only 4 parameters. There is no need for the beginning and ending altitudes--the difference of the two is enough. Also, we can change the daily altitude changes A, B, and C and get the same answer if we make a corresponding change to the total altitude change. For example, if we add 1 to each of A, B, and C, we could add N to the altitude change: 1 additional meter each day over N days means N additional meters total. We can "normalize" our daily altitude changes by sorting them so A is the smallest, then subtract A from each of the altitude changes and subtract N * A from the total altitude change. This means we now need to add a bunch of 0's and two other values (let's call them D and E). D is not larger than E.
We now have an easier problem: take N values, each of which is 0, D, or E, so they sum to a particular total (let's say H). This is the same at using up to N numbers equaling D or E, with the rest zeros.
We can use mathematics, in particular Bezout's identity, to see if this is possible. Some more mathematics can find all the ways of doing this. Once we know how many 0's, D's, and E's, we can use multinomial coefficients to find how many ways these values can be rearranged. Total all these up and we have the answer.
This code finds the total number of ways to complete the journey, and takes it modulo 10**9 + 7 only at the very end. This is possible since Python uses large integers. The largest result I found in my testing is for the input values 100000 0 100000 0 1 2 which results in a number with 47,710 digits before taking the modulus. This takes a little over 8 seconds on my machine.
This code is a little longer than necessary, since I made some of the routines more general than necessary for this problem. I did this so I can use them in other problems. I used many comments for clarity.
# Combinatorial routines -----------------------------------------------
def comb(n, k):
"""Compute the number of ways to choose k elements out of a pile of
n, ignoring the order of the elements. This is also called
combinations, or the binomial coefficient of n over k.
"""
if k < 0 or k > n:
return 0
result = 1
for i in range(min(k, n - k)):
result = result * (n - i) // (i + 1)
return result
def multcoeff(*args):
"""Return the multinomial coefficient
(n1 + n2 + ...)! / n1! / n2! / ..."""
if not args: # no parameters
return 1
# Find and store the index of the largest parameter so we can skip
# it (for efficiency)
skipndx = args.index(max(args))
newargs = args[:skipndx] + args[skipndx + 1:]
result = 1
num = args[skipndx] + 1 # a factor in the numerator
for n in newargs:
for den in range(1, n + 1): # a factor in the denominator
result = result * num // den
num += 1
return result
def new_multcoeff(prev_multcoeff, x, y, z, ag, bg):
"""Given a multinomial coefficient prev_multcoeff =
multcoeff(x-bg, y+ag, z+(bg-ag)), calculate multcoeff(x, y, z)).
NOTES: 1. This uses bg multiplications and bg divisions,
faster than doing multcoeff from scratch.
"""
result = prev_multcoeff
for d in range(1, ag + 1):
result *= y + d
for d in range(1, bg - ag + 1):
result *= z + d
for d in range(bg):
result //= x - d
return result
# Number theory routines -----------------------------------------------
def bezout(a, b):
"""For integers a and b, find an integral solution to
a*x + b*y = gcd(a, b).
RETURNS: (x, y, gcd)
NOTES: 1. This routine uses the convergents of the continued
fraction expansion of b / a, so it will be slightly
faster if a <= b, i.e. the parameters are sorted.
2. This routine ensures the gcd is nonnegative.
3. If a and/or b is zero, the corresponding x or y
will also be zero.
4. This routine is named after Bezout's identity, which
guarantees the existences of the solution x, y.
"""
if not a:
return (0, (b > 0) - (b < 0), abs(b)) # 2nd is sign(b)
p1, p = 0, 1 # numerators of the two previous convergents
q1, q = 1, 0 # denominators of the two previous convergents
negate_y = True # flag if negate y=q (True) or x=p (False)
quotient, remainder = divmod(b, a)
while remainder:
b, a = a, remainder
p, p1 = p * quotient + p1, p
q, q1 = q * quotient + q1, q
negate_y = not negate_y
quotient, remainder = divmod(b, a)
if a < 0:
p, q, a = -p, -q, -a # ensure the gcd is nonnegative
return (p, -q, a) if negate_y else (-p, q, a)
def byzantine_bball(a, b, s):
"""For nonnegative integers a, b, s, return information about
integer solutions x, y to a*x + b*y = s. This is
equivalent to finding a multiset containing only a and b that
sums to s. The name comes from getting a given basketball score
given scores for shots and free throws in a hypothetical game of
"byzantine basketball."
RETURNS: None if there is no solution, or an 8-tuple containing
x the smallest possible nonnegative integer value of
x.
y the value of y corresponding to the smallest
possible integral value of x. If this is negative,
there is no solution for nonnegative x, y.
g the greatest common divisor (gcd) of a, b.
u the found solution to a*u + b*v = g
v " "
ag a // g, or zero if g=0
bg b // g, or zero if g=0
sg s // g, or zero if g=0
NOTES: 1. If a and b are not both zero and one solution x, y is
returned, then all integer solutions are given by
x + t * bg, y - t * ag for any integer t.
2. This routine is slightly optimized for a <= b. In that
case, the solution returned also has the smallest sum
x + y among positive integer solutions.
"""
# Handle edge cases of zero parameter(s).
if 0 == a == b: # the only score possible from 0, 0 is 0
return (0, 0, 0, 0, 0, 0, 0, 0) if s == 0 else None
if a == 0:
sb = s // b
return (0, sb, b, 0, 1, 0, 1, sb) if s % b == 0 else None
if b == 0:
sa = s // a
return (sa, 0, a, 1, 0, 1, 0, sa) if s % a == 0 else None
# Find if the score is possible, ignoring the signs of x and y.
u, v, g = bezout(a, b)
if s % g:
return None # only multiples of the gcd are possible scores
# Find one way to get the score, ignoring the signs of x and y.
ag, bg, sg = a // g, b // g, s // g # we now have ag*u + bg*v = 1
x, y = sg * u, sg * v # we now have a*x + b*y = s
# Find the solution where x is nonnegative and as small as possible.
t = x // bg # Python rounds toward minus infinity--what we want
x, y = x - t * bg, y + t * ag
# Return the information
return (x, y, g, u, v, ag, bg, sg)
# Routines for this puzzle ---------------------------------------------
def altitude_reduced(n, h, d, e):
"""Return the number of distinct n-tuples containing only the
values 0, d, and e that sum to h. Assume that all these
numbers are integers and that 0 <= d <= e.
"""
# Handle some impossible special cases
if n < 0 or h < 0:
return 0
# Handle some other simple cases with zero values
if n == 0:
return 0 if h else 1
if 0 == d == e: # all step values are zero
return 0 if h else 1
if 0 == d or d == e: # e is the only non-zero step value
# If possible, return # of tuples with proper # of e's, the rest 0's
return 0 if h % e else comb(n, h // e)
# Handle the main case 0 < d < e
# --Try to get the solution with the fewest possible non-zero days:
# x d's and y e's and the rest zeros: all solutions are given by
# x + t * bg, y - t * ag
solutions_info = byzantine_bball(d, e, h)
if not solutions_info:
return 0 # no way at all to get h from d, e
x, y, _, _, _, ag, bg, _ = solutions_info
# --Loop over all solutions with nonnegative x, y, small enough x + y
result = 0
while y >= 0 and x + y <= n: # at most n non-zero days
# Find multcoeff(x, y, n - x - y), in a faster way
if result == 0: # 1st time through loop: no prev coeff available
amultcoeff = multcoeff(x, y, n - x - y)
else: # use previous multinomial coefficient
amultcoeff = new_multcoeff(amultcoeff, x, y, n - x - y, ag, bg)
result += amultcoeff
x, y = x + bg, y - ag # x+y increases by bg-ag >= 0
return result
def altitudes(input_str=None):
# Get the input
if input_str is None:
input_str = input('Numbers N H1 H2 A B C? ')
# input_str = '100000 0 100000 0 1 2' # replace with prev line for input
n, h1, h2, a, b, c = map(int, input_str.strip().split())
# Reduce the number of parameters by normalizing the values
h_diff = h2 - h1 # net altitude change
a, b, c = sorted((a, b, c)) # a is now the smallest
h, d, e = h_diff - n * a, b - a, c - a # reduce a to zero
# Solve the reduced problem
print(altitude_reduced(n, h, d, e) % (10**9 + 7))
if __name__ == '__main__':
altitudes()
Here are some of my test routines for the main problem. These are suitable for pytest.
# Testing, some with pytest ---------------------------------------------------
import itertools # for testing
import collections # for testing
def brute(n, h, d, e):
"""Do alt_reduced with brute force."""
return sum(1 for v in itertools.product({0, d, e}, repeat=n)
if sum(v) == h)
def brute_count(n, d, e):
"""Count achieved heights with brute force."""
if n < 0:
return collections.Counter()
return collections.Counter(
sum(v) for v in itertools.product({0, d, e}, repeat=n)
)
def test_impossible():
assert altitude_reduced(0, 6, 1, 2) == 0
assert altitude_reduced(-1, 6, 1, 2) == 0
assert altitude_reduced(3, -1, 1, 2) == 0
def test_simple():
assert altitude_reduced(1, 0, 0, 0) == 1
assert altitude_reduced(1, 1, 0, 0) == 0
assert altitude_reduced(1, -1, 0, 0) == 0
assert altitude_reduced(1, 1, 0, 1) == 1
assert altitude_reduced(1, 1, 1, 1) == 1
assert altitude_reduced(1, 2, 0, 1) == 0
assert altitude_reduced(1, 2, 1, 1) == 0
assert altitude_reduced(2, 4, 0, 3) == 0
assert altitude_reduced(2, 4, 3, 3) == 0
assert altitude_reduced(2, 4, 0, 2) == 1
assert altitude_reduced(2, 4, 2, 2) == 1
assert altitude_reduced(3, 4, 0, 2) == 3
assert altitude_reduced(3, 4, 2, 2) == 3
assert altitude_reduced(4, 4, 0, 2) == 6
assert altitude_reduced(4, 4, 2, 2) == 6
assert altitude_reduced(2, 6, 0, 2) == 0
assert altitude_reduced(2, 6, 2, 2) == 0
def test_main():
N = 12
maxcnt = 0
for n in range(-1, N):
for d in range(N): # must have 0 <= d
for e in range(d, N): # must have d <= e
counts = brute_count(n, d, e)
for h, cnt in counts.items():
if cnt == 25653:
print(n, h, d, e, cnt)
maxcnt = max(maxcnt, cnt)
assert cnt == altitude_reduced(n, h, d, e)
print(maxcnt) # got 25653 for N = 12, (n, h, d, e) = (11, 11, 1, 2) etc.

cvxpy contrained normalization equations (abs)

I am working in an optimization problem (A*v = b) where I would like to rank a set of alternatives X = {x1,x2,x3,x4}. However, I have the following normalization constraint: |v[i] - v[j]| <= 1, which can be in the form -1 <= v[i] - v[j] <= 1.
My code is as follows:
import cvxpy as cp
n = len(X) #set of alternatives
v = cp.Variable(n)
objective = cp.Minimize(cp.sum_squares(A*v - b))
constraints = [0 <= v]
#Normalization condition -1 <= v[i] - v[j] <= 1
for i in range(n):
for j in range(n):
constraints = [-1 <= v[i]-v[j], 1 >= v[i]-v[j]]
prob = cp.Problem(objective, constraints)
# The optimal objective value is returned by `prob.solve()`.
result = prob.solve()
# The optimal value for v is stored in `v.value`.
va2 = v.value
Which outputs:
[-0.15 0.45 -0.35 0.05]
Result, which is not close to what should be and even have negative values. I think, my code for the normalization contraint most probably is wrong.
You are not appending your constraints, instead you are overwriting them each time. Instead of this line
constraints = [-1 <= v[i]-v[j], 1 >= v[i]-v[j]]
You should have
constraints += [-1 <= v[i]-v[j], 1 >= v[i]-v[j]]
For cleanliness you may want to change this
for i in range(n):
for j in range(n):
To only consider each pair once:
for i in range(n):
for j in range(i+1, n):

runif with .Machine$double.xmax as boundaries

I wanted to generate a random real (I guess rational) number.
To do this I wanted to use runif(1, min = m, max = M) and my thoughts were to set m; M (absolute) as large as possible in order to make the interval as large as possible. Which brings me to my question:
M <- .Machine$double.xmax
m <- -M
runif(1, m, M)
## which returns
[1] Inf
Why is it not returning a number? Is the chosen interval simply too large?
PS
> .Machine$double.xmax
[1] 1.797693e+308
As hinted by mt1022 the reason is in runif C source:
double runif(double a, double b)
{
if (!R_FINITE(a) || !R_FINITE(b) || b < a) ML_ERR_return_NAN;
if (a == b)
return a;
else {
double u;
/* This is true of all builtin generators, but protect against
user-supplied ones */
do {u = unif_rand();} while (u <= 0 || u >= 1);
return a + (b - a) * u;
}
}
In the return argument you can see the formula a + (b - a) * u which transoform uniformly [0, 1] generated random value in the user supplied interval [a, b]. In your case it will be -M + (M + M) * u. So M + M in case it 1.79E308 + 1.79E308 generates Inf. I.e. finite + Inf * finite = Inf:
M + (M - m) * runif(1, 0, 1)
# Inf

Is it safe to replace "a/(b*c)" with "a/b/c" when using integer-division?

Is it safe to replace a/(b*c) with a/b/c when using integer-division on positive integers a,b,c, or am I at risk losing information?
I did some random tests and couldn't find an example of a/(b*c) != a/b/c, so I'm pretty sure it's safe but not quite sure how to prove it.
Thank you.
Mathematics
As mathematical expressions, ⌊a/(bc)⌋ and ⌊⌊a/b⌋/c⌋ are equivalent whenever b is nonzero and c is a positive integer (and in particular for positive integers a, b, c). The standard reference for these sorts of things is the delightful book Concrete Mathematics: A Foundation for Computer Science by Graham, Knuth and Patashnik. In it, Chapter 3 is mostly on floors and ceilings, and this is proved on page 71 as a part of a far more general result:
In the 3.10 above, you can define x = a/b (mathematical, i.e. real division), and f(x) = x/c (exact division again), and plug those into the result on the left ⌊f(x)⌋ = ⌊f(⌊x⌋)⌋ (after verifying that the conditions on f hold here) to get ⌊a/(bc)⌋ on the LHS equal to ⌊⌊a/b⌋/c⌋ on the RHS.
If we don't want to rely on a reference in a book, we can prove ⌊a/(bc)⌋ = ⌊⌊a/b⌋/c⌋ directly using their methods. Note that with x = a/b (the real number), what we're trying to prove is that ⌊x/c⌋ = ⌊⌊x⌋/c⌋. So:
if x is an integer, then there is nothing to prove, as x = ⌊x⌋.
Otherwise, ⌊x⌋ < x, so ⌊x⌋/c < x/c which means that ⌊⌊x⌋/c⌋ ≤ ⌊x/c⌋. (We want to show it's equal.) Suppose, for the sake of contradiction, that ⌊⌊x⌋/c⌋ < ⌊x/c⌋ then there must be a number y such that ⌊x⌋ < y ≤ x and y/c = ⌊x/c⌋. (As we increase a number from ⌊x⌋ to x and consider division by c, somewhere we must hit the exact value ⌊x/c⌋.) But this means that y = c*⌊x/c⌋ is an integer between ⌊x⌋ and x, which is a contradiction!
This proves the result.
Programming
#include <stdio.h>
int main() {
unsigned int a = 142857;
unsigned int b = 65537;
unsigned int c = 65537;
printf("a/(b*c) = %d\n", a/(b*c));
printf("a/b/c = %d\n", a/b/c);
}
prints (with 32-bit integers),
a/(b*c) = 1
a/b/c = 0
(I used unsigned integers as overflow behaviour for them is well-defined, so the above output is guaranteed. With signed integers, overflow is undefined behaviour, so the program can in fact print (or do) anything, which only reinforces the point that the results can be different.)
But if you don't have overflow, then the values you get in your program are equal to their mathematical values (that is, a/(b*c) in your code is equal to the mathematical value ⌊a/(bc)⌋, and a/b/c in code is equal to the mathematical value ⌊⌊a/b⌋/c⌋), which we've proved are equal. So it is safe to replace a/(b*c) in code by a/b/c when b*c is small enough not to overflow.
While b*c could overflow (in C) for the original computation, a/b/c can't overflow, so we don't need to worry about overflow for the forward replacement a/(b*c) -> a/b/c. We would need to worry about it the other way around, though.
Let x = a/b/c. Then a/b == x*c + y for some y < c, and a == (x*c + y)*b + z for some z < b.
Thus, a == x*b*c + y*b + z. y*b + z is at most b*c-1, so x*b*c <= a <= (x+1)*b*c, and a/(b*c) == x.
Thus, a/b/c == a/(b*c), and replacing a/(b*c) by a/b/c is safe.
Nested floor division can be reordered as long as you keep track of your divisors and dividends.
#python3.x
x // m // n = x // (m * n)
#python2.x
x / m / n = x / (m * n)
Proof (sucks without LaTeX :( ) in python3.x:
Let k = x // m
then k - 1 < x / m <= k
and (k - 1) / n < x / (m * n) <= k / n
In addition, (x // m) // n = k // n
and because x // m <= x / m and (x // m) // n <= (x / m) // n
k // n <= x // (m * n)
Now, if k // n < x // (m * n)
then k / n < x / (m * n)
and this contradicts the above statement that x / (m * n) <= k / n
so if k // n <= x // (m * n) and k // n !< x // (m * n)
then k // n = x // (m * n)
and (x // m) // n = x // (m * n)
https://en.wikipedia.org/wiki/Floor_and_ceiling_functions#Nested_divisions

Calculating the modulo of two intervals

I want to understand how the modulus operator works when applied to two intervals. Adding, subtracting and multiplying two intervals is trivial to implement in code, but how do you do it for modulus?
I'd be happy if someone can show me the formula, sample code or a link which explains how it works.
Background info: You have two integers x_lo < x < x_hi and y_lo < y < y_hi. What is the the lower and upper bound for mod(x, y)?
Edit: I'm unsure if it is possible to come up with the minimal bounds in an efficient manner (without calculating the mod for all x or for all y). If so, then I'll accept an accurate but non-optimal answer for the bounds. Obviously, [-inf,+inf] is a correct answer then :) but I want a bound that is more limited in size.
It turns out, this is an interesting problem. The assumption I make is that for integer intervals, modulo is defined with respect to truncated division (round towards 0).
As a consequence, mod(-a,m) == -mod(a,m) for all a, m. Moreover, sign(mod(a,m)) == sign(a).
Definitions, before we start
Closed interval from a to b: [a,b]
Empty interval: [] := [+Inf,-Inf]
Negation: -[a,b] := [-b,-a]
Union: [a,b] u [c,d] := [min(a,c),max(b,d)]
Absolute value: |m| := max(m,-m)
Simpler Case: Fixed modulus m
It is easier to start with a fixed m. We will later generalize this to the modulo of two intervals. The definition builds up recursively. It should be no problem to implement this in your favorite programming language. Pseudocode:
def mod1([a,b], m):
// (1): empty interval
if a > b || m == 0:
return []
// (2): compute modulo with positive interval and negate
else if b < 0:
return -mod1([-b,-a], m)
// (3): split into negative and non-negative interval, compute and join
else if a < 0:
return mod1([a,-1], m) u mod1([0,b], m)
// (4): there is no k > 0 such that a < k*m <= b
else if b-a < |m| && a % m <= b % m:
return [a % m, b % m]
// (5): we can't do better than that
else
return [0,|m|-1]
Up to this point, we can't do better than that. The resulting interval in (5) might be an over-approximation, but it is the best we can get. If we were allowed to return a set of intervals, we could be more precise.
General case
The same ideas apply to the case where our modulus is an interval itself. Here we go:
def mod2([a,b], [m,n]):
// (1): empty interval
if a > b || m > n:
return []
// (2): compute modulo with positive interval and negate
else if b < 0:
return -mod2([-b,-a], [m,n])
// (3): split into negative and non-negative interval, compute, and join
else if a < 0:
return mod2([a,-1], [m,n]) u mod2([0,b], [m,n])
// (4): use the simpler function from before
else if m == n:
return mod1([a,b], m)
// (5): use only non-negative m and n
else if n <= 0:
return mod2([a,b], [-n,-m])
// (6): similar to (5), make modulus non-negative
else if m <= 0:
return mod2([a,b], [1, max(-m,n)])
// (7): compare to (4) in mod1, check b-a < |modulus|
else if b-a >= n:
return [0,n-1]
// (8): similar to (7), split interval, compute, and join
else if b-a >= m:
return [0, b-a-1] u mod2([a,b], [b-a+1,n])
// (9): modulo has no effect
else if m > b:
return [a,b]
// (10): there is some overlapping of [a,b] and [n,m]
else if n > b:
return [0,b]
// (11): either compute all possibilities and join, or be imprecise
else:
return [0,n-1] // imprecise
Have fun! :)
Let see mod(x, y) = mod.
In general 0 <= mod <= y. So it's always true: y_lo < mod < y_hi
But we can see some specific cases below:
- if: x_hi < y_lo then div(x, y) = 0, then x_low < mod < x_hi
- if: x_low > y_hi then div(x, y) > 0, then y_low < mod < y_hi
- if: x_low < y_low < y_hi < x_hi, then y_low < mod < y_hi
- if: x_low < y_low < x_hi < y_hi, then y_low < mod < x_hi
- if: y_low < x_low < y_hi < x_hi, then y_low < mod < y_hi
....

Resources