Learning Binary Search in python 3.7 - runtime-error

I found this code on https://www.geeksforgeeks.org/binary-search/
# Python Program for recursive binary search.
# Returns index of x in arr if present, else -1
def binarySearch (arr, l, r, x):
# Check base case
if r >= l:
mid = l + (r - l)/2;
# If element is present at the middle itself
if arr[mid] == x:
return mid
# If element is smaller than mid, then it
# can only be present in left subarray
elif arr[mid] > x:
return binarySearch(arr, l, mid-1, x)
# Else the element can only be present
# in right subarray
else:
return binarySearch(arr, mid+1, r, x)
else:
# Element is not present in the array
return -1
# Test array
arr = [ 2, 3, 4, 10, 40, 50, 80, 140, 200, 2000, 100]
x = 50
# Function call
result = binarySearch(arr, 0, len(arr)-1, int)
if result != -1:
print ("Element is present at index %d" % result)
else:
print ("Element is not present in array")
However, when I run it I get this problem: TypeError: list indices must be integers or slices, not float
I'm not sure how to convert do that. I attempted to set the entire array as an int but that didn't work or replace x with int and that didn't work either.
Any suggestion?

The issue is on this line:
mid = l + (r - l)/2;
In Python 3 / does floating point division and as mid is used as an array index it needs to be an int. To do integer division use //
mid = l + (r - l) // 2;
There is also another issue with the call to the function:
result = binarySearch(arr, 0, len(arr) - 1, int)
The last parameter should not be int but x (the variable you are searching for):
result = binarySearch(arr, 0, len(arr) - 1, x)
when you pass in int as the last parameter you'll get an error TypeError: unorderable types: int() > type()

Related

Could someone explain why this function doesn't always return the first element of the list

I have been having a look at some simple recursive functions to wrap my head around the concept. However one example has me a little confused.
The function below uses recursion to obtain the largest integer from a list:
A = [-4, 2, 4]
n = len(A)
def findMaxRec(A, n):
if (n == 1):
return A[0]
else:
return max(A[n - 1], findMaxRec(A, n - 1))
As n will eventually equal 1 why does the function not always return the first element in the list?
In such cases it might be helpful to just write out what code will be executed. I've tried to do that with your function as pseudo-code, where n is replaced with its value:
findMaxRec(A, 3):
if (3 == 1):
return A[0]
else:
return max(A[3 - 1],
findMaxRec(A, 2):
if (2 == 1):
return A[0]
else:
return max(A[2 - 1],
findMaxRec(A, 1):
if (1 == 1):
return A[0]
)
)
Effectively, this results in:
max(A[2], max(A[1], A[0]))
Where the inner call max(A[1], A[0]) = max(2, -4) = 2
And the outer call max(A[2], ...) = max(4, 2) = 4

Hi everyone, I'm trying to code this formula in prolog, any help is appreciated :)

I'm trying to code this formula in prolog :
"str" is input number as a string
"base" is the base of the input number.
Result is,
(base)^0 * str[len-1] + (base)^1 * str[len-2] + (base)^2 * str[len-3] + ...
I'm new to prolog and I have this right now:
calc([],_,0):- !.
calc([H|T],Base,Res):-
length([H|T],Long),
Long >= 0,
Size is Long - 1,
power(Base , Size, Res),
Res1 is Res * H,
calc(T,Base,Res1).
but it doesn't work properly I spent yesterday trying to solve the problem but with no success.
Any help is appreciated :) .
You can do something like this:
value(String, Base, Value) :-
string_chars(String, Digits),
value(Digits, Base, 0, Value).
value([], _, Value, Value).
value([Digit|Digits], Base, Accumulator, Value) :-
atoi(Digit, Number),
NewAccumulator is Base*Accumulator + Number,
value(Digits, Base, NewAccumulator, Value).
atoi(Char, Int) :- % convert ASCII code to integer
char_code(Char, Code) ,
Int is Code - 48.
The predefined predicate string_chars converts a string into a list of chars:
?- string_chars("1101", Chars).
Chars = ['1', '1', '0', '1'].
The predicate atoi converts a character representing a digit into a corresponding integer:
?- atoi('3', Integer).
Integer = 3.
Supposing that [1,1,0,1] is a list of integers (representing a number in base 2), its corresponding value in base 10 can be computed as following:
Digit Accumulator
- 0
1 2 x 0 + 1 = 1
1 2 x 1 + 1 = 3
0 2 x 3 + 0 = 6
1 2 x 6 + 1 = 13
Here are some examples:
?- value("1101", 2, V).
V = 13.
?- value("1201", 3, V).
V = 46.
Alternative solution Suppossing that you already have a list of integers representing the digits of a number, the solution is even simpler:
value_integers(Digits, Base, Value) :-
value_integers(Digits, Base, 0, Value).
value_integers([], _, Value, Value).
value_integers([Digit|Digits], Base, Accumulator, Value) :-
NewAccumulator is Base*Accumulator + Digit,
value_integers(Digits, Base, NewAccumulator, Value).
Here are some examples:
?- value_integers([1,1,0,1], 2, Value).
Value = 13.
?- value_integers([1,2,0,1], 3, Value).
Value = 46.
?- value_integers([1,2,0,1], 10, Value).
Value = 1201.

Continued fractions and Pell's equation - numerical issues

Mathematical background
Continued fractions are a way to represent numbers (rational or not), with a basic recursion formula to calculate it. Given a number r, we define r[0]=r and have:
for n in range(0..N):
a[n] = floor(r[n])
if r[n] == [an]: break
r[n+1] = 1 / (r[n]-a[n])
where a is the final representation. We can also define a series of convergents by
h[-2,-1] = [0, 1]
k[-2, -1] = [1, 0]
h[n] = a[n]*h[n-1]+h[n-2]
k[n] = a[n]*k[n-1]+k[n-2]
where h[n]/k[n] converge to r.
Pell's equation is a problem of the form x^2-D*y^2=1 where all numbers are integers and D is not a perfect square in our case. A solution for a given D that minimizes x is given by continued fractions. Basically, for the above equation, it is guaranteed that this (fundamental) solution is x=h[n] and y=k[n] for the lowest n found which solves the equation in the continued fraction expansion of sqrt(D).
Problem
I am failing to get this simple algorithm work for D=61. I first noticed it did not solve Pell's equation for 100 coefficients, so I compared it against Wolfram Alpha's convergents and continued fraction representation and noticed the 20th elements fail - the representation is 3 compared to 4 that I get, yielding different convergents - h[20]=335159612 on Wolfram compared to 425680601 for me.
I tested the code below, two languages (though to be fair, Python is C under the hood I guess), on two systems and get the same result - a diff on loop 20. I'll note that the convergents are still accurate and converge! Why am I getting different results compared to Wolfram Alpha, and is it possible to fix it?
For testing, here's a Python program to solve Pell's equation for D=61, printing first 20 convergents and the continued fraction representation cf (and some extra unneeded fluff):
from math import floor, sqrt # Can use mpmath here as well.
def continued_fraction(D, count=100, thresh=1E-12, verbose=False):
cf = []
h = (0, 1)
k = (1, 0)
r = start = sqrt(D)
initial_count = count
x = (1+thresh+start)*start
y = start
while abs(x/y - start) > thresh and count:
i = int(floor(r))
cf.append(i)
f = r - i
x, y = i*h[-1] + h[-2], i*k[-1] + k[-2]
if verbose is True or verbose == initial_count-count:
print(f'{x}\u00B2-{D}x{y}\u00B2 = {x**2-D*y**2}')
if x**2 - D*y**2 == 1:
print(f'{x}\u00B2-{D}x{y}\u00B2 = {x**2-D*y**2}')
print(cf)
return
count -= 1
r = 1/f
h = (h[1], x)
k = (k[1], y)
print(cf)
raise OverflowError(f"Converged on {x} {y} with count {count} and diff {abs(start-x/y)}!")
continued_fraction(61, count=20, verbose=True, thresh=-1) # We don't want to stop on account of thresh in this example
A c program doing the same:
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
int main() {
long D = 61;
double start = sqrt(D);
long h[] = {0, 1};
long k[] = {1, 0};
int count = 20;
float thresh = 1E-12;
double r = start;
long x = (1+thresh+start)*start;
long y = start;
while(abs(x/(double)y-start) > -1 && count) {
long i = floor(r);
double f = r - i;
x = i * h[1] + h[0];
y = i * k[1] + k[0];
printf("%ld\u00B2-%ldx%ld\u00B2 = %lf\n", x, D, y, x*x-D*y*y);
r = 1/f;
--count;
h[0] = h[1];
h[1] = x;
k[0] = k[1];
k[1] = y;
}
return 0;
}
mpmath, python's multi-precision library can be used. Just be careful that all the important numbers are in mp format.
In the code below, x, y and i are standard multi-precision integers. r and f are multi-precision real numbers. Note that the initial count is set higher than 20.
from mpmath import mp, mpf
mp.dps = 50 # precision in number of decimal digits
def continued_fraction(D, count=22, thresh=mpf(1E-12), verbose=False):
cf = []
h = (0, 1)
k = (1, 0)
r = start = mp.sqrt(D)
initial_count = count
x = 0 # some dummy starting values, they will be overwritten early in the while loop
y = 1
while abs(x/y - start) > thresh and count > 0:
i = int(mp.floor(r))
cf.append(i)
x, y = i*h[-1] + h[-2], i*k[-1] + k[-2]
if verbose or initial_count == count:
print(f'{x}\u00B2-{D}x{y}\u00B2 = {x**2-D*y**2}')
if x**2 - D*y**2 == 1:
print(f'{x}\u00B2-{D}x{y}\u00B2 = {x**2-D*y**2}')
print(cf)
return
count -= 1
f = r - i
r = 1/f
h = (h[1], x)
k = (k[1], y)
print(cf)
raise OverflowError(f"Converged on {x} {y} with count {count} and diff {abs(start-x/y)}!")
continued_fraction(61, count=22, verbose=True, thresh=mpf(1e-100))
Output is similar to wolfram's:
...
335159612²-61x42912791² = 3
1431159437²-61x183241189² = -12
1766319049²-61x226153980² = 1
[7, 1, 4, 3, 1, 2, 2, 1, 3, 4, 1, 14, 1, 4, 3, 1, 2, 2, 1, 3, 4, 1]

Implementing FFT over finite fields

I would like to implement multiplication of polynomials using NTT. I followed Number-theoretic transform (integer DFT) and it seems to work.
Now I would like to implement multiplication of polynomials over finite fields Z_p[x] where p is arbitrary prime number.
Does it changes anything that the coefficients are now bounded by p, compared to the former unbounded case?
In particular, original NTT required to find prime number N as the working modulus that is larger than (magnitude of largest element of input vector)^2 * (length of input vector) + 1 so that the result never overflows. If the result is going to be bounded by that p prime anyway, how small can the modulus be? Note that p - 1 does not have to be of form (some positive integer) * (length of input vector).
Edit: I copy-pasted the source from the link above to illustrate the problem:
#
# Number-theoretic transform library (Python 2, 3)
#
# Copyright (c) 2017 Project Nayuki
# All rights reserved. Contact Nayuki for licensing.
# https://www.nayuki.io/page/number-theoretic-transform-integer-dft
#
import itertools, numbers
def find_params_and_transform(invec, minmod):
check_int(minmod)
mod = find_modulus(len(invec), minmod)
root = find_primitive_root(len(invec), mod - 1, mod)
return (transform(invec, root, mod), root, mod)
def check_int(n):
if not isinstance(n, numbers.Integral):
raise TypeError()
def find_modulus(veclen, minimum):
check_int(veclen)
check_int(minimum)
if veclen < 1 or minimum < 1:
raise ValueError()
start = (minimum - 1 + veclen - 1) // veclen
for i in itertools.count(max(start, 1)):
n = i * veclen + 1
assert n >= minimum
if is_prime(n):
return n
def is_prime(n):
check_int(n)
if n <= 1:
raise ValueError()
return all((n % i != 0) for i in range(2, sqrt(n) + 1))
def sqrt(n):
check_int(n)
if n < 0:
raise ValueError()
i = 1
while i * i <= n:
i *= 2
result = 0
while i > 0:
if (result + i)**2 <= n:
result += i
i //= 2
return result
def find_primitive_root(degree, totient, mod):
check_int(degree)
check_int(totient)
check_int(mod)
if not (1 <= degree <= totient < mod):
raise ValueError()
if totient % degree != 0:
raise ValueError()
gen = find_generator(totient, mod)
root = pow(gen, totient // degree, mod)
assert 0 <= root < mod
return root
def find_generator(totient, mod):
check_int(totient)
check_int(mod)
if not (1 <= totient < mod):
raise ValueError()
for i in range(1, mod):
if is_generator(i, totient, mod):
return i
raise ValueError("No generator exists")
def is_generator(val, totient, mod):
check_int(val)
check_int(totient)
check_int(mod)
if not (0 <= val < mod):
raise ValueError()
if not (1 <= totient < mod):
raise ValueError()
pf = unique_prime_factors(totient)
return pow(val, totient, mod) == 1 and all((pow(val, totient // p, mod) != 1) for p in pf)
def unique_prime_factors(n):
check_int(n)
if n < 1:
raise ValueError()
result = []
i = 2
end = sqrt(n)
while i <= end:
if n % i == 0:
n //= i
result.append(i)
while n % i == 0:
n //= i
end = sqrt(n)
i += 1
if n > 1:
result.append(n)
return result
def transform(invec, root, mod):
check_int(root)
check_int(mod)
if len(invec) >= mod:
raise ValueError()
if not all((0 <= val < mod) for val in invec):
raise ValueError()
if not (1 <= root < mod):
raise ValueError()
outvec = []
for i in range(len(invec)):
temp = 0
for (j, val) in enumerate(invec):
temp += val * pow(root, i * j, mod)
temp %= mod
outvec.append(temp)
return outvec
def inverse_transform(invec, root, mod):
outvec = transform(invec, reciprocal(root, mod), mod)
scaler = reciprocal(len(invec), mod)
return [(val * scaler % mod) for val in outvec]
def reciprocal(n, mod):
check_int(n)
check_int(mod)
if not (0 <= n < mod):
raise ValueError()
x, y = mod, n
a, b = 0, 1
while y != 0:
a, b = b, a - x // y * b
x, y = y, x % y
if x == 1:
return a % mod
else:
raise ValueError("Reciprocal does not exist")
def circular_convolve(vec0, vec1):
if not (0 < len(vec0) == len(vec1)):
raise ValueError()
if any((val < 0) for val in itertools.chain(vec0, vec1)):
raise ValueError()
maxval = max(val for val in itertools.chain(vec0, vec1))
minmod = maxval**2 * len(vec0) + 1
temp0, root, mod = find_params_and_transform(vec0, minmod)
temp1 = transform(vec1, root, mod)
temp2 = [(x * y % mod) for (x, y) in zip(temp0, temp1)]
return inverse_transform(temp2, root, mod)
vec0 = [24, 12, 28, 8, 0, 0, 0, 0]
vec1 = [4, 26, 29, 23, 0, 0, 0, 0]
print(circular_convolve(vec0, vec1))
def modulo(vec, prime):
return [x % prime for x in vec]
print(modulo(circular_convolve(vec0, vec1), 31))
Prints:
[96, 672, 1120, 1660, 1296, 876, 184, 0]
[3, 21, 4, 17, 25, 8, 29, 0]
However, where I change minmod = maxval**2 * len(vec0) + 1 to minmod = maxval + 1, it stops working:
[14, 16, 13, 20, 25, 15, 20, 0]
[14, 16, 13, 20, 25, 15, 20, 0]
What is the smallest minmod (N in the link above) be in order to work as expected?
If your input of n integers is bound to some prime q (any mod q not just prime will be the same) You can use it as a max value +1 but beware you can not use it as a prime p for the NTT because NTT prime p has special properties. All of them are here:
Translation from Complex-FFT to Finite-Field-FFT
so our max value of each input is q-1 but during your task computation (Convolution on 2 NTT results) the magnitude of first layer results can rise up to n.(q-1) but as we are doing convolution on them the input magnitude of final iNTT will rise up to:
m = n.((q-1)^2)
If you are doing different operations on the NTTs than the m equation might change.
Now let us get back to the p so in a nutshell you can use any prime p that upholds these:
p mod n == 1
p > m
and there exist 1 <= r,L < p such that:
p mod (L-1) = 0
r^(L*i) mod p == 1 // i = { 0,n }
r^(L*i) mod p != 1 // i = { 1,2,3, ... n-1 }
If all this is satisfied then p is nth root of unity and can be used for NTT. To find such prime and also the r,L look at the link above (there is C++ code that finds such).
For example during string multiplication we take 2 strings do NTT on them then convolute the result and iNTT back the result (that is sum of both input sizes). So for example:
99999999999999999999999999999999
*99999999999999999999999999999999
----------------------------------------------------------------
9999999999999999999999999999999800000000000000000000000000000001
the q = 10 and both operands are 9^32 so n=32 hence m = 9*9*32 = 2592 and the found prime is p = 2689. As you can see the result matches so no overflow occurs. However if I use any smaller prime that still fit all the other conditions the result will not match. I used this specifically to stretch the NTT values as much as possible (all values are q-1 and sizes are equal to the same power of 2)
In case your NTT is fast and n is not a power of 2 then you need to zero pad to nearest higher or equal power of 2 size for each NTT. But that should not affect the m value as zero pad should not increase the magnitude of values. My testing proves it so for convolution you can use:
m = (n1+n2).((q-1)^2)/2
where n1,n2 are the raw inputs sizes before zeropad.
For more info about implementing NTT you can check out mine in C++ (extensively optimized):
Modular arithmetics and NTT (finite field DFT) optimizations
So to answer your questions:
yes you can take advantage of the fact that input is mod q but you can not use q as p !!!
You can use minmod = n * (maxval + 1) only for single NTT (or first layer of NTTs) but as you are chaining them with convolution during your NTT usage you can not use that for the final INTT stage !!!
However as I mentioned in the comments easiest is to use max possible p that fits in the data type you are using and is usable for all power of 2 sizes of input supported.
Which basically renders your question irrelevant. The only case I can think of where this is not possible/desired is on arbitrary precision numbers where there is "no" max limit. There are many performance issues binded to variable p as the search for p is really slow (may be even slower than the NTT itself) and also variable p disables many performance optimizations of the modular arithmetics needed making the NTT really slow.

Nth Combination

Is there a direct way of getting the Nth combination of an ordered set of all combinations of nCr?
Example: I have four elements: [6, 4, 2, 1]. All the possible combinations by taking three at a time would be:
[[6, 4, 2], [6, 4, 1], [6, 2, 1], [4, 2, 1]].
Is there an algorithm that would give me e.g. the 3rd answer, [6, 2, 1], in the ordered result set, without enumerating all the previous answers?
Note you can generate the sequence by recursively generating all combinations with the first element, then all combinations without. In both recursive cases, you drop the first element to get all combinations from n-1 elements. In Python:
def combination(l, r):
if r == 0:
yield []
elif len(l) == r:
yield l
else:
for c in (combination(l[1:], r-1)):
yield l[0:1]+c
for c in (combination(l[1:], r)):
yield c
Any time you're generating a sequence by making a choice like this, you can recursively generate the kth element by counting how many elements a choice generates and comparing the count to k. If k is less than the count, you make that choice. Otherwise, subtract the count and repeat for the other possible choices you could make at that point. If there are always b choices, you can view this as generating a number in base b. The technique still works if the number of choices varies. In pseudocode (when all choices are always available):
kth(k, choicePoints)
if choicePoints is empty
return empty list
for each choice in head of choicePoints:
if k < size of choice
return choice and kth(k, tail of choicePoints)
else
k -= size of choice
signal exception: k is out-of-bounds
This gives you a 0-based index. If you want 1-based, change the comparison to k <= size of choice.
The tricky part (and what is unspecified in the pseudocode) is that the size of a choice depends on previous choices. Note the pseudocode can be used to solve a more general case than the problem.
For this specific problem, there are two choices (b= 2) and the size of the 1st choice (i.e. including the 1st element) is given by n-1Cr-1. Here's one implementation (which requires a suitable nCr):
def kthCombination(k, l, r):
if r == 0:
return []
elif len(l) == r:
return l
else:
i=nCr(len(l)-1, r-1)
if k < i:
return l[0:1] + kthCombination(k, l[1:], r-1)
else:
return kthCombination(k-i, l[1:], r)
If you reverse the order of the choices, you reverse the order of the sequence.
def reverseKthCombination(k, l, r):
if r == 0:
return []
elif len(l) == r:
return l
else:
i=nCr(len(l)-1, r)
if k < i:
return reverseKthCombination(k, l[1:], r)
else:
return l[0:1] + reverseKthCombination(k-i, l[1:], r-1)
Putting it to use:
>>> l = [6, 4, 2, 1]
>>> [kthCombination(k, [6, 4, 2, 1], 3) for k in range(nCr(len(l), 3)) ]
[[6, 4, 2], [6, 4, 1], [6, 2, 1], [4, 2, 1]]
>>> powOf2s=[2**i for i in range(4,-1,-1)]
>>> [sum(kthCombination(k, powOf2s, 3)) for k in range(nCr(len(powOf2s), 3))]
[28, 26, 25, 22, 21, 19, 14, 13, 11, 7]
>>> [sum(reverseKthCombination(k, powOf2s, 3)) for k in range(nCr(len(powOf2s), 3))]
[7, 11, 13, 14, 19, 21, 22, 25, 26, 28]
TLDR? Just scroll to the very bottom for my final solution.
I stumbled across this question while I was looking for methods to both get the index a specified combination would be located at if it were in a lexicographically sorted list and vice versa, for a choice of objects from some potentially very large set of objects and couldn't find much on the latter (the inverse of your problem is not so elusive).
Since I also solved (what I thought was) your exact problem before I thought I'd post my solutions to both here.
** EDIT: My requirement is what your requirement was too - I saw the answers and thought recursion was fine. Well now, after six long years you have it; just scroll down.**
For your requirement as (I thought it was) posed in the question this will do the job just fine:
def iterCombinations(n, k):
if k==1:
for i in range(n):
yield [i]
return
result = []
for a in range(k-1, n):
for e in iterCombinations(n, k-1):
if e[-1] == a:
break
yield e + [a]
You can then lookup the item in a collection ordered in the descending order (or use some equivalent compare methodology), so for the case in question:
>>> itemsDescending = [6,4,2,1]
>>> for c in iterCombinations(4, 3):
... [itemsDescending[i] for i in c]
...
[6, 4, 2]
[6, 4, 1]
[6, 2, 1]
[4, 2, 1]
This is also possible straight out of the box in Python, however:
>>> import itertools
>>> for c in itertools.combinations(itemsDescending, 3):
... c
...
(6, 4, 2)
(6, 4, 1)
(6, 2, 1)
(4, 2, 1)
Here is what I did for my requirement (and really for yours!) of a non-recursive algorithm that does not create or traverse the ordered list for either direction, but rather uses a simple but effective non-recursive implementation of nCr, choose(n, k):
def choose(n, k):
'''Returns the number of ways to choose k items from n items'''
reflect = n - k
if k > reflect:
if k > n:
return 0
k = reflect
if k == 0:
return 1
for nMinusIPlus1, i in zip(range(n - 1, n - k, -1), range(2, k + 1)):
n = n * nMinusIPlus1 // i
return n
To get the combination at some (zero-based) index in a forward sorted list:
def iterCombination(index, n, k):
'''Yields the items of the single combination that would be at the provided
(0-based) index in a lexicographically sorted list of combinations of choices
of k items from n items [0,n), given the combinations were sorted in
descending order. Yields in descending order.
'''
if index < 0 or index >= choose(n, k):
return
n -= 1
for i in range(k):
while choose(n, k) > index:
n -= 1
yield n
index -= choose(n, k)
n -= 1
k -= 1
To get the (zero-based) index at which some combination would reside in a reverse ordered list:
def indexOfCombination(combination):
'''Returns the (0-based) index the given combination would have if it were in
a reverse-lexicographically sorted list of combinations of choices of
len(combination) items from any possible number of items (given the
combination's length and maximum value)
- combination must already be in descending order,
and it's items drawn from the set [0,n).
'''
result = 0
for i, a in enumerate(combination):
result += choose(a, i + 1)
return result
It's overkill for your example (but I realise now that that was just an example); this is how that would go for each index in turn:
def exampleUseCase(itemsDescending=[6,4,2,1], k=3):
n = len(itemsDescending)
print("index -> combination -> and back again:")
for i in range(choose(n, k)):
c = [itemsDescending[j] for j in iterCombination(i, n, k)][-1::-1]
index = indexOfCombination([itemsDescending.index(v) for v in c])
print("{0} -> {1} -> {2}".format(i, c, index))
>>> exampleUseCase()
index -> combination -> and back again:
0 -> [6, 4, 2] -> 0
1 -> [6, 4, 1] -> 1
2 -> [6, 2, 1] -> 2
3 -> [4, 2, 1] -> 3
This can find the index given some long list or return the combination at some astronomical index in the blink of an eye, for example:
>>> choose(2016, 37)
9617597205504126094112265433349923026485628526002095715212972063686138242753600
>>> list(iterCombination(_-1, 2016, 37))
[2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003,
2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989,
1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979]
or, since that was the very last one and could be fast due to the reflection in choose(n, k), here's one from right in the middle and it seems just as fast...
>>> choose(2016, 37)//2
4808798602752063047056132716674961513242814263001047857606486031843069121376800
>>> list(iterCombination(_, 2016, 37))
[1978, 1973, 1921, 1908, 1825, 1775, 1747, 1635, 1613, 1598, 1529, 1528, 1521,
1445, 1393, 1251, 1247, 1229, 1204, 1198, 922, 901, 794, 699, 685, 633, 619, 598,
469, 456, 374, 368, 357, 219, 149, 93, 71]
This last example pauses for thought for a split second, but wouldn't you?
>>> import random
>>> rSet = set(random.randint(0, 10000000) for i in range(900))
>>> len(rSet)
900
>>> rList = sorted(rSet, reverse=True)
>>> combinations.indexOfCombination(rList)
61536587905102303838316048492163850175478325236595592744487336325506086930974887
88085020093159925576117511028315621934208381981476407812702689774826510322023536
58905845549371069786639595263444239118366962232872361362581506476113967993096033
00541202874946853699568596881200225925266331936183173583581021914595163799417151
30442624813775945054888304722079206982972852037480516813527237183254850056012217
59834465303543702263588008387352235149083914737690225710105023486226582087736870
38383323140972279867697434315252036074490127510158752080225274972225311906715033
86851377357968649982293794242170046400174118714525559851836064661141086690326842
25236658978135989907667078625869419802333512020715700514133380517628637151215549
05922388534567108671308819960483147825031620798631811671493891643972220604919591
22785587505280326638477135315176731640100473359830821781905546117103137944239120
34912084544221250309244925308316352643060056100719194985568284049903555621750881
39419639825279398618630525081169688672242833238889454445237928356800414839702024
66807635358129606994342005075585962080795273287472139515994244684088406544976674
84183671032002497594936116837768233617073949894918741875863985858049825755901232
89317507965160689287607868119414903299382093412911433254998227245783454244894604
83654290108678890682359278892580855226717964180806265176337132759167920384512456
91624558534942279041452960272707049107641475225516294235268581475735143470692000
78400891862852130481822509803019636619427631175355448729708451565341764545325720
79277290914349746541071731127111532099038538549697091038496002102703737347343739
96398832832674081286904287066696046621691978697914823322322650123025472624927566
99891468668052668317066769517155581261265629289158798073055495539590686279250097
27295943276536772955923599217742543093669565147228386873469711200278811335649924
13587219640724942441913695193417732608127949738209466313175361161142601108707568
19470026889319648128790363676253707359290547393198350533094409863254710237344552
47692325209744353688541868412075798500629908908768438513508959321262250985142709
19794478379412756202638771417821781240327337108495689300616872374578607430951230
96908870723878513999404242546015617238957825116802801618973562178005776911079790
22026655573872019955677676783191505879571719659770550759779880002320421606755826
75809722478174545846409923210824885805972611279030267270741509747224602604003738
30411365119180944456819762167312738395140461035991994771968906979578667047734952
21981545694935313345331923300019842406900689401417602004228459137311983483386802
30352489602769346000257761959413965109940729263098747702427952104316612809425394
85037536245288888254374135695390839718978818689595231708490351927063849922772653
26064826999661128817511630298712833048667406916285156973335575847429111697259113
53969532522640227276562651123634766230804871160471143157687290382053412295542343
14022687833967461351170188107671919648640149202504369991478703293224727284508796
06843631262345918398240286430644564444566815901074110609701319038586170760771099
41252989796265436701638358088345892387619172572763571929093224171759199798290520
71975442996399826830220944004118266689537930602427572308646745061258472912222347
18088442198837834539211242627770833874751143136048704550494404981971932449150098
52555927020553995188323691320225317096340687798498057634440618188905647503384292
79493920419695886724506109053220167190536026635080266763647744881063220423654648
36855624855494077960732944499038847158715263413026604773216510801253044020991845
89652657529729792772055725210165026891724511953666038764273616212464901231675592
46950937136633665320781952510620087284589083139308516989522633786063418913473703
96532777760440118656525488729217328376766171004246127636983612583177565603918697
15557602015171235214344399010185766876727226408494760175957535995025356361689144
85181975631986409708533731043231896096597038345028523539733981468056497208027899
6245509252811753667386001506195
However going back from that index to the combination of 900-choose-10,000,000 that it represents with the previous implementation would be very slow (since it simply subtracts one from n at each iteration).
For such large lists of combinations we can instead do a binary search of the space, and the overhead we add means it will only be a little slower for small lists of combinations:
def iterCombination(index, n, k):
'''Yields the items of the single combination that would be at the provided
(0-based) index in a lexicographically sorted list of combinations of choices
of k items from n items [0,n), given the combinations were sorted in
descending order. Yields in descending order.
'''
if index < 0 or n < k or n < 1 or k < 1 or choose(n, k) <= index:
return
for i in range(k, 0, -1):
d = (n - i) // 2 or 1
n -= d
while 1:
nCi = choose(n, i)
while nCi > index:
d = d // 2 or 1
n -= d
nCi = choose(n, i)
if d == 1:
break
n += d
d //= 2
n -= d
yield n
index -= nCi
From this one may notice that all the calls to choose have terms that cancel, if we cancel everything out we end up with a much faster implementation and what is, I think...
The optimal function for this problem
def iterCombination(index, n, k):
'''Yields the items of the single combination that would be at the provided
(0-based) index in a lexicographically sorted list of combinations of choices
of k items from n items [0,n), given the combinations were sorted in
descending order. Yields in descending order.
'''
nCk = 1
for nMinusI, iPlus1 in zip(range(n, n - k, -1), range(1, k + 1)):
nCk *= nMinusI
nCk //= iPlus1
curIndex = nCk
for k in range(k, 0, -1):
nCk *= k
nCk //= n
while curIndex - nCk > index:
curIndex -= nCk
nCk *= (n - k)
nCk -= nCk % k
n -= 1
nCk //= n
n -= 1
yield n
A final reminder that for the use case of the question one would do something like this:
def combinationAt(index, itemsDescending, k):
return [itemsDescending[i] for i in
list(iterCombination(index, len(itemsDescending), k))[-1::-1]]
>>> itemsDescending = [6,4,2,1]
>>> numberOfItemsBeingChosen = 3
>>> zeroBasedIndexWanted = 1
>>> combinationAt(zeroBasedIndexWanted, itemsDescending, numberOfItemsBeingChosen)
[6, 4, 1]
One way to do it would be by using properties of bits. This still requires some enumeration, but you wouldn't have to enumerate every set.
For your example, you have 4 numbers in your set. So if you were generating all the possible combinations of 4 numbers, you could enumerate them as follows:
{6, 4, 2, 1}
0000 - {(no numbers in set)}
0001 - {1}
0010 - {2}
0011 - {2, 1}
...
1111 - {6, 4, 2, 1}
See how each "bit" corresponds to "whether that number is in your set"? We see here that there are 16 possibilities (2^4).
So now we can go through and find all of the possibilities that have only 3 bits turned on. This will tell us all of the combinations of "3" that exist:
0111 - {4, 2, 1}
1011 - {6, 2, 1}
1101 - {6, 4, 1}
1110 - {6, 4, 2}
And lets rewrite each of our binary values as decimal values:
0111 = 7
1011 = 11
1101 = 13
1110 = 14
Now that we've done that - well, you said you wanted the "3rd" enumeration. So lets look at the 3rd largest number: 11. Which has the bit pattern 1011. Which corresponds to... {6, 2, 1}
Cool!
Basically, you can use the same concept for any set. So now all we've done is translate the problem from "enumerating all the sets" to "enumerating all of the integers". This might be a lot easier for your problem.
From the Python 3.6 itertools recipes:
def nth_combination(iterable, r, index):
'Equivalent to list(combinations(iterable, r))[index]'
pool = tuple(iterable)
n = len(pool)
if r < 0 or r > n:
raise ValueError
c = 1
k = min(r, n-r)
for i in range(1, k+1):
c = c * (n - k + i) // i
if index < 0:
index += c
if index < 0 or index >= c:
raise IndexError
result = []
while r:
c, n, r = c*r//n, n-1, r-1
while index >= c:
index -= c
c, n = c*(n-r)//n, n-1
result.append(pool[-1-n])
return tuple(result)
In practice:
iterable, r, index = [6, 4, 2, 1], 3, 2
nth_combination(iterable, r, index)
# (6, 2, 1)
Alternatively, as mentioned in the docstring:
import itertools as it
list(it.combinations(iterable, r))[index]
# (6, 2, 1)
See also more_itertools - a third party library that implements this recipe for you. Install via:
> pip install more_itertools
just a rough sketch:
arrange your numbers into upper triangular matrix of tuples:
A(n-1,n-1)
Aij = [i+1, j-1]
if you traverse matrix row first, you will get combinations in increasing order for two elements. To generalize to three elements, think of your matrix rows as another triangular matrix, rather than a vector. It kind of creates a corner of a cube.
At least this is how I have would approach the problem
let me clarify this, you do not have to store the matrix, you will need to compute index.
Let me work out to dimensional example, which you in principle could expand to 20 dimensions(bookkeeping may be atrocious).
ij = (i*i + i)/2 + j // ij is also the combination number
(i,j) = decompose(ij) // from ij one can recover i,j components
I = i // actual first index
J = j + 1 // actual second index
this two-dimensional example works for any number n, and you dont have to tabulate permutations.
Yes there a direct way of getting the Nth combination of an ordered set of all combinations of nCr? Say you need to generate 0th, 3rd, 6th.. combinations of a given set. You can generate it directly without generating combinations in between using JNuberTools. You can even generate next billionth combination (if your set size is large)
Here is the code example:
JNumberTools.combinationsOf(list)
.uniqueNth(8,1000_000_000) //skip to billionth combination of size 8
.forEach(System.out::println);
The maven dependency for JNumberTools is :
<dependency>
<groupId>io.github.deepeshpatel</groupId>
<artifactId>jnumbertools</artifactId>
<version>1.0.0</version>
</dependency>

Resources