Standard ML semifactorial using a reduce function - functional-programming

I am following this tutorial on standard ML: http://homepages.inf.ed.ac.uk/stg/NOTES/node2.html, and I came across this problem:
The semifactorial of a positive integer n is 1 × 3 × 5 × ... × n if n is odd and 2 × 4 × 6 × ... × n if n is even. Use the reduce function to define a semifac function which calculates semifactorials.
Reduce is defined as:
fun reduce (g, e, m, n, f) =
if m > n then e else g (reduce (g, e, m, n-1, f), f n);
I've spent a couple of hours messing around with the problem and can't find a satisfying answer which doesn't require altering the reduce function. The problem becomes easy if you redefine reduce as:
fun reduce' (g, e, m, n, f) =
if m > n then e else g (reduce'(g, e, m, n-2, f), f n);
With the final solution as:
fun semifactorial n = reduce'(fn (x,y) => x * y, 1, 1, n, fn x=>x);
Which is what I think the author was trying to get at, but I'm not sure. Is there anyway to solve this without altering the definition of reduce? I'm thinking that there is some really obvious functional way to do this, but I can't see how to make reduce decrement by two instead of one (my intuition says the answer lies in choosing correct function vals for g and f).

You're very close. The answer is in choosing a correct f. In particular, it's key that f can depend on n.
fun semifactorial n = reduce(fn (x,y) => x * y, 1, 1,
n, fn x => if x mod 2 = n mod 2 then x else 1)
should work.
Alternately, and a bit more cleanly:
fun foo n x = if (x + n) mod 2 = 0 then x else 1
fun semifactorial n = reduce((op* ), 1, 1, n, foo n x)
though this form does have potential to fail with particularly large values of n and x if your ML implementation doesn't support arbitrary precision integers.

You can use f to decide whether to multiply the index or not. It should only multiply them if the parities are the same.
fun sfact n = reduce(op*, 1, 1, n, fn k => if n mod 2 + k mod 2 = 1 then 1 else k);

Related

Implementation of Pohlig-Hellman unable to solve for large exponents

I'm trying to create an implementation of the Pohlig-Hellman algorithm in order to create a utility to craft / exploit backdoors in implementations of the Diffie-Hellman protocol. This project is inspired by this 2016 white-paper by NCC Group.
I currently have an implementation, here, that works for relatively small exponents – i.e. Given a linear congruence, g^x = h (mod n), for some specially-crafted modulus, n = pq, where p and q are prime, my implementation can solve for values of x smaller than min{ p, q }.
However, if x is larger than the smallest prime factor of n, then my implementation will give an incorrect solution. I suspect that the issue may not be with my implementation of Pohlig-Hellman, itself, but with the arguments I am passing to it. All the code can be found at the link, provided above, but I'll copy the relevant code snippets, here:
#
# Implementation of Pohlig-Hellman algorithm
#
# The `crt` function implements the Chinese Remainder Theorem, and the `pollard` function implements
# Pollard's Rho algorithm for discrete logarithms (see /dph/crt.py and /dph/pollard.py).
#
def pohlig(G, H, P, factors):
g = [pow(G, divexact(P - 1, f), P) for f in factors]
h = [pow(H, divexact(P - 1, f), P) for f in factors]
if Config.verbose:
x = []
total = len(factors)
for i, (gi, hi) in enumerate(zip(g, h), start=1):
print('Solving discrete logarithm {}/{}...'.format(str(i).rjust(len(str(total))), total))
result = pollard(gi, hi, P)
x.append(result)
print(f'x = 0x{result.digits(16)}')
else:
x = [pollard(gi, hi, P) for gi, hi in zip(g, h)]
return crt(x, factors)
Above is my implementation of Pohlig-Hellman, and below is where I call it to exploit a backdoor in some implementation of the Diffie-Hellman protocol.
def _exp(args):
g = args.g
h = args.h
p_factors = list(map(mpz, args.p_factors.split(',')))
try:
p_factors.remove(2)
except ValueError:
pass
q_factors = list(map(mpz, args.q_factors.split(',')))
try:
q_factors.remove(2)
except ValueError:
pass
p = 2 * _product(*p_factors) + 1
q = 2 * _product(*q_factors) + 1
if Config.verbose:
print(f'p = 0x{p.digits(16)}')
print(f'q = 0x{q.digits(16)}')
print()
print(f'Compute the discrete logarithm modulo `p`')
print(f'-----------------------------------------')
px = pohlig(g % p, h % p, p, p_factors)
if Config.verbose:
print()
print(f'Compute the discrete logarithm modulo `q`')
print(f'-----------------------------------------')
qx = pohlig(g % q, h % q, q, q_factors)
if Config.verbose:
print()
x = crt([px, qx], [p, q])
print(f'x = 0x{x.digits(16)}')
Here is a summary of what I am doing:
Choose a prime p = 2 * prod{ p_i } + 1, where p_i denotes a set of primes.
Choose a prime q = 2 * prod{ q_j } + 1, where q_j denotes a set of primes.
Inject n = pq as the backdoor modulus in some implementation of Diffie-Hellman.
Wait for a victim (e.g. Alice computes A = g^a (mod n), and Bob computes B = g^b (mod n)).
Solve for Alice's or Bob's secret exponent, a or b, and compute their shared secret key, K = A^b = B^a (mod n).
Step #5 is done by performing the Pohlig-Hellman algorithm twice to solve for x (mod p) and x (mod q), and then the Chinese Remainder Theorem is used to solve for x (mod n).
EDIT
The x that I am referring to in the description of step #5 is either Alice's secret exponent, a, or Bob's secret exponent, b, depending on which we choose to solve for, since only one is needed to compute the shared secret key, K.

Concatenation of binary representation of first n positive integers in O(logn) time complexity

I came across this question in a coding competition. Given a number n, concatenate the binary representation of first n positive integers and return the decimal value of the resultant number formed. Since the answer can be large return answer modulo 10^9+7.
N can be as large as 10^9.
Eg:- n=4. Number formed=11011100(1=1,10=2,11=3,100=4). Decimal value of 11011100=220.
I found a stack overflow answer to this question but the problem is that it only contains a O(n) solution.
Link:- concatenate binary of first N integers and return decimal value
Since n can be up to 10^9 we need to come up with solution that is better than O(n).
Here's some Python code that provides a fast solution; it uses the same ideas as in Abhinav Mathur's post. It requires Python >= 3.8, but it doesn't use anything particularly fancy from Python, and could easily be translated into another language. You'd need to write algorithms for modular exponentiation and modular inverse if they're not already available in the target language.
First, for testing purposes, let's define the slow and obvious version:
# Modulus that results are reduced by,
M = 10 ** 9 + 7
def slow_binary_concat(n):
"""
Concatenate binary representations of 1 through n (inclusive).
Reinterpret the resulting binary string as an integer.
"""
concatenation = "".join(format(k, "b") for k in range(n + 1))
return int(concatenation, 2) % M
Checking that we get the expected result:
>>> slow_binary_concat(4)
220
>>> slow_binary_concat(10)
462911642
Now we'll write a faster version. First, we split the range [1, n) into subintervals such that within each subinterval, all numbers have the same length in binary. For example, the range [1, 10) would be split into four subintervals: [1, 2), [2, 4), [4, 8) and [8, 10). Here's a function to do that splitting:
def split_by_bit_length(n):
"""
Split the numbers in [1, n) by bit-length.
Produces triples (a, b, 2**k). Each triple represents a subinterval
[a, b) of [1, n), with a < b, all of whose elements has bit-length k.
"""
a = 1
while n > a:
b = 2 * a
yield (a, min(n, b), b)
a = b
Example output:
>>> list(split_by_bit_length(10))
[(1, 2, 2), (2, 4, 4), (4, 8, 8), (8, 10, 16)]
Now for each subinterval, the value of the concatenation of all numbers in that subinterval is represented by a fairly simple mathematical sum, which can be computed in exact form. Here's a function to compute that sum modulo M:
def subinterval_concat(a, b, l):
"""
Concatenation of values in [a, b), all of which have the same bit-length k.
l is 2**k.
Equivalently, sum(i * l**(b - 1 - i)) for i in range(a, b)) modulo M.
"""
n = b - a
inv = pow(l - 1, -1, M)
q = (pow(l, n, M) - 1) * inv
return (a * q + (q - n) * inv) % M
I won't go into the evaluation of the sum here: it's a bit off-topic for this site, and it's hard to express without a good way to render formulas. If you want the details, that's a topic for https://math.stackexchange.com, or a page of fairly simple algebra.
Finally, we want to put all the intervals together. Here's a function to do that.
def fast_binary_concat(n):
"""
Fast version of slow_binary_concat.
"""
acc = 0
for a, b, l in split_by_bit_length(n + 1):
acc = (acc * pow(l, b - a, M) + subinterval_concat(a, b, l)) % M
return acc
A comparison with the slow version shows that we get the same results:
>>> fast_binary_concat(4)
220
>>> fast_binary_concat(10)
462911642
But the fast version can easily be evaluated for much larger inputs, where using the slow version would be infeasible:
>>> fast_binary_concat(10**9)
827129560
>>> fast_binary_concat(10**18)
945204784
You just have to note a simple pattern. Taking up your example for n=4, let's gradually build the solution starting from n=1.
1 -> 1 #1
2 -> 2^2(1) + 2 #6
3 -> 2^2[2^2(1)+2] + 3 #27
4 -> 2^3{2^2[2^2(1)+2]+3} + 4 #220
If you expand the coefficients of each term for n=4, you'll get the coefficients as:
1 -> (2^3)*(2^2)*(2^2)
2 -> (2^3)*(2^2)
3 -> (2^3)
4 -> (2^0)
Let the N be total number of bits in the string representation of our required number, and D(x) be the number of bits in x. The coefficients can then be written as
1 -> 2^(N-D(1))
2 -> 2^(N-D(1)-D(2))
3 -> 2^(N-D(1)-D(2)-D(3))
... and so on
Since the value of D(x) will be the same for all x between range (2^t, 2^(t+1)-1) for some given t, you can break the problem into such ranges and solve for each range using mathematics (not iteration). Since the number of such ranges will be log2(Given N), this should work in the given time limit.
As an example, the various ranges become:
1. 1 (D(x) = 1)
2. 2-3 (D(x) = 2)
3. 4-7 (D(x) = 3)
4. 8-15 (D(x) = 4)

Generating a formula

x = n
while x > 0:
x = x // 2
Let x_k denote the variable x after k iterations. How do I find x_k?
Is it floor(n / 2)^k?
For integers you can use right shift
x_k = n >> k
If you want to use division (note k power is not applied to n)
x_k = n // (2**k)

Creating an iterative function in SML

I've been stuck on this problem for a while now, and can't think of how to solve it.
Consider the following function f : N → N .
f(0) = 2, f(1) = 0, f(2) = 3,
f(n) = 3f(n-3) + 2f(n-2) - f(n-1) for n≥3.
Define an iterative version of f.
I know my solution should look something like this
fun myFun 0 = 2
| myFun 1 = 0
| myFun 2 = 3
| myFun n =
let
(* code *)
in
funHelp(3,2,0,n)
end ;
I know iterative functions are only suppose to use one recursive call, while letting the arguments do all the work. I can't figure out how to do it with this problem, though! Any help would be much appreciated!
I assume this is homework, so I just give you a partial solution:
fun f n =
let
fun iter(0, n0, n1, n2) = n0
| iter(1, n0, n1, n2) = n1
| iter(2, n0, n1, n2) = n2
| iter(n, n0, n1, n2) = iter(n - 1, n1, n2, ???)
in
iter(n, 2, 0, 3)
end
Filling in the ??? shouldn't be too hard.
fun funHelp 0 = (2,0,3)
| funHelp n = let val (x,y,z) = funHelp n - 1
in
(y,z,(3 * x) + (2 * y) - z)
end
fun myFun n = let val (x,_,_) = funHelp n
in
x
end
This should do what you seem to want, if a bit messily.

Accumulating Curried Function (SML)

I have a set of problems that I've been working through and can't seem to understand what the last one is asking. Here is the first problem, and my solution to it:
a) Often we are interested in computing ∑i=m..n f(i), the sum of function values f(i) for i = m through n. Define sigma f m n which computes ∑i=m..n f(i). This is different from defining sigma (f, m, n).
fun sigma f m n = if (m=n) then f(m) else (f(m) + sigma f (m+1) n);
The second problem, and my solution:
b) In the computation of sigma above, the index i goes from current
i to next value i+1. We may want to compute the sum of f(i) where i
goes from current i to the next, say i+2, not i+1. If we send this
information as an argument, we can compute more generalized
summation. Define ‘sum f next m n’ to compute such summation, where
‘next’ is a function to compute the next index value from the
current index value. To get ‘sigma’ in (a), you send the successor
function as ‘next’.
fun sum f next m n = if (m>=n) then f(m) else (f(m) + sum f (next) (next(m)) n);
And the third problem, with my attempt:
c) Generalizing sum in (b), we can compute not only summation but also
product and other forms of accumulation. If we want to compute sum in
(b), we send addition as an argument; if we want to compute the
product of function values, we send multiplication as an argument for
the same parameter. We also have to send the identity of the
operator. Define ‘accum h v f next m n’ to compute such accumulation,
where h is a two-variable function to do accumulation, and v is the
base value for accumulation. If we send the multiplication function
for h, 1 for v, and the successor function as ‘next’, this ‘accum’
computes ∏i=m..n f(i). Create examples whose ‘h’ is not addition or
multiplication, too.
fun accum h v f next m n = if (m>=n) then f(m) else (h (f(m)) (accum (h) (v) (f) (next) (next(m)) n));
In problem C, I'm unsure of what i'm suppose to do with my "v" argument. Right now the function will take any interval of numbers m - n and apply any kind of operation to them. For example, I could call my function
accum mult (4?) double next3 1 5;
where double is a doubling function and next3 adds 3 to a given value. Any ideas on how i'm suppoes to utilize the v value?
This set of problems is designed to lead to implementation of accumulation function. It takes
h - combines previous value and current value to produce next value
v - starting value for h
f - function to be applied to values from [m, n) interval before passing them to h function
next - computes next value in sequence
m and n - boundaries
Here is how I'd define accum:
fun accum h v f next m n = if m >= n then v else accum h (h (f m) v) f next (next m) n
Examples that were described in C will look like this:
fun sum x y = x + y;
fun mult x y = x * y;
fun id x = x;
accum sum 0 id next 1 10; (* sum [1, 10) staring 0 *)
accum mult 1 id next 1 10; (* prod [1, 10) starting 1 *)
For example, you can calculate sum of numbers from 1 to 10 and plus 5 if you pass 5 as v in first example.
The instructions will make more sense if you consider the possibility of an empty interval.
The "sum" of a single value n is n. The sum of no values is zero.
The "product" of a single value n is n. The product of no values is one.
A list of a single value n is [n] (n::nil). A list of no values is nil.
Currently, you're assuming that m ≤ n, and treating m = n as a special case that returns f m. Another approach is to treat m > n as the special case, returning v. Then, when m = n, your function will automatically return h v (f m), which is the same as (f m) (provided that v was selected properly for this h).
To be honest, though, I think the v-less approach is fine when the function's arguments specify an interval of the form [m,n], since there's no logical reason that such a function would support an empty interval. (I mean, [m,m−1] isn't so much "the empty interval" as it is "obvious error".) The v-ful approach is chiefly useful when the function's arguments specify a list or set of elements in some way that really could conceivably be empty, e.g. as an 'a list.

Resources