I'm trying to create an implementation of the Pohlig-Hellman algorithm in order to create a utility to craft / exploit backdoors in implementations of the Diffie-Hellman protocol. This project is inspired by this 2016 white-paper by NCC Group.
I currently have an implementation, here, that works for relatively small exponents – i.e. Given a linear congruence, g^x = h (mod n), for some specially-crafted modulus, n = pq, where p and q are prime, my implementation can solve for values of x smaller than min{ p, q }.
However, if x is larger than the smallest prime factor of n, then my implementation will give an incorrect solution. I suspect that the issue may not be with my implementation of Pohlig-Hellman, itself, but with the arguments I am passing to it. All the code can be found at the link, provided above, but I'll copy the relevant code snippets, here:
#
# Implementation of Pohlig-Hellman algorithm
#
# The `crt` function implements the Chinese Remainder Theorem, and the `pollard` function implements
# Pollard's Rho algorithm for discrete logarithms (see /dph/crt.py and /dph/pollard.py).
#
def pohlig(G, H, P, factors):
g = [pow(G, divexact(P - 1, f), P) for f in factors]
h = [pow(H, divexact(P - 1, f), P) for f in factors]
if Config.verbose:
x = []
total = len(factors)
for i, (gi, hi) in enumerate(zip(g, h), start=1):
print('Solving discrete logarithm {}/{}...'.format(str(i).rjust(len(str(total))), total))
result = pollard(gi, hi, P)
x.append(result)
print(f'x = 0x{result.digits(16)}')
else:
x = [pollard(gi, hi, P) for gi, hi in zip(g, h)]
return crt(x, factors)
Above is my implementation of Pohlig-Hellman, and below is where I call it to exploit a backdoor in some implementation of the Diffie-Hellman protocol.
def _exp(args):
g = args.g
h = args.h
p_factors = list(map(mpz, args.p_factors.split(',')))
try:
p_factors.remove(2)
except ValueError:
pass
q_factors = list(map(mpz, args.q_factors.split(',')))
try:
q_factors.remove(2)
except ValueError:
pass
p = 2 * _product(*p_factors) + 1
q = 2 * _product(*q_factors) + 1
if Config.verbose:
print(f'p = 0x{p.digits(16)}')
print(f'q = 0x{q.digits(16)}')
print()
print(f'Compute the discrete logarithm modulo `p`')
print(f'-----------------------------------------')
px = pohlig(g % p, h % p, p, p_factors)
if Config.verbose:
print()
print(f'Compute the discrete logarithm modulo `q`')
print(f'-----------------------------------------')
qx = pohlig(g % q, h % q, q, q_factors)
if Config.verbose:
print()
x = crt([px, qx], [p, q])
print(f'x = 0x{x.digits(16)}')
Here is a summary of what I am doing:
Choose a prime p = 2 * prod{ p_i } + 1, where p_i denotes a set of primes.
Choose a prime q = 2 * prod{ q_j } + 1, where q_j denotes a set of primes.
Inject n = pq as the backdoor modulus in some implementation of Diffie-Hellman.
Wait for a victim (e.g. Alice computes A = g^a (mod n), and Bob computes B = g^b (mod n)).
Solve for Alice's or Bob's secret exponent, a or b, and compute their shared secret key, K = A^b = B^a (mod n).
Step #5 is done by performing the Pohlig-Hellman algorithm twice to solve for x (mod p) and x (mod q), and then the Chinese Remainder Theorem is used to solve for x (mod n).
EDIT
The x that I am referring to in the description of step #5 is either Alice's secret exponent, a, or Bob's secret exponent, b, depending on which we choose to solve for, since only one is needed to compute the shared secret key, K.
I try to add two points on an elliptic curve over a prime field, converting these points from affine/to-affine coordinates, but do not manage to get a correct result (the curve I am testing has a=0). Anyone can see what's wrong?
// From Affine
BigInteger X1=P.x;
BigInteger Y1=P.y;
BigInteger Z1=BigInteger.ONE;
BigInteger X2=Q.x;
BigInteger Y2=Q.y;
BigInteger Z2=BigInteger.ONE;
// Point addition in Jacobian coordinates for a=0
// see http://www.hyperelliptic.org/EFD/g1p/auto-shortw-jacobian-0.html#addition-add-2007-bl
BigInteger Z1Z1 = Z1.multiply(Z1);
BigInteger Z2Z2 = Z2.multiply(Z2);
BigInteger U1 = X1.multiply(Z2Z2);
BigInteger U2 = X2.multiply(Z1Z1);
BigInteger S1 = Y1.multiply(Z2).multiply(Z2Z2);
BigInteger S2 = Y2.multiply(Z1).multiply(Z1Z1);
BigInteger H = U2.subtract(U1);
BigInteger I = H.add(H).multiply(H.add(H));
BigInteger J = H.multiply(I);
BigInteger r = S2.subtract(S1).add(S2.subtract(S1));
BigInteger V = U1.multiply(I);
BigInteger X3 = r.multiply(r).subtract(J).subtract(V.add(V)).mod(FIELD);
BigInteger Y3 = r.multiply(V.subtract(X3)).subtract(S1.add(S1).multiply(J)).mod(FIELD);
BigInteger Z3 = Z1.add(Z2).multiply(Z1.add(Z2)).subtract(Z1Z1).subtract(Z2Z2).multiply(H).mod(FIELD);
//To affine
BigInteger Z3Z3 = Z3.multiply(Z3);
BigInteger Z3Z3Z3 = Z3Z3.multiply(Z3);
return new Point(X3.divide(Z3Z3),Y3.divide(Z3Z3Z3));
CodesInChaos said:
The division can't be right. You need to compute the multiplicative inverse modulo FIELD. This operation is quite expensive, and should only be performed once at the end of a scalar multiplication, not after each doubling/addition. Use z^{-1} = ModPow(z, FIELD-2, FIELD).
I have a RSA private key with modulus m, public exponent e and private exponent d, but the program I am using needs the modulus's prime factors p and q.
Is it possible to use e and d to get p and q?
Yes -- once you know the modulus N, and public/private exponents d and e, it is not too difficult to obtain p and q such that N=pq.
This paper by Dan Boneh describes an algorithm for doing so. It relies
on the fact that, by definition,
de = 1 mod phi(N).
For any randomly chosen "witness"
in (2,N), there is about a 50% chance of being able to use it to find a nontrivial
square root of 1 mod N (call it x). Then gcd(x-1,N) gives one of the factors.
You can use the open source tool I have developed in 2009 that converts RSA keys between the SFM format (n,e,d) and CRT format (p,q,dp,dq,u), and the other way around. It is on SourceForge : http://rsaconverter.sourceforge.net/
The algorithm I implemented is based on ideas presented by Dan Boneh, as described by the previous answer.
I hope this will be useful.
Mounir IDRASSI - IDRIX
I posted a response on the crypto stack exchange answering the same question here. It uses the same approach as outlined in Boneh's paper, but does a lot more explanation as to how it actually works. I also try to assume a minimal amount of prior knowledge.
Hope this helps!
I put in the effort to dig through Boneh's paper. The "algorithm" for deriving (p, q) from (n, d) is buried at the end of §1.1, coded in maths jargon, and left as an exercise for the reader to render out of his (rather terse) proof that it's efficient to do so.
Let 〈N, e〉 be an RSA public key. Given the private key d, one can efficiently factor the modulus N = pq.
Proof. Compute k = de − 1. By definition of d and e we know that k is a multiple of φ(N). Since φ(N) is even, k = 2tr with r odd and t ≥ 1. We have gk = 1 for every g ∈ ℤN×, and therefore gk/2 is a square root of unity modulo N. By the Chinese Remainder Theorem, 1 has four square roots modulo N = pq. Two of these square roots are ±1. The other two are ±x where x satisfies x = 1 mod p and x = −1 mod q. Using either one of these last two square roots, the factorization of N is revealed by computing gcd(x − 1, N). A straightforward argument shows that if g is chosen at random from ℤN× then with probability at least 1/2 (over the choice of g) one of the elements in the sequence gk/2, gk/4, …, gk/2t mod N is a square root of unity that reveals the factorization of N. All elements in the sequence can be efficiently computed in time O(n3) where n = log2(N).
Obviously, this is pretty close to meaningless for anyone who doesn't know what $Z_N^\ast$ is, and has a pretty nonlinear structure that takes a good deal of time to twist into a linear algorithm.
So here is the worked solution:
from random import randrange
from math import gcd
def ned_to_pqe(secret_key):
"""
https://crypto.stanford.edu/~dabo/papers/RSA-survey.pdf#:~:text=Given%20d%2C,reveals%20the%20factorization%20of%20N%2E
"""
n, e, d = secret_key
k = d * e - 1
t = bit_scan1(k)
trivial_sqrt1 = {1, n - 1}
while True:
g = randrange(2, n - 1)
for j in range(1, t + 1):
x = pow(g, k >> j, n)
if pow(x, 2, n) == 1:
if x in trivial_sqrt1: continue
p = gcd(x - 1, n)
q = n // p
if q > p: p, q = q, p
return p, q, e
def pqe_to_ned(secret_key):
p, q, e = secret_key
n = p * q
l = (p - 1) * (q - 1)
d = pow(e, -1, l)
return n, e, d
def bit_scan1(i):
"""
https://gmpy2.readthedocs.io/en/latest/mpz.html#mpz.bit_scan1
"""
# https://stackoverflow.com/a/63552117/1874170
return (i & -i).bit_length() - 1
def test():
secret_key = (
# https://en.wikipedia.org/wiki/RSA_numbers#RSA-100
# Should take upwards of an hour to factor on a consumer desktop ca. 2022
1522605027922533360535618378132637429718068114961380688657908494580122963258952897654000350692006139,
65537,
1435319569480661473883310243084583371347212233430112391255270984679722445287591616684593449660400673
)
if secret_key != pqe_to_ned(ned_to_pqe(secret_key)):
raise ValueError
if __name__ == '__main__':
test()
print("Self-test OK")
Live demo (JS):
function ned_to_pqe({n, e, d}) {
// https://crypto.stanford.edu/~dabo/papers/RSA-survey.pdf#:~:text=Given%20d%2C,reveals%20the%20factorization%20of%20N%2E
let k = d * e - 1n;
let t = scan1(k);
let trivial_sqrt1 = new Set([1n, n - 1n]);
while (true) {
let g = insecure_randrange(2n, n - 1n);
for ( let j = t ; j > 0 ; --j ) {
let x = bn_powMod(g, k >> j, n);
if (bn_powMod(x, 2n, n) === 1n) {
if (trivial_sqrt1.has(x)) continue;
let p = gcd(x - 1n, n), q = n/p;
if (q > p) [p, q] = [q, p];
return {p, q, e};
}
}
}
}
function pqe_to_ned({p, q, e}) {
let n = p * q;
let l = (p - 1n) * (q - 1n);
let d = bn_modInv(e, l);
return {n, e, d};
}
function bn_powMod(x, e, m) {
// h/t https://umaranis.com/2018/07/12/calculate-modular-exponentiation-powermod-in-javascript-ap-n/
if (m === 1n) return 0n;
let y = 1n;
x = x % m;
while (e > 0n) {
if (e % 2n === 1n) //odd number
y = (y * x) % m;
e = e >> 1n; //divide by 2
x = (x * x) % m;
}
return y;
}
function bn_modInv(x, m) {
// TOY IMPLEMENTATION
// DO NOT USE IN GENERAL-PURPOSE CODE
// h/t https://rosettacode.org/wiki/Modular_inverse#C
let m0 = m, t, q;
let x0 = 0n, y = 1n;
if (m === 1n) return 1n;
while (x > 1n) {
q = x / m;
t = m;
m = x % m;
x = t;
t = x0;
x0 = y - q * x0;
y = t;
}
if (y < 0n) y += m0;
return y;
}
function gcd(a, b) {
// h/t https://stackoverflow.com/a/17445304/1874170
while (b) {
[a, b] = [b, a % b];
}
return a;
}
function scan1(i) {
// https://gmplib.org/manual/Integer-Logic-and-Bit-Fiddling#mpz_scan1
let k = 0n;
if ( i !== 0n ) {
while( (i & 1n) === 0n ) {
i >>= 1n;
k += 1n;
}
}
return k;
}
function insecure_randrange(a, b) {
// h/t https://arxiv.org/abs/1304.1916
let numerator = 0n;
let denominator = 1n;
let n = (b - a);
while (true) {
numerator <<= 1n;
denominator <<= 1n;
numerator |= BigInt(Math.random()>1/2);
if (denominator >= n) {
if (numerator < n)
return a + numerator;
numerator -= n;
denominator -= n;
}
}
}
<form action="javascript:" onsubmit="(({target:form,submitter:{value:action}})=>{eval(action)(form)})(event)">
<p>
<label for="p">p=</label><input name="p" value="37975227936943673922808872755445627854565536638199" /><br />
<label for="q">q=</label><input name="q" value="40094690950920881030683735292761468389214899724061" /><br />
<label for="n">n=</label><input name="n" /><br />
<label for="e">e=</label><input name="e" placeholder="65537" /><br />
<label for="d">d=</label><input name="d" /><br />
</p>
<p>
<button type="submit" value="pqe2nd">Get (n,d) from (p,q,e)</button><br />
<button type="submit" value="delpq">Forget (p,q)</button><br />
<button type="submit" value="ned2pq">Get (p,q) from (n,e,d)</button>
</form>
<script>
function pqe2nd({elements}) {
if (!elements['e'].value) elements['e'].value = elements['e'].placeholder;
let p = BigInt(elements['p'].value||undefined);
let q = BigInt(elements['q'].value||undefined);
let e = BigInt(elements['e'].value||undefined);
let {n, d} = pqe_to_ned({p,q,e});
elements['n'].value = n.toString();
elements['d'].value = d.toString();
}
function ned2pq({elements}) {
if (!elements['e'].value) elements['e'].value = elements['e'].placeholder;
let n = BigInt(elements['n'].value||undefined);
let e = BigInt(elements['e'].value||undefined);
let d = BigInt(elements['d'].value||undefined);
let {p, q} = ned_to_pqe({n,e,d});
elements['p'].value = p.toString();
elements['q'].value = q.toString();
}
function delpq({elements}) {
elements['p'].value = null;
elements['q'].value = null;
}
</script>
To answer the question as-stated in the title: factoring N entails finding N. But you cannot, in the general case, derive N from (e, d). Therefore, you cannot, in the general case, derive the factors of N from (e, d); QED.
finding n from (e, d) is computationally feasible with fair probability, or even certainty, for a small but observable fraction of RSA keys of practical interest
If you want to try to do so anyway, you'll need to be able to factorize e * d - 1 (if I understand the above-linked answer correctly):
from itertools import permutations
def ed_to_pq(e, d):
# NOT ALWAYS POSSIBLE -- the number e*d-1 must be small enough to factorize
# h/t https://crypto.stackexchange.com/a/81620/8287
factors = factorize(e * d - 1)
factors.sort()
# Unimplemented optimization:
# if two factors are larger than (p * q).bit_length()//4
# and the greater of (p, q) is not many times bigger than the lesser,
# then you can safely assume that the large factors belong to (p-1) and (q-1)
# and thereby reduce the number of iterations in the following loops
# Unimplemented optimization:
# permutations are overkill for this partitioning scheme;
# a clever mathematician could come up with something more efficient
# Unimplemented optimization:
# prune permutations based on "sanity" factor of logarithm knapsacking
l = len(factors)
for arrangement in permutations(factors):
for l_pm1 in range(1, l - 1):
for l_qm1 in range(1, l_pm1):
pm1 = prod(arrangement[:l_pm1])
qm1 = prod(arrangement[l_pm1:l_pm1+l_qm1])
try:
if pow(e, -1, pm1 * qm1) == d:
return (pm1 + 1, qm1 + 1)
except Exception:
pass
from functools import reduce
from operator import mul
def prod(l):
return reduce(mul, l)
Can anyone explain to me in detail how this log2 function works:
inline float fast_log2 (float val)
{
int * const exp_ptr = reinterpret_cast <int *> (&val);
int x = *exp_ptr;
const int log_2 = ((x >> 23) & 255) - 128;
x &= ~(255 << 23);
x += 127 << 23;
*exp_ptr = x;
val = ((-1.0f/3) * val + 2) * val - 2.0f/3; // (1)
return (val + log_2);
}
IEEE floats internally have an exponent E and a mantissa M, each represented as binary integers. The actual value is basically
2^E * M
Basic logarithmic math says:
log2(2^E * M)
= log2(2^E) + log2(M)
= E + log2(M)
The first part of your code separates E and M. The line commented (1) computes log2(M) by using a polynomial approximation. The final line adds E and the result of the approximation.
It's an approximation. It first takes log2 of the exponent directly (trivial to do), then uses an approximation formula for log2 of the mantissa. It then adds these two log2 components to give the final result.
On my midterm I had the problem:
T(n) = 8T(n/2) + n^3
and I am supposed to find its big theta notation using either the masters or alternative method. So what I did was
a = 8, b = 2 k = 3
log28 = 3 = k
therefore, T(n) is big theta n3. I got 1/3 points so I must be wrong. What did I do wrong?
T(n) = aT(n/b) + f(n)
You applied the version when f(n) = O(n^(log_b(a) - e)) for some e > 0.
This is important, you need this to be true for some e > 0.
For f(n) = n^3, b = 2 and a = 8,
n^3 = O(n^(3-e)) is not true for any e > 0.
So your picked the wrong version of the Master theorem.
You need to apply a different version of Master theorem:
if f(n) = Theta ((log n)^k * n^log_b(a)) for some k >= 0,
then
T(n) = Theta((log n)^(k+1) * n^log_b(a))
In your problem, you can apply this case, and that gives T(n) = Theta(n^3 log n).
An alternative way to solve your problem would be:
T(n) = 8 T(n/2) + n^3.
Let g(n) = T(n)/n^3.
Then
n^3 *g(n) = 8 * (n/2)^3 * g(n/2)+ n^3
i.e g(n) = g(n/2) + 1.
This implies g(n) = Theta(logn) and so T(n) = Theta(n^3 logn).