I am learning about the RSA algorithm. I perform the algorithm on very small prime numbers and use online Big Integer calculators to perform the encryption and decryption and everything works just fine.
My question is about the size of the exponent we create and when it comes to bigger numbers, it seems infeasible to calculate.
For example, the algorithm starts with picking two prime numbers p and q. You compute n=pxq and then the totient of n. Next you pick a number 'e' such that 1
Then to perform an encryption you take say like the ASCII character 'A' which is 65 and you raise it to the power of e. (65^e)
The online big integer calculator started getting very slow and sluggish (over a minute to calculate) when e was bigger than about 100,000 (6 digits)
My question is then, for the working RSA algorithm, what size (number of digits) number does that algorithm pick?
One thought I had was it was possible the online calculator that I was using was not using the best method for exponents? This is the calculator I am using: http://www.javascripter.net/math/calculators/100digitbigintcalculator.htm
Let's say M is the modulus. So YES, you could first perform intermediate = 65^e, and finally compute intermediate mod M. And of course, intermediate would be a very very very very big integer (if e equals 65537, the decimal representation of intermediate contains 118813 digits!).
BUT, thanks to a very basic modular arithmetic theorem,
(65^e) mod M = ((((65 mod M) * 65) mod M) * 65) mod M [...] (e times)
(the theorem states that in a quotient ring, the n-th power of the class of an element is the class of the n-th power of the element)
As you can see, this does not need any very big integer library, since after each arithmetic product, you use mod M that returns an integer between 0 and M-1. So, you only have to compute arithmetic products of integers less than M.
As an example, here is a simple shell script (bash) that computes 65^65537 mod 991*997. As you can see, no need to get a big number library:
#!/bin/bash
# set RSA parameters
m=65 # message to encode
M=$((991*997)) # modulus (both 991 and 997 are prime numbers)
e=65537 # public exponent (coprime with 990*996, thus compliant with RSA algorithm)
# compute (m^e) mod M
ret=1
for i in {1..$e}
do
ret=$(((ret*m)%M))
done
# display the result
echo $ret
It immediately returns 784933, thus 65^65537 mod 991*997 = 784933
The biggest integer computed with your method of calculus has 118813 digits, but the biggest integer handled with this shell script only has 12 or less digits ((M-1)^2 is made of 12 digits).
According to these explanations, we can now answer your question:
My question is then, for the working RSA algorithm, what size (number of digits) number does that algorithm pick?
With the above explanations, you can see that the maximum number of digits in the decimal representation of integers you have to manipulate is 1+log10((M-1)^2), because you will, at most, compute a product of two integers between 0 and M-1.
Note that 1+log10((M-1)^2) = 1+2.log10(M-1) < 2+2.log10(M) = 2.(1+log10(M)). Also note that 1+log10(M) is the number of digits of M.
Therefore, as a conclusion, this proves that the number of digits your library has to handle correctly is twice the number of digits of the modulus (if you are computing the exponentiation using integer multiplications the way explained here).
I have a pretty easy question (I think). As much as I've tried, I can not find an answer to this question.
I am creating a function, for which I want the user to enter two numbers. The first is the the number of terms of a certain infinite series to add together. The second is the number of digits the user would like the truncated sum to be accurate to.
Say the terms of the sequence are a_i. How much precision n, would be required in mpfr to ensure the result of adding these a_i from i=0 up to the user's entered value would be needed to guarantee the number of digits the user needs?
By the way, I'm adding the a_i in a naive way.
Any help will be much appreciated.
Thanks,
Rick
You can convert between decimal digits of precision, d, and binary digits of precision, b, with logarithms
b = d × log(10) / log(2)
A little rearranging shows why
b × log(2) = d × log(10)
log(2b) = log(10d)
2b = 10d
Each term of the series (and each addition) will introduce a rounding error at the least significant digit so, assuming each of the t terms involves n (two argument) arithmetic operations, you will want to add an extra
log(t * (n+2))/log(2)
bits.
You'll need to round the number of bits of precision up to be sure that you have enough room for your decimal digits of precision
b = ceil((d*log(10.0) + log(t*(n+2)))/log(2.0));
Finally, you should be aware that the terms may introduce cancellation errors, in which case this simple calculation will dramatically underestimate the required number of bits, even assuming I've got it right in the first place ;-)
I could just use division and modulus in a loop, but this is slow for really large integers. The number is stored in base two, and may be as large as 2^8192. I only need to know if it is a power of ten, so I figure there may be a shortcut (other than using a lookup table).
If your number x is a power of ten then
x = 10^y
for some integer y, which means that
x = (2^y)(5^y)
So, shift the integer right until there are no more trailing zeroes (should be a very low cost operation) and count the number of digits shifted (call this k). Now check if the remaining number is 5^k. If it is, then your original number is a power of 10. Otherwise, it's not. Since 2 and 5 are both prime this will always work.
Let's say that X is your input value, and we start with the assumption.
X = 10 ^ Something
Where Something is an Integer.
So we say the following:
log10(X) = Something.
So if X is a power of 10, then Something will be an Integer.
Example
int x = 10000;
double test = Math.log10(x);
if(test == ((int)test))
System.out.println("Is a power of 10");
EDIT
So it seems I "underestimated" what varying length numbers meant. I didn't even think about situations where the operands are 100 digits long. In that case, my proposed algorithm is definitely not efficient. I'd probably need an implementation who's complexity depends on the # of digits in each operands as opposed to its numerical value, right?
As suggested below, I will look into the Karatsuba algorithm...
Write the pseudocode of an algorithm that takes in two arbitrary length numbers (provided as strings), and computes the product of these numbers. Use an efficient procedure for multiplication of large numbers of arbitrary length. Analyze the efficiency of your algorithm.
I decided to take the (semi) easy way out and use the Russian Peasant Algorithm. It works like this:
a * b = a/2 * 2b if a is even
a * b = (a-1)/2 * 2b + a if a is odd
My pseudocode is:
rpa(x, y){
if x is 1
return y
if x is even
return rpa(x/2, 2y)
if x is odd
return rpa((x-1)/2, 2y) + y
}
I have 3 questions:
Is this efficient for arbitrary length numbers? I implemented it in C and tried varying length numbers. The run-time in was near-instant in all cases so it's hard to tell empirically...
Can I apply the Master's Theorem to understand the complexity...?
a = # subproblems in recursion = 1 (max 1 recursive call across all states)
n / b = size of each subproblem = n / 1 -> b = 1 (problem doesn't change size...?)
f(n^d) = work done outside recursive calls = 1 -> d = 0 (the addition when a is odd)
a = 1, b^d = 1, a = b^d -> complexity is in n^d*log(n) = log(n)
this makes sense logically since we are halving the problem at each step, right?
What might my professor mean by providing arbitrary length numbers "as strings". Why do that?
Many thanks in advance
What might my professor mean by providing arbitrary length numbers "as strings". Why do that?
This actually change everything about the problem (and make your algorithm incorrect).
It means than 1234 is provided as 1,2,3,4 and you cannot operate directly on the whole number. You need to analyze your algorithm in terms of #additions, #multiplications, #divisions.
You should expect a division to be a bit more expensive than a multiplication, and a multiplication to be lot more expensive than an addition. So a good algorithm try to reduce the number of divisions and multiplications.
Check out the Karatsuba algorithm, (ps don't copy it that's not what your teacher want) is one of the fastest for this specification.
Add 3): Native integers are limited in how large (or small) numbers they can represent (32- or 64-bit integers for example). To represent arbitrary length numbers you can choose strings, because then you are not really limited by this. The problem is then, of course, that your arithmetic units are not really made to add strings ;-)
Lets say I have guessed a lottery number of:
1689
And the way the lottery works is, the order of the digits don't matter as long as the digits match up 1:1 with the digits in the actual winning lottery number.
So, the number 1689 would be a winning lottery number with:
1896, 1698, 9816, etc..
As long as each digit in your guess was present in the target number, then you win the lottery.
Is there a mathematical way I can do this?
I've solved this problem with a O(N^2) looping checking each digit against each digit of the winning lottery number (separating them with modulus). Which is fine, it works but I want to know if there are any neat math tricks I can do.
For example, at first... I thought I could be tricky and just take the sum and product of each digit in both numbers and if they matched then you won.
^ Do you think that would work?
However, I quickly disproved this when I found that lottery guess: 222, and 124 have the different digits but the same product and sum.
Anyone know any math tricks I can use to quickly determine if the digits in num1 match the digits in num2 regardless of order?
How about going through each number, and counting up the number of appearances of each digit (into two different 10 element arrays)? After you'd done the totaling, compare the counts of each digit. Since you only look at each digit once, that's O(N).
Code would look something like:
for(int i=0; i<digit_count; i++)
{
guessCounts[guessDigits[i] - '0']++;
actualCounts[actualDigits[i] - '0']++;
}
bool winner = true;
for(int i=0; i<10 && winner; i++)
{
winner &= guessCounts[i] == actualCounts[i];
}
Above code makes the assumption that guessDigits and actualDigits are both char strings; if they held the actual digits then you can just skip the - '0' business.
There are probably optimizations that would make this take less space or terminate sooner, but it's a pretty straightforward example of an O(N) approach.
By the way, as I mentioned in a comment, the multiplication/sum comparison will definitely not work because of zeros. Consider 0123 and 0222. Product is 0, sum is 6 in both cases.
Split into array, sort array, join into string, compare strings.
(Not a math trick, I know)
You can place the digits into an array, sort the array, then compare the arrays element by element. This will give you O( NlogN ) complexity which is better than O( N^2 ).
If N can become large, sorting the digits is the answer.
Because digits are 0..9 you can count the number of occurrences of each digit of the lottery answer in an array [0..9].
To compare you can subtract 1 for each digit you encounter in the guess. When you encounter a digit where the count is already 0, you know the guess is different. When you get through all the digits, the guess is the same (as long as the guess has as many digits as the lottery answer).
For each digit d multiply with the (d+1)-th prime number.
This is more mathematical but less efficient than the sorting or bucket methods. Actually it is the bucket method in disguise.
I'd sort both number's digits and compare them.
If you are only dealing with 4 digits I dont think you have to put much thought into which algorithm you use. They will all perform roughly the same.
Also 222 and 124 dont have the same sum
You have to consider that when n is small, the order of efficiency is irrelevant, and the constants start to matter more. How big can your numbers actually get? Can you get up to 10 digits? 20? 100? If your numbers have just a few digits, n^2 really isn't that bad. If you have strings of thousands of digits, then you might actually need to do something more clever like sorting or bucketing. (i.e. count the 0s, count the 1s, etc.)
I'm stealing the answer from Yuliy, and starblue (upvote them)
Bucketing is the fastest aside from the O(1)
lottonumbers == mynumbers;
Sorting is O(nlog2n)
Bucketsort is an O(n) algorithm.
So all you need to do is do it twice (once for your numbers, once for the target-set), and if the numbers of the digits add up, then they match.
Any kind of sorting is an added overhead that is unnecessary in this case.
array[10] digits;
while(targetnum > 0)
{
short currDig = targetnum % 10;
digits[currDig]++;
targetnum = targetnum / 10;
}
while(mynum > 0)
{
short myDig = mynum % 10;
digits[myDig]--;
mynum = mynum / 10;
}
for(int i = 0; i < 10; i++)
{
if(digits[i] == 0)
continue;
else
//FAIL TO MATCH
}
Not the prettiest code, I'll admit.
Create an array of 10 integers subscripted [0 .. 9].
Initialize each element to a different prime number
Set product to 1.
Use each digit from the number, to subscript into the array,
pull out the prime number, and multiply the product by it.
That gives you a unique representation which is digit order independent.
Do the same procedure for the other number.
If the unique representations match, then the original numbers match.
If there are no repeating digits allowed (not sure if this is the case though) then use a 10-bit binary number. The most significant bit represents the digit 9 and the LSB represents the digit 0. Work through each number in turn and flip the appropriate bit for each digit that you find
So 1689 would be: 1101000010
and 9816 would also be: 1101000010
then a XOR or a subtract will leave 0 if you are a winner
This is just a simple form of bucketing
Just for fun, and thinking outside of the normal, instead of sorting and other ways, do the deletion-thing. If resultstring is empty, you have a winner!
Dim ticket As String = "1324"
Dim WinningNumber As String = "4321"
For Each s As String In WinningNumber.ToCharArray
ticket = Replace(ticket, s, "", 1, 1)
Next
If ticket = "" Then MsgBox("WINNER!")
If ticket.Length=1 then msgbox "Close but no cigar!"
This works with repeating numbers too..
Sort digits before storing a number. After that, your numbers will be equal.
One cute solution is to use a variant of Zobrist hashing. (Yes, I know it's overkill, as well as probabilistic, but hey, it's "clever".)
Initialize a ten-element array a[0..9] to random integers. Then, for each number d[], compute the sum of a[d[i]]. If the numbers contained the same digits, the resulting numbers will be equal; with high probability (~ 1 in how many possible ints there are), the opposite is true as well.
(If you know that there will be at most 10 digits total, then you can use the fixed numbers 1, 10, 100, ... instead of random numbers for guaranteed success. This is bucket sorting in not-too-much disguise.)