Basic encryption/RSA prime number concept? [beginner] [closed] - encryption

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
so I understand that for RSA encryption, you multiply two large primes to get a very large composite number, which is your public key. The two large primes are your private key, so you want to keep both of those primes secret, and you can do so because it is impossible for the computer to guess them, because you would have to increment your guess by 1 each time for a long time, because each prime number is so large.
However my question is: why won't composite numbers work as well for this? If one of the large primes in your private key was a large composite, wouldn't you still have to guess it by incrementing by 2,3,5,7,11, etc? I know incrementing like this wouldn't take as long as incrementing by 1, but it would still take a long time because you don't know whether to increment by 2,3,5,7, etc.
Although if incrementing by 2 doesn't work, you can be sure private key < publickey/2, if incrementing by 3 doesnt work you can be sure that private key < publickey/3, and so on.
But anyways, can someone tell me why composite numbers aren't as good for encryption? How would you guess a large composite?

The use of prime numbers isn't about the difficulty of factoring - the most efficient method for factoring large composite numbers like the modulus of an RSA key is the General Number Field Sieve, which is much more efficient than simply trying to divide by successive numbers.
In fact, your statement that it doesn't work with composite numbers isn't really true - whilst RSA is generally performed using two primes, it still works more than two distinct primes, and since every composite number is a product of primes, you could choose two composite numbers (providing they are the product of distinct primes, and have no common factors) and the algorithm will still work.
For example, if we choose two composite numbers with distinct prime factors and no common factors:
a = 15 (= 3 * 5)
b = 77 (= 7 * 11)
This is equivalent to 4-prime RSA:
p = 3
q = 5
r = 7
s = 11
So far, so good however you go on to say that the "primes are your private key", which isn't really true either - the private key is actually the combination of the modulus n = p * q (which is also part of the public key), and the private exponent, which is the number d such that e * d = 1 mod phi(n) where e is the public exponent (which is often just 65537), and phi is Euler's totient function. Once d is calculated, the original primes can be discarded.
The problem is that calculating phi(n) requires knowing the prime factors of n. In ordinary RSA where n = p * q we already know the prime factors of n, but if we instead chose two composite numbers (like the a and b above) and use n = a * b we need to factor these numbers in order to calculate phi(n). Since the security of RSA itself relies on the difficulty of factoring large composite numbers, it is infeasible to calculate the private exponent if we haven't constructed the modulus directly from its prime factors, which is the real reason that you have to use primes rather than composite numbers.

because, like you said, with a prime number you have to increment by one. So if I have the number 13, it would take me 13 steps to get to the number 13 since it's prime so I'd be incrementing by one. But, if I have the number 14, it would only take me 7 steps to get there because i could go by twos. And the larger you scale the numbers the bigger the difference in them is. And I will admit you don't know what you need to increment by, but even incrementing by 2 is still faster than by one

Related

In RSA encryption algorithm, Can we find P ,Q and totient of N if we have N value?

N is p*q while totient(N) is a product of (p-1)(q-1) and (P-1),(Q-1) will not be prime after taken 1 from them. For an example N is 51. 51 = p*q while Totient(N) is a product of pq -p -q + 1. So the totient(N) = 51-p-q+1. What should I do after this? How to get p,q from N value (RSA)?
The only analytic (non-implementation) way of obtaining p,q from n is to factor n. For a toy value like 51, this is easy; just try possible values of p until you find p=3 q=17 (or swap to p=17 q=3 if you like). For the sizes of n used in practice -- until a few years ago usually 1024 bits which is about 308 decimal digits, now at least 2048 bits (616 digits) and sometimes more -- there is no known way to factor in less than thousands of years, and that is why RSA is considered generically secure, because knowing p,q enables you to trivially recover the private exponent and decrypt and/or forge data.
Particular implementations sometimes choose the RSA primes badly when generating keys, or leak information about them through side channels, or leak information about the resulting private exponent (d). These attacks are specific to an implementation, and depend on a lot of details and a much higher level of knowledge than exhibited in your question. Knowing d plus the public key allows you to compute p,q; this has been asked and answered many times on other stacks which I'll dig up later. Note that your question is really about the mathematics behind RSA not any program code or language(s), although if there were a method it could well be embodied in code, so it is less suitable here and would be more suitable (but trivial or duplicate) on https://crypto.stackexchange.com or https://security.stackexchange.com .
It is believed that if and when research into quantum computers is successful they will be vastly more efficient at factoring, enough to make RSA insecure. If and when this happens you can expect to see it on every news channel and site in the world.

What does it mean when a key is of a specific length in bits?

I'm learning some basic cryptography related programming and I'm now learning diffie-hellman-merkle key exchange. I was watching a video by Computerphile where they explain the mathematics of it.
In the video, they said that you should use n that is 2000 or 4000 bits long. I've seen this key length be discussed as bits in several other places, like with AES. But I don't understand what "length in bits" means. What does it mean to have a key that is 2000 bits long, so if I need to write a program that creates or uses keys that are of certain length, what would I need to do?
If you need to write a program that creates keys of a certain length, you pass the desired key length to the program. You need to know in which unit the length is expressed (for example, some interfaces might require bytes rather than bits), but you usually don't need to know what the key length means under the hood.
The concrete meaning of the key length depends on the cryptographic scheme, and for some schemes, there can be an ambiguity as to what the “key length” is. It's typically one of three things:
The length of a string that is associated with the algorithm, such as a key.
A number n such that an integer parameter of the algorithm is picked between 2^(n-1) and (2^n)-1.
A number n such that an integer parameter of the algorithm is picked between 1 (or some other small bound) and (2^n)-1.
Both of the last two cases are called “n-bit numbers”. In cryptography, “n-bit number” sometimes means a number that can be written with n digits in base 2, and sometimes a number which requires exactly n digits in base 2. In other words, “n-bit number” sometimes means a number whose bit-size is exactly n, and sometimes a number whose bit-size is at most n. You have to check the exact requirement in the description of each cryptographic scheme.
Depending on the cryptographic scheme, a different number is conventionally chosen as the “key length”. For any specific scheme, a larger key length is harder to break, but you can't compare key lengths between different schemes.
For most symmetric schemes, the key is a randomly generated string (each bit of the string has an independent ½ chance of being 0 or 1), and the length is the number of bits of the string. For example, AES-128 is AES using a 128-bit (16-byte) key. There is only one exception worth mentioning: DES keys are expressed as 64-bit strings, but only 56 of those bits are random (the other 8 are calculated from the random 56), and so DES is sometimes considered to have a “56-bit” key length and sometimes a “64-bit” key length.
For Diffie-Hellman, the key length n is the exact size of the group (conventionally written p). Both the private key and the public key are numbers between 1 and p, so they're at-most n-bit numbers. This is as simple as it goes in terms of key length for asymmetric cryptography.
For RSA, the key length n is the exact size of the modulus, which is one part of the public key (the public key is a pair of numbers: the modulus and the public exponent). For example, 4096-bit RSA means that the modulus is between 2^4095 and 2^4096-1. The private key is also an n-bit number, but in the at-most sense.
For DSA, there are two numbers that can be called the key length, because the private key and the public key are chosen in intervals that have different sizes. The public key length is the size of the larger prime p; the public key is a number between 2 and p-2. The private key length is the size of the smaller prime q; the private key is a number between 1 and q-1.
For elliptic curve cryptography, the domain parameters of the algorithm are called a curve: a set of points, and a parametrization of this set of points. A private key is a parameter value that designates a point on the curve, and a public key is a pair of integers that are the coordinates of a point on the curve. In general, since the private key and the public key live in different mathematical spaces, there are two numbers that could be called the “key size”. A private key is a number between 1 and n-1 for some m-bit number n, and a public key is a point with two coordinates, each of which are between 0 and q for some ℓ-bit number q. In general, m and ℓ don't have to be equal. However, n and q are usually close (if they aren't, it's a waste of performance for a given security level), and so m and ℓ are usually equal and can be called the “key length” without ambiguity.
Every bit can be either 1 or 0. It's the basic unit in the digital world. As you may know, all that is digital end up being either 1 or 0. Each 1 and 0 is a bit.
So something of length n bits means that it has n 1s and 0s.

Advantages and disadvantages of single numeric (float) data type [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Why we use various data types in programming languages ? Why not use float everywhere ? I have heard some arguments like
Arithmetic on int is faster ( but why ?)
It takes more memory to store float. ( I get it.)
What are the additional benefits of using various types of numeric data types ?
Arithmetic on integers has traditionally been faster because it's a simpler operation. It can be implemented in logic gates and, if properly designed, the whole thing can happen in a single clock cycle.
On most modern PCs floating-point support is actually quite fast, because loads of time has been invested into making it fast. It's only on lower-end processors (like Arduino, or some versions of the ARM platform) where floating point seriously suffers, or is absent from the CPU altogether.
A floating point number contains a few different pieces of data: there's a sign bit, and the mantissa, and the exponent. To put those three parts together to determine the value they represent, you do something like this:
value = sign * mantissa * 2^exponent
It's a little more complicated than that because floating point numbers optimize how they store the mantissa a bit (for instance the first bit of the mantissa is assumed to be 1, thus the first bit doesn't actually need to be stored... But this also means zero has to be stored a particular way, and there's various "special values" that can be stored in floats like "not a number" and infinity that have to be handled correctly when working with floats)
So to store the number "3" you'd have a mantissa of 0.75 and an exponent of 2. (0.75 * 2^2 = 3).
But then to add two floats together, you first have to align them. For instance, 3 + 10:
m3 = 0.75 (stored as binary (1)1000000... the first (1) implicit and not actually stored)
e3 = 2
m10 = .625 (stored as binary (1)010000...)
e10 = 4 (.625 * 2^4 = 10)
You can't just add m3 and m10 together, 'cause you'd get the wrong answer. You first have to shift m3 over by a couple bits to get e3 and e10 to match, then you can add the mantissas together and reassemble the result into a new floating point number. A CPU with good floating-point implementation will do all that for you, of course, and do it fast.
So why else would you not want to use floating point values for everything? Well, for starters there's the problem of exactness. If you add or multiply two integers to get another integer, as long as you don't exceed the limits of your integer size, the answer you get will be exactly correct. This isn't the case with floating-point. For instance:
x = 1000000000.0
y = .0000000001
for (cc = 0; cc < 1000000000; cc++) { x += y; }
Logically you'd expect the final value of (x) to be 1000000000.1, but that's almost certainly not what you're going to get. When you add (y) to (x), the change to (x)'s mantissa may be so small that it doesn't even fit into the float, and so (x) may not change at all. And even if that's not the case, (y)'s value is not exact. There are no two integers (a, b) such that (a * 2^b = 10^-10). That's true for many common decimal values, actually. Even something simple like 0.3 can't be stored as an exact value in a binary floating-point number.
So (y) isn't exactly 10^-10, it's actually off by some small amount. For a 32-bit floating point number it'll be off by about 10^-26:
y = 10^-10 + error, error is about 10^-26
Then if you add (y) together ten billion times, the error is magnified by about ten billion times as well, so your final error is around 10^-16
A good floating-point implementation will try to minimize these errors, but it can't always get it right. The problem is fundamental to how the numbers are stored, and to some extent unavoidable. As a result, for instance, even though it seems natural to store a money value in a float, it might be preferable to store it as an integer instead, to get that assurance that the value is always exact.
The "exactness" issue also means that when you test the value of a floating point number, generally speaking, you can't use exact comparisons. For instance:
x = 11.0 / 500
if (x * 50 == 1.1) { ... It doesn't!
for (float x = 0.0; x < 1.0; x += 0.01) { print x; }
// prints 101 values instead of 100, the last one being 0.9999999...
The test fails because (x) isn't exactly the value we specified, and 1.1, when encoded as a float, isn't exactly the value we specified either. They're both close but not exact. So you have to do inexact comparisons:
if (abs(x - expected_value) < small_value) {...
Choosing the correct "small_value" is a problem unto itself. It can depend on what you're doing with the values, what kind of behavior you're trying to achieve.
Finally, if you look at the "it takes more memory" issue, you can also turn that around and think of it in terms of what you get for the memory you use.
If you can work with integer math for your problem, a 32-bit unsigned integer lets you work with (exact) values between 0 and around 4 billion.
If you're using 32-bit floats instead of 32-bit integers, you can store larger values than 4 billion, but you're still limited by the representation: of those 32 bits, one is used for the sign bit, and eight for the mantissa, so you get 23 bits (24, effectively) of mantissa. Once (x >= 2^24), you're beyond the range where integers are stored "exactly" in that float, so (x+1 = x). So a loop like this:
float i;
for (i = 1600000; i < 1700000; i += 1);
would never terminate: (i) would reach (2^24 = 16777216), and the least-significant bit of its mantissa would be of a magnitude greater than 1, so adding 1 to (i) would cease to have any effect.

Multiplication using FFT in integer rings

I need to multiply long integer numbers with an arbitrary BASE of the digits using FFT in integer rings. Operands are always of length n = 2^k for some k, and the convolution vector has 2n components, therefore I need a 2n'th primitive root of unity.
I'm not particularly concerned with efficiency issues, so I don't want to use Strassen & Schönhage's algorithm - just computing basic convolution, then some carries, and that's nothing else.
Even though it seems simple to many mathematicians, my understanding of algebra is really bad, so I have lots of questions:
What are essential differences or nuances between performing the FFT in integer rings modulo 2^n + 1 (perhaps composite) and in integer FIELDS modulo some prime p?
I ask this because 2 is a (2n)th primitive root of unity in such a ring, because 2^n == -1 (mod 2^n+1). In contrast, integer field would require me to search for such a primitive root.
But maybe there are other nuances which will prevent me from using rings of such a form for the FFT.
If I picked integer rings, what are sufficient conditions for the existence of 2^n-th root of unity in this field?
All other 2^k-th roots of unity of smaller order could be obtained by squaring this root, right?..
What essential restrictions are imposed on the multiplication by the modulo of the ring? Maybe on their length, maybe on the numeric base, maybe even on the numeric types used for multiplication.
I suspect that there may be some loss of information if the coefficients of the convolution are reduced by the modulo operation. Is it true and why?.. What are general conditions that will allow me to avoid this?
Is there any possibility that just primitive-typed dynamic lists (i.e. long) will suffice for FFT vectors, their product and the convolution vector? Or should I transform the coefficients to BigInteger just in case (and what is the "case" when I really should)?
If a general answer to these question takes too long, I would be particularly satisfied by an answer under the following conditions. I've found a table of primitive roots of unity of order up to 2^30 in the field Z_70383776563201:
http://people.cis.ksu.edu/~rhowell/calculator/roots.html
So if I use 2^30th root of unity to multiply numbers of length 2^29, what are the precision/algorithmic/efficiency nuances I should consider?..
Thank you so much in advance!
I am going to award a bounty to the best answer - please consider helping out with some examples.
First, an arithmetic clue about your identity: 70383776563201 = 1 + 65550 * 2^30. And that long number is prime. There's a lot of insight into your modulus on the page How the FFT constants were found.
Here's a fact of group theory you should know. The multiplicative group of integers modulo N is the product of cyclic groups whose orders are determined by the prime factors of N. When N is prime, there's one cycle. The orders of the elements in such a cyclic group, however, are related to the prime factors of N - 1. 70383776563201 - 1 = 2^31 * 3^1 * 5^2 * 11 * 13, and the exponents give the possible orders of elements.
(1) You don't need a primitive root necessarily, you need one whose order is at least large enough. There are some probabilistic algorithms for finding elements of "high" order. They're used in cryptography for ensuring you have strong parameters for keying materials. For numbers of the form 2^n+1 specifically, they've received a lot of factoring attention and you can go look up the results.
(2) The sufficient (and necessary) condition for an element of order 2^n is illustrated by the example modulus. The condition is that some prime factor p of the modulus has to have the property that 2^n | p - 1.
(3) Loss of information only happens when elements aren't multiplicatively invertible, which isn't the case for the cyclic multiplicative group of a prime modulus. If you work in a modular ring with a composite modulus, some elements are not so invertible.
(4) If you want to use arrays of long, you'll be essentially rewriting your big-integer library.
Suppose we need to calculate two n-bit integer multiplication where
n = 2^30;
m = 2*n; p = 2^{n} + 1
Now,
w = 2, x =[w^0,w^1,...w^{m-1}] (mod p).
The issue, for each x[i], it will be too large and we cannot do w*a_i in O(1) time.

Pseudo-random numbers from a 32-bit auto-increment INTEGER

I have a table with an auto-increment 32-bit integer primary key in a database, which will produce numbers ranging 1-4294967295.
I would like to keep the convenience of an auto-generated primary key, while having my numbers on the front-end of an application look like randomly generated.
Is there a mathematical function which would allow a two-way, one-to-one transformation between an integer and another?
For example a function would take a number, and translate it to another:
1 => 1538645623
2 => 2043145593
3 => 393439399
And another function the way back:
1538645623 => 1
2043145593 => 2
393439399 => 3
I'm not necessarily looking for an implementation here, but rather a hint on what I suppose, must be a well-known mathematical problem somewhere :)
Mathematically this is almost exactly the same problem as cryptography.
You: I want to go from an id(string of bits) to another number (string of bits) and back again in a non-obvious way.
Cryptography: I want to go from plaintext (string of bits) to another string of bits and back again (reversible) in a non-obvious way.
So for a simple solution, can I suggest just plugging in whatever cryptography algorithm is most convenient in your language, and encrypt and decrypt your id?
If you wanted to be a bit cleverer you can do what is called "salting" in addition to cryptography. Take your id as a 32 bit (or whatever) number. Concatenate it with a random 32 bit number. Encrypt the result. To reverse, just decrypt, and throw away the random part.
Of course, if someone was seriously attacking this, this might be vulnerable to known plaintext/differential cryptanalysis attacks as you have a very small known plaintext space, but it sounds like you aren't trying to defend against serious attacks.
First remove the offset of 1, so you get numbers in the range 0 to 232-2. Let m = 232-1.
Choose some a that is relative prime to m. Since it is relatively prime it has an inverse a' so that a * a' = 1 (mod m). Also choose some b. Choose big numbers to get a good mixing effect.
Then you can compute your desired pseudo-random number by y = (a * x + b) % m, and get back the original by x = ((y - b) * a') % m.
This is essentially one step of a linear congruential generator (LCG) for pseudo-random numbers.
Note that this is not secure, it is only obfuscation. For example, if a user can get two numbers in sequence then he can recover a and b easily.
In most cases web apps use a hash of a randomly generated number as a reference to a table row. This hash can be stored as a number and displayed as a string for the end user.
This hash is unique and it is identifier and the id is only used in the application itself, never shown to the outside world.

Resources