I have recently been learning about public/private key encryption in my computer science lessons, and how it works in terms of data encryption/decryption. We also covered how it can be used for digital signatures. However, we didn't go into too much detail on how the actual keys are generated themselves.
I know that it begins with a very large number, which is then passed through some kind of keygen algorithm which returns two distinctive keys, one of which is private and the other is public. Are these algorithms known or are they black box systems? And does one user always have the same pair of keys linked to them or do they ever change at any point?
It just seems like a very mathematical issue, as the keys are linked, yet one is not deducible from the other.
I know that it begins with a very large number, which is then passed through some kind of keygen algorithm which returns two distinctive keys, one of which is private and the other is public.
Well, that's not entirely correct. Most asymmetric algorithms are of course based on large numbers, but this is not a requirement. There are, for instance, algorithms based on hashing, and hashing is based on bits/bytes, not numbers.
But yes, for asymmetric algorithms usually contain a specific algorithm to perform the key pair generation. For instance, asymmetric encryption consists of a triple Gen, Enc and Dec where Gen represents the key pair generation. And the key pair of course consists of a public and a private part.
RSA basically starts off by generating two large random primes, it doesn't start with a single number necessarily.
Are these algorithms known or are they black box systems?
They are known, and they are fundamental to the security of the system. You cannot use just any numbers to perform, e.g., RSA. Note that for RSA there are different algorithms and configurations possible; not every system will use the same Gen.
And does one user always have the same pair of keys linked to them or do they ever change at any point?
That depends on the key management of the system. Usually there is some way of refreshing or regenerating keys. For instance X.509 certificates tend to have a end date (the date of expiry or expiration date), so you cannot even keep using the corresponding private key forever; you have to refresh the certificates and keys now and then.
It just seems like a very mathematical issue, as the keys are linked, yet one is not deducible from the other.
That's generally not correct. The public key is usually easy to derive from the private key. For RSA the public exponent may not be known, but it is usually set to a fixed number (65537). This together with the modulus - also part of the private key - makes the public key. For Elliptic Curve keys a private random value is first produced and the public key is directly derived from it.
You can of course never derive the private key from the public key; that would make no sense - it would not be very private if you could.
In RSA the generated two numbers p and q are very large prime numbers more or less the same size, which are used to calculate N which derives the public/private keys using modulo arithmetic.
The following answer in crypto.stackexchange.com describes in more details how we can start from a random (large) number and use Fermat test and Miller-Rabin tests to reach a number that is very probable to be prime.
Related
I was considering hashing small blocks of sensitive ID data but I require to maintain the full uniqueness of the data blocks as a whole once obfuscated.
So, I came up with the idea of encrypting some publicly-known input data (say, 128 bits of zeroes), and use the data I want to obfuscate as the key/password, then throw it away, thus protecting the original data from ever being discovered.
I already know about hashing algorithms, but my problem is that I need to maintain full uniqueness (generally speaking a 1:1 mapping of input to output) while still making it impossible to retrieve the actual input. A hash cannot serve this function because information is lost during the process.
It is not necessary that the data be retrieved once "encrypted". It is only to be used as an ID number from then on.
An actual GUID/UUID is not suitable here because I need to manually control the identifiers on a per-identifier basis. The IDs cannot be unknown or arbitrarily generated data.
EDIT: To clarify exactly what these identifiers are made of:
(unencrypted) 64bit Time Stamp
ID Generation Counter (one count for each filetype)
Random Data (to make multiple encrypted keys dissimilar)
MAC Address (or if that's not available, set top bit + random digits)
Other PC-Specific Information (from registry)
The whole thing should add up to 192 bits, but the encrypted section's content size(s) could vary (this is by no means a final specification).
Given:
A static IV value
Any arbitrary 128bit key
A static 128 bits of input
Are AES keys treated in a fashion that would result in a 1:1 key<---->output mapping, given the same input and IV value?
No. AES is, in the abstract, a family of permutations of which you select a random one with the key. It is the case that for one of those permutations(i.e. for encryption under a given AES key) you will not get collisions because permutations are bijective.
However, for two different permutations (i.e. encryption under different AES keys, which is what you have), there is no guarantee what so ever that you don't get a collision. Indeed, because of the birthday paradox, the likelihood of a collision is probably higher than you think.
If your ID's are short ( < 1024 bits) you could just do an RSA encryption of them which would give you want you want. You'd just need to forget the private key.
On the one hand, I hear people saying that the two keys are totally interchangeable, the first one will decrypt what the second one encrypted. This makes me think that the two keys are interchangeable.
But on the other hand, RSA generated keys appear to have different length, and on another topic encrypting with a private key was called “signing” and was deemed less safe than encrypting with a public key. (2)
On top of that comes the idea that the private key should be kept undisclosed when the public key should be openly distributed in the wild. (3)
I planned to receive data from an unique server, so my idea was to keep a public key on that server to encrypt data, and distribute a private key to all the possible customers, but this goes against (3). Conversely, if I distribute public keys and encrypt my data with the private key, the encryption is less safe according to (2).
Should I distribute a public key and encrypt with a private one to satisfy (2) or the other way around?
NB: in my case, performance is not an issue.
The answer depends on whether you are asking your question out of mathematic curiosity, or for purely practical, cryptographic reasons.
If you are implementing a crypto system you should never disclose your private key, so in this sense the keys are absolutely not interchangeable. Furthermore, the usage scenario you describe seems like a good match for authentication rather than confidentiality, so the message that is sent by the server to the clients should indeed be signed and not encrypted. If you need confidentiality as well, you need a few more steps in your protocol.
From a mathematical point of view, the answer is OTOH "yes", presuming you use an internal representation of the private key that only contains the modulus N and the exponent D, and the other exponent E is generated randomly. The formula that describes the relation between the two exponents is 1 = E*D (mod phi(N)), so from a mathematical point of view it doesn't really matter which exponent is which.
But on the other hand, RSA generated keys appear to have different length
If you are using an implementation that produces RSA private keys that are significantly longer than the corresponding public keys, this almost always means the implementation is absolutely not suitable for using public and private keys interchangeably. The difference in length is usually due to a combination of the following:
The public exponent E is not generated randomly, but is a small, fixed constant, such as 3 or 0x10001. The private exponent D will on the other hand be almost as large as the modulus, so the private key data will be almost twice the size of the public key data. If you only got a RSA private key (N,D), your first guess on the public exponent would be either of the values 3 or 0x10001, and it would be easy the check if the guess is correct. Should you want the keys to be interchangeable, the exponent you pick first has to be picked randomly as an odd integer greater than 1 and less than phi(N) and with no prime factors in common with N or phi(N).
The private key data includes the factors P,Q of the public modulus N.
The private key data includes the public exponent E.
Your public key is used to encrypt a message, your private one to decrypt it. Thus with the public key, which you distribute, anyone can encrypt a message safe in the knowledge that only you (or someone with your private key) can decrypt it. To answer your question directly, no they are not interchangeable. You should never distribute your private key.
If you want to share a key with multiple possible customers, then there are really two options. Either you abandon asymmetric cryptography and find a secure way to distribute a symmetric key, for use with something like AES instead of RSA, to each of them, or you ask each of them to generate a key pair and provide you with their public key. Then you can decrypt what comes from the server, and re-encrypt for each customer. The number of customers will help dictate your choice between the two.
Assuming a keystore is secure and one needs to service around a million keys, is it better to generate asymmetric keys in real-time or is it better to generate a bunch of keys and store them to be used as and when required?
Edit 1: By real time I mean generate a key pair when a user registers for the first time, from then on that key pair is used for all communication with the user.
Asymmetric keys have a public part and a private part; the public part is used to perform the operation which complements that which is done with the private part (e.g. you sign with the private key, and you verify the signature with the public key; or you encrypt data with the public key, and decrypt it with the private key). The point of asymmetric keys is that the private and public parts can be known by distinct entities; namely, that the public part is, well, public (everybody knows it) while the private part remains private.
Consequently, generating an asymmetric key "in real-time" makes little sense in most situations: what gives some value to a private key is that the public key is already known to some other party.
One can still imagine some situations in which "real-time" generation of asymmetric keys can be of use. For instance, SSL connections using one of the "ephemeral Diffie-Hellman" cipher suites: the DH keys, which can be called "asymmetric", are generated for each connection, the public part being then signed by the server (with another asymmetric key, which is not generated on-the-fly: the public key is the one in the server certificate) and then sent to the connecting client. In such a situation, pre-generating DH key pairs and storing them could be viewed as a kind of optimization, but a bad one since DH key pair generation is very fast, and private key storage is a complex and delicate issue.
Edit: if your problem is about key generation upon user registration vs key generation and storage in advance: assuming that server-side key generation is indeed what you want, key generation and storage in advance is worthwhile only as an optimization, if on-the-fly generation proves to be too expensive to handle peaks (occasionally, many users trying to register at the same time). I suggest that you try and bench and make sure that the problem really exists, before implementing a "solution", because private key secure storage is somewhat tricky. RSA key generation is quite fast (on a basic PC, you can easily generate a dozen RSA keys per second), and with discrete-log (DSA, Diffie-Hellman, El-Gamal) or elliptic-curve based cryptosystems, it is even considerably faster (e.g. ten thousands new EC key pairs per second, with a PC).
I'm working on a new licensing scheme for my software, based on OpenSSL public / private key encryption. My past approach, based on this article, was to use a large private key size and encrypt an SHA1 hashed string, which I sent to the customer as a license file (the base64 encoded hash is about a paragraph in length). I know someone could still easily crack my application, but it prevented someone from making a key generator, which I think would hurt more in the long run.
For various reasons I want to move away from license files and simply email a 16 character base32 string the customer can type into the application. Even using small private keys (which I understand are trivial to crack), it's hard to get the encrypted hash this small. Would there be any benefit to using the same strategy to generated an encrypted hash, but simply using the first 16 characters as a license key? If not, is there a better alternative that will create keys in the format I want?
DSA signatures are signficantly shorter than RSA ones. DSA signatures are the twice the size of the Q parameter; if you use the OpenSSL defaults, Q is 160 bits, so your signatures fit into 320 bits.
If you can switch to a base-64 representation (which only requires upper-and-lower case alphanumerics, the digits and two other symbols) then you will need 53 symbols, which you could do with 11 groups of 5. Not quite the 16 that you wanted, but still within the bounds of being user-enterable.
Actually, it occurs to me that you could halve the number of bits required in the license key. DSA signatures are made up of two numbers, R and S, each the size of Q. However, the R values can all be pre-computed by the signer (you) - the only requirement is that you never re-use them. So this means that you could precalculate a whole table of R values - say 1 million of them (taking up 20MB) - and distribute these as part of the application. Now when you create a license key, you pick the next un-used R value, and generate the S value. The license key itself only contains the index of the R value (needing only 20 bits) and the complete S value (160 bits).
And if you're getting close to selling a million copies of the app - a nice problem to have - just create a new version with a new R table.
Did you consider using some existing protection + key generation scheme? I know that EXECryptor (I am not advertising it at all, this is just some info I remember) offers strong protection whcih together with complimentatary product of the same guys, StrongKey (if memory serves) offers short keys and protection against cracking. Armadillo is another product name that comes to my mind, though I don't know what level of protection they offer now. But they also had short keys earlier.
In general, cryptographically strong short keys are based on some aspects of ECC (elliptic curve cryptography). Large part of ECC is patented, and in overall ECC is hard to implement right and so industry solution is a preferred way to go.
Of course, if you don't need strong keys, you can go with just a hash of "secret word" (salt) + user name, and verify them in the application, but this is crackable in minutes.
Why use public key crypto? It gives you the advantage that nobody can reverse-engineer the executable to create a key generator, but key generators are a somewhat secondary risk compared to patching the executable to skip the check, which is generally much easier for an attacker, even with well-obfuscated executables.
Eugene's suggestion of using ECC is a good one - ECC keys are much shorter than RSA or DSA for a given security level.
However, 16 characters in base 32 is still only 5*16=80 bits, which is low enough that brute-forcing for valid keys might be practical, regardless of what algorithm you use.
In an asymetric encryption scheme, I was wondering if it's possible to achieve the following:
Bob sends to Alice his public key
Alice alters Bob's public key and encrypt some document with it
Alice sends the encrypted document to Bob
Bob retrieve the document but can't decrypt it with his private key
Later, Alice sends some additional information (probably related to the method she used to alter Bob's public key) to Bob
Bob uses this additional information to modify his private key and successfully decrypt the document
Anyone?
I am assuming RSA for the keys generation, encryption and decryption but if it's easier to do with another scheme feel free to comment.
(I assume you talk about RSA.)
Yes it is possible, but not 100%.
The public key is a part of the private key. It contains the modulus and the exponent of the key.
You can completely forget changing the modulus, because you would have to generate a new rsa keypair, which is the same problem as the one we are trying to solve.
But it is possible to change the exponent. You can select any (prime) number between 1 and your exponent as the new exponent and hope that it is coprime with the totient. Without knowing the totient it's impossible to select always a correct exponent. To find out the totient you would have to know the prime factors of the key, which means that you would have to break the key (have fun!).
So, it's actually impossible to have a 100% percent working method to do that, at least not while knowing only the public key.
If you need more information about the theory check here
I hope my idea works.
Let us assume that (e,d,n) is a tuple of the RSA public exponent. The RSA private exponent and the RSA modulus n :
Select a prime number, say p, between 1 and a 256 bit integer.
To encrypt a message m, compute the new public exponent as e*p and the ciphertext as:
c= m^{e*p} mod n.
To decrypt, the receiver should know the prime p, so you send this p later to him, with this he computes
(1) P = p^{-1} mod phi(n)
and
(2) m^e=c^{P} mod n
and
finally m=(m^e)^d mod n. This works as the receiver knows phi(n).
By the way, where can we use this? Is there any application you have in mind for this?
As silky implies in his answer, the way in which RSA is usually used to encrypt a document is in combination with a symmetric algorithm, like AES. A secure random key is generated for the AES algorithm, the documented is encrypted with that AES key, and the AES key is encrypted with the recipient's public key. Both parts are supplied to the recipient.
You can adapt this to your situation simply by sending only the document encrypted with the AES key in the first step, and withholding the AES key encrypted with the recipient's public key until the second step. The first part will be on the order of the original file size, and the second part will be a small, constant size (on the order of the RSA key size).
Hmm, interesting.
You're referring to RSA, I assume?
FYI, RSA isn't actually used to encrypt documents. It's used to exchange keys (keys for a symmetric algorithm, like AES).
So what you're really talking about is an approach that changes the keys.
Technically (mathmatically) if you put a different number in, you'll get a different number out. So that's not an issue; changing the public key in some fashion will (assuming you convince your RSA implementation to use it, or prepare an appropriately different number) result in a different symmetric key, thus an undecryptable document by Bob (because he'll expect a different key).
Really, though, I'm not so sure you care about this. It's a fairly useless thing to do. Perhaps, however, you're actually interested in Key Splitting (or "Secret Sharing" as wikipedia seems to call it).
HTH. I'm by no means an expert.