I'm looking for a method that enables a user to generate a pair of public/private keys using an initial key provided to him/her. I don't know if this is called hierarchical key generation or multilevel key generation or something else. It's not important for the higher level key to be able to decrypt the data of the lower level I just need the pair to be be generated using another key.
I have a seen some articles but they're all just theoretical. Is there a way to achieve this for RSA?
It is pretty easy actually.
The algorithm for generating an RSA key pair boils down to finding a set of big, prime numbers, that fulfil some algebraical properties and that are of appropriate size.
If you need a 2048 bit RSA key, you will typically look for 2 prime number, each having a a rough length of 1024 bits.
The process of finding a prime number is trial-and-error: you randomly pick an integer of appropriate size, and test if it is prime. If it is not, you retry.
In the real world, the random generator that drives the algorithm is a deterministic PRNG which is seeded with a secret of appropriate entropy (e.g. 128 bits of true randomness).
In your case, the PRNG seed can be derived from a user secret or even from another key (provided it is secret of course). Derivation should be performed with a salted KDF like HKDF, PBKDF2, etc.
You don't specify which crypto library you use: whatever it is, you must be clear on how it draw randomness and how to define the seed of the PRNG.
Example (in Python 2.x):
from Crypto.PublicKey import RSA
from Crypto.Hash import HMAC
from struct import pack
# The first key could also be read from a file
first_key = RSA.generate(2048)
# Here we encode the first key into bytes and in a platform-independent format.
# The actual format is not important (PKCS#1 in this case), but it must
# include the private key.
encoded_first_key = first_key.exportKey('DER')
seed_128 = HMAC.new(encoded_first_key + b"Application: 2nd key derivation").digest()
class PRNG(object):
def __init__(self, seed):
self.index = 0
self.seed = seed
self.buffer = b""
def __call__(self, n):
while len(self.buffer) < n:
self.buffer += HMAC.new(self.seed +
pack("<I", self.index)).digest()
self.index += 1
result, self.buffer = self.buffer[:n], self.buffer[n:]
return result
second_key = RSA.generate(2048, randfunc=PRNG(seed_128))
The drawbacks to keep in mind are that:
the derived key will get compromised as soon as the first key is compromised.
the derived key cannot be stronger than the first key (as in, the algorithm does not magically generate entropy. If the secret key or passphrase is short, you end up with a weak derived key.
Related
If the user enters a wrong key for AES decryption, some garbage data is generated. I want to verify the given decryption key and throw an error if the key is incorrect. How can I verify the key entered by the user?
Use an HMAC. The basic premise is that you run the plaintext through an HMAC, add the result to the plaintext and then encrypt. Then do the opposite when decrypting. If the plaintext and HMAC result match, then you know you've got the correct key.
OR, if you want to know prior to decryption, use the key material provided by the user to derive two further keys (using, say PBKDF2). Use one for encryption and another for an HMAC. In this case, encrypt first and then apply the HMAC using the second key. This way you can compute the HMAC and check if it matches before you decrypt.
Simplest approach is to add a magic number to the plaintext file data in a predictable location before encrypting; when decrypting, if the magic number is wrong, you used the wrong key. Downside to this approach is that it cannot validate the integrity/authenticity of the entire message.
To do that, use AES in an authenticated mode (e.g. AES-GCM) which gives stronger guarantees that the rest of the message was not tampered with.
One common way used to verify if a key is correctly entered, without revealing the actual key, is by use of a KCV (Key Check Value). When you create the key you would at the same time calculate the KCV, when the key is then entered manually, you can verify the entry by re-calcuylating the KCV. This is eg. used when entering keys manually into HSM's from physical key letters.
To calculate a KCV for an AES key you encrypt an empty (0x00) block with the key and the first 3 bytes of the resulting encrypted block is then the KCV.
Take a look here
For example, if I create a dictionary in python I can use d.keys() to retrieve the keys.
What is a hash table/dictionary without this kind of access? Storage might be an issue and the keys may be of least importance.
Edit (clarification): I want a data structure that can access values through the key but doesn't know the key, only the hash. For example:
Hash Value
-----------------------------------------------------------------------
2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae|hey!
c9fc5d06292274fd98bcb57882657bf71de1eda4df902c519d915fc585b10190|hello!
If I try and access the data structure with the key "this is a key", it will hash that and get "hello!". If I try to access it with the key "foo", I will get "hey!".
We cannot retrieve the keys from this hash table, but we can access the data. This would be useful in cases where storage is important.
Normally, this would be the table:
Hash Value Key
-------------------------------------------------------------------------------------
2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae|hey! |foo
c9fc5d06292274fd98bcb57882657bf71de1eda4df902c519d915fc585b10190|hello!|this is a key
This is called a Set - in this case the value is the key, and implementations generally use the hashcode and equality operations on the items before adding them to the set.
Some implementations of Set can be sorted, generally those are referred to as SortedSet. Think of Set<T> as an equivalent to Dictionary<T,T> (and SortedSet<T> being approximate of SortedDictionary<T,T> in C# parlance.
Sorted variants are generally implemented using binary trees, whereas unsorted implementations use hashing tables. As the key is the value, most implementations only store the value itself.
Which platform / language are you using? Java?
I have a 1024 bits private key, and use it to generate a public key.
Does that automatically mean that my public key also has 1024 encryption? Or can it be of a lesser encryption size? (512, 256...)
PS: What i'm mostly interested in, and talking about, is the size of the modulus ("n") in RSA keys. The size is typically 1024 or 2048 bits. But I'm glad to see that this sparked a discussion, and all this is feeding my interest in cryptography.
This depends on the encryption algorithm and on what precisely you call public/private key. Sometimes it's possible to use a different size in RAM compared to serialization on disk or the network.
RSA
An RSA public key consists of a modulus n and a public exponent e. We usually choose a small value for e (3, or 65537 are common). The size of e has little influence on security. Since e is usually less than four bytes and n over a hundred, the total size is dominated by the modulus. If you really want to, you can fix e as part of your protocol specification so there is only n to store.
An RSA private key can be represented in different forms, but typically we store the values p, q, dp, dq, e, d, n, InvQ. Their combined size is larger than the public key. Most of these aren't strictly required, but it's convenient to have them available instead of regenerating them. Regenerating all of them given e, p and q is straight forward.
When we talk about key-size in the context of RSA we always mean the size of the modulus, ignoring all the other elements. This is a useful convention, since this is the only value that affects security. A typical size for n is 2048 bits.
Finite field crypto (Diffie-Hellman, DSA, etc.)
The private key is a scalar twice the size of the security level. A typical value is 256 bits.
The public key is a group element, which is much larger than the private key. A typical value is 2048 bits.
So with finite field crypto the public key is much larger than the private key.
Elliptic curves
The private key is a scalar twice the size of the security level. A typical value is 256 bits. This part is identical to finite field crypto.
The public key is a group element. There are two forms of serializing such an element. The compressed form is slightly larger than the private key (a couple of bits at most). The uncompressed form is about twice the size of the private key. A typical value for the compressed form is 256 bits and 512 bits for the uncompressed form.
Private key as seed
When you generate public/private key pairs yourself, you can always store them as seeds for a PRNG. That way they're quite small, 160 bits or so regardless of the scheme you use. The downside of this is that regenerating the natural form of the private key may be expensive. It is required that the method of creating the key pair remains constant.
Fingerprint of public key
Instead of storing the full public key, you can often store only a fingerprint, which is 160 bits or so in size. The downside of this is that it increases the size of the message/signature.
Summary
For some algorithms the size of public and private key are the same, for some they differ, and it is often possible to compress either or both of them at a cost (decompression time or message size).
No. The public key in a key pair always matches the private key size, in fact it is derived from the private key.
However, with some public key cryptographic implementations, such as OpenPGP, keys are created with subkeys assigned to different tasks. Those subkeys can be different sizes to each other and the master key used to create them. In those cases the public key data will indicate the key sizes for the master key and the subkey(s) which will match the corresponding private key data.
Whereas many other public key implementations do not utilise subkeys (e.g. TLS) so you will only ever see the single key size. Again that key size will be indicated in both the public and private key data.
The only variation in key sizes you will see is when asymmetric encryption is used in conjunction with symmetric encryption. The symmetric encryption (session key) will be smaller, but it uses entirely different algorithms (e.g. AES, TWOFISH, etc.) and is not part of the public key (except in OpenPGP, where symmetric cipher preferences can be saved because it does not utilise a live connection to establish the symmetrically encrypted communication and exchange session key data).
EDIT: More detail on the relationship between the public and private key data (also known as proving David wrong)
Pointing to RSA is all very well and good, but it depends on the key exchange protocol and for that we go to Diffie-Hellman key exchange and the original patent, which is now expired. Both of these have examples and explanations of the key exchange methods and the relationship between the public and private keys.
Algorithms implementing this relationship, including RSA and El-Gamal, all create both the public and private keys simultaneously. Specifically by creating a private key which then generates the public key. The public key inherits all the features of the private key which made it. The only way to get mis-matched details between the two components would be by somehow generating a public key independently of the private key. The problem there, of course, is that they would no longer be a key pair.
The key generation descriptions for both RSA and El-Gamal explain the common data between the public and private keys and specifically that all the components of the public key are a part of the private key, but the private key contains additional data necessary to decrypt data and/or sign data. In El-Gamal the public components are G, q, g and h while the private components are G, q, g, h and x.
Now, on to the lack of mention of the bit size of the key pairs in the algorithms, yes, that's true, but every practical implementation of them incorporates the selected key size as one of the constants when generating the private key. Here's the relevant code (after all the options are selected, including selecting the key size and specifying the passphrase) for generating keys in GnuPG:
static int
do_create( int algo, unsigned int nbits, KBNODE pub_root, KBNODE sec_root,
DEK *dek, STRING2KEY *s2k, PKT_secret_key **sk, u32 timestamp,
u32 expiredate, int is_subkey )
{
int rc=0;
if( !opt.batch )
tty_printf(_(
"We need to generate a lot of random bytes. It is a good idea to perform\n"
"some other action (type on the keyboard, move the mouse, utilize the\n"
"disks) during the prime generation; this gives the random number\n"
"generator a better chance to gain enough entropy.\n") );
if( algo == PUBKEY_ALGO_ELGAMAL_E )
rc = gen_elg(algo, nbits, pub_root, sec_root, dek, s2k, sk, timestamp,
expiredate, is_subkey);
else if( algo == PUBKEY_ALGO_DSA )
rc = gen_dsa(nbits, pub_root, sec_root, dek, s2k, sk, timestamp,
expiredate, is_subkey);
else if( algo == PUBKEY_ALGO_RSA )
rc = gen_rsa(algo, nbits, pub_root, sec_root, dek, s2k, sk, timestamp,
expiredate, is_subkey);
else
BUG();
return rc;
}
The slight differences between the three algorithms relate to the values for the items referred to in the published algorithms, yet in each case the "nbits" is a constant.
You'll find the same consistency relating to the key size in the code for generating keys in OpenSSL, OpenSSH and any other system utilising public key cryptography. In every implementation in order to have a matched public and private key pair the public key must be derived from the private key. Since the private key is generated with the key size as a constant, that key size must be inherited by the public key. If the public key does not contain all the correct shared information with the private key then it will be, by definition, not matched to that key and thus the encryption/decryption processes and the signing/verifying processes will fail.
I was looking from various sources, and my conclusion is that the modulus (n=p*q) used to RSA key generation, is the same for the public and the private key. The modulus determines the length of the key for both.
For RSA your public key can be as small as 2 bits. That is the number 3 can be your public key.
A popular choice for RSA public key is 17.
From what I understand, there is no requirement that both keys be the same size. Check below for how to generate keys:
http://en.wikipedia.org/wiki/RSA_algorithm#Key_generation
However I believe that if one key (or factor of the modulus) is significantly smaller it would weaken the strength against cryptanalysis.
Edit:
This discussion has largely become irrelevant since the OP clarified that they were most interested in the size of the modulus, which will naturally be the same for encryption and decryption (excluding any bizarre unknown cryptosystems).
Just to clarify my point, I am simply saying that if e << d (or d << e) you can distribute the keys as different key sizes. They would be generated by the same algorithm using the same bit-size mathematics (e.g. 256 bits), and similarly encryption and decryption would require the same number of bits. If you look at (for the sake of argument) the numbers 1 and 128, you have a range of choices in how to represent them. They could both be 8 bit, or 1 could be represented by any number of bits from 1-7 bits. This could be considered a cheap trick unless your key generation method guarantees that the magnitudes of d and e would be significantly different in a predictable way. However as stated, I don't see much point to doing this.
I have to encrypt a text by using the DES algorythm with a hash created in MD5.
The MD5 function has the parameters: salt (byte[8]) and key (string 6), It has to iterate 1000 times. When I pass the MD5 encryption function it returns me a byte[16].
The DES function parameters are: the string to encrypt and the key (returned by the MD5 function). But when I try to assign the key value to the key encoder I get an exception because it waits a bte[8] instead of a byte[16]. I've tryed to get the first 8 bytes or the last 8 bytes..... But it doesn't work (I have an example and I have to get the same result).
Some ideas???
DES (not to be confused with 3DES) has 56 bit keys. Your problem will require more definition in order to determine the correct choice for the key.
There is no reason to use DES today. There are far better, unbroken, algorithms available.
Why are you using the hash as an encryption key? Keys should be cryptographically secure random data, something a hash is not. Hashing itself is not encryption at all.
DES keys are 56 bits normally packaged in 8 bytes, so taking the first 8 bytes from the hash means you have a key that is too long (depending on if it's signed or unsigned), you need to extract 56 bits if you must use the hash as a source.
So say I want to encrypt a file and the only way I want it to be read is if two different people enter their keys. So, for instance there are four keys:
key1, key2, key3, key4.
If I encrypt it with key1 then the following combinations will decypt it:
key2,key3
key3,key4
key2,key4
Is this possible using a standard method?
Generate a unique content key to encrypt the message (this is common to many message encryption standards), then apply an erasure code scheme such as Reed-Solomon coding against that content key concatenated with enough additional random data to ensure that any m of n "shards" of the key can be put together to create the final key. Shards are only given out from the random data portion so that none of the shards given out contain actual bits from the content key. This way, any number of collected shards short of m does not give any useful information about the key itself.
EDIT: Reed-Solomon to generate key shards appears to be identical to Shamir's secret-sharing, first published in 1979; thanks to #caf for pointing out the article.
Generate a symmetric key key1 randomly and use it to encrypt the data, then generate key2, key3 and key4 from key1 using Shamir's Secret Sharing protocol.
To securely distribute key2, key3 and key4 you can then use a public key algorithm to encrypt them using the public keys of the recipients.
Say you're assigning keys x1, x2, .. xN
Encrypt the file with a master symmetric key M. Then store several encrypted copies of M:
Encrypted with x1 and x2
Encrypted with x2 and x3
Encrypted with x1 and x3
...
Any two keys will unlock one of the encrypted copies of the master, which will decrypt the file.
Not as you state it, I don't think. But you could get the same effect like this: Use public key crypto; now there are 4 public and 4 private keys. As person #1, encrypt your message with each pairwise combination of the other 3. E.g. encrypt the message with key 2, then encrypt that with key 3. Now encrypt the message with key 2, then encrypt that with key 4. Finally, 3 then 4. Now if any two of the others get together they can recover the original message.
make the fourth key the bitwise checksum of the other three... You could even sequentially increment which key had the checksum value.. so that
key 4 bit 1 was a checksum of bit 1 in keys 1-3, and
key 1 bit 2 was a checksum of bit 2 in keys 2-4, and
key 2 bit 3 was a checksum of bit 3 in keys 1,3,4, and
key 3 bit 4 was a checksum of bit 4 in keys 1,2,4, and
key 4 bit 5 was a checksum of bit 5 in keys 1,2,3,
etc. ...
kinda like striped raid 5 does...
This way, no matter which three of the four keys you had, you could recreate the missing one. use some combination of all four keys to encrypt the message.