From what I can see, Microsoft's RSA CSP always generates identical bitlength pseudo prime numbers. So if the key size is 1024, the P and Q values seem to be (?) guaranteed to be 512 bits each? Does anyone know for sure if this, in fact, is the case?
I'm building an interoperability module between my own RSA implementation and Microsoft's. In my case I have built in a small random variance between P & Q values so for 1024 bit key I could end up with one value being 506 bits and the other 518. On purely experimental basis, if I lock the variance to 0 (i.e. the P & Q values are equal in size) -- Things work the way they should, I soon as I make the size variable Microsoft RSA object responds with "Bad Data" during import process.
I'm looking for a confirmation that Microsoft enforces equal key sizes, so if anyone has any information on it, please post
Before someone has a chance to ask why I had to implement my own RSA provider : CryptoAPI doesn't play nice in a multithreaded environment, it locks the machine keystore on CryptoServiceProvider calls; which means "File not found" (rather cryptic) errors if accessed from multiple threads
For those that care, take a look here: http://blogs.msdn.com/b/alejacma/archive/2007/12/03/rsacryptoserviceprovider-fails-when-used-with-asp-net.aspx
Microsoft's RSA CSP generates and uses private keys which it can export and import in the format described on this page, and looks like this:
BLOBHEADER blobheader;
RSAPUBKEY rsapubkey;
BYTE modulus[rsapubkey.bitlen/8];
BYTE prime1[rsapubkey.bitlen/16];
BYTE prime2[rsapubkey.bitlen/16];
BYTE exponent1[rsapubkey.bitlen/16];
BYTE exponent2[rsapubkey.bitlen/16];
BYTE coefficient[rsapubkey.bitlen/16];
BYTE privateExponent[rsapubkey.bitlen/8];
So private keys that the CSP can handle (and in particular generate) must have the following properties:
The modulus length, in bits, must be a multiple of 16.
The length of each prime factor must be no more than half the length of the modulus.
The private exponent must not be longer than the modulus.
The private exponent, reduced modulo p-1 (resp. q-1) must be no longer than half the modulus.
Technically, there are infinitely many possible values for the private exponent d, and similarly for exponent1 and exponent2 because all that mathematically matters are the value of d modulo p-1 and q-1; it has been suggested to accept slightly longer private exponent parts if they end up with a lower Hamming weight, because this would lead to some performance benefits. Bottom-line: the format described above will not let you do that.
Other characteristics that the key must have to be acceptable to Microsoft's code (but not directly reported in the description above):
The numerical value of the first prime (p, aka prime1) must be greater than the numerical value of the second prime (q, aka prime2).
The public exponent (here encoded within the rsapubkey field) must fit in a 32-bit integer (unsigned).
Therefore there are many RSA key pairs which are nominally valid as per the RSA standard, but which cannot be handled by Microsoft RSA CSP code. Noteworthy is the last constraint, on the public exponent size: this means that the constraint is more general than just the CSP; if you setup a SSL server where the server's public key (in its certificate) has a public exponent which does not fit in 32 bits, then Internet Explorer will not be able to connect to it.
So, in practice, if you generate RSA key pairs, you will have to make sure that they comply with the rules above. Do not worry: to the best of our knowledge, these rules do not lower security.
My own work/experimentations, doing Mono's (managed) RSA implementation and unit tests, shows that Microsoft implementation requires specific byte[] size when importing RSA parameter values.
It's also a common interoperability issue (there are some SO questions about it) when people using BigInteger to convert their parameters since they often are a bit smaller (e.g. 1 byte less) than what MS expect and needs to be 0-padded.
So I'm pretty sure you can pad your smaller value to make MS accept it, but you'll likely not be able to make it accept a larger value.
Related
I have recently been learning about public/private key encryption in my computer science lessons, and how it works in terms of data encryption/decryption. We also covered how it can be used for digital signatures. However, we didn't go into too much detail on how the actual keys are generated themselves.
I know that it begins with a very large number, which is then passed through some kind of keygen algorithm which returns two distinctive keys, one of which is private and the other is public. Are these algorithms known or are they black box systems? And does one user always have the same pair of keys linked to them or do they ever change at any point?
It just seems like a very mathematical issue, as the keys are linked, yet one is not deducible from the other.
I know that it begins with a very large number, which is then passed through some kind of keygen algorithm which returns two distinctive keys, one of which is private and the other is public.
Well, that's not entirely correct. Most asymmetric algorithms are of course based on large numbers, but this is not a requirement. There are, for instance, algorithms based on hashing, and hashing is based on bits/bytes, not numbers.
But yes, for asymmetric algorithms usually contain a specific algorithm to perform the key pair generation. For instance, asymmetric encryption consists of a triple Gen, Enc and Dec where Gen represents the key pair generation. And the key pair of course consists of a public and a private part.
RSA basically starts off by generating two large random primes, it doesn't start with a single number necessarily.
Are these algorithms known or are they black box systems?
They are known, and they are fundamental to the security of the system. You cannot use just any numbers to perform, e.g., RSA. Note that for RSA there are different algorithms and configurations possible; not every system will use the same Gen.
And does one user always have the same pair of keys linked to them or do they ever change at any point?
That depends on the key management of the system. Usually there is some way of refreshing or regenerating keys. For instance X.509 certificates tend to have a end date (the date of expiry or expiration date), so you cannot even keep using the corresponding private key forever; you have to refresh the certificates and keys now and then.
It just seems like a very mathematical issue, as the keys are linked, yet one is not deducible from the other.
That's generally not correct. The public key is usually easy to derive from the private key. For RSA the public exponent may not be known, but it is usually set to a fixed number (65537). This together with the modulus - also part of the private key - makes the public key. For Elliptic Curve keys a private random value is first produced and the public key is directly derived from it.
You can of course never derive the private key from the public key; that would make no sense - it would not be very private if you could.
In RSA the generated two numbers p and q are very large prime numbers more or less the same size, which are used to calculate N which derives the public/private keys using modulo arithmetic.
The following answer in crypto.stackexchange.com describes in more details how we can start from a random (large) number and use Fermat test and Miller-Rabin tests to reach a number that is very probable to be prime.
How to determine the padding scheme used in a RSA encrypted message?
The best way is probably to perform PKCS#1 v1.5 or OAEP decryption and see if you hit gold. The unpadding will fail if the wrong algorithm is chosen.
It is possible to raw decipher and then look at the resulting scheme by representing the padded message as hexadecimals. If the resulting octet string (byte array) starts with 0002 then it's likely PKCS#1 v1.5 padding. If it starts with 00 and then a random looking byte it's probably OAEP. As you can see, this is just a heuristic, not a full fledged algorithm. Note that OAEP's MFG1 can be parameterized with a hash function, but usually SHA-1 is used.
RSA-KEM is not used much, but as it results in a completely random key seed (possibly excluding the first bit), there is no way to test for RSA-KEM other than to expect RSA-KEM if the ciphertext and private key can be verified to be correct and the value doesn't match one of the other schemes / lacks structure.
The protocol should define which algorithm is used. Leaving the choice of algorithm to the decryption routine opens up your implementation to attacks. The security proof of ciphers doesn't allow for other algorithms to be chosen.
So you can use above to analyze the protocol, but please do not use it in your implementation to choose between algorithms in the field.
When encrypting with AES, you need to have a key size of either 128, 192 or 256 bits. But on various encrypting websites you can use any key that can even be 1 character long (8 bits).
http://aesencryption.net/
For example, on that website I can use any key I want and it will encrypt/decrypt just fine.
How does that work? How is it possible to use keys that aren't even the correct length?
Many encryption tools (and libraries) allow you to provide a 'password', which it uses to derive an appropriately sized key. In order to prevent ambiguity, the term cryptographic key is often used to refer to the N-bit key used with an encryption algorithm.
If you look at the code on the page you linked, it's calculating a SHA-1 hash of the key you gave it, and taking the first 16 bytes as a 128-bit cryptographic key.
PHP, OpenSSL (and others)
Many web sites specifically use PHP mcrypt_encrypt. PHP mcrypt used to accept keys and IV's of any size. What happened was that a key unsupported by the algorithm was extended to the first available size larger than the key bytes given. If the key was too large, it was cut down to the highest key size.
For PHP this changed in 5.6.0:
Invalid key and iv sizes are no longer accepted. mcrypt_encrypt() will now throw a warning and return FALSE if the inputs are invalid. Previously keys and IVs were padded with '\0' bytes to the next valid size.
This will probably break quite a few sites.
Note that this kind of key expansion is absolutely not best practice and is not just frowned upon by cryptography experts. Instead a key derivation function or KDF should be used.
Hashing using a hash function
Hashing using a cryptographic hash such as MD5 or SHA-1 can be used as a poor mans KDF. It doesn't provide the protection that a PBKDF offers though (see below). It is relatively safe to take the (leftmost) bytes if a key of shorter size is required. If a hash is used it should be clear from the API or source code.
This seems to be the method used in the example in the question.
Seeding a random number generator
Unless it is abundantly clear from the API what algorithm is used and that the DRBG a given seed is not mixed with previous seed data (e.g. by the operating system) then this method should not be used. In general using the key/password as seed to a random number generator will lead to catastrophic failure. This method should be fought with all possible means. A random number generator is not a KDF. Unfortunately there are many people following bad examples.
Password encryption
Instead, for password based encryption (PBE), a PBKDF (Password Based Key Derivation Function) should be used. Examples of PBKDFs are PBKDF2, bcrypt or scrypt. This is usually explicit in the API or clearly visible in source code. A good PBKDF uses a salt, possibly a pepper (secret value) and a work factor or iteration count. This makes the password - which usually does contain less entropy than a full key - to be somewhat more safe. It won't protect against really weak keys though.
Secret key derivation
If you have a secret that does contain enough entropy then the salt and work factor are not needed (a salt can however make your KDF much more secure). A work factor only adds a constant amount of time to your key derivation - if brute force attacks are already not feasible because of the amount of entropy the work factor will only slow down the intended user and CPU. Arguably the most advanced KBKDF currently is HKDF. It may be tricky to find KDFs implemented in cryptographic libraries.
The http://aesencryption.net/ algorithm takes a key in string form and remaps it to an array which has a length accepted by Rijndael. If longer than 256 bits, the key is truncated to that length, otherwise it is padded with '\0' bytes until it reaches one of the accepted lengths for the algorithm, that is 128, 160, 192, 224 or 256 bits.
I reproduced the behaviour of this site by taking the key, converting it to an array and eventually truncating / padding it.
You can use the algorithm below to reproduce the key transformation of the site http://aesencryption.net
public static byte[] transformKey(String inputKey){
int keySize = Math.min(((((sessionKey.length * 8 - 128) / 32) + 1) * 32) + 128, 256) / 8;
sessionKey = Arrays.copyOf(sessionKey, keySize);
for (int i = key.getBytes().length; i < sessionKey.length; i++) {
sessionKey[i] = '\0';
}
return sessionKey;
}
NOTE: the for cycle is useless because Arrays.copyOf already pads the array with zeroes.
I want to compare a hash function and a RSA encryption with another parameter.
I have an algorithm with some hash function and I want to claim that computation load of these hashes is less than one RSA.
Can I say compare them with multiplication parameter, for example how many multiplication each of them has?
How can I compare them in communication load? How can I say that what the length of output in RSA is?
It sounds like you're trying to compare apples and oranges.
A hash function is generally expected to accept arbitrarily long inputs, and the time needed to compute it should generally scale linearly with the length of the input. Thus, a useful measure of hash function performance would be, say, "megabytes per second".
(Specifically, that would be a measure of throughput, which is the relevant measure when hashing long inputs. For short messages, a more relevant measure is the latency, which is basically the minimum time needed to hash zero-length input. Given the throughput and the latency, one can generally calculate a fairly good approximation of the time needed to hash an input of any given length as time = latency + length / throughput.)
RSA, on the other hand, can only encrypt messages shorter than the modulus, which is chosen at the time the key is generated. (Typical modulus sizes might be, say, from 1024 to 4096 bits.) To "encrypt a long message with RSA" one would normally use hybrid encryption: first encrypt the message using a symmetric cipher like AES, using a suitable mode of operation and a randomly chosen key, and then encrypt the AES key with RSA.
The same length limits apply to signing messages with RSA — by itself, RSA can only sign messages shorter than the modulus. The standard workaround in this case is to first hash the message, and then sign the hash value. (There's also a lot of important details like padding involved that I'm not going to go into here, since we're not on crypto.SE, but which are absolutely crucial for security.)
The point is that, in both cases, the RSA operation itself takes a fixed amount of time regardless of the message length, and thus, for sufficiently long messages, most of the time will be consumed by AES or the hash function, not by RSA itself. So when you say you want to "claim that computation load of these hashes is less than one RSA", I would say that's meaningless, at least unless you fixed a specific input length for your hash. (And if you did, my next question would be "what's so special about that particular input length?")
I'm working on a new licensing scheme for my software, based on OpenSSL public / private key encryption. My past approach, based on this article, was to use a large private key size and encrypt an SHA1 hashed string, which I sent to the customer as a license file (the base64 encoded hash is about a paragraph in length). I know someone could still easily crack my application, but it prevented someone from making a key generator, which I think would hurt more in the long run.
For various reasons I want to move away from license files and simply email a 16 character base32 string the customer can type into the application. Even using small private keys (which I understand are trivial to crack), it's hard to get the encrypted hash this small. Would there be any benefit to using the same strategy to generated an encrypted hash, but simply using the first 16 characters as a license key? If not, is there a better alternative that will create keys in the format I want?
DSA signatures are signficantly shorter than RSA ones. DSA signatures are the twice the size of the Q parameter; if you use the OpenSSL defaults, Q is 160 bits, so your signatures fit into 320 bits.
If you can switch to a base-64 representation (which only requires upper-and-lower case alphanumerics, the digits and two other symbols) then you will need 53 symbols, which you could do with 11 groups of 5. Not quite the 16 that you wanted, but still within the bounds of being user-enterable.
Actually, it occurs to me that you could halve the number of bits required in the license key. DSA signatures are made up of two numbers, R and S, each the size of Q. However, the R values can all be pre-computed by the signer (you) - the only requirement is that you never re-use them. So this means that you could precalculate a whole table of R values - say 1 million of them (taking up 20MB) - and distribute these as part of the application. Now when you create a license key, you pick the next un-used R value, and generate the S value. The license key itself only contains the index of the R value (needing only 20 bits) and the complete S value (160 bits).
And if you're getting close to selling a million copies of the app - a nice problem to have - just create a new version with a new R table.
Did you consider using some existing protection + key generation scheme? I know that EXECryptor (I am not advertising it at all, this is just some info I remember) offers strong protection whcih together with complimentatary product of the same guys, StrongKey (if memory serves) offers short keys and protection against cracking. Armadillo is another product name that comes to my mind, though I don't know what level of protection they offer now. But they also had short keys earlier.
In general, cryptographically strong short keys are based on some aspects of ECC (elliptic curve cryptography). Large part of ECC is patented, and in overall ECC is hard to implement right and so industry solution is a preferred way to go.
Of course, if you don't need strong keys, you can go with just a hash of "secret word" (salt) + user name, and verify them in the application, but this is crackable in minutes.
Why use public key crypto? It gives you the advantage that nobody can reverse-engineer the executable to create a key generator, but key generators are a somewhat secondary risk compared to patching the executable to skip the check, which is generally much easier for an attacker, even with well-obfuscated executables.
Eugene's suggestion of using ECC is a good one - ECC keys are much shorter than RSA or DSA for a given security level.
However, 16 characters in base 32 is still only 5*16=80 bits, which is low enough that brute-forcing for valid keys might be practical, regardless of what algorithm you use.