I'm trying to decrypt an encrypted h264 I-frame, and I was given a key of length 15, is this even valid?
Should not it be of length 16, so the binary representation would be 128 bits?
If you have a thing you could type on a keyboard, that is not a proper AES key, no matter the length. AES derives its power from the fact that its key is effectively random. Anything you can type on a keyboard in not an effectively random sequence of equivalent length. There are only about 96 characters you can type easily on a Latin-style keyboard. A byte has 256 values. 96^16 is a minuscule fraction of 256^16.
To convert a "password" that a human could type into an effectively random AES key, you need a password-based key derivation function (PBKDF). The most famous and widely available is PBKDF2. There are other excellent PBKDFs including scrypt and Argon2. All of them require a random salt, and all are (in cryptographic terms) very slow to compute.
That said, regarding your framework, it is not possible to guess how they have converted this string into a key. You must consult the documentation or the implementation. There are an unbounded number of ways to convert strings into keys (most of them are terrible, but there are still an unbounded selection to pick from). As Michael Fehr noted they might have done something insecure like padding with zeros. They might also have used a simple hashing function like SHA-256 and either used a 256-bit key or taken the top or bottom 128 bits. Or…almost literally anything else. There is no common practice here. Each encryption system has to document how it is implemented.
(Note that even if you see "AES-128," this is also ambiguous. It can mean "AES with a 128-bit key" or it can mean "AES with a 128-bit block and a key of 128, 192 or 256 bits." While the former meaning is a bit more common, the latter occurs often, for example in Apple documentation, despite being redundant (AES always has a 128-bit block). So even questions like "how long is the key" requires digging into the documentation or the implementation. Cryptography is unfortunately incredibly unstandardized.)
Should not it be of length 16, so the binary representation would be 128 bits?
You are right. For AES only key length of 128, 192 or 256 bit is valid.
I commonly see two possibilities for having a key of different length:
You was given a password, not a key. Then you need as well to ask for a way to generate a key from the password (Hash? PBKDF2? Other?)
Many frameworks will silently accept different key length and then trim or zero-pad the value to fit the required key size. IMHO this is not a proper approach as it gives the developers feeling the key is good and in reality a different (padded or trimmed) value is used.
Related
I'm developing an encryption protocol where I have to create an AES key in CTR mode. I decided to keep the key length 128 bits in length, as shorter key size would mean less computing power for mobile devices.
Now, to create this key, I use PBKDF2, which allows me to set its certain parameters like hashing function and iteration number, as it derives a key using an initial information, such as a password, which is what I have. As SHA-1 broken, I wanted to use SHA-256 for the key derivation function's key hashing but I don't understand if it is possible. As I want key to be 128 bits, and SHA-256 is producing 256 bits, does PBKDF2 capable of doing that?
AES-256 is not much slower than AES-128, keysetup is slightly slower, and every block only needs 4 more rounds (11 to 15). So it's about 40% slower at most, and with modern phones having dedicated AES-instruction sets probably even less.
PBKDF2 can output almost any size key, and mostly HMAC-SHA1 or HMAC-SHA256 (not SHA1 or SHA256 directly, but most API's only accept a hash function as parameter and do the HMAC implicitly) is used as the building block "random function". But either one can produce 256, 128 or 10000 byte keys (not that you need that large a key anyway). With HMAC-SHA256 it's equally cheap or expensive to derive a 256 or 128 bit key (the latter is a truncated version of the former, but that's no issue); it's the same work. With HMAC-SHA1 (which is as secure as HMAC-SHA256 for PBKDF2) it's a bit more the work to get a 256 bit key, as 256 is more than the digest size.
So use AES-256 and PBKDF2-(HMAC)-SHA256, should be no performance issue.
First of all the AES-256 is not so slow compared to AES-128. See from Cryptography
CPU overhead (+20% for a 192-bit key, +40% for a 256-bit key:
The PBKDF2's output size is equal to the used PRF function, in your case it is a hash function as SHA-256. Therefore the output will be in 256-bit size.
The PBKDF2's function requires a dkLen parameter - desired Key Lenght.
PBKDF2(PRF, Password, Salt, c, dkLen)
When you put 128 into this parameter you will get a 128-bit. The output is the substring of the full 256-bit output. You will get first 128-bit.
You can see it from implementations as here
When encrypting with AES, you need to have a key size of either 128, 192 or 256 bits. But on various encrypting websites you can use any key that can even be 1 character long (8 bits).
http://aesencryption.net/
For example, on that website I can use any key I want and it will encrypt/decrypt just fine.
How does that work? How is it possible to use keys that aren't even the correct length?
Many encryption tools (and libraries) allow you to provide a 'password', which it uses to derive an appropriately sized key. In order to prevent ambiguity, the term cryptographic key is often used to refer to the N-bit key used with an encryption algorithm.
If you look at the code on the page you linked, it's calculating a SHA-1 hash of the key you gave it, and taking the first 16 bytes as a 128-bit cryptographic key.
PHP, OpenSSL (and others)
Many web sites specifically use PHP mcrypt_encrypt. PHP mcrypt used to accept keys and IV's of any size. What happened was that a key unsupported by the algorithm was extended to the first available size larger than the key bytes given. If the key was too large, it was cut down to the highest key size.
For PHP this changed in 5.6.0:
Invalid key and iv sizes are no longer accepted. mcrypt_encrypt() will now throw a warning and return FALSE if the inputs are invalid. Previously keys and IVs were padded with '\0' bytes to the next valid size.
This will probably break quite a few sites.
Note that this kind of key expansion is absolutely not best practice and is not just frowned upon by cryptography experts. Instead a key derivation function or KDF should be used.
Hashing using a hash function
Hashing using a cryptographic hash such as MD5 or SHA-1 can be used as a poor mans KDF. It doesn't provide the protection that a PBKDF offers though (see below). It is relatively safe to take the (leftmost) bytes if a key of shorter size is required. If a hash is used it should be clear from the API or source code.
This seems to be the method used in the example in the question.
Seeding a random number generator
Unless it is abundantly clear from the API what algorithm is used and that the DRBG a given seed is not mixed with previous seed data (e.g. by the operating system) then this method should not be used. In general using the key/password as seed to a random number generator will lead to catastrophic failure. This method should be fought with all possible means. A random number generator is not a KDF. Unfortunately there are many people following bad examples.
Password encryption
Instead, for password based encryption (PBE), a PBKDF (Password Based Key Derivation Function) should be used. Examples of PBKDFs are PBKDF2, bcrypt or scrypt. This is usually explicit in the API or clearly visible in source code. A good PBKDF uses a salt, possibly a pepper (secret value) and a work factor or iteration count. This makes the password - which usually does contain less entropy than a full key - to be somewhat more safe. It won't protect against really weak keys though.
Secret key derivation
If you have a secret that does contain enough entropy then the salt and work factor are not needed (a salt can however make your KDF much more secure). A work factor only adds a constant amount of time to your key derivation - if brute force attacks are already not feasible because of the amount of entropy the work factor will only slow down the intended user and CPU. Arguably the most advanced KBKDF currently is HKDF. It may be tricky to find KDFs implemented in cryptographic libraries.
The http://aesencryption.net/ algorithm takes a key in string form and remaps it to an array which has a length accepted by Rijndael. If longer than 256 bits, the key is truncated to that length, otherwise it is padded with '\0' bytes until it reaches one of the accepted lengths for the algorithm, that is 128, 160, 192, 224 or 256 bits.
I reproduced the behaviour of this site by taking the key, converting it to an array and eventually truncating / padding it.
You can use the algorithm below to reproduce the key transformation of the site http://aesencryption.net
public static byte[] transformKey(String inputKey){
int keySize = Math.min(((((sessionKey.length * 8 - 128) / 32) + 1) * 32) + 128, 256) / 8;
sessionKey = Arrays.copyOf(sessionKey, keySize);
for (int i = key.getBytes().length; i < sessionKey.length; i++) {
sessionKey[i] = '\0';
}
return sessionKey;
}
NOTE: the for cycle is useless because Arrays.copyOf already pads the array with zeroes.
I came across this:
I don't understand how AES128 is stronger than AES256 in a brute force attack, or how AES256 allows for more combinations than AES128.
These are my simplified premises - assuming I have 100 unique characters on my keyboard, and my ideal password length is 10 characters - there would be 100^10 (or 1x10^20) combinations for brute force attack to decry-pt a given cipher text.
In that case, whether or not AES128 or AES256 is applied doesn't make a difference - please correct me.
Yes, you are correct (in that a weak password will negate the difference between AES128 and AES256 and make bruteforcing as complex as the password is). But this applies only to the case when the password is the only source for key generation.
In normal use, AES keys are generated by a "truly" random source and never by a simple pseudorandom generator (like C++ rand());
AES256 is "more secure" than AES128 because it has 256-bit key - that means 2^256 possible keys to bruteforce, as opposed to 2^128 (AES128). The numbers of possible keys are shown in your table as "combinations".
Personally, I use KeePass and passwords of 20 symbols and above.
Using 20-symbol password composed of small+capital letters (26+26), digits (10) and special symbols (around 20) gives (26+26+10+20)^20 = 1.89*10^38 possible combinations - comparable to an AES128 key.
how AES128 is stronger than AES256 in a brute force attack
AES does multiple rounds of transforming each chunk of data, and it uses different portions of the key in these different rounds. The specification for which portions of the key get used when is called the key schedule. The key schedule for 256-bit keys is not as well designed as the key schedule for 128-bit keys. And in recent years there has been substantial progress in turning those design problems into potential attacks on AES 256.This is the basis for advice on key choice.
how AES256 allows for more combinations than AES128
AES256 uses 256 bits, giving you the permissible combination of aroung 2^256, while in case of 128, its 2^128.
These are my simplified premises - assuming I have 100 unique characters on my keyboard, and my ideal password length is 10
characters - there would be 100^10 (or 1x10^20) combinations for brute
force attack to decry-pt a given cipher text.
I am not quite sure what your understanding is, but when you say applying AES128/AES256, you actually encrypt your password into a cipher text.It is encoded information because it contains a form of the original plaintext that is unreadable by a human. It won't just use all the 100unique characters from your keyboard. It uses more than that. So, if you want to get the original password, you must find the key with which it is encrypted. And that gives you the combination figures 2^128 ans 2^256.
I am creating an encryption scheme with AES in cbc mode with a 256-bit key. Before I learned about CBC mode and initial values, I was planning on creating a 32-bit salt for each act of encryption and storing the salt. The password/entered key would then be padded with this salt up to 32 bits.
ie. if the pass/key entered was "tree," instead of padding it with 28 0s, it would be padded with the first 28 chars of this salt.
However, this was before I learned of the iv, also called a salt in some places. The question for me has now arisen as to whether or not this earlier method of salting has become redundant in principle with the IV. This would be to assume that the salt and the iv would be stored with the cipher text and so a theoretical brute force attack would not be deterred any.
Storing this key and using it rather than 0s is a step that involves some effort, so it is worth asking I think whether or not it is a practically useless measure. It is not as though there could be made, with current knowledge, any brute-force decryption tables for AES, and even a 16 bit salt pains the creation of md5 tables.
Thanks,
Elijah
It's good that you know CBC, as it is certainly better than using ECB mode encryption (although even better modes such as the authenticated modes GCM and EAX exist as well).
I think there are several things that you should know about, so I'll explain them here.
Keys and passwords are not the same. Normally you create a key used for symmetric encryption out of a password using a key derivation function. The most common one discussed here is PBKDF2 (password based key derivation function #2), which is used for PBE (password based encryption). This is defined in the latest, open PKCS#5 standard by RSA labs. Before entering the password need to check if the password is correctly translated into bytes (character encoding).
The salt is used as another input of the key derivation function. It is used to prevent brute force attacks using "rainbow tables" where keys are pre-computed for specific passwords. Because of the salt, the attacker cannot use pre-computed values, as he cannot generate one for each salt. The salt should normally be 8 bytes (64 bits) or longer; using a 128 bit salt would give you optimum security. The salt also ensures that identical passwords (of different users) do not derive the same key.
The output of the key derivation function is a secret of dkLen bytes, where dkLen is the length of the key to generate, in bytes. As an AES key does not contain anything other than these bytes, the AES key will be identical to the generated secret. dkLen should be 16, 24 or 32 bytes for the key lengths of AES: 128, 192 or 256 bits.
OK, so now you finally have an AES key to use. However, if you simply encrypt each plain text block with this key, you will get identical result if the plain text blocks are identical. CBC mode gets around this by XOR'ing the next plain text block with the last encrypted block before doing the encryption. That last encrypted block is the "vector". This does not work for the first block, because there is no last encrypted block. This is why you need to specify the first vector: the "initialization vector" or IV.
The block size of AES is 16 bytes independent of the key size. So the vectors, including the initialization vector, need to be 16 bytes as well. Now, if you only use the key to encrypt e.g. a single file, then the IV could simply contain 16 bytes with the value 00h.
This does not work for multiple files, because if the files contain the same text, you will be able to detect that the first part of the encrypted file is identical. This is why you need to specify a different IV for each encryption you perform with the key. It does not matter what it contains, as long as it is unique, 16 bytes and known to the application performing the decryption.
[EDIT 6 years later] The above part is not entirely correct: for CBC the IV needs to be unpredictable to an attacker, it doesn't just need to be unique. So for instance a counter cannot be used.
Now there is one trick that might allow you to use all zero's for the IV all the time: for each plain text you encrypt using AES-CBC, you could calculate a key using the same password but a different salt. In that case, you will only use the resulting key for a single piece of information. This might be a good idea if you cannot provide an IV for a library implementing password based encryption.
[EDIT] Another commonly used trick is to use additional output of PBKDF2 to derive the IV. This way the official recommendation that the IV for CBC should not be predicted by an adversary is fulfilled. You should however make sure that you do not ask for more output of the PBKDF2 function than that the underlying hash function can deliver. PBKDF2 has weaknesses that would enable an adversary to gain an advantage in such a situation. So do not ask for more than 256 bits if SHA-256 is used as hash function for PBKDF2. Note that SHA-1 is the common default for PBKDF2 so that only allows for a single 128 bit AES key.
IV's and salts are completely separate terms, although often confused. In your question, you also confuse bits and bytes, key size and block size and rainbow tables with MD5 tables (nobody said crypto is easy). One thing is certain: in cryptography it pays to be as secure as possible; redundant security is generally not a problem, unless you really (really) cannot afford the extra resources.
When you understand how this all works, I would seriously you to find a library that performs PBE encryption. You might just need to feed this the password, salt, plain data and - if separately configured- the IV.
[Edit] You should probably look for a library that uses Argon2 by now. PBKDF2 is still considered secure, but it does give unfair advantage to an attacker in some cases, letting the attacker perform fewer calculations than the regular user of the function. That's not a good property for a PBKDF / password hash.
If you are talking about AES-CBC then it is an Initialisation Vector (IV), not Salt. It is common practice to send the IV in clear as the first block of the encyphered message. The IV does not need to be kept secret. It should however be changed with every message - a constant IV means that effectively your first block is encrypted in ECB mode, which is not properly secure.
What is the minimum number of bits needed to represent a single character of encrypted text.
eg, if I wanted to encrypt the letter 'a', how many bits would I require. (assume there are many singly encrypted characters using the same key.)
Am I right in thinking that it would be the size of the key. eg 256 bits?
Though the question is somewhat fuzzy, first of all it would depend on whether you use a stream cipher or a block cipher.
For the stream cipher, you would get the same number of bits out that you put in - so the binary logarithm of your input alphabet size would make sense. The block cipher requires input blocks of a fixed size, so you might pad your 'a' with zeroes and encrypt that, effectively having the block size as a minimum, like you already proposed.
I'm afraid all the answers you've had so far are quite wrong! It seems I can't reply to them, but do ask if you need more information on why they are wrong. Here is the correct answer:
About 80 bits.
You need a few bits for the "nonce" (sometimes called the IV). When you encrypt, you combine key, plaintext and nonce to produce the ciphertext, and you must never use the same nonce twice. So how big the nonce needs to be depends on how often you plan on using the same key; if you won't be using the key more than 256 times, you can use an 8 bit nonce. Note that it's only the encrypting side that needs to ensure it doesn't use a nonce twice; the decrypting side only needs to care if it cares about preventing replay attacks.
You need 8 bits for the payload, since that's how many bits of plaintext you have.
Finally, you need about 64 bits for the authentication tag. At this length, an attacker has to try on average 2^63 bogus messages minimum before they get one accepted by the remote end. Do not think that you can do without the authentication tag; this is essential for the security of the whole mode.
Put these together using AES in a chaining mode such as EAX or GCM, and you get 80 bits of ciphertext.
The key size isn't a consideration.
You can have the same number of bits as the plaintext if you use a one-time pad.
This is hard to answer. You should definitely first read up on some fundamentals. You can 'encrypt' an 'a' with a single bit (Huffman encoding-style), and of course you could use more bits too. A number like 256 bits without any context is meaningless.
Here's something to get you started:
Information Theory -- esp. check out Shannon's seminal paper
One Time Pad -- infamous secure, but impractical, encryption scheme
Huffman encoding -- not encryption, but demonstrates the above point