I know KDF (Key derivation function) are used to stretch user passwords, which are basically not suitable to be used as keys in cryptographic algorithms.
But what if I create a random key (random 32 bytes), do I still need to use KDF on it to ensure proper encryption?
A KDF is typically used for deriving cryptographic keys from things like passphrases, which as you correctly say are not suitable for direct use. But they are also used for deriving additional keys from a master key, which depending on your overall scheme, might be useful.
Suppose you used a key agreement protocol where both parties ended up with a random shared secret. You could use a KDF to derive a key for encryption, and one for message integrity (for example, an HMAC key).
From NIST SP800-108:
When parties share a secret symmetric key (e.g., upon a successful
execution of a key- establishment scheme as specified in 1 and
[2]), it is often the case that additional keys will be needed (e.g.
as described in [3]). Separate keys may be needed for different
cryptographic purposes – for example, one key may be required for an
encryption algorithm, while another key is intended for use by an
integrity protection algorithm, such as a message authentication
code. At other times, the distinct keys required by multiple entities
may be generated by a trusted party from a single master key. Key
derivation functions are used to derive such keys.
The short answer is, no, you don't need to use a KDF, assuming your key generation is correct.
Related
I need to encrypt two-way (symmetric) distinct tokens. These tokens are expected to be repeated (e.g. They are people first names), but I do not want an attacker to conclude which encrypted tokens came from the same original tokens. Salt is the way to go for one-way cryptography (hashing).
Is there a method that can work in symmetric cryptography, a workaround or an alternative?
Yes. Properly used, symmetric encryption does not reveal anything about the plaintext, not even the fact that multiple plaintexts are the same.
Proper usage means choosing a mode of operation that uses an initialization vector (IV) or nonce (that is, not ECB), and choosing the IV appropriately (usually random bytes). Encrypting multiple plaintexts with the same key and IV allows this attack pretty much just like with ECB mode, and using a static IV is a common mistake.
As mentioned above, properly utilizing a symmetric encryption scheme would NOT reveal information about the plaintext. You mention the need to protect the users against a dictionary attack on the hidden tokens, and a properly utilized encryption scheme such as GCM would provide you with this property.
I recommend utilizing GCM mode as it is an efficient authenticated encryption scheme. Performing cryptographic functions on unauthenticated data may lead to security flaws so utilizing an authenticated encryption scheme such as GCM is your best bet. Note that this encryption scheme along with other CPA-SECURE schemes will provide you security against an adversary that wishes to learn the value of an encrypted token.
For example, in correctly implemented GCM mode, the encryption of the same last name will result in a different ciphertext i.e GCM Mode is Non-Deterministic.
Make sure to utilize a secure padding scheme and fix a length for the ciphertexts to make sure an attacker can't use the lenght of the ciphertext to learn some information about the contents of what generated this token.
Be careful however, you can't interchangeably use hash functions and symmetric encryption schemes as they are created for very different purposes. Be careful with how you share the key, and remember that once an adversary has knowledge of the key, there is nothing random about the ciphertext.
-NOTE-
Using encryption incorrectly : If every user is utilizing the same key to encrypt their token then they can simply decrypt everyone else's token and see the name that generated it.
To be safe, every user must encrypt with a different key so now you have to somehow store and manage the key for each user. This may be very painful and you have to be very careful with this.
However if you are utilizing salts and hash functions, then even if every user is utilizing the same salt to compute hash(name||salt), a malicious user would have to brute force all possible names with the salt to figure out what generated these tokens.
So keep this into consideration and be careful as hash functions and symmetric encryptions schemes can't be used interchangeably.
Assuming that the only items to be ciphered are the tokens (that is, they are not embedded in a larger data structure), then Inicialization Vectors (IV's) are the way to go.
They are quite simple to understand: let M be your token, padded to fit the block size used in the symmetric ciphering algorithm (I'm assuming it's AES) and IV be a random array of bits also the size of the ciphering block.
Then compute C = AES_ENCRYPT(M xor IV, K) where C is the ciphered data and K the symmetric key. That way, the same message M will not be ciphered the same way multiple times since IV is randomly obtained every time.
To decrypt M, just compute M = (AES_DECRYPT(C, K) xor IV).
Of course, both IV and K must be known at decryption time. The most usual way to transmit the IV is to just send it along the ciphered text. This does not compromise security, it's pretty much like storing a salt value, since the encryption key will remain unknown for everybody else.
is it possible to retrieve AES key using AES initalization vector and encrypted data?
I have AES initalization vector and encrypted data. I have seen a online tool for decrypting AES encrypt data using AES key and AES initalization vector.
online tool: http://aes.online-domain-tools.com
When i entered any key in AES key field, it is showing AES initalization vector in initialization field.
So, I have question that if i have AES initalization vector then is it possible to retrieve AES key?
No, the AES key cannot be retrieved from the initialization if AES was applied correctly. In that case the IV and AES key should be independent of each other.
Sometimes the AES key and IV are however generated by hashing over some common value. This is not a secure method of creating an IV. In that case the IV can possibly be used as distinguisher to validate if a particular key is correct (but in general such a test can also be performed over the ciphertext. Deriving the IV from the key makes the use of an IV moot in the first place, IV's should be used to make a cipher secure when a key is reused!
Sometimes the AES key is not generated correctly, for instance by using MD5 over weak password, or by directly applying a password as a key (after padding it to the required size). In that case you may use a dictionary (and related) attacks, basically brute forcing the password to get the key. It is easier to test the correctness of the result if the IV is directly derived from the key .
Both above techniques seem to be used by the above online tool. It clearly shows you how not to apply AES.. Don't trust shit websites that are popular because they just choose an interesting name.
A question for cryptography experts. Imagine we have a conceptual Notes.app:
There are notes (title|content) stored as AES-256 encrypted strings
Application has to present a list of all notes (titles) in a list on its main window
Every title|content is encrypted with a key, generated from a password and a salt
Let's imagine the key generation algorithm takes ~80ms on a mobile device to generate a key
With the following conditions, it would take almost 1 second to decrypt 10 note titles. But what if there are lots of notes?
My 2 pennies on the problem: Encrypt all notes with different initialization vectors, but also with identical salt. That would allow me to generate a decryption key only once and decrypt lots of notes fast.
The question: doing so we would end up with lots of different notes, encrypted with an identical key. Does that somehow compromise the security of AES encryption? Is it possible that knowing there's a bunch of files with not just identical password, but also identical salt somehow makes it possible to crack the encryption?
Thanks for your thoughts
AES-256 do not use a salt. But I guess you use the salt together with the password in a PBE algorithm to generate the key. Usually this kind of PBE algorithms are constructed to be computational expensive - thus the 80 ms you see on your mobile.
When encrypting different messages, you could instead of using different salts to create different keys, just use different initialization vectors (IV) but the same key. The different IV ensures that messages that starts with the same block encrypts to different messages.
What is the correct (acceptable) way to derive an, lets say 128 bit AES key from the secret derived in a DH negotiation?
Use the first 128 bit
Hash the secret and use the first 128 bit
Use some more complicated derivation function
How would you derive a set of keys in a "correct" manner?
I would use a standard. One such standard is NIST Special Pub 800-56A. See in particular section 5.8.
For instance, in TLS used pseudo-random function, which is based on SHA1 and MD5 hash over shared secret (i.e. DH key exchange value), string label (to distinguish different cases for which key is generated, HMAC, cipher and so on), and shared random parameter (both client and server generates his own half of random parameter).
So, i'd recommend to add some random data generated by both client and server, and hash it together with DH key exchange value.
When passing symetrically encrypted data in a URL or possibly storing encrypted data in a cookie, is it resonable and/or nessassary and/or possible to also pass the Symetric Encryption IV (Salt) in the same URL? Is the idea of using Salt even valid in a stateless environment such as the web?
(I understand how salt works in a database given a list of names or accounts etc. but we can't save the salt given that we are passing data in a stateless environment.
Assuming a server side password that is used to encrypt data and then decrypt data, how can Salt be used? I guess a separate IV could be passed in the query string but is publicly exposing the salt ok?
Or can one generate a key and IV from the hash of a "password". Assuming the IV and Key come from non-overlapping areas of the hash, is this ok? (I realize that the salt / key will always be the same for a given password.)
EDIT: Typically using AES.
It is encouraged to generate random IVs for each encryption routine, and they can be passed along safely with the cipher text.
Edit:
I should probably ask what type of information you're storing and why you're using a salt with AES encryption, since salts are typically used for hashing, not symmetric encryption. If the salt is publicly available, it defeats the purpose of having it.
What you really need to do is ensure the strength of your key, because if an attacker has the salt, IV, and cipher text, a brute-force attack can easily be done on weaker keys.
You should not generate an initialization vector from the secret key. The initialization vector should be unpredictable for a given message; if you generated it from the key (or a password used to generate a key), the IV will always be the same, which defeats its purpose.
The IV doesn't need to be secret, however. It's quite common to send it with the ciphertext, unprotected. Incorporating the IV in the URL is a lot easier than trying to keep track of the IV for a given link in some server-side state.
Salt and IVs have distinct applications, but they do act in similar ways.
Cryptographic "salt" is used in password-based key derivation algorithms; storing a hashed password for authentication is a special case of this function. Salt causes the same password to yield different hashes, and thwarts "dictionary attacks", where a hacker has pre-computed hash values for common passwords, and built a "reverse-lookup" index so that they can quickly discover a password for a given hash. Like an IV, the salt used is not a secret.
An initialization vector is used with block ciphers like DES and AES in a feedback mode like CBC. Each block is combined with the next block when it is encrypted. For example, under CBC, the previous block cipher text is XOR-ed with the plain text of the current block before encryption. The IV is randomly generated to serve as a dummy initial block to bootstrap the process.
Because a different IV is (or should be, at least) chosen for each message, when the same message is encrypted with the same key, the resulting cipher text is different. In that sense, an IV is very similar to a salt. A cryptographic random generator is usually the easiest and most secure source for a salt or an IV, so they have that similarity too.
Cryptography is very easy to mess up. If you are not confident about what you are doing, you should consider the value of the information you are protecting, and budget accordingly to get the training or consultation you need.