Symmetric cryptography and salt - encryption

I need to encrypt two-way (symmetric) distinct tokens. These tokens are expected to be repeated (e.g. They are people first names), but I do not want an attacker to conclude which encrypted tokens came from the same original tokens. Salt is the way to go for one-way cryptography (hashing).
Is there a method that can work in symmetric cryptography, a workaround or an alternative?

Yes. Properly used, symmetric encryption does not reveal anything about the plaintext, not even the fact that multiple plaintexts are the same.
Proper usage means choosing a mode of operation that uses an initialization vector (IV) or nonce (that is, not ECB), and choosing the IV appropriately (usually random bytes). Encrypting multiple plaintexts with the same key and IV allows this attack pretty much just like with ECB mode, and using a static IV is a common mistake.

As mentioned above, properly utilizing a symmetric encryption scheme would NOT reveal information about the plaintext. You mention the need to protect the users against a dictionary attack on the hidden tokens, and a properly utilized encryption scheme such as GCM would provide you with this property.
I recommend utilizing GCM mode as it is an efficient authenticated encryption scheme. Performing cryptographic functions on unauthenticated data may lead to security flaws so utilizing an authenticated encryption scheme such as GCM is your best bet. Note that this encryption scheme along with other CPA-SECURE schemes will provide you security against an adversary that wishes to learn the value of an encrypted token.
For example, in correctly implemented GCM mode, the encryption of the same last name will result in a different ciphertext i.e GCM Mode is Non-Deterministic.
Make sure to utilize a secure padding scheme and fix a length for the ciphertexts to make sure an attacker can't use the lenght of the ciphertext to learn some information about the contents of what generated this token.
Be careful however, you can't interchangeably use hash functions and symmetric encryption schemes as they are created for very different purposes. Be careful with how you share the key, and remember that once an adversary has knowledge of the key, there is nothing random about the ciphertext.
-NOTE-
Using encryption incorrectly : If every user is utilizing the same key to encrypt their token then they can simply decrypt everyone else's token and see the name that generated it.
To be safe, every user must encrypt with a different key so now you have to somehow store and manage the key for each user. This may be very painful and you have to be very careful with this.
However if you are utilizing salts and hash functions, then even if every user is utilizing the same salt to compute hash(name||salt), a malicious user would have to brute force all possible names with the salt to figure out what generated these tokens.
So keep this into consideration and be careful as hash functions and symmetric encryptions schemes can't be used interchangeably.

Assuming that the only items to be ciphered are the tokens (that is, they are not embedded in a larger data structure), then Inicialization Vectors (IV's) are the way to go.
They are quite simple to understand: let M be your token, padded to fit the block size used in the symmetric ciphering algorithm (I'm assuming it's AES) and IV be a random array of bits also the size of the ciphering block.
Then compute C = AES_ENCRYPT(M xor IV, K) where C is the ciphered data and K the symmetric key. That way, the same message M will not be ciphered the same way multiple times since IV is randomly obtained every time.
To decrypt M, just compute M = (AES_DECRYPT(C, K) xor IV).
Of course, both IV and K must be known at decryption time. The most usual way to transmit the IV is to just send it along the ciphered text. This does not compromise security, it's pretty much like storing a salt value, since the encryption key will remain unknown for everybody else.

Related

Does the IV of AES-128-cbc need to be random during encryption and decryption?

I am using node and the crypto module to encrypt and decrypt a large binary file. I encrypt the file using crypto.createCipheriv and decrypt it using crypto.createDecipheriv.
For the encryption I use a random IV as follows:
const iv = crypto.randomBytes(16);
const encrypt = crypto.createCipheriv('aes-128-cbc', key, iv)
What I don't understand, do I need to pass a random IV for createDecipheriv as well? The SO here says:
The IV needs to be identical for encryption and decryption.
Can the IV be static? And if it can't, is it considered to be a secret? Where would I store the IV? In the payload?
If I use different random IVs for the encryption and decryption, my payload gets decrypted but the first 16 bytes are corrupt. This means, it looks like the IV needs to be the same but from a security perspective there is also not much value as the payload is decrypted except 16 bytes.
Can anyone elaborate what the go-to approach is? Thanks for your help!
The Key+IV pair must never be duplicated on two encryptions using CBC. Doing so leaks information about the first block (in all cases), and is creates duplicate cipher texts (which is a problem if you ever encrypt the same message prefix twice).
So, if your key changes for every encryption, then your IV could be static. But no one does that. They have a key they reuse. So the IV must change.
There is no requirement that it be random. It just shouldn't repeat and it must not be predictable (in cases where the attacker can control the messages). Random is the easiest way to do that. Anything other than random requires a lot of specialized knowledge to get right, so use random.
Reusing a Key+IV pair in CBC weakens the security of the cipher, but does not destroy it, as in CTR. IV reused with CTR can lead to trivial decryptions. In CBC, it generally just leaks information. It's a serious problem, but it is not catastrophic. (Not all insecure configurations are created equal.)
The IV is not a secret. Everyone can know it. So it is typically prepended to the ciphertext.
For security reasons, the IV needs to be chosen to meet cryptographic randomness security requirements (i.e. use crypto.randomBytes( ) in node). This was shown in Phil Rogaway's research paper. The summary is in Figure 1.2 of the paper, which I transcribe here:
CBC (SP 800-38A): An IV-based encryption scheme, the mode is secure as a probabilistic encryption scheme, achieving indistinguishability from random bits, assuming a random IV. Confidentiality is not achieved if the IV is merely a nonce, nor if it is a nonce enciphered under the same key used by the scheme, as the standard incorrectly suggests to do.
The normal way to implement this is to include the IV prepended to the ciphertext. The receiving party extracts the IV and then decrypts the ciphertext. The IV is not a secret, instead it is just used to bring necessary security properties into the mode of operation.
However, be aware that encryption with CBC does not prevent people from tampering with the data. If an attacker fiddles with ciphertext bits within a block, it affects exactly two plaintext blocks, one of which is in a very controlled way.
To make a very long story short, GCM is a better mode to use to prevent such abuses. In that case, you do not need a random IV, but instead you must never let the IV repeat (in cryptography, we call this property a "nonce"). Luke Park gives an example of how to implement it, here. He uses randomness for the nonce, which achieves the nonce property for all practical purposes (unless you are encrypting 2^48 texts, which is crazy large).
But whatever mode you do, you must never repeat an IV for a given key, which is a very common mistake.

How to decrypt AES on the server side? [duplicate]

If I am using Rijndael CBC mode, I have no idea why we would need salt.
My understanding is even if people know the password, but he cannot get the data without IV.
So from my perspective, password + IV seem to be sufficent secure.
Do I get anything wrong?
Yes, you need all of these things.
Salt (and an "iteration count") is used to derive a key from the password. Refer to PKCS #5 for more information. The salt and iteration count used for key derivation do not have to be secret. The salt should be unpredictable, however, and is best chosen randomly.
CBC mode requires an initialization vector. This is a block of random data produced for each message by a cryptographic random number generator. It serves as the dummy initial block of ciphertext. Like the key-derivation salt, it doesn't have to be kept secret, and is usually transmitted along with the cipher text.
The password, and keys derived from it, must be kept secret. Even if an attacker has the parameters for key derivation and encryption, and the ciphertext, he can do nothing without the key.
Update:
Passwords aren't selected randomly; some passwords are much more likely than others. Therefore, rather than generating all possible passwords of a given length (exhaustive brute-force search), attackers maintain a list of passwords, ordered by decreasing probability.
Deriving an encryption key from a password is relatively slow (due to the iteration of the key derivation algorithm). Deriving keys for a few million passwords could take months. This would motivate an attacker to derive the keys from his most-likely-password list once, and store the results. With such a list, he can quickly try to decrypt with each key in his list, rather than spending months of compute time to derive keys again.
However, each bit of salt doubles the space required to store the derived key, and the time it takes to derive keys for each of his likely passwords. A few bytes of salt, and it quickly becomes infeasible to create and store such a list.
Salt is necessary to prevent pre-computation attacks.
An IV (or nonce with counter modes) makes the same plain text produce different cipher texts. The prevents an attacker from exploiting patterns in the plain text to garner information from a set of encrypted messages.
An initialization vector is necessary to hide patterns in messages.
One serves to enhance the security of the key, the other enhances the security of each message encrypted with that key. Both are necessary together.
First things first: Rijndael does not have a "password" in CBC mode. Rijndael in CBC mode takes a buffer to encrypt or decrypt, a key, and an IV.
A "salt" is typically used for encrypting passwords. The salt is added to the password that is encrypted and stored with the encrypted value. This prevents someone from building a dictionary of how all passwords encrypt---you need to build a dictionary of how all passwords encrypt for all salts. That was actually possible with the old Unix password encryption algorithm, which only used a 12-bit salt. (It increased the work factor by 4096). With a 128-bit salt it is not possible.
Someone can still do a brute-force attack on a specific password, of course, provided that they can retrieve the encrypted password.
However, you have an IV, which does pretty much the same thing that a Salt does. You don't need both. Or, rather, the IV is your salt.
BTW, these days we call "Rijndael" AES.
A salt is generally used when using a hash algorithm. Rijndael is not a hash, but a two-way encryption algorithm. Ergo, a salt is not necessarily needed for encrypting the data. That being said, a salted hash of a password may be used as the Key for encrypting data. For what you're looking for, you might wish to look at hybrid cryptosystems.
The Key should be considered private and not transmitted with your encrypted data while the IV may be transmitted with the encrypted data.

How to generate AES-ECB encryption secret key given examples of plaintexts and hashes?

My question is that, suppose you have some AES-ECB encrypted hash and you want to decode it. You are also given a bunch of example plaintexts and hashes. For example:
I want: unknown_plaintext for the hash given_hash
and i have a bunch of known_plaintexts and hashes that have been encrypted with the same secret key. None of them (obviously) are the exact same to the given hash.
Please let me know if you can help. This is not for malicious intents, just to learn how Cryptography and AES systems work.
This is not computationally feasible. I.e., you can't do this.
Modern encryption algorithms like AES are resistant to known-plaintext attacks, which is what you are describing.
There has been some past success in a category called adaptive chosen plaintext attacks. Often these exploit an "oracle." In this scenario, an attacker can decrypt a single message by repeatedly asking the victim whether it can successfully decrypt a guess generated by the attacker. By being smart about choosing successive guesses, the attacker could decrypt the message with a million tries or so, which is a relatively small number. But even in this scenario, the attacker can't recover the key.
As an aside, ciphers don't generate hashes. They output cipher text. Hash functions (aka message digests) generate hashes.
For any respectable block cipher (and AES is a respectable block cipher), the only way to decrypt a ciphertext block (not "hash") is to know the key, and the only way to find the key from a bunch of plaintext-ciphertext pairs is by guessing a key and seeing if it maps a known plaintext onto the corresponding ciphertext. If you have some knowledge of how the key was chosen (e.g., SHA-256 of a pet's name), this might work; but if the key was randomly selected from the set of all possible AES keys, the number of guesses required to produce a significant probability of success is such a large number that you wander off into age-of-the-universe handwaving.
If you know that all the encrypted hashes are encrypted with the same key you can first try to find that key using your pairs of plaintexts and encrypted hashes. The most obvious way to do that would be to just take one of your plaintexts, first hash it and then try out all the possible keys to encrypt it until it matches the encrypted hash that you know. If the key you're looking for is just one of the many many possible AES keys this is set to fail, because it would take way too long to try all the keys.
Assuming you were able to recover the AES key somehow, you can decrypt that one hash you don't have a plaintext for and start looking for the plaintext.
The more you know about the plaintext, the easier this guesswork would be. You could just throw the decrypted hash into google and see what it spits out, query databases of known hashes or make guesses in the most eduated way possible. This step will again fail, if the hash is strong enough and the plaintext is random enough.
As other people have indicated, modern encryption algorithms are specifically designed to resist this kind of attack. Even a rather weak encryption algorithm like the Tiny Encryption Algorithm would require well over 8 million chosen plaintexts to do anything like this. Better algorithms like AES, Blowfish, etc. require vastly more than that.
As of right now, there are no practical attacks on AES.
If you're interested in learning about cryptography, the older Data Encryption Standard (DES) may actually be a more interesting place to start than AES; there's a lot of literature available about it and it was already broken (the code to do so is still freely available online - studying it is actually really useful).

Proper/Secure encryption of data using AES and a password

Right now, this is what I am doing:
1. SHA-1 a password like "pass123", use the first 32 characters of the hexadecimal decoding for the key
2. Encrypt with AES-256 with just whatever the default parameters are
^Is that secure enough?
I need my application to encrypt data with a password, and securely. There are too many different things that come up when I google this and some things that I don't understand about it too. I am asking this as a general question, not any specific coding language (though I'm planning on using this with Java and with iOS).
So now that I am trying to do this more properly, please follow what I have in mind:
Input is a password such as "pass123" and the data is
what I want to encrypt such as "The bank account is 038414838 and the pin is 5931"
Use PBKDF2 to derive a key from the password. Parameters:
1000 iterations
length of 256bits
Salt - this one confuses me because I am not sure where to get the salt from, do I just make one up? As in, all my encryptions would always use the salt "F" for example (since apparently salts are 8bits which is just one character)
Now I take this key, and do I hash it?? Should I use something like SHA-256? Is that secure? And what is HMAC? Should I use that?
Note: Do I need to perform both steps 2 and 3 or is just one or the other okay?
Okay now I have the 256-bit key to do the encryption with. So I perform the encryption using AES, but here's yet another confusing part (the parameters).
I'm not really sure what are the different "modes" to use, apparently there's like CBC and EBC and a bunch of others
I also am not sure about the "Initialization Vector," do I just make one up and always use that one?
And then what about other options, what is PKCS7Padding?
For your initial points:
Using hexadecimals clearly splits the key size in half. Basically, you are using AES-128 security wise. Not that that is bad, but you might also go for AES-128 and use 16 bytes.
SHA-1 is relatively safe for key derivation, but it shouldn't be used directly because of the existence/creation of rainbow tables. For this you need a function like PBKDF2 which uses an iteration count and salt.
As for the solution:
You should not encrypt PIN's if that can be avoided. Please make sure your passwords are safe enough, allow pass phrases.
Create a random number per password and save the salt (16 bytes) with the output of PBKDF2. The salt does not have to be secret, although you might want to include a system secret to add some extra security. The salt and password are hashed, so they may have any length to be compatible with PBKDF2.
No, you just save the secret generated by the PBKDF2, let the PBKDF2 generate more data when required.
Never use ECB (not EBC). Use CBC as minimum. Note that CBC encryption does not provide integrity checking (somebody might change the cipher text and you might never know it) or authenticity. For that, you might want to add an additional MAC, HMAC or use an encryption mode such as GCM. PKCS7Padding (identical to PKCS5Padding in most occurences) is a simple method of adding bogus data to get N * [blocksize] bytes, required by block wise encryption.
Don't forget to prepend a (random) IV to your cipher text in case you reuse your encryption keys. An IV is similar to a salt, but should be exactly [blocksize] bytes (16 for AES).

Encrypted data in URL and salt

When passing symetrically encrypted data in a URL or possibly storing encrypted data in a cookie, is it resonable and/or nessassary and/or possible to also pass the Symetric Encryption IV (Salt) in the same URL? Is the idea of using Salt even valid in a stateless environment such as the web?
(I understand how salt works in a database given a list of names or accounts etc. but we can't save the salt given that we are passing data in a stateless environment.
Assuming a server side password that is used to encrypt data and then decrypt data, how can Salt be used? I guess a separate IV could be passed in the query string but is publicly exposing the salt ok?
Or can one generate a key and IV from the hash of a "password". Assuming the IV and Key come from non-overlapping areas of the hash, is this ok? (I realize that the salt / key will always be the same for a given password.)
EDIT: Typically using AES.
It is encouraged to generate random IVs for each encryption routine, and they can be passed along safely with the cipher text.
Edit:
I should probably ask what type of information you're storing and why you're using a salt with AES encryption, since salts are typically used for hashing, not symmetric encryption. If the salt is publicly available, it defeats the purpose of having it.
What you really need to do is ensure the strength of your key, because if an attacker has the salt, IV, and cipher text, a brute-force attack can easily be done on weaker keys.
You should not generate an initialization vector from the secret key. The initialization vector should be unpredictable for a given message; if you generated it from the key (or a password used to generate a key), the IV will always be the same, which defeats its purpose.
The IV doesn't need to be secret, however. It's quite common to send it with the ciphertext, unprotected. Incorporating the IV in the URL is a lot easier than trying to keep track of the IV for a given link in some server-side state.
Salt and IVs have distinct applications, but they do act in similar ways.
Cryptographic "salt" is used in password-based key derivation algorithms; storing a hashed password for authentication is a special case of this function. Salt causes the same password to yield different hashes, and thwarts "dictionary attacks", where a hacker has pre-computed hash values for common passwords, and built a "reverse-lookup" index so that they can quickly discover a password for a given hash. Like an IV, the salt used is not a secret.
An initialization vector is used with block ciphers like DES and AES in a feedback mode like CBC. Each block is combined with the next block when it is encrypted. For example, under CBC, the previous block cipher text is XOR-ed with the plain text of the current block before encryption. The IV is randomly generated to serve as a dummy initial block to bootstrap the process.
Because a different IV is (or should be, at least) chosen for each message, when the same message is encrypted with the same key, the resulting cipher text is different. In that sense, an IV is very similar to a salt. A cryptographic random generator is usually the easiest and most secure source for a salt or an IV, so they have that similarity too.
Cryptography is very easy to mess up. If you are not confident about what you are doing, you should consider the value of the information you are protecting, and budget accordingly to get the training or consultation you need.

Resources