Hashing, encrypting or both - encryption

Recently I have been looking to add some security to a project. I have been doing lots of research into the situation and discovered that clearly password hashing is a must. Further I have concluded that the best options are to use bcrypt, PBKDF2 or scrypt.
Also I have seen much discussion over hashing vs encryption and discovered that it is clear that hashing is more important. That said, after many searches into the depths of Google I have yet to find any information on whether encrypting an already properly hashed password is of any benefit, serves to harm or is relatively neutral.
Is the CPU cost of doing both worth it? Are there any pitfalls?

Encrypting something leads to the need of decrypting, which in turn leads to the problem you already have: secure storage of a secret.
Assuming that you want to store passwords as hashes instead of plain text you are basically doing this:
hashpw := hash(salt + password)
You then store salt and hashpw in a file and use this data instead of the plain text passwords. (Note that the order of the concatenation of salt and password is crucial in many cases and that this is only a visualization of the process, nothing more; Use a tool to generate salted hashes).
A possible attacker then needs to guess the salt and the plain text password to check for a match with
the stored hashpw, which is as secure as the hash algorithm you're using (rate of collisions).
Encrypting something using some cipher has the benefit of being able to restore the plain text, which
the hashing way does not offer. It also requires the system which decrypt the cipher text to have the
key available. Say you encrypt a string foo with some key bar. To decrypt the resulting cipher text
brn you need the key bar again. This key needs secure storage on your system and if the key is exposed
to the attacker, all security is gone.
As a general rule of thumb I would say that hashing provides a good way of storing texts which are
checked against (e.g., passwords) as the security of that is determined by the collision rate of the
hashing algorithm. Encryption on the other hand, is the technique you're using to store the rest of
the data securely.

You're on the right track. Use a key derivation/password hashing function like the ones you've mentioned.
Do not use just a hash or salted hash. The main issue is that traditional hashing algorithms (MD5, SHA-*, etc.) are intended to be fast. That's not advantageous for password storage, and many implementations are breakable, even if you add a salt.
Encryption always introduces key management-related issues. It should be avoided for password storage.
The advantage of a KDF is the work factor. It's designed to be slow and computationally expensive, which is why they're idea for this situation. Scrypt is the most resilient of the options you're looking at since it requires a set amount of memory to execute. This kills the GPU attack vector. There are tradeoffs whichever way you go, but all of your choices are fine as long as you use appropriate work factors where they're configurable.

I would simply encrypt the password. Hashing is fast, but a little unsafe for passwords. When I use hashing for security purposes, it's usually for things like message signing e.g. message + hash(message+password) so that the message can be verified, but I'm no expert in the field. I don't see the point of doing both.

Related

What is the most secure hash algorithm in ColdFusion?

What is the most secure hash algorithm to use in ColdFusion 9 (non-Enterprise)?
According to the CF documentation, these are the options:
MD5: (default) Generates a 32-character, hexadecimal string, using the
MD5 algorithm (The algorithm used in ColdFusion MX and prior
releases).
SHA: Generates a 40-character string using the Secure Hash Standard
SHA-1 algorithm specified by Nation Institute of Standards and
Technology (NIST) FIPS-180-2.
SHA-256: Generates a 44-character string using the SHA-256 algorithm
specified by FIPS-180-2.
SHA-384: Generates a 64-character string using the SHA-384 algorithm
specified by FIPS-180-2.
SHA-512: Generates an 128-character string using the SHA-1 algorithm
specified by FIPS-180-2.
But in this article, it says not to use MD5 or SHA-1
I am also a little skeptical about the cf documentation. It says encoding "SHA-512" uses SHA-1, but the description of "SHA-512" for the Enterprise version is "The 512-bit secure hash algorithm defined by FIPS 180-2 and FIPS 198." And the output of SHA-512 is larger than SHA-384. Sorry, I am having a hard time getting my head around all these different encoding principles.
Hashes are not secure by themselves, anything that can be hashed can be broken. So in the security world you might think, ok I need to run the hash multiple times to obscure it more, but that doesn't secure the information, it just means someone has to repeat that same process and iterate over the hash multiple times. If they know the hash algorithm you used and assume they do, it's not secure. Sha-256 should be good enough for hashing information unless you are trying to secure the information. Hashes should never ever be used by themselves to secure information. Just because it isn't human readable does not make it secure.
If you want to secure something use coldfusion a encrypt functions and make sure you use a decent algorithm, like AES because the default in coldfusion is not secure. Then you need to use some entropic data from the information you're securing to ensure you have a unique encryption key that would be hard for someone to guess or find. Do not hard code a single key in your code, this will make it easy for someone to find and utilize a pattern in all of your encryptions.
Use something like bcrypt or scyrpt for storing passwords. I know they are more work to put into use and require java integration in coldfusion but they are much more secure ways of storing information. Remember that even with bcrypt or scrypt the information can be compromised given enough time and someone willing to put the effort into decrypting it. Be paranoid when securing information.

Is it worth encrypting a password hash?

I was wondering if it is common practice to encrypt a password hash, and/or the salt, does it necessarily make it more secure or just increase the time it'd take to "guess" the password?
Thanks!
It's not common practice to encrypt a salted hash. It may slightly increase security but realistically it's not worth it, since you'd have to manage the key in some way, complicating the whole process. Using a salted hash with a secure hashing algorithm will be fine.
Generally, you don't need to encrypt hash as long as you use good cryptographic hash function. As for salt, salting is done best before encrypting, ie. salt does get encrypted. The exception would be one-time table, in which case you can easily salt afterwards. As for the third question, whole encryption is nothing but increasing the time it takes to "guess" the plaintext, exception being again one-time table. Now what's your concrete problem? Can you formulate it as a task in some concrete computer language?
If the hash was produced by a "good" algorithm, than it doesn't make any sense to cipher it, since you would be essentially ciphering something that in theory only the rightful user can generate.
Ciphering the salt doesn't add any kind of real security.

Client side encryption - best practice

I wrote a "Password Locker" C# app a while ago as an exercise in encryption. I'd like to move the data to the web so that I can access it anywhere without compromising my password data. I'd just like to run my ideas by the community to ensure I'm not making a mistake as I'm not an encryption expert.
Here's what I envision:
In the C# app all the password data is encrypted as a single chunk of text using a user supplied password. I'm using Rijndael (symmetric encryption) in CBC mode. The password is salted using a hard coded value.
Encrypted data gets sent to my database
I go to a web page on my server and download the encrypted text. Using client side javascript I input my password. The javascript will decrypt everything (still client side)
Here are my assumptions:
I assume that all transmissions can be intercepted
I assume that the javascript (which contains the decryption algo, and hard coded salt) can be intercepted (since it's really just on the web)
The password cannot be intercepted (since it's only input client side)
The result is that someone snooping could have everything except the password.
So, based on those assumptions: Is my data safe? I realize that my data is only as safe as the strength of my password... Is there something I can do to improve that? Is Rijndael decryption slow enough to prevent brute force attacks?
I thought about using a random salt value, but that would still need to be transmitted and because of that, it doesn't seem like it would be any safer. My preference is to not store the password in any form (hashed or otherwise) on the web.
Edit:
I am considering using SSL, so my "interception" assumptions may not be valid in that case.
Edit 2:
Based on comments from Joachim Isaksson, I will be running with SSL. Please continue breaking apart my assumptions!
Edit 3:
Based on comments from Nemo I will use salt on a per user basis. Also, I'm using PBKDF2 to derive a key based on passwords, so this is where I'll get my "slowness" to resist brute force attacks.
Without even going into the crypto analysis in any way, if you're assuming all your information can be intercepted (ie you're running without SSL), you're not secure.
Since anyone can intercept the Javascript, they can also change the Javascript to make the browser pass the clear text elsewhere once decrypted.
Also, anyone hacking into the site (or the site owner) can maliciously change the Javascript to do the same thing even if SSL is on.
By "password data", I assume you mean "password-protected data"?
The salt does need to be random. It is fine that is transmitted in the clear. The purpose of a salt is protection against dictionary attacks. That is, should someone manage to obtain your entire encrypted database, they could quickly try a large dictionary of passwords against all of your users. With random salts, they need to try the dictionary against each user.
Or, alternatively, even without compromising the database, they could generate a huge collection of pre-encrypted data for lots of dictionary words, and immediately be able to recognize any known plaintext encrypted by any of those keys.
Even with a salt, dictionary attacks can be faster than you would like, so deriving key data from a password is a lot more subtle than most people realize.
Bottom line: As always, never invent your own cryptography, not even your own modes of operation. To derive an encryption key from a password, use a well-known standard like PBKDF2 (aka. PKCS#5).
Well, as this is an open question:
Issue #1
What are you going to do if the password that is supplied is incorrect, or if the salt/ciphertext is altered? You will get an incorrect decryption result, but how are you going to test that? What happens if just the last part of the ciphertext is altered? Or removed altogether?
Solution: Provide integrity protection against such attacks. Add a HMAC using a different key or use a mode like GCM mode.
Issue #2
What happens if you change or add a few bytes to the password (compare the encrypted store in time)?
Solution: Encrypt your key store with a different IV each time.
That's already 4 issues found :) Cryptography is hard.

Is it insecure to pass initialization vector and salt along with ciphertext?

I'm new to implementing encryption and am still learning basics, it seems.
I have need for symmetric encryption capabilities in my open source codebase. There are three components to this system:
A server that stores some user data, and information about whether or not it is encrypted, and how
A C# client that lets a user encrypt their data with a simple password when sending to the server, and decrypt with the same password when receiving
A JavaScript client that does the same and therefore must be compatible with the C# client's encryption method
Looking at various JavaScript libraries, I came across SJCL, which has a lovely demo page here: http://bitwiseshiftleft.github.com/sjcl/demo/
From this, it seems that what a client needs to know (besides the password used) in order to decrypt the ciphertext is:
The initialization vector
Any salt used on the password
The key size
Authentication strength (I'm not totally sure what this is)
Is it relatively safe to keep all of this data with the ciphertext? Keep in mind that this is an open source codebase, and there is no way I can reasonably hide these variables unless I ask the user to remember them (yeah, right).
Any advice appreciated.
Initialization vectors and salts are called such, and not keys, precisely because they need not be kept secret. It is safe, and customary, to encode such data along with the encrypted/hashed element.
What an IV or salt needs is to be used only once with a given key or password. For some algorithms (e.g. CBC encryption) there may be some additional requirements, fulfilled by chosing the IV randomly, with uniform probability and a cryptographically strong random number generator. However, confidentiality is not a needed property for an IV or salt.
Symmetric encryption is rarely enough to provide security; by itself, encryption protects against passive attacks, where the attacker observes but does not interfere. To protect against active attacks, you also need some kind of authentication. SJCL uses CCM or OCB2 encryption modes which combine encryption and authentication, so that's fine. The "authentication strength" is the length (in bits) of a field dedicated to authentication within the encrypted text; a strength of "64 bits" means that an attacker trying to alter a message has a maximum probability of 2-64 to succeed in doing so without being detected by the authentication mechanism (and he cannot know whether he has succeeded without trying, i.e. having the altered message sent to someone who knows the key/password). That's enough for most purposes. A larger authentication strength implies a larger ciphertext, by (roughly) the same amount.
I have not looked at the implementation, but from the documentation it seems that the SJCL authors know their trade, and did things properly. I recommend using it.
Remember the usual caveats of passwords and Javascript:
Javascript is code which runs on the client side but is downloaded from the server. This requires that the download be integrity-protected in some way; otherwise, an attacker could inject some of his own code, for instance a simple patch which also logs a copy of the password entered by the user somewhere. In practice, this means that the SJCL code should be served across a SSL/TLS session (i.e. HTTPS).
Users are human beings and human beings are bad at choosing passwords. It is a limitation of the human brain. Moreover, computers keep getting more and more powerful while human brains keep getting more or less unchanged. This makes passwords increasingly weak towards dictionary attacks, i.e. exhaustive searches on passwords (the attacker tries to guess the user's password by trying "probable" passwords). A ciphertext produced by SJCL can be used in an offline dictionary attack: the attacker can "try" passwords on his own computers, without having to check them against your server, and he is limited only by his own computing abilities. SJCL includes some features to make offline dictionary attacks more difficult:
SJCL uses a salt, which prevents cost sharing (usually known as "precomputed tables", in particular "rainbow tables" which are a special kind of precomputed tables). At least the attacker will have to pay the full price of dictionary search for each attacked password.
SJCL uses the salt repeatedly, by hashing it with the password over and over in order to produce the key. This is what SJCL calls the "password strengthening factor". This makes the password-to-key transformation more expensive for the client, but also for the attacker, which is the point. Making the key transformation 1000 times longer means that the user will have to wait, maybe, half a second; but it also multiplies by 1000 the cost for the attacker.

How to implement password protection for individual files?

I'm writing a little desktop app that should be able to encrypt a data file and protect it with a password (i.e. one must enter the correct password to decrypt). I want the encrypted data file to be self-contained and portable, so the authentication has to be embedded in the file (or so I assume).
I have a strategy that appears workable and seems logical based on what I know (which is probably just enough to be dangerous), but I have no idea if it's actually a good design or not. So tell me: is this crazy? Is there a better/best way to do it?
Step 1: User enters plain-text password, e.g. "MyDifficultPassword"
Step 2: App hashes the user-password and uses that value as the symmetric key to encrypt/decrypt the data file. e.g. "MyDifficultPassword" --> "HashedUserPwdAndKey".
Step 3: App hashes the hashed value from step 2 and saves the new value in the data file header (i.e. the unencrypted part of the data file) and uses that value to validate the user's password. e.g. "HashedUserPwdAndKey" --> "HashedValueForAuthentication"
Basically I'm extrapolating from the common way to implement web-site passwords (when you're not using OpenID, that is), which is to store the (salted) hash of the user's password in your DB and never save the actual password. But since I use the hashed user password for the symmetric encryption key, I can't use the same value for authentication. So I hash it again, basically treating it just like another password, and save the doubly-hashed value in the data file. That way, I can take the file to another PC and decrypt it by simply entering my password.
So is this design reasonably secure, or hopelessly naive, or somewhere in between? Thanks!
EDIT: clarification and follow-up question re: Salt.
I thought the salt had to be kept secret to be useful, but your answers and links imply this is not the case. For example, this spec linked by erickson (below) says:
Thus, password-based key derivation as defined here is a function of a password, a salt, and an iteration count, where the latter two quantities need not be kept secret.
Does this mean that I could store the salt value in the same place/file as the hashed key and still be more secure than if I used no salt at all when hashing? How does that work?
A little more context: the encrypted file isn't meant to be shared with or decrypted by others, it's really single-user data. But I'd like to deploy it in a shared environment on computers I don't fully control (e.g. at work) and be able to migrate/move the data by simply copying the file (so I can use it at home, on different workstations, etc.).
Key Generation
I would recommend using a recognized algorithm such as PBKDF2 defined in PKCS #5 version 2.0 to generate a key from your password. It's similar to the algorithm you outline, but is capable of generating longer symmetric keys for use with AES. You should be able to find an open-source library that implements PBE key generators for different algorithms.
File Format
You might also consider using the Cryptographic Message Syntax as a format for your file. This will require some study on your part, but again there are existing libraries to use, and it opens up the possibility of inter-operating more smoothly with other software, like S/MIME-enabled mail clients.
Password Validation
Regarding your desire to store a hash of the password, if you use PBKDF2 to generate the key, you could use a standard password hashing algorithm (big salt, a thousand rounds of hashing) for that, and get different values.
Alternatively, you could compute a MAC on the content. A hash collision on a password is more likely to be useful to an attacker; a hash collision on the content is likely to be worthless. But it would serve to let a legitimate recipient know that the wrong password was used for decryption.
Cryptographic Salt
Salt helps to thwart pre-computed dictionary attacks.
Suppose an attacker has a list of likely passwords. He can hash each and compare it to the hash of his victim's password, and see if it matches. If the list is large, this could take a long time. He doesn't want spend that much time on his next target, so he records the result in a "dictionary" where a hash points to its corresponding input. If the list of passwords is very, very long, he can use techniques like a Rainbow Table to save some space.
However, suppose his next target salted their password. Even if the attacker knows what the salt is, his precomputed table is worthless—the salt changes the hash resulting from each password. He has to re-hash all of the passwords in his list, affixing the target's salt to the input. Every different salt requires a different dictionary, and if enough salts are used, the attacker won't have room to store dictionaries for them all. Trading space to save time is no longer an option; the attacker must fall back to hashing each password in his list for each target he wants to attack.
So, it's not necessary to keep the salt secret. Ensuring that the attacker doesn't have a pre-computed dictionary corresponding to that particular salt is sufficient.
As Niyaz said, the approach sounds reasonable if you use a quality implementation of strong algorithms, like SHA-265 and AES for hashing and encryption. Additionally I would recommend using a Salt to reduce the possibility to create a dictionary of all password hashes.
Of course, reading Bruce Schneier's Applied Cryptography is never wrong either.
If you are using a strong hash algorithm (SHA-2) and a strong Encryption algorithm (AES), you will do fine with this approach.
Why not use a compression library that supports password-protected files? I've used a password-protected zip file containing XML content in the past :}
Is there really need to save the hashed password into the file. Can't you just use the password (or hashed password) with some salt and then encrypt the file with it. When decrypting just try to decrypt the file with the password + salt. If user gives wrong password the decrypted file isn't correct.
Only drawbacks I can think is if the user accidentally enters wrong password and the decryption is slow, he has to wait to try again. And of course if password is forgotten there's no way to decrypt the file.

Resources