remotely stored password protected private key - encryption

I was wondering if there is a protocol for this.
The idea is to have some encrypted data on a server.
Now I would like to find a protocol that fulfills the following requirements.
Since actively managing keys is beyond what most users are willing to put up with, a password protected key is stored on the server.
The server should not be able to decrypt the data. I.e. can not learn the password
Third parties should not be able to access the password protected key because they don't know the password.
There should only be one password. The problem is trivial to solve with 2 passwords. One to download the protected key and one to unprotect the key
The solution might involve some kind of zero-knowledge proof of knowledge of the password.

There is no problem generating multiple keys and authentication hashes from the same password. You shouldn't be sending the real password to the server in any case. You want to send it through PBKDF2 first. My usual recommendation for authentication looks like this:
salt = static-per-app-token + username
key = PBKDF2(salt, password, > 10000 iterations)
send key to server
on server: finalkey = SHA512(key) // This step keeps someone who steals your auth database from logging in
compare finalkey to database
Given that procedure, you can just modify the PBKDF2 step to generate a key twice as long as you need. The top X bytes are for authenticating, the bottom X bytes are for decryption. PKBDF2 can create as many bytes as you want, and having half the bytes doesn't help you figure out the other half.
For your specific problem, this is probably the most convenient and best performing. You could also do as you suggest, and use two different salts for PBKDF2. That is also fine. It just takes twice as long to compute. If that's not a problem, then it's a very simple solution and may be even more convenient depending on your code.
But just to take the discussion another step, let's say you didn't have a password. You had a (random) key, and you wanted to create independent keys from that. Or you have the output of PBKDF2 already, and want to expand it into more keys. In that case, you can expand an key using HKDF. The process is simple (|| means "concatenate"):
prk = psuedorandom key (your input key)
info = some-random-token || username
T(0) = empty string (zero length)
T(1) = HMAC-Hash(PRK, T(0) | info | 0x01)
T(2) = HMAC-Hash(PRK, T(1) | info | 0x02)
T(3) = HMAC-Hash(PRK, T(2) | info | 0x03) ...
You keep building T(n) until you have enough bytes for all your keys. You glue all the T(n) results together, and slice it up to get your keys out. This is much faster than PBKDF2, and can be applied to the output of PBKDF2 if you want.
Take a look at the RNCryptor v4 spec for an example of a process for expanding a single password into a bunch of different keys and random values.
Passwords are still horrible, and none of this can fix badly chosen passwords, but that at least stacks the cards in your favor if the password is reasonably hard to guess.

Related

symmetric AES enryption concept

i have a project for a website, running on Django. One function of it needs to store user/password for a third party website. So it needs to be symmetric encryption, as it needs to use these credentials in an automated process.
Storing credentials is never a good idea, I know, but for this case there is no other option.
My idea so far is, to create a Django app, that will save and use these passwords, and do nothing else. With this I can have 2 "webservers" that will not receive any request from outside, but only get tasking via redis or something. Therefore I can isolate them to some degree (they are the only servers who will have access to this extra db, they will not handle any web request, etc)
First question: Does this plan sound solid or is there a major flaw?
Second question is about the encryption itself:
AES requires an encryption key for all its work, ok that needs to be "secured" in some way. But I am more interested in the IV.
Every user can have one or more credential sets saved in the extra db. Would it be a good idea to use some hash of sort over the user id or something to generate a per user custom IV? Most of the time I see IV to be just random generated. But then I will have to also store them somewhere in addition to the key.
For me it gets a bit confusing here. I need key and IV to decrypt, but I would "store" them the same way. So wouldn't it be likely if one get compromised, that also the IV will be? Would it then make any difference if I generate the IV on the fly over a known procedure? Problem then, everyone could know the IV if they know their user id, as the code will be open source....
In the end, I need some direction guidance as how to handle key and best unique IV per user. Thank you very much for reading so far :-)
Does this plan sound solid or is there a major flaw?
The need to store use credentials is imho a flaw by design, at least we all appreciate you are aware of it.
Having a separate credential service with dedicated datastore seems to be best you can do under stated conditions. I don't like the option to store user credentials, but let's skip academic discussion to practical things.
AES requires an encryption key for all its work, ok that needs to be "secured" in some way.
Yes, there's the whole problem.
to generate a per user custom IV?
IV allows reusing the same key for multiple encryptions, so effectively it needs to be unique for each ciphertext (if a user has multiple passwords, you need an IV for each password). Very commonly IV is prepended to the ciphertext as it is needed to decrypt it.
Would it then make any difference if I generate the IV on the fly over a known procedure?
IV doesn't need to be secret itself.
Some encryption modes require the IV to be unpredictable (e.g. CBC mode), therefore it's best if you generate the IV as random. There are some modes that use IV as a counter to encrypt/decrypt only part of data (such as CTR or OFB), but still it is required the IV is unique for each key and encryption.

Encryption of user name

I need to encrypt user names that i receive from an external partners SSO. This needs to be done because the user names are assigned to school children. But we still need to be able to track each individual to prevent abuse of our systems, so we have decided to encrypt the user names in our logs etc.
This way, a breach of our systems will not compromise the identity of the children.
Heres my predicament. I have very limited knowledge in this area, so i am looking for advice on which algorithm to use.
I was thinking of using an asymmetrical algorithm, like PGP, and throwing away one of the keys so that we will not be able to decrypt the user name.
My questions:
Does PGP encryption always provide the same output given the same input?
Is PGP a good choice for this, or should we use an other algorithm?
Does anyone have a better suggestion for achieving the same thing - anonymization of the user
If you want a one-way function, you don't want encryption. You want hashing. The easiest thing to do is to use a hash like SHA-256. I recommend salting the username before hashing. In this case, I would probably pick a static salt like edu.myschoolname: and put that in front of the username. Then run that through SHA-256. Convert the result to Base-64 or hex encoding, and use the resulting string as the "username."
From a unix command line, this would look like:
$ echo -n "edu.myschoolname:robnapier#myschoolname.edu" | shasum -a 256
09356cf6df6aea20717a346668a1aad986966b192ff2d54244802ecc78f964e3 -
That output is unique to that input string (technically it's not "unique" but you will never find a collision, by accident or by searching). And that output is stable, in that it will always be the same for the given input. (I believe that PGP includes some randomization; if it doesn't, it should.)
(Regarding comments below)
Cryptographic hash algorithms are extremely secure for their purposes. Non-cryptographic hash algorithms are not secure (but also aren't meant to be). There are no major attacks I know of against SHA-2 (which includes SHA-256 and SHA-512).
You're correct that your system needs to be robust against someone with access to the code. If they know what userid they're looking for, however, no system will be resistant to them discovering the masked version of that id. If you encrypt, an attacker with access to the key can just encrypt the value themselves to figure out what it is.
But if you're protecting against the reverse: preventing attackers from determining the id when they do not already know the id they're looking for, the correct solution is a cryptographic hash, specifically SHA-256 or SHA-512. Using PGP to create a one-way function is using a cryptographic primitive for something it is not built to do, and that's always a mistake. If you want a one-way function, you want a hash.
I think that PGP is a good Idea, but risk to make usernames hard to memorize, why not simply make a list of usernames composed with user + OrderedNumbers where user can be wichever word you want and oredered number is a 4-5 digit number ordered by birth date of childrens?Once you have done this you only have to keep a list where the usernames are linked wit the corresponding child abd then you can encript this "nice to have" list with a key only you know.

Repeatedly encrypt(random padding + const int32) - compromise secret?

Will any encryption scheme safely allow me to encrypt the same integer repeatedly, with different random material prepended each time? It seems like the kind of operation that might get me in hot water.
I want to prevent spidering of items at my web application, but still have persistent item IDs/URLs so content links don't expire over time. My security requirements aren't really high for this, but I'd rather not do something totally moronic that obviously compromises the secret.
// performed on each ID before transmitting item search results to the client
public int64 encryptWithRandomPadding(int32 id) {
int32 randomPadding = getNextRandomInt32();
return encrypt(((int64)randomPadding << 32) + id), SECRET);
}
// performed on an encrypted/padded ID for which the client requests details
public int32 decryptAndRemoveRandomPadding(int64 idToDecrypt) {
int64 idWithPadding = decrypt(idToDecrypt, SECRET);
return (int32)idWithPadding;
}
static readonly string SECRET = "thesecret";
Generated IDs/URLs are permanent, the encrypted IDs are sparsely populated (less than 1 in uint32.Max are unique, and I could add another constant padding to reduce the likelyhood of a guess existing), and the client may run the same search and get the same results with different representative IDs each time. I think it meets my requirements, unless there's a blatant cryptographic issue.
Example:
encrypt(rndA + item1) -> tokenA
encrypt(rndB + item1) -> tokenB
encrypt(rndC + item2) -> tokenC
encrypt(rndD + item175) -> tokenD
Here, there is no way to identify that tokenA and tokenB both point to identical items; this prevents a spider from removing duplicate search results without retrieving them (while retrieving increments the usage meter). Additionally, item2 may not exist.
Knowing that re-running a search will return the same int32 padded multiple ways with the same secret, can I do this safely with any popular crypto algorithms? Thanks, crypto experts!
note: this is a follow-up to a question that didn't work out as I'd hoped: Encrypt integer with a secret and shared salt
If your encryption is secure, then random padding makes cracking neither easier nor harder. For a message this short, a single block long, either everything is compromised or nothing is. Even with a stream cipher, you'd still need the key to get any further; the point of good encryption is that you don't need extra randomness. Zero padding or other known messages at least a block long at the beginning are obviously to be avoided if possible, but that's not the issue here. It's pure noise, and once someone discovered that, they'd just skip ahead and start cracking from there.
Now, in a stream cipher, you can add all the randomness in the beginning and the later bytes will still be the same with the same key, don't forget that. This only actually does anything at all for a block cipher, otherwise you'd have to xor the random bits into the real value to get any use out of it.
However, you might be better off using a MAC as padding: with proper encryption, the encrypted mac won't give any information away, but it looks semi-randomish and you can use it to verify that there were no errors or malicious attacks during decryption. Any hash function you like can create the MAC, even a simple CRC-32, without giving anything away after encryption.
(A cryptographer might find a way to shave a bit or two off due to the relatedness, will tons of plaintexts if they knew beforehand how they were related, but that's still far beyond practicality.)
As you asked before, you can safely throw in an unecrypted salt in front of every message; a salt can only compromise an encrypted value if the implementation is broken or the key compromised, as long as the salt is properly mixed into the key, particularly if you can mix it into the expanded key schedule before decryption. Modern hash algorithms with lots of bits are really good at that, but even mixing into a regular input key will always have the same security as the key alone.

Crypting with a weak secret key using AES

I am using AES to encrypt some data, the problem is that I have to use a key that contains only 4 digits (like pin code), so anyone can loop 9999 times to find my key and decrypt my text. The data I am encrypting here is an SMS.
Is the any idea to avoid this?
No, there isn't. You can add salts and iteration counts to a PBKDF all you want, but in the end the attacker only has 10K tries to go through, and that's peanuts.
The only sensible way to do this is to have a separate entity that performs the decryption. It can add secret entropy of its own to the key seed, and use a strong key. The entity would then place restrictions on the authentication with the PIN.
You might want to take a good look at your system's security architecture and see if you can change something to avoid this problem (access control, other login credentials etc. etc.).
Edit: Removed my comment about adding a salt, everyone who pointed this out was correct. You could perhaps increase the time complexity of decryption, such that a brute-force attack would take a prohibitively long time.
Edit: read this: https://security.stackexchange.com/questions/6719/how-would-you-store-a-4-digit-pin-code-securely-in-the-database
You can take the same aproach as ATM machines: after someone enters an incorrect PIN three times, that account is temprorarily invalid (you can also set a along time-out) and that user will have to undertake some kind of action (e.g. click a confirmation link in an e-mail) in order to reactive his/her account.
You'll also have to salt the PIN with an unique property of that user (preferably a string that was randomly generated when that user was registered). I also recommend adding an additional salt to all hashes that is either hard-coded or read from a config file (usefull in case your database is compromised but the rest isn't).
This approach still leaves you vulnerable to an attack where someone chooses a single PIN and brute-forces usernames. You can take some countermeasures to this by applying the same policy to IP-adresses, but that's still far from optimal.
EDIT: If your goal is to encrypt traffic rather than to hash PIN's, you should use HTTPS or another protocol based on public-key cryptography, that way you won't have to use your PIN for encrypting these SMS's.
assuming that you can only enter 4 digits, pad the keylength in the application with either phone number of sender or something like that?

Passwords and different types of encryption

I know, I know, similar questions have been asked millions and billions of times already, but since most of them got a different flavor, I got one of my own.
Currently I'm working on a website that is meant to be launched all across my country, therefore, needs some kind of protection for user system.
I've been lately reading alot about password encryption, hashing, salting.. you name it, but after reading that much of articles, I get confused.
One says that plain SHA512 encryption is enough for a password, others say that you have to use "salt" no matter what you would do, and then there are guys who say that you should build a whole new machine for password encryption because that way no one will be able to get it.
For now I'm using hash_hmac(); with SHA512, plus, password gets random SHA1 salt and the last part, defined random md5(); key. For most of us it'll sound secure, but is it?
I recently read here on SO, that bcrypt(); (now known as crypt(); with Blowfish hashing) is the most secure way. After reading PHP manual about crypt(); and associated stuff, I'm confused.
Basicly, the question is, will my hash_hmac(); beat the hell out of Blowfished crypt(); or vice-versa?
And one more, maybe there are more secure options for password hashing?
The key to proper application of cryptography is to define with enough precision what properties you are after.
Usually, when someone wants to hash passwords, it is in the following context: a server is authenticating users; users show their password, through a confidential channel (HTTPS...). Thus, the server must store user passwords, or at least store something which can be used to verify a password. We do not want to store the passwords "as is" because an attacker gaining read access to the server database would then learn all passwords. This is our attack model.
A password is something which fits in the brain of the average user, hence it cannot be fully unguessable. A few users will choose very long passwords with high entropy, but most will select passwords with an entropy no higher than, say, 32 bits. This is a way of saying that an attacker will have to "try" on average less than 231 (about 2 billions) potential passwords before finding the right one.
Whatever the server stores, it is sufficient to verify a password; hence, our attacker has all the data needed to try passwords, limited only by the computing power he can muster. This is known as an offline dictionary attack.
One must assume that our attacker can crack one password. At that point we may hope for two properties:
cracking a single password should be difficult (a matter of days or weeks, rather than seconds);
cracking two passwords should be twice as hard as cracking one.
Those two properties call for distinct countermeasures, which can be combined.
1. Slow hash
Hash functions are fast. Computing power is cheap. As a data point, with SHA-1 as hash function, and a 130$ NVidia graphic card, I can hash 160 millions passwords per second. The 231 cost is paid in about 13 seconds. SHA-1 is thus too fast for security.
On the other hand, the user will not see any difference between being authenticated in 1µs, and being authenticated in 1ms. So the trick here is to warp the hash function in a way which makes it slow.
For instance, given a hash function H, use another hash function H' defined as:
H'(x) = H(x || x || x || ... || x)
where '||' means concatenation. In plain words, repeat the input enough times so that computing the H' function takes some non-negligible time. So you set a timing target, e.g. 1ms, and adjust the number of repetitions needed to reach that target. 10ms means that your server will be able to authenticate 10 users per second at the cost of only 10% of its computing power. Note that we are talking about a server storing a hashed password for its own ulterior usage, hence there is no interoperability issue here: each server can use a specific repetition count, tailored for its power.
Suppose now that the attacker can have 100 times your computing power; e.g. the attacker is a bored student -- the nemesis of many security systems -- and can use dozens of computers across his university campus. Also, the attacker may use a more thoroughly optimized implementation of the hash function H (you are talking about PHP but the attacker can do assembly). Moreover, the attacker is patient: users cannot wait for more than a fraction of a second, but a sufficiently bored student may try for several days. Yet, trying 2 billions passwords will still require about 3 full days worth of computing. This is not ultimately secure, but is much better than 13 seconds on a single cheap PC.
2. Salts
A salt is a piece of public data which you hash with the password in order to prevent sharing.
"Sharing" is what happens when the attacker can reuse his hashing efforts over several attacked passwords. This is what happens when the attacker has several hashed passwords (he read the whole database of hashed passwords): whenever he hashes one potential password, he can look it up against all hashed passwords he is trying to attack. We call that a parallel dictionary attack. Another instance of sharing is when the attacker can build a precomputed table of hashed passwords, and then use his table repeatedly (by simple lookups). The fabled rainbow table is just a special case of a precomputed table (that's just a time-memory trade-off which allows for using a precomputed table much bigger than what would fit on a hard disk; but building the table still requires hashing each potential password). Space-time wise, parallel attacks and precomputed tables are the same attack.
Salting defeats sharing. The salt is a public data element which alters the hashing process (one could say that the salt selects the hash function among a whole set of distinct functions). The point of the salt is that it is unique for each password. The attacker can no longer share cracking efforts because any precomputed table would have to use a specific salt and would be useless against a password hashed with a distinct salt.
The salt must be used to verify a password, hence the server must store, for each hashed password, the salt value which was used to hash that password. In a database, that's just an extra column. Or you could concatenate the salt and the hash password in a single blob; that's just a matter of data encoding and it is up to you.
Assuming S to be the salt (i.e. some bytes), the hashing process for password p is: H'(S||p) (with the H' function defined in the previous section). That's it!
The point of the salt is to be, as much as possible, unique to each hashed password. A simple way to achieve that is to use random salts: whenever a password is created or changed, use a random generator to get 16 random bytes. 16 bytes ought to be enough to make salt reuse highly improbable. Note that the salt should be unique for each password: using the user name as a salt is not sufficient (some distinct server instances may have users with the same name -- how many "bob"s exist out there ? -- and, also, some users change their password, and the new password should not use the same salt than the previous password).
3. Choice of hash function
The H' hash function is built over a hash function H. Some traditional implementations have used encryption algorithms twisted into hash functions (e.g. DES for Unix's crypt()). This has promoted the use of the "encrypted password" expression, although it is not proper (the password is not encrypted because there is no decryption process; the correct term is "hashed password"). It seems safer, however, to use a real hash function, designed for the purpose of hashing.
The most used hash functions are: MD5, SHA-1, SHA-256, SHA-512 (the latter two are collectively known as "SHA-2"). Some weaknesses have been found in MD5 and SHA-1. Those weaknesses have serious impact for some usages, but not for what is described above (the weaknesses are about collisions, whereas we work here on preimage resistance). However, it is better public relations to choose SHA-256 or SHA-512: if you use MD5 or SHA-1, you may have to justify yourself. SHA-256 and SHA-512 differ by their output size and performance (on some systems, SHA-256 is much faster than SHA-512, and on others SHA-512 is faster than SHA-256). However, performance is not an issue here (regardless of the hash function intrinsic speed, we make it much slower through input repetitions), and the 256 bits of SHA-256 output are more than enough. Truncating the hash function output to the first n bits, in order to save on storage costs, is cryptographically valid, as long as you keep at least 128 bits (n >= 128).
4. Conclusion
Whenever you create or modify a password, generate a new random salt S (16 bytes). Then hash the password p as SHA-256(S||p||S||p||S||p||...||S||p) where the 'S||p' pattern is repeated enough times to that the hashing process takes 10ms. Store both S and the hash result. To verify a user password, retrieve S, recompute the hash, and compare it with the stored value.
And you will live longer and happier.
This question raises multiple points, each of which need to be addressed individually.
Firstly you should not engineer your own encryption algorithm. The argument that something is secure because it is not mainstream is completely invalid. Any algorithm you might develop will only be as strong as your understanding of cryptography.
The average developer does not have a grasp on the mathematical concepts necessary to create a strong algorithm, should your application be compromised, then your completely untested algorithm will be the only thing standing between an attacker and your users personal information, and a suitably motivated attacker will probably defeat your custom encryption much faster than they could had you used a time tested algorithm.
Using a salt is a very good idea. Because the hash is generated using both the salt and password value, a brute force attack on the hashed data becomes excessively expensive because the dictionary of hashed passwords used by an attacker would not take into account the salt value used when generating the hashes.
I'm not the most qualified person to comment on algorithm selection, so I'll leave that to somebody else.
I'm not a PHP developer, but I have some experience with encryption. My first recommendation is as Crippledsmurf suggested, absolutely don't try to "roll your own" encryption. It will have disaster written all over it.
You say you're using hash_hmac() currently. If you're just protecting user accounts and some basic information (name, address, email etc.) and not anything important such as SSN, credit cards, I think you're safe to stick with what you have.
With encryption we'd all like the most secure, complex vault to secure our stuff, but the question is, why have a huge safe door to protect things no-one would realistically want? You have to balance the type and strength of encryption you use against what you are protecting and the risk of it being taken.
Currently, if you are encrypting your information, even at a basic level, you already beat the hell out of 90% of sites and applications out there - who still store in plain text. You're using a salt (excellent idea) and you're making it extremely difficult to decrypt the information (the md5 key is good).
Make a call - is this worth protecting further. If not, don't waste your time and move on.

Resources