We are considering using email to transmit PDFs containing personal health-related information ("PHI"). There is nothing of commercial value, no social security numbers or credit card numbers, or anything like that in these documents. Only recommendations for treatment of medical conditions.
The PDFs would be password-encrypted using Adobe Acrobat Pro's 256-bit password encryption.
Using very long passwords is not logistically desirable because the recipient of the emails with PDF attachment is the patient, not a technical person. We want to make the password easy-to-type, and yet not so short that any desktop PC has the CPU capacity to crack it in a few minutes.
If a password does not use any dictionary words but is simply a four-character random ASCII alphanumeric string, like DT4K (alphas all uppercase, not mixed), how long would it take a typical desktop business or home computer with no specialized hardware to crack the encryption? Does going to 5 characters significantly increase the cracking time?
Short answer: no, and no.
Longer answer: alphanumeric means A-Za-z0-9, right? That's 62 possible characters, or 5.95 bits of entropy. Since entropy is additive, 4 characters are roughly 24 bits, and 5 are about 30. To put that into comparison, 10 bits mean the attacker has to try about a thousand possible keys, 20 bits are a million, 30 bits about a billion. That's almost nothing these days. 56 bit DES was cracked using brute force in 1998, today people worry that 128 bit AES might not be safe enough.
If i were you, I'd try to use something like diceware. That's a list of 7776 easily pronounced words. You can use a random number generator to pick a passphrase from these words, and each word will have about 12.9 bits of entropy. So 5 words are about 65 bits, which for the kind of data you have might be an acceptable level of security, while being easily remembered or communicated via phone.
Why 7776 words? Well, 7776 is 6*6*6*6*6, so you can roll a die five times and get a number, and just look up the corresponding word on the list.
My bank sends statements encrypted and uses a combination of my name and birth date. I'm not a huge fan of that idea, but provided you use information that's unlikely to be known to an attacker you'll get a greater level of security than from four or five character alphanumeric passwords.
This would take less than an 25 seconds even with the most rudimentary tools. There are precomplied rainbow tables for passwords this short that can run in seconds on decent PC's. Password length, NOT complexity, are what make a password difficult to crack. I would highly recomend giving them a longer password, but make it something eaisly recalled. Maybe you entire business name salted with your street address number at the end. Please take at least some precautions. Having a four character password is barely better than not having one at all.
How Strong is your Password?
Related
I have pairs of email addresses and hashes, can you tell what's being used to create them?
aaaaaaa#aaaaa.com
BeRs114JrR0sBpueyEmnOWZfnLuigYTA
and
aaaaaaaaaaaaa.bbbbbbbbbbbb#cccccccccccc.com
4KoujQHr3N2wHWBLQBy%2b26t8GgVRTqSEmKduST9BqPYV6wBZF4IfebJS%2fxYVvIvR
and
r.r#a.com
819kwGAcTsMw3DndEVzu%2fA%3d%3d
First, the obvious even if you know nothing about cryptography: the percent signs are URL encoding; decoding that gives
BeRs114JrR0sBpueyEmnOWZfnLuigYTA
4KoujQHr3N2wHWBLQBy+26t8GgVRTqSEmKduST9BqPYV6wBZF4IfebJS/xYVvIvR
819kwGAcTsMw3DndEVzu/A==
And that in turn is base64. The lengths of the encodings wrt the length of the original strings are
plaintext encoding
17 24
43 48
10 16
More samples would give more confidence, but it's fairly clear that the encoding pads the plaintext to a multiple of 8 bytes. That suggest a block cipher (it can't be a hash since a hash would be fixed-size). The de facto standard block algorithm is AES which uses 16-byte blocks; 24 is not a multiple of 16 so that's out. The most common block algorithm with a block size of 8 (which fits the data) is DES; 3DES or blowfish or something even rarer is also a possibility but DES is what I'd put my money on.
Since it's a cipher, there must be a key somewhere. It might be in a configuration file, or hard-coded in the source code. If all you have is the binary, you should be able to locate it with the help of a debugger. With DES, you could find the key by brute force (because a key is only 56 bits and that's doable by renting a bit of CPU time on Amazon) but finding it in the program would be easier.
If you want to reproduce the algorithm then you'll also need to figure out the mode of operation. Here one clue is that the encoding is never more than 7 bytes longer than the plaintext, so there's no room for an initialization vector. If the developers who made that software did a horrible job they might have used ECB. If they made a slightly less horrible job they might have used CBC or (much less likely) some other mode with a constant IV. If they did an again slightly less horrible job then the IV may be derived from some other characteristic of the account. You can refine the analysis by testing some patterns:
If the encoding of abcdefghabcdefgh#example.com (starting with two identical 8-byte blocks) starts with two identical 8-byte blocks, it's ECB.
If the encoding of abcdefgh1#example.com and abcdefgh2#example.com (differing at the 9th character) have identical first blocks, it's CBC (probably) with a constant IV.
Another thing you'll need to figure out is the padding mode. There are a few common ones. That's a bit harder to figure out as a black box except with ECB.
There are some tools online, and also some open source projects. For example:
https://code.google.com/archive/p/hash-identifier/
http://www.insidepro.com/
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
Which is the more secure method of storing passwords? I lack the mathematical background to determine the answer myself.
Let's please for the sake of argument assume that all passwords and usernames generated for each of the following methods are randomly generated 6 characters known to be exactly six alpha-humeric-special-character fields and that each are using the same hashing algorithm and the same number of passes.
The standard way. UserName stored in plain text and only the password is to be discovered. Hash(PlaintextPassword + UniqueRecordSalt) = Password stored in DB.
One field recognized as LoginInfo = Hash(Encryption(UserName, Password) + Shared Salt). Neither the UserName nor the Password are ever stored in any other format EVER.
Does the forced cross attempting of username/password combinations offset the weakness of a shared salt as opposed to a unique record salt? This is of course completely IGNORING all affects on usability and focusing entirely on security.
Can anyone point me to any software to help me answer this question myself since I lack the cryptography and mathematical knowledge to arrive at the answer myself?
Please feel free to move this to a more appropriate forum. I didn't know where else to put it. However, I don't feel that it is a topic irrelevant to programmers overall doing their everyday job.
Please read How to securely hash passwords? first. To summarize:
Never use a single pass of any hashing algorithm.
Never roll your own, which is what your example 2 is (and example 1 as well, if + means concatenation).
Username stored in the clear
Salt generated per user, 8-16 random bytes, stored in the clear
in pure binary or encoded into Base64 or hex or whatever you like.
Use BCrypt, SCrypt, or PBKDF2
Until some time after the results of the Password Hashing Competition, at least.
Use as high an work factor/cost/iteration count as your CPU's can handle during expected future peak times.
For PBKDF2 in particular, do not ask for more binary output bytes than the native hash produces. I would say not less than 20 binary bytes, regardless.
SHA-1: output = 20 bytes (40 hex digits)
SHA-224: 20 bytes <= output <= 28 bytes (56 hex digits)
SHA-256: 20 bytes <= output <= 32 bytes (64 hex digits)
SHA-384: 20 bytes <= output <= 48 bytes (96 hex digits)
SHA-512: 20 bytes <= output <= 64 bytes (128 hex digits)
For PBKDF2 in particular, SHA-384 and SHA-512 have a comparative advantage on 64-bit systems for the moment, as 2014 vintage GPU's many attackers will use have a smaller margin of advantage for 64-bit operations over your defensive CPU's than they would on 32-bit operations.
If you want an example, then perhaps look at PHP source code, in particular the password_hash() and password_verify() functions, per the PHP.net Password Hashing FAQ.
Alternately, I have a variety of (currently very crude) password hashing examples at my github repositories. Right now it's almost entirely PBKDF2, but I will be adding BCrypt, SCrypt, and so on in the future.
As you say option 1 is the standard way to store passwords. As long as you use a secure hash function (eg. NIST recommend PBKDF2) with a unique salt, your passwords are secure. So I would recommend this option.
Option 2 doesn't really make sense. You cant 'undo' a hash function, so why encrypt its contents? You would then also have to store the encryption key somewhere which is different issue entirely.
Also what do you mean by a shared salt? If you always use the same salt then that defeats the point of salting your hashes. A unique salt per row is the way to go.
I would say that combining the username and password into a single hash is overcomplicating things, and limits your options in development, since you can't get a row from the DB given a username.
Say you want to lock out a user after 5 incorrect password attempts. With a standard plain-text username and hashed pw, you can just have a 'login_attempt_count' column and update the row for that user each time their password is incorrectly entered.
If your username and passwords are hashed together, you have no way of identifying which row to update with a login attempt count, since a hashed correct username with a wrong password wont match any hash.
I guess you could have some kind of mapping function to get a row_id given a username, but I would say its just needlessly complicated, and with greater complication you have a bigger chance of security flaws.
As I said, I would just go with option 1. It's the industry standard way to store passwords, and its secure enough for pretty much any application (as long as you use a modern secure hash function).
This is a question about whether my security process is adequate for the kind of information i am storing.
I am building a website using ASP.NET 4.0 with a SQL backend and need to know how my security would hold up with regards to passwords and hashes etc.
I don't store any critical information on someone - No real names, addresses, credit card details or anything like that... just email and username.
For now, I am deliberately leaving out some specifics as I am not sure if telling you them will weaken my security but if not I can reveal slightly more.
Here is how I do it:
The user registers with their email and a unique username up to 50 chars long
They create a password (minimum 6 chars) using any characters on the keyboard (I HTMLEncode the input and am using parameterized stored procedures so I don't restrict the chars)
I send them an email with a link to verify they are real.
I use FormsAuthentication to set an auth cookie but I'm not using SSL at the moment... I understand the implications of sending auth details across plain http but I have asked my host to add the cert so it should be ready shortly.
It's the hashing bit I need to be sure of!
I create a random 100 character salt from the following char set (I just use the System.Random class, nothing cryptographic) - abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNOPQRSTUVWXYZ0123456789!£$%^*()_{}[]#~#<,>.?
This is then merged with the password and then hashed using SHA-512 (SHA512Managed class) tens of thousands of times (takes nearly 2 seconds on my i7 laptop to generate the final hash).
This final hash is then converted to a base64 string and compared with the already-hashed password in the database (the salt is stored in another column in the DB too)
A few questions (ignore the lack of SSL for the moment, I just haven't bought the certificate yet but it will be ready in a week or so):
Does this strike you as secure enough? I understand there are degrees of security and that given enough time and resources anything is breakable but given that I don't store critical data, does it seem like enough?
Would revealing the actual number of times I hash the password weaken my security?
Does a 100 character salt make any difference over, say, a 20 character one?
By revealing how I join a password and salt together, would that weaken my security?
So, let's try to answer your questions one by one:
Does this strike you as secure enough? I understand there are degrees of security and that given enough time and resources anything is breakable but given that I don't store critical data, does it seem like enough?
No. It is definitely not "secure enough".
Without seeing code, it's hard to say more. But the fact that you're doing a straight SHA512 instead of doing a HMAC indicates one problem. Not because you need to be using a HMAC, but because most algorithms that are designed for this purpose use HMAC under the hood (for several reasons).
And it seems likely you're doing hash = SHA512(hash) (just from your wording) which is proven to be bad.
So without seeing code, it's hard to say for sure, but it's not pointing in the right direction...
Would revealing the actual number of times I hash the password weaken my security?
No, it shouldn't. If it does, you have a problem somewhere else in the algorithm.
Does a 100 character salt make any difference over, say, a 20 character one?
Nope. All the salt does is make the hash unique (forcing the attacker to attack each password separately). All you need is a salt long enough to be statistically unique. Thanks to the Birthday Problem, 128 bits is more than enough for a 1/10^12 chance of collision. Which is plenty for us. So that means that 16 characters is the upper bound on salt effectiveness.
That doesn't mean it's bad to use a longer salt. It just means that making it longer than 16 characters doesn't significantly increase the security it provides...
By revealing how I join a password and salt together, would that weaken my security?
If it does, your algorithm is severely flawed. If it does, it amounts to Security Through Obscurity.
The Real Answer
The real answer here is to not re-invent the wheel. Algorithms like PBKDF2 and BCRYPT exist for exactly this purpose. So use them.
Further Information (Note that these talk about PHP, but the concepts are 100% applicable to ASP.NET and C#):
YouTube Video - Password Storage and Hacking in PHP
Blog Post - The Rainbow Table Is Dead
Blog Post - Properly Salting Passwords
PHP password_hash RFC
Blog Post - Seven Ways To Screw Up BCrypt
In theory, your hashing scheme sounds ok. In practice, it sounds like you have rolled your own crypto, which is bad. Use bcrypt, scrypt, or pbkdf2. All of these are designed by security professionals.
Not really, but I don't think anyone needs to know that anyway.
No. It just needs to be unique to every user. The purpose of salt is to prevent precalculation of hashes/rainbow table attacks.
This doesn't apply once you make use of bcrypt (or scrypt or pbkdf2)
http://security.stackexchange.com has some topics on the subject, you should check them out.
Some extra notes - serious attackers will crack sha512 hashes way faster than your laptop. For example you could rent a server with a few Tesla GPU's from Amazon or similar, and start cracking at a few billion hashes/second rate. Scrypt makes some effort trying to prevent this by using memory intensive operations.
6 characters minimum for password is not enough, go with at least 8. A related image, I haven't verified the times but it gives a rough estimate and gives you the general idea (excluding dictionary attacks, which can target longer passwords):
I am learning about encryption methods and I have a question about MD5.
I have seen there are several websites that have 'rainbow tables' that will give you reverse MD5 lookup, but, they can't lookup all the combinations possible.
For knowledge's sake, my question is this :
Hypothetically, if a group of people were to consider an upper limit (eg. 5 or 6 characters) and decide to map out the entire MD5 hash for all the values inside that range, storing the results in a database to use for reverse lookup.
1. Do you think such a thing is probable.
2. If you can speculate, what kind of scale of resources would this mean?
3. To your knowledge have there been any public or private attempts to do this?
I am not referring to tables that have select entries based on a dictionary, but mapping the entire range upto a certain number of characters.
(I have refered to This question already.)
It is possible. For a small number of characters, it has already been done. In the near future, it will be easy for larger numbers of characters. MD5 isn't getting any stronger.
That's a function of time. To reverse the entire 6-or-fewer-character alphanumeric space would require computing 62^6 entries. That's 56 trillion MD5s. That's doable by a determined small group or easy for a government, right now. In the future, it will be doable on a home computer. Remember, though, that as the number of allowable characters or the maximum length increases, the difficulty increase is exponential.
People already have done it. But, honestly, it doesn't matter - because anyone with half an ounce of sense uses a random salt. If you precompute the entire MD5 space and reverse it, that doesn't mean jack dandy if someone is using key strengthening or a good salt! Read up on salting.
5 or 6 characters is easy. 6 bytes is doable (that's 248 combinations), even with limited hardware.
Namely, a simple Core2 CPU from Intel will be able to hash one password in about 150 clock cycles (assuming you use a SSE2 implementation, which will hash four passwords in parallel in 600 clock cycles). With a 2.4 GHz quad core CPU (that's my PC, not exactly the newest machine available), I can then try about 226 passwords per second. For that kind of job, a massively parallel architecture is fine, hence it makes sense to use a GPU. For maybe 200$, you can buy a NVidia video card which will be about four times faster (i.e. 228 passwords per second). 6 alphanumeric characters (uppercase, lowercase and digits) are close to 236 combinations; trying them all is then a matter of 2(36-28) seconds, which is less than five minutes. With 6 random bytes, it will need 220 seconds, i.e. a bit less than a fortnight.
That's for the CPU cost. If you want to speed up the actual attack, you store the hash results: thus you will not need to recompute all those hashed passwords every time you attack a password (but you still have to do it once). 236 hash results (16 bytes each) mean 1 terabyte. You can buy a harddisk that big for 100$. 248 hash results imply 4096 times that storage space; in plain harddisks this will cost as much as a house: a bit expensive for the average bored student, but affordable for most kinds of governmental or criminal organizations.
Rainbow tables are an optimization trick for the storage. In rough terms, you store only one every t hash results, in exchange of having to do t lookups and t2 hash computations for every attack. E.g., you choose t=1000, you only have to buy four harddisks instead of four thousands, but you will need to make 1000 lookups and a million hashes every time you want to crack a password (this will need a dozen seconds at most, if you do it right).
Hence you have two costs:
The CPU cost is about computing hashes for the complete password space; with a table (rainbow or not) you have to do it once, and then can reuse that computational effort for every attacked password.
The storage cost is about storing the hash results in order to easily attack several passwords. Harddisks are not very expensive, as shown above. Rainbow tables help you lower storage costs.
Salting defeats cost sharing through precomputed tables (whether they are rainbow tables or just plain tables has no effect here: tables are about reusing precomputed values for several attacked passwords, and salts prevent such recycling).
The CPU cost can be increased by defining that the hash procedure is not just a single hash computation; for instance, you can define the "password hash" as applying MD5 over the concatenation of 10000 copies of the password. This will make each attacker guess one
thousand times more expensive. It also makes legitimate password validation one thousands times more expensive, but most users will not mind (the user has just typed his password; he cannot really see whether the password verification took 10ms or 10µs).
Modern Unix-like systems (e.g. Linux) use "MD5" passwords which actually combine salting and iterated hashing, as described above. (Actually, a modern Linux system may use another hash function, such as SHA-256, but that does not change things much here.) So precomputed tables will not help, and the on-the-fly password cracking is expensive. A password with 6 alphanumeric characters can still be cracked within a few days, because 6 characters are kind of weak anyway. Also, many longer passwords are crackable because it turns out that human begins are bad are remembering passwords; hence they will not choose just any random sequence of characters, they will select passwords which have some "meaning". This reduces the space of possible passwords.
It's called a rainbow table, and it's easily defeated with salting.
Yes, it is not only probable, but it's probably been done before.
It depends on whether they are mapping the entire possible range or just a range of ASCII characters. Let's say you need 128 bits + 6 bytes to store each match. That's 22 bytes. You'd need:
6.32 GB to store all lowercase alphabetic combinations [a-z]
405 GB to for all alphabetic combinations [a-zA-Z]
1.13 TB for all alphanumeric combinations [a-zA-Z0-9]
5.24 TB for all combinations that consists of letters, numbers and 18 symbols.
As you see, it increases exponentially, but even at 5.24 TB that's nothing to agencies like, say, the NSA or the CIA. They probably have done it.
As everyone else said, salting can easily defeat rainbow tables and that's almost as important as hashing. Read this: Just hashing is far from enough - How to position against dictionary and rainbow attacks
This question already has answers here:
Closed 13 years ago.
Duplicate:
Confused about hashes
How can SHA encryption create unique 40 character hash for any string, when there are n infinite number of possible input strings but only a finite number of 40 character hashes?
SHA is not an encryption algorithm, it is a cryptographic hashing algorithm.
Check out this reference at Wikipedia
The simple answer is that it doesn't create a unique 40 character hash for any string - it's inevitable that different strings will have the same hash.
It does try to make sure that close-by string will have very different hashes. 40 characters is a pretty long hash, so the chance of collision is quite low unless you're doing ridiculous numbers of them.
SHA doesn't create a unique 40 character hash for any string. If you create enough hashes, you'll get a collision (two inputs that hash to the same output) eventually. What makes SHA and other hash functions cryptographically useful is that there's no easy way to find two files that will have the same hash.
To elaborate on jdigital's answer:
Since it's a hash algorithm and not an encryption algorithm, there is no need to reverse the operation. This, in turn, means that the result does not need to be unique; there are (in theory) in infinite number of strings that will result in the same hash. Finding out which on those are is practically impossible, though.
Hash algorithms like SHA-1 or the SHA-2 family are used as "one-way" hashes in support of password-based authentication. It is not computationally feasible to find a message (password) that hashes to a given value. So, if an attacker obtains the list of hashed passwords, they can't determine the original passwords.
You are correct that, in general, there are an infinite number of messages that hash to a given value. It's still hard to find one though.
It does not guarantee that two strings will have unique 40 character hashes. What it does is provide an extremely low probability that two strings will have conflicting hashes, and makes it very difficult to create two conflicting documents without just randomly trying inputs.
Generally, a low enough probability of something bad happening is as good as a guarantee that it never will. As long as it's more likely that the world will end when a comet hits it, the chance of a colliding hash isn't generally worth worrying about.
Of course, secure hash algorithms are not perfect. Because they are used in cryptography, they are very valuable things to try and crack. SHA-1, for instance, has been weakened (you can find a collision in 2000 times fewer guesses than just doing random guessing); MD5 has been completely cracked, and security researchers have actually created two certificates which have the same MD5 sum, and got one of them signed by a certificate authority, thus allowing them to use the other one as if it had been signed by the certificate authority. You should not blindly put your faith in cryptographic hashes; once one has been weakened (like SHA-1), it is time to look for a new hash, which is why there is currently a competition to create a new standard hash algorithm.
The function is something like:
hash1 = SHA1(plaintext1)
hash2 = SHA1(plaintext2)
now, hash1 and hash2 can technically be the same. It's a collision. Not common, but possible, and not a problem.
The real magic is in the fact that it's impossible to do this:
plaintext1 = SHA1-REVERSE(hash1)
So you can never reverse it. Handy if you dont want to know what a password is, only that the user gave you the same one both times. Think about it. You have 1024 bytes of input. You get 40 bits of output. How can you EVER reconstruct those 1024 bytes from the 40 - you threw information away. It's just not possible (well, unless you design the algorithm to allow it, I guess....)
Also, if 40 bits isn't enough, use SHA256 or something with a bigger output. And Salt it. Salt is good.
Oh, and as an aside: any website which emails you your password, is not hashing it's passwords. It's either storing them unencrypted (run, run screaming), or encrypting them with a 2 way encryption (DES, AES, public-private key et al - trust them a little more)
There is ZERO reasons for a website to be able to email you your password, or need to store anything but the hash. /rant.
Nice observation. Short answer it can't and leads to collisions which can be exploited in birthday attacks.
The simple answer is: it doesn't create unique hashes. Look at the Pidgeonhole priciple. It's just so unlikely for there to be a collision that nobody has ever found one.