I was hoping someone could help me sort something out. I've been working on a shopping cart plugin for WordPress for quite a while now. I started coding it at the end of 2008 (and it's been one of those "work on it when I have time" projects, so the going is very slow, obviously!) and got pretty far with it. Even had a few testers take me up on it and give me feedback. (Please note that this plugin is also meant to be a fee download - I have no intention of making it a premium plugin.)
Anyway, in 2010, when all the PCI/DSS stuff became standard, I shelved it, because the plugin was meant to retain certain information in the database, and I was not 100% sure what qualified as "sensitive data," and I didn't want to put anything out there that might compromise anyone, and possibly come back on me.
Over the last few weeks, some colleagues and I have been having a discussion about PCI/DSS compliance, and it's sparked a re-interest in finally finishing this plugin. I'm going to remove the storage of credit card numbers and any data of that nature, but I do like the idea of storing the names and shipping addresses of people who voluntarily might want to create an account with the site that might use this plugin so if they shop there again, that kind of info is retained. Keep in mind, the data stored would be public information - the kind of thing you'd find in a phone book, or a peek in the record room of a courthouse. So nothing like storing SS#'s, medical histories or credit card numbers. Just stuff that would maybe let someone see past purchases, and retain some info to make a future checkout process a bit easier.
One of my colleagues suggested I still do something to enhance security a bit, since the name and shipping address would likely be passed to whatever payment gateway the site owner would choose to use. They suggested I use "one-way encryption." Now, I'm not a huge security freak, but I'm pretty sure this involves (one aspect anyway) stuff like MD5 hashes with salts, or the like. So this confuses me, because I wouldn't have the slightest idea of where to look to see how to use that kind of thing with my code, and/or if it will work when passing that kind of data to PayPal or Google Checkout, or Mal's, or what have you.
So I suppose this isn't an "I need code examples" kind of question, but more of a "please enlighten me, because I'm sort of a dunce" kind of question. (which, I'm sure, makes people feel much better about the fact that I'm writing a shopping cart plugin LOL)
One way encryption is used to store information in the database that you don't need back out of the database again in its unencrypted stage (hence the one-way moniker). It could, in a more general sense, be used to demonstrate that two different people (or systems) are in possession of the same piece of data. Git, for instance, uses hashes to check if files (and indeed entire directory structures) are identical.
Generally in an ecomm contect hashes are used for passwords (and sometimes credit cards) because as the site owner, you don't need to retain the actual password, you just need a function to be able to determine if the password currently being sent by the user is the same as the one previously provided. So in order to authenticate a user you would pass the password provided through the encryption algorithm (MD5, SHA, etc) in order to get a 'hash'. If the hash matches the hash previously generated and stored in the database, you know the password is the same.
WordPress uses salted hashes to store it's passwords. If you open up your wp_users table in the database you'll see the hashes.
Upside to this system is that if someone steals your database, they don't get the original passwords, just the hash values which the thief can't then use to log in to your users' Facebook, banking, etc sites (if your user has used the same password). Actually, they can't even use the hashes to log in to the site they were stolen from as hashing a hash produces a different hash.
The salt provides a measure of protection against dictionary attacks on the hash. There are databases available of mappings between common passwords and hash values where the hash values have been generated by regularly used one way hash functions. If, when generating the hash, you tack a salt value on to the end of your password string (eg my password becomes abc123salt), you can still do the comparison against the hash value you've previously generated and stored if you use the same salt value each time.
You wouldn't one way hash something like an address or phone number (or something along those lines) if you need to use it in the future again in its raw form, say to for instance pre-populate a checkout field for a logged in user.
Best practices would also involve just not storing data that you don't need again in the future, if you don't need the phone number in the future, don't store it. If you store the response transaction number from the payment gateway, you can use this for fraud investigations and leave the storage of all of the other data up to the gateway.
I'll leave it to others to discuss the relative merits of MD5 vs. SHA vs ??? hashing systems. Note, there's functions built in to PHP to do the hashing.
Related
I've created a web site for student management (martial arts schools). Which includes invoicing students. Currently the only way my users can do this is by printing the invoices and handing them to the students. I'd like to create a way for the students to go to their invoice online.
I've been considering using GUIDs for the students, and using that as the parameter for the query string to the invoice. (http://thesite.com/invoice.php?guid=E3D3D122-5AB6-4405-96EC-7C0579710813)
The invoice would be a read-only page, and allow no access to the rest of the site. So I'm not to worried about packet sniffing (I don't believe some sniffing traffic in a coffee shop is a concern, if all they have access to is a random student invoice).
I am worried about someone being able to guess, or get to a specific set of invoices (i.e. all the invoices of a competitor).
I feel like I'm either crazy for considering it, or it's a relativity standard practice. I'm just not sure which. And SO is a great sanity check.
Thanks
That's actually a good, secure process; you lose the readability of the URL, of course, but if that's not much of a concern, that's a good solution. It's certainly not guessable.
As an added security measure, you might want to put in place logging of invoice accesses.
I would take it one step further and store the invoice as a password protected pdf document. This achieves several things:
the document is read only (a web page is too, but a pdf is harder for the end user to change)
the student also requires a password to access the info in the document so even if someone guesses the GUID (or more likely gets a shortcut/url mailed to them) then they can't see what is in the document (they won't be able to see the amount, which school it is for, etc.)
even if the document is retrieved from a web cache it isn't viewable without the password
it is printer friendly
it should be easily viewable on other devices
Suppose you are writing a survey application and would like to somehow guarantee results to be secure from user stand point. Put simply, i know what IP you came from but i want to make sure you sleep well at night knowing i know nothing of your responses. I can't store IP in raw form.
I do need to guarantee 1 thing though, that is that you answer questions once. So once your PC comes in with some data, i need to recognize that your PC already has responsed to the survey.
Any suggestions on how to best handle it?
Thanks
-mac
Create a one-way hash of the IP address (and any other unique identifying attributes), and store the hash with the response. That way no one can lookup the IP address for a response, but you can compare the hash to previously submitted responses to keep ensure people only submit the form once.
There's not much you can do to convince someone your respecting their privacy. Just don't abuse the trust, and people will work it out.
(For an idea on how to create a hash in java see How can I generate an MD5 hash?)
You can't guarantee either of these. All you can do is raise the bar so it's harder to get around it. If someone really wants to get around your tracking they can if they know enough about your system. Good thing is most people either don't want to bother or don't know how.
You can generate a cryptographic hash and store that in a cookie on the persons browser if you want to prevent proxy problem. Lots of websites do this to keep session creation to track authentication. This is something like using an HMAC to generate something that identifies the browser with a unique key that can't be faked. If they clear their browser though you won't be able to track them.
One way hash of IP address is a way to keep your IP from being tracked, but the same IP always hashes to the same value so you can tell if someone is doing that. However if they go to an internet cafe viola they can resubmit. You'd use SHA1, MD5, etc for that.
You can do the same thing with email address and hash it. To get people to want to participate send the results to their email address instead of displaying in the browser. People just have to trust you won't do nasty things with their email.
Other ideas might be if you know who you want to send the survey too. Generate a random number that identifies the individual response. Then email those links to people. They will then submit under that number, and you don't track email -> random number then you can't correlate the answers with the email address. Once a random number is used once you don't let them submit it again. Track Responses once. Display results many times.
You can combine some of these together to try and work around the deficiencies of the other.
I am debating using user-names as a means to salt passwords, instead of storing a random string along with the names. My justification is that the purpose of the salt is to prevent rainbow tables, so what makes this realistically less secure than another set of data in there?
For example,
hash( md5(johnny_381#example.com), p4ss\/\/0rD)
vs
hash( md5(some_UUID_value), p4ss\/\/0rD)
Is there a real reason I couldn't just stick with the user name and simplify things? The only thing my web searching resulted was debates as to how a salt should be like a password, but ended without any reasoning behind it, where I'm under the impression this is just to prevent something like a cain-and-able cracker to run against it without being in the range of a million years. Thinking about processing limitations of reality, I don't believe this is a big deal if people know the hash, they still don't know the password, and they've moved into the super-computer range to brute force each individual hash.
Could someone please enlighten me here?
You'll run into problems, when the username changes (if it can be changed). There's no way you can update the hashed password, because you don't store the unsalted, unhashed password.
I don't see a problem with utilizing the username as the salt value.
A more secure way of storing passwords involves using a different salt value for each record anyway.
If you look at the aspnet_Membership table of the asp.net membership provider you'll see that they have stored the password, passwordsalt, and username fields in pretty much the same record. So, from that perspective, there's no security difference in just using the username for the salt value.
Note that some systems use a single salt value for all of the passwords, and store that in a config file. The only difference in security here is that if they gained access to a single salt value, then they can more easily build a rainbow table to crack all of the passwords at once...
But then again, if they have access to the encrypted form of the passwords, then they probably would have access to the salt value stored in the user table right along with it... Which might mean that they would have a slightly harder time of figuring out the password values.
However, at the end of the day I believe nearly all applications fail on the encryption front because they only encrypt what is ostensibly one of the least important pieces of data: the password. What should really be encrypted is nearly everything else.
After all, if I have access to your database, why would I care if the password is encrypted? I already have access to the important things...
There are obviously other considerations at play, but at the end of the day I wouldn't sweat this one too much as it's a minor issue compared others.
If you use the username as password and there are many instances of your application, people may create rainbow tables for specific users like "admin" or "system" like it is the case with Oracle databases or with a whole list of common names like they did for WPA (CowPatty)
You better take a really random salt, it is not that difficult and it will not come back haunting you.
This method was deemed secure enough for the working group that created HTTP digest authentication which operates with a hash of the string "username:realm:password".
I think you would be fine seeing as this decision is secret. If someone steals your database and source code to see how you actually implemented your hashing, well what are they logging in to access at that point? The website that displays the data in the database that they've already stolen?
In this case a salt buys your user a couple of security benefits. First, if the thief has precomputed values (rainbow tables) they would have to recompute them for every single user in order to do their attack; if the thief is after a single user's password this isn't a big win.
Second, the hashes for all users will always be different even if they share the same password, so the thief wouldn't get any hash collisions for free (crack one user get 300 passwords).
These two benefits help protect your users that may use the same password at multiple sites even if the thief happens to acquire the databases of other sites.
So while a salt for password hashing is best kept secret (which in your case the exact data used for the salt would be) it does still provide benefits even if it is compromised.
Random salting prevents comparison of two independently-computed password hashes for the same username. Without it, it would be possible to test whether a person's password on one machine matched the one on another, or whether a password matched one that was used in the past, etc., without having to have the actual password. It would also greatly facilitate searching for criteria like the above even when the password is available (since one could search for the computed hash, rather than computing the hash separately for each old password hash value).
As to whether such prevention is a good thing or a bad thing, who knows.
I know this is an old question but for anyone searching for a solution based on this question.
If you use a derived salt (as opposed to random salt), the salt source should be strengthened by using a key derivation function like PBKDF2.
Thus if your username is "theunhandledexception" pass that through PBKDF2 for x iterations to generate a 32 bit (or whatever length salt you need) value.
Make x pseudo random (as opposed to even numbers like 1,000) and pass in a static site specific salt to the PBKDF2 and you make it highly improbable that your username salt will match any other site's username salt.
I'm using ASP.Net but my question is a little more general than that. I'm interested in reading about strategies to prevent users from fooling with their HTML form values and links in an attempt to update records that don't belong to them.
For instance, if my application dealt with used cars and had links to add/remove inventory, which included as part of the URL the userid, what can I do to intercept attempts to munge the link and put someone else's ID in there? In this limited instance I can always run a check at the server to ensure that userid XYZ actually has rights to car ABC, but I was curious what other strategies are out there to keep the clever at bay. (Doing a checksum of the page, perhaps? Not sure.)
Thanks for your input.
The following that you are describing is a vulnerability called "Insecure Direct Object References" And it is recognized by A4 in the The OWASP top 10 for 2010.
what can I do to intercept attempts to
munge the link and put someone else's
ID in there?
There are a few ways that this vulnerability can be addressed. The first is to store the User's primary key in a session variable so you don't have to worry about it being manipulated by an attacker. For all future requests, especially ones that update user information like password, make sure to check this session variable.
Here is an example of the security system i am describing:
"update users set password='new_pass_hash' where user_id='"&Session("user_id")&"'";
Edit:
Another approach is a Hashed Message Authentication Code. This approach is much less secure than using Session as it introduces a new attack pattern of brute force instead of avoiding the problem all togather. An hmac allows you to see if a message has been modified by someone who doesn't have the secret key. The hmac value could be calculated as follows on the server side and then stored as a hidden variable.
hmac_value=hash('secret'&user_name&user_id&todays_date)
The idea is that if the user trys to change his username or userid then the hmac_value will not be valid unless the attacker can obtain the 'secret', which can be brute forced. Again you should avoid this security system at all costs. Although sometimes you don't have a choice (You do have a choice in your example vulnerability).
You want to find out how to use a session.
Sessions on tiztag.
If you keep track of the user session you don't need to keep looking at the URL to find out who is making a request/post.
Can anybody detail some approach on how to save private data in social websites like facebook, etc. They cant save all the updates and friends list in clear text format because of privacy issues. So how do they actually save it?
Hashing all the data with user password so that only a valid session view it is one possibility. But I think there are some problem with this approach and there must be some better solution.
They can and probably do save it in plain text - it goes into a database on a server somewhere. There aren't really privacy issues there... and even if there were, Facebook has publicly admitted they don't care about privacy.
Most applications do not encrypt data like this in the database. The password will usally be stored in a salted hash, and the application artchitecture is responsible for limiting visibility based on appropriate rights/roles.
Most websites do in fact save updates and friends list in clear text format---that is, they save them in an SQL database. If you are a facebook developer you can access the database using FQL, the Facebook Query Language. Queries are restricted so that you can only look at the data of "friends" or of people running your application, or their friends, or what have you. (The key difference between SQL and FQL is that you must always include a WHERE X=id where the X is a keyed column.)
There are other approaches, however. You can store information in a Bloom filter or in some kind of hash. You might want to read Peter Wayner's book Translucent Databases---he goes into clever approaches for storing data so that you can detect if it is present or missing, but you can't do brute force searches.