Saving private data - encryption

Can anybody detail some approach on how to save private data in social websites like facebook, etc. They cant save all the updates and friends list in clear text format because of privacy issues. So how do they actually save it?
Hashing all the data with user password so that only a valid session view it is one possibility. But I think there are some problem with this approach and there must be some better solution.

They can and probably do save it in plain text - it goes into a database on a server somewhere. There aren't really privacy issues there... and even if there were, Facebook has publicly admitted they don't care about privacy.

Most applications do not encrypt data like this in the database. The password will usally be stored in a salted hash, and the application artchitecture is responsible for limiting visibility based on appropriate rights/roles.

Most websites do in fact save updates and friends list in clear text format---that is, they save them in an SQL database. If you are a facebook developer you can access the database using FQL, the Facebook Query Language. Queries are restricted so that you can only look at the data of "friends" or of people running your application, or their friends, or what have you. (The key difference between SQL and FQL is that you must always include a WHERE X=id where the X is a keyed column.)
There are other approaches, however. You can store information in a Bloom filter or in some kind of hash. You might want to read Peter Wayner's book Translucent Databases---he goes into clever approaches for storing data so that you can detect if it is present or missing, but you can't do brute force searches.

Related

Is there an inherent risk in publishing other users' ids?

I have a collection called Vouchers. A user can, if they know the unique number ID of a Voucher, "claim" that voucher, which will give it a user_id attribute, tying it to them.
I'm at a point where I need to check a user's ID query against the existing database, but I'm wondering if I can do so on the client instead of the server (the client would be much more convenient because I'm using utility functions to tie the query form to the database operation.... it's a long story). If I do so on the client, I'll have to publish the entire Vouchers collection with correct user_id fields, and although I won't be showing those ids through any templates, they would be available through the console.
Is there an inherent risk in publishing all of the IDs like this? Can they be used maliciously even if I don't leave any specific holes for them to be used in?
First, in general it sounds like a bad idea to publish all user_ids to the client. What would happen if you have 1 million users? That would be a lot of data.
Second, in specific, we cannot know if there is inherent risk in publishing your user_ids, because we do not know what could be done with it in your system. If you use a typical design of user_ids chosen by the user themselves (for instance email), then you MUST design your system to be safe even if an attacker has guessed the user_id.
Short Version: not so good idea.
I have a similar setup up: user can sign-up, if she knows the voucher code. You can only publish those vouchers where the user_id is identical to the logged in user. All other checks like "does the user input correspond to a valid voucher?" must be handled on the server.
Remember: client code is not trusted.

Value obfuscation information at value level in RavenDB

I am storing sensitive information within RavenDB relating to employee performance reviews.
As such, I need a simple first-line-of-defence against curious db admins, to prevent them from browsing the data.
I would class this as client-side encryption (although it need not be TNO) just really to obfuscate the data, however, in such a way that it obviously does not impact indexability.
Notes:
I am aware that indexed fields will remain unencrypted in Lucene.
I would really like to maintain document schema browsability if possible, so if someone were to use Raven Studio, they would see something like this (they can see the schema, not the data):
{
WhatIThinkOfMyManager: 'jfjsd83hfdljdf983nofs==',
AmIHappyWithMyPayLevel: false
}
Are there any facilitiesin Raven for this? And how do I go about it?
RavenDB 1.2 supports encryption of the data on disk (including in the indexes).
But an admin that has access to the data can see it in its decrypted form.
You might want to store the data inside RavenDB encrypted from your own code.

WordPress Plugin and One-Way Encryption

I was hoping someone could help me sort something out. I've been working on a shopping cart plugin for WordPress for quite a while now. I started coding it at the end of 2008 (and it's been one of those "work on it when I have time" projects, so the going is very slow, obviously!) and got pretty far with it. Even had a few testers take me up on it and give me feedback. (Please note that this plugin is also meant to be a fee download - I have no intention of making it a premium plugin.)
Anyway, in 2010, when all the PCI/DSS stuff became standard, I shelved it, because the plugin was meant to retain certain information in the database, and I was not 100% sure what qualified as "sensitive data," and I didn't want to put anything out there that might compromise anyone, and possibly come back on me.
Over the last few weeks, some colleagues and I have been having a discussion about PCI/DSS compliance, and it's sparked a re-interest in finally finishing this plugin. I'm going to remove the storage of credit card numbers and any data of that nature, but I do like the idea of storing the names and shipping addresses of people who voluntarily might want to create an account with the site that might use this plugin so if they shop there again, that kind of info is retained. Keep in mind, the data stored would be public information - the kind of thing you'd find in a phone book, or a peek in the record room of a courthouse. So nothing like storing SS#'s, medical histories or credit card numbers. Just stuff that would maybe let someone see past purchases, and retain some info to make a future checkout process a bit easier.
One of my colleagues suggested I still do something to enhance security a bit, since the name and shipping address would likely be passed to whatever payment gateway the site owner would choose to use. They suggested I use "one-way encryption." Now, I'm not a huge security freak, but I'm pretty sure this involves (one aspect anyway) stuff like MD5 hashes with salts, or the like. So this confuses me, because I wouldn't have the slightest idea of where to look to see how to use that kind of thing with my code, and/or if it will work when passing that kind of data to PayPal or Google Checkout, or Mal's, or what have you.
So I suppose this isn't an "I need code examples" kind of question, but more of a "please enlighten me, because I'm sort of a dunce" kind of question. (which, I'm sure, makes people feel much better about the fact that I'm writing a shopping cart plugin LOL)
One way encryption is used to store information in the database that you don't need back out of the database again in its unencrypted stage (hence the one-way moniker). It could, in a more general sense, be used to demonstrate that two different people (or systems) are in possession of the same piece of data. Git, for instance, uses hashes to check if files (and indeed entire directory structures) are identical.
Generally in an ecomm contect hashes are used for passwords (and sometimes credit cards) because as the site owner, you don't need to retain the actual password, you just need a function to be able to determine if the password currently being sent by the user is the same as the one previously provided. So in order to authenticate a user you would pass the password provided through the encryption algorithm (MD5, SHA, etc) in order to get a 'hash'. If the hash matches the hash previously generated and stored in the database, you know the password is the same.
WordPress uses salted hashes to store it's passwords. If you open up your wp_users table in the database you'll see the hashes.
Upside to this system is that if someone steals your database, they don't get the original passwords, just the hash values which the thief can't then use to log in to your users' Facebook, banking, etc sites (if your user has used the same password). Actually, they can't even use the hashes to log in to the site they were stolen from as hashing a hash produces a different hash.
The salt provides a measure of protection against dictionary attacks on the hash. There are databases available of mappings between common passwords and hash values where the hash values have been generated by regularly used one way hash functions. If, when generating the hash, you tack a salt value on to the end of your password string (eg my password becomes abc123salt), you can still do the comparison against the hash value you've previously generated and stored if you use the same salt value each time.
You wouldn't one way hash something like an address or phone number (or something along those lines) if you need to use it in the future again in its raw form, say to for instance pre-populate a checkout field for a logged in user.
Best practices would also involve just not storing data that you don't need again in the future, if you don't need the phone number in the future, don't store it. If you store the response transaction number from the payment gateway, you can use this for fraud investigations and leave the storage of all of the other data up to the gateway.
I'll leave it to others to discuss the relative merits of MD5 vs. SHA vs ??? hashing systems. Note, there's functions built in to PHP to do the hashing.

Documents/links on preventing HTML form fiddling?

I'm using ASP.Net but my question is a little more general than that. I'm interested in reading about strategies to prevent users from fooling with their HTML form values and links in an attempt to update records that don't belong to them.
For instance, if my application dealt with used cars and had links to add/remove inventory, which included as part of the URL the userid, what can I do to intercept attempts to munge the link and put someone else's ID in there? In this limited instance I can always run a check at the server to ensure that userid XYZ actually has rights to car ABC, but I was curious what other strategies are out there to keep the clever at bay. (Doing a checksum of the page, perhaps? Not sure.)
Thanks for your input.
The following that you are describing is a vulnerability called "Insecure Direct Object References" And it is recognized by A4 in the The OWASP top 10 for 2010.
what can I do to intercept attempts to
munge the link and put someone else's
ID in there?
There are a few ways that this vulnerability can be addressed. The first is to store the User's primary key in a session variable so you don't have to worry about it being manipulated by an attacker. For all future requests, especially ones that update user information like password, make sure to check this session variable.
Here is an example of the security system i am describing:
"update users set password='new_pass_hash' where user_id='"&Session("user_id")&"'";
Edit:
Another approach is a Hashed Message Authentication Code. This approach is much less secure than using Session as it introduces a new attack pattern of brute force instead of avoiding the problem all togather. An hmac allows you to see if a message has been modified by someone who doesn't have the secret key. The hmac value could be calculated as follows on the server side and then stored as a hidden variable.
hmac_value=hash('secret'&user_name&user_id&todays_date)
The idea is that if the user trys to change his username or userid then the hmac_value will not be valid unless the attacker can obtain the 'secret', which can be brute forced. Again you should avoid this security system at all costs. Although sometimes you don't have a choice (You do have a choice in your example vulnerability).
You want to find out how to use a session.
Sessions on tiztag.
If you keep track of the user session you don't need to keep looking at the URL to find out who is making a request/post.

Am I going about this the wrong way?

This is my first MVC/Linq to SQL Application. I'm using the out of the box SQL Membership with ASP.NET to track users through my system.
As most of you know, the UserId is a guid, great. However, to link other user-created tables in the system, I decided to go with username instead of userid. The reason I did this was because:
Username is unique anyway
It prevents me from having to make an extra call when handling db functions.
So for example: I don't have to do a look up on the userid based on username to create a new story; I simply insert User.Identity.Name into the story table.
Now I did run into some nasty complication, which seems to be related to this. It worked fine on my local machine, but not on the host. I continually got an error that went something like this:
"System.InvalidCastException: Specified cast is not valid. at System.Data.Linq.IdentityManager.StandardIdentityManager.SingleKeyManager"...
This happened whenever an insert on the db occurred on the host. If I understand correctly, this is a bug currently that happens when you link a non integer field (in my case username) to another table of a non integer field (username in aspnet_user). Although the bug reported seems a little bit different, maybe they are similar?
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=351358
In any case, MS bug or not - is storing the username instead of the userid in my tables a bad idea? If it is, why?
Update
I just wanted to add some more context here. A good point people are bringing up is that this is dangerous if I want to allow users to change their username in the future. Perfectly valid!
However, this application relies heavily on the username. Each user creates one and only one story. They then link to their story by using: mysite/username. Therefore, the application will never allow them to change their username. It would cause a potential nightmare for people who follow the link only to see it no longer exists.
Be careful regarding your comment regarding usernames are unique. The minute Anita Takeabath gets married to Seymour Butts suddenly atakebath wants to be abutts.
Just a thought!
I've used the same approach as you and it works. Do you have a relationship between your application table and the table from the membership db? If so, you may want to remove that relationship.
My only thought would be in order to future proof your application, the userid would offer flexibility in users changing their username, as the userid would remain constant (like SO for instance).
But that is something that has to fit your application requirements. Then again requirements often tend to change wihtout a developers control.
It's bad for the following reasons:
You mentioned avoiding extra database calls. However, by joining tables, there is no "extra" call to database. You can argue that joining is expensive than no joining at all. However, most likely, a store needs more user information than a user login name (note: user names are not unique, user login names are unique). So you need joining anyway for most database operations.
User login names have different length, it doesn't perform well when they are used in joining.
Edit: modified format. I am still learning how to make my post look better:-)
If the reason you're implementing this is for easier access to the User's GUID, I suggest having your FormsAuthentication.SetAuthCookie use the users's GUID as the name property and use User.Identity.Name throughout your application.
Using username as the unique identifier could have bad consequences in the future. Should you want to allow the user change their username in the future, you will have a hard time implementing that.

Resources