I have an application that allows users to register for events. The database (SQL Server) has names, email addresses, addresses and phone numbers.
I pass id information via the query string (for example event.aspx?id=____). Presently I am using the "unique" identifier provided by NEWID() and the performance is great.
I was debating whether or not this is a reasonably secure approach. Should I encrypt the id values and pass that in the query string instead? For example instead of generating the unique id by using NEWID I would take the integer value that is in the primary key column and encrypt and decrypt that as needed in the application.
I have done this and noticed a performance hit. Any thoughts?
passing an anonymous key is perfectly fine. What's more of a risk is SQL injection attacks, but if you are taking precautions (e.g. using stored procedures, parameterized SQL etc...) you should be fine.
If someone can change the value of id and still register (thus altering/viewing someone else's registration information), then yes, you have a problem.
If you are ensuring that a person can only modify their own registration, then you likely do not have an issue. In this case, you could use the SQL ID column anyway instead of generating a NEWID() (unencrypted -- what is the threat if they know the ID?). If only the correct person can modify/register then revealing the ID shouldn't be a concern. Though I'm curious though as to why a simple encrypt/decrypt is negatively impacting performance.
Related
I use asp membership in application.
I added UserProfile table and it has foreign key to Users(of asp membership).
As a foreign key I use Username because username is email and it's unique.
And anywhere where I need to reference user I use Username as foreign key.
From application when I need to get profile for example I pass Username to stored procedure to get data.
I just wonder if this is the good way to do this. Is there some potentional security issue here?
The main issue that I see here is that you spend a lot of "data space" for a foreign key and this will make it slow and eat database space for your tables. Also you database table will connect making string compare - database take care and make hash for this strings and behind is make a number compare, but have a little small overhead on that.
Just make the UserName unique and use a number foreign key to connect it with the rest table.
The second issue here is when a user need to change their email, or give it wrong for any reason. In this case you need to update all the connections on the database and make sure that there is not other similar email.
And one more issue is that the email and the foreign key can be case sensitive or not. If for any reason you make it case sensitive then you make a mess.
About security issues, you always need to open and ask your database using parameters. This is the same for a number key or for a string key, so this make no different at this case.
I would say no for one simple reason: many systems allow users to change their usernames. In your case, you link this to an email address, which users should be allowed to change.
If you use it as a foreign key, you have to run updates to keep your data in sync, and that is bad.
This is an old natural vs. surrogate key discussion. There are "fans" of either approach, but the simple truth is that both have pros and cons, and you'll have to make your own decision that best fits your particular situation.
For the specific case of e-mail as PK, you might want to take a look at this discussion.
This is my first MVC/Linq to SQL Application. I'm using the out of the box SQL Membership with ASP.NET to track users through my system.
As most of you know, the UserId is a guid, great. However, to link other user-created tables in the system, I decided to go with username instead of userid. The reason I did this was because:
Username is unique anyway
It prevents me from having to make an extra call when handling db functions.
So for example: I don't have to do a look up on the userid based on username to create a new story; I simply insert User.Identity.Name into the story table.
Now I did run into some nasty complication, which seems to be related to this. It worked fine on my local machine, but not on the host. I continually got an error that went something like this:
"System.InvalidCastException: Specified cast is not valid. at System.Data.Linq.IdentityManager.StandardIdentityManager.SingleKeyManager"...
This happened whenever an insert on the db occurred on the host. If I understand correctly, this is a bug currently that happens when you link a non integer field (in my case username) to another table of a non integer field (username in aspnet_user). Although the bug reported seems a little bit different, maybe they are similar?
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=351358
In any case, MS bug or not - is storing the username instead of the userid in my tables a bad idea? If it is, why?
Update
I just wanted to add some more context here. A good point people are bringing up is that this is dangerous if I want to allow users to change their username in the future. Perfectly valid!
However, this application relies heavily on the username. Each user creates one and only one story. They then link to their story by using: mysite/username. Therefore, the application will never allow them to change their username. It would cause a potential nightmare for people who follow the link only to see it no longer exists.
Be careful regarding your comment regarding usernames are unique. The minute Anita Takeabath gets married to Seymour Butts suddenly atakebath wants to be abutts.
Just a thought!
I've used the same approach as you and it works. Do you have a relationship between your application table and the table from the membership db? If so, you may want to remove that relationship.
My only thought would be in order to future proof your application, the userid would offer flexibility in users changing their username, as the userid would remain constant (like SO for instance).
But that is something that has to fit your application requirements. Then again requirements often tend to change wihtout a developers control.
It's bad for the following reasons:
You mentioned avoiding extra database calls. However, by joining tables, there is no "extra" call to database. You can argue that joining is expensive than no joining at all. However, most likely, a store needs more user information than a user login name (note: user names are not unique, user login names are unique). So you need joining anyway for most database operations.
User login names have different length, it doesn't perform well when they are used in joining.
Edit: modified format. I am still learning how to make my post look better:-)
If the reason you're implementing this is for easier access to the User's GUID, I suggest having your FormsAuthentication.SetAuthCookie use the users's GUID as the name property and use User.Identity.Name throughout your application.
Using username as the unique identifier could have bad consequences in the future. Should you want to allow the user change their username in the future, you will have a hard time implementing that.
One requirement is that when persisting my C# objects to the database I must decide the database ID (surrogate primary key) in code.
Second requirement is that the database type for the key must be int or char(x)... so no uniqueidentifier or binary(16) or the like.
These are unchangeable requirements.
What would be the best way to go about handling this?
One idea is the base64 encoded GUIDs looking like "XSiZtdXcKU68QWe7N96Dig". These are easily created in code and are to me acceptable in URLs if necessary. But will it be too expensive regarding performance (indexing, size) having all primary and foreign keys be char(22)? Off hand I really like this idea.
Another idea would be to create a code version of a database sequence creating incremented integers for me. But I don't know if this is plausible and would need some guidance to secure the reliability. The sequencer must know har far it has come and what about threads that I don't control etc.
I imagine that no table involved will ever exceed 1.000.000 rows... will probably be far less.
You could have a table called "sequences". For each table there would be a row with a counter. Then, when you need another number, fetch it from the counter table and increment it. Put it in a transaction and you will have uniqueness.
However this will suffer in terms of performance, of course.
A simple incrementing int would be the easiest way to ensure uniqueness. This is what the database will do if you let it. If you set the table row to auto_increment, the database will do this for you automatically.
There are no security issues with this, but since you will be handling it yourself instead of letting the database engine take care of it, you will need to ensure that you don't generate the same id twice. This should be simple if you are on a single threaded system, but if your program is distributed you will need to put some effort into ensuring the uniqueness.
Seeing that you have an ASP.NET app, you could do the following (hoping and assuming all users must authenticate themselves before using your app!):
Assign each user a unique "UserID" in your database (can be INT, or CHAR)
Assign each user a "HighestSequentialID" (INT) in your database
When the user logs on, read those values from the database and store them in e.g. a custom principal, or in a cookie, or something else
whenever the user is about to insert a row, create a segmented ID: (UserID).(User's sequential number) and store it as "VARCHAR(20)" - e.g. your UserID is 15 and thus this user's entries would have unique IDs of "15.00001", "15.00002" and so on.
when the user logs off (or at any other time), update its new, highest used sequential ID in the database so that next time around, you'll know what this user has used last
Again - you'll have to do a lot more housekeeping work yourself, and it's always prone to a mishap (assigning a duplicate user ID, or misinterpreting the highest sequential number for that user).
I would strongly recommend trying to get these requirements changed - with these in place, all solutions will be sub-optimal at best, while using the database to handle this would be totally painless.
Marc
For a table below 1.000.000 rows, I would not be too terribly concerned about a char(22) Primary key. Of course the ideal solution for a situation like this would be for each object to have something unique about it that you could leverage for the key, even if it is a multi-part key. The next ideal solution would be to have the requirements changed :)
Why does aspnet_users use guid for id rather than incrementing int?
Also is there any reason no to use this in other tables as the primary key? It feels a bit odd as I know most apps I've worked with in the past just use the normal int system.
I'm also about to start using this id to match against an extended details table for extra user prefs etc. I was also considering using a link table with a guid and an int in it, but decided that as I don't think I actually need to have user id as a public int.
Although I would like to have the int (feels easier to do a user lookup etc stackoverflow.com/users/12345/user-name ) , as I am just going to have the username, I don't think I need to carry this item around, and incure the extra complexity of lookups when I need to find a users int.
Thanks for any help with any of this.
It ensures uniqueness across disconnected systems. Any data store which may need to interface with another previously unconnected datastore can potentially encounter collisions - e.g. they both used int to identify users, now we have to go through a complex resolution process to choose new IDs for the conflicting ones and update all references accordingly.
The downside to using a standard uniqueidentifier in SQL (with newid()) as the primary key is that GUIDs are not sequential, so as new rows are created they are inserted at some arbitrary position in the physical database page, instead of appended to the end. This causes severe page fragmentation for systems that have any substantial insert rate. It can be corrected by using newsequentialid() instead. I discussed this in more detail here.
In general, its best practice to either use newsequentialid() for your GUID primary key, or just don't use GUIDs for the primary key. You can always have a secondary indexed column that stores a GUID, which you can use to keep your references unique.
GUIDs as a primary key are quite popular with certain groups of programmers who don't really (don't want to or don't know to) care about their underlying database.
GUIDs are cool because they're (almost) guaranteed to be unique, and you can create them "ahead of time" on the client app in .NET code.
Unfortunately, those folks aren't aware of the terrible downsides (horrendous index fragmentation and thus performance loss) of those choices when it comes to SQL Server. Lots of programmer really just don't care one bit..... and then blame SQL Server for being slow as a dog.
If you want to use GUIDs for your primary keys (and they do have some really good uses, as Rex M. pointed out - in replication scenarios mostly), then OK, but make sure to use a INT IDENTITY column as your clustering key in SQL Server to minimize index fragmentation and thus performance losses.
Kimberly Tripp, the "Queen of SQL Server Indexing", has a bunch of really good and insightful articles on the topic - see some of my favorites:
GUIDs as Primary and/or clustering key
The clustered index key debate continues....
The clustered index key debate....again!
Indexes in SQL Server 2005/2008 Best Practices
and basically anything she ever publishes on her blog is worth reading.
So in my simple learning website, I use the built in ASP.NET authentication system.
I am adding now a user table to save stuff like his zip, DOB etc. My question is:
In the new table, should the key be the user name (the string) or the user ID which is that GUID looking number they use in the asp_ tables.
If the best practice is to use that ugly guid, does anyone know how to get it? it seems to not be accessible as easily as the name (System.Web.HttpContext.Current.User.Identity.Name)
If you suggest I use neither (not the guid nor the userName fields provided by ASP.NET authentication) then how do I do it with ASP.NET authentication? One option I like is to use the email address of the user as login, but how to I make ASP.NET authentication system use an email address instead of a user name? (or there is nothing to do there, it is just me deciding I "know" userName is actually an email address?
Please note:
I am not asking on how get a GUID in .NET, I am just referring to the userID column in the asp_ tables as guid.
The user name is unique in ASP.NET authentication.
You should use some unique ID, either the GUID you mention or some other auto generated key. However, this number should never be visible to the user.
A huge benefit of this is that all your code can work on the user ID, but the user's name is not really tied to it. Then, the user can change their name (which I've found useful on sites). This is especially useful if you use email address as the user's login... which is very convenient for users (then they don't have to remember 20 IDs in case their common user ID is a popular one).
You should use the UserID.
It's the ProviderUserKey property of MembershipUser.
Guid UserID = new Guid(Membership.GetUser(User.Identity.Name).ProviderUserKey.ToString());
I would suggest using the username as the primary key in the table if the username is going to be unique, there are a few good reasons to do this:
The primary key will be a clustered index and thus search for a users details via their username will be very quick.
It will stop duplicate usernames from appearing
You don't have to worry about using two different peices of information (username or guid)
It will make writing code much easier because of not having to lookup two bits of information.
I would use a userid. If you want to use an user name, you are going to make the "change the username" feature very expensive.
I would say use the UserID so Usernames can still be changed without affecting the primary key. I would also set the username column to be unique to stop duplicate usernames.
If you'll mainly be searching on username rather than UserID then make Username a clustered index and set the Primary key to be non clustered. This will give you the fastest access when searching for usernames, if however you will be mainly searching for UserIds then leave this as the clustered index.
Edit : This will also fit better with the current ASP.Net membership tables as they also use the UserID as the primary key.
I agree with Palmsey,
Though there seems to be a little error in his code:
Guid UserID = new Guid(Membership.GetUser(User.Identity.Name)).ProviderUserKey.ToString());
should be
Guid UserID = new Guid(Membership.GetUser(User.Identity.Name).ProviderUserKey.ToString());
This is old but I just want people who find this to note a few things:
The aspnet membership database IS optimized when it comes to accessing user records. The clustered index seek (optimal) in sql server is used when a record is searched for using loweredusername and applicationid. This makes a lot of sense as we only have the supplied username to go on when the user first sends their credentials.
The guid userid will give a larger index size than an int but this is not really significant because we often only retrieve 1 record (user) at a time and in terms of fragmentation, the number of reads usually greately outweighs the number of writes and edits to a users table - people simply don't update that info all that often.
the regsql script that creates the aspnet membership tables can be edited so that instead of using NEWID as the default for userid, it can use NEWSEQUENTIALID() which delivers better performance (I have profiled this).
Profile. Someone creating a "new learning website" should not try to reinvent the wheel. One of the websites I have worked for used an out of the box version of the aspnet membership tables (excluding the horrible profile system) and the users table contained nearly 2 million user records. Even with such a high number of records, selects were still fast because, as I said to begin with, the database indexes focus on loweredusername+applicationid to peform clustered index seek for these records and generally speaking, if sql is doing a clustered index seek to find 1 record, you don't have any problems, even with huge numbers of records provided that you dont add columns to the tables and start pulling back too much data.
Worrying about a guid in this system, to me, based on actual performance and experience of the system, is premature optimization. If you have an int for your userid but the system performs sub-optimal queries because of your custom index design etc. the system won't scale well. The Microsoft guys did a generally good job with the aspnet membership db and there are many more productive things to focus on than changing userId to int.
I would use an auto incrementing number usually an int.
You want to keep the size of the key as small as possible. This keeps your index small and benefits any foreign keys as well. Additonally you are not tightly coupling the data design to external user data (this holds true for the aspnet GUID as well).
Generally GUIDs don't make good primary keys as they are large and inserts can happen at potentially any data page within the table rather than at the last data page. The main exception to this is if you are running mutilple replicated databases. GUIDs are very useful for keys in this scenario, but I am guessing you only have one database so this is not a problem.
If you're going to be using LinqToSql for development, I would recommend using an Int as a primary key. I've had many issues when I had relationships built off of non-Int fields, even when the nvarchar(x) field had constraints to make it a unique field.
I'm not sure if this is a known bug in LinqToSql or what, but I've had issues with it on a current project and I had to swap out PKs and FKs on several tables.
I agree with Mike Stone. I would also suggest only using a GUID in the event you are going to be tracking an enormous amount of data. Otherwise, a simple auto incrementing integer (Id) column will suffice.
If you do need the GUID, .NET is lovely enough that you can get one by a simple...
Dim guidProduct As Guid = Guid.NewGuid()
or
Guid guidProduct = Guid.NewGuid();
I'm agreeing with Mike Stone also. My company recently implemented a new user table for outside clients (as opposed to internal users who authenticate through LDAP). For the external users, we chose to store the GUID as the primary key, and store the username as varchar with unique constraints on the username field.
Also, if you are going to store the password field, I highly recommend storing the password as a salted, hashed binary in the database. This way, if someone were to hack your database, they would not have access to your customer's passwords.
I would use the guid in my code and as already mentioned an email address as username. It is, after all, already unique and memorable for the user. Maybe even ditch the guid (v. debateable).
Someone mentioned using a clustered index on the GUID if this was being used in your code. I would avoid this, especially if INSERTs are high; the index will be rebuilt every time you INSERT a record. Clustered indexes work well on auto increment IDs though because new records are appended only.