Is it best practice to reference the IdentityUser primary throughout the application? - .net-core

I'm relatively new to .NET Core MVC / Web API and I'm currently implementing a multi user, multi role system.
The default IdentityUser that I extend from has a Guid primary key (which I know I can override).
In all other systems yet I pointed to the user's auto incremented PK int value to reference data that belongs to a certain user.
The question now is if this is Microsoft's intended use of this primary key for IdentityUser and if it's ok when I reference in all tables this (big) Guid, or I could create a separate Users table and create an (IdentityUser.Id, UserId auto increment) separate table and reference this UserId int. Or even another solution which I don't know yet.
I especially ask because I read multiple times that this Guid is supposed to be kept secret, but when I start to reference the Guid everywhere the likelihood of leakage increases.

Database is where all your secrets are stored, therefore, you don't need to worry if your GUIDs are used as foreign keys in other tables because they are still in your database. Maybe what you read about keeping it secret meant that you should not expose it to client-side apps, although I still disagree. Because one advantage of GUID is that even if a user gains one GUID, (unlike auto-incremented integers) he cannot guess the other ones.

Related

How to create a good primary key in DynamoDB

I have an application on AWS using DynamoDB with user sending messages to each other. I am not familiar with AWS and I a lacking best practice knowledge
My application has now started to get slow to retrieve messages for a user because I have more and more data in my database.
I am thinking that it is because of my primary key and I wonder what could be a good primary key in this case.
Currently I am using a random guid as a primary key.
I am looking to retrieve all messages corresponding to a user, I am doing a scan operation.
I would like to use a composite value based on username as a primary key but I wonder if it will be better. For instance if I need to retrieve the number of messages for a user and to increment it will probably be even longer to do the request to create the primary key.
What would be a good primary key here ?
Thanks!
It will be better since it appears you often query based on the userid. Scans are expensive and should be avoided where possible. AWS has a great article on best practices for choosing a partition key (primary key). The key takeaway is the following:
You should evaluate various approaches based on your data ingestion and access pattern, then choose the most appropriate key with the least probability of hitting throttling issues.
Using a guid for the partition/primary key is a waste if you never query the data using it. Since using the query operation (rather than using scan) requires querying using the partition/primary (and sort key), you want to ensure you choose a value that you use to retrieve the data often and also has the sufficient cardinality to ensure your data is distributed across a reasonable amount of partitions.
What other access patterns do you have in your application? From what you've mentioned so far, userid seems to be a reasonable choice.

Is there a way to enforce a schema constraint on an AWS DynamoDB table?

I'm a MSSQL developer who recently was tasked with building a new application using DynamoDB since we use AWS and we wanted a highly scaleable database service.
My biggest concern is data integrity. For example, I have a table for all my users where every row needs to have a username, email, and name field, all strings, with a verified field that's an int. Is there anyway to require all entries in that table to have those fields and to be of that particular type?
Since the application is in PHP I'm using Kettle as my ORM which should prevent me from messing up the data integrity but another developer voiced a concern about if we ever add another application or if someone manually changes some types via the console.
https://github.com/inouet/kettle
Currently, no, you are responsible for maintaining the integrity of your items with respect to the existence of attributes that are not keys on the base table. However, you can use LSI and GSI to enforce data types of attributes (notwithstanding my qualm that this is not a recommended pattern, as it could cause partition heat especially for attributes whose range of values is small). For example, verified seems like it might take only 0 or 1 as a value, so if you create a GSI with PK=verified where verified is a Number, writes to the base table may get throttled by the verified GSI.

Relationship Between ASPNET Membership Provider Tables and Custom Membership Tables

I went through a custom profile provider example a while ago and I am
now revisiting it.
My database has all the dbo.aspnet_* tables created when I ran the aspnet registration
wizard. In these tables I have aspnet_Profile which has a FK constraint pointing to aspnet_Users.
I also have two tables in MyDB: The first, dbo.ProfileData, has a foreign key constraint
pointing to dbo.Profile.
What I want to understand is how the tables in MyDB relate to
those in dbo.aspnet_*. Shouldn't there be a foreign key constraint (or some kind of
relationship) between the profile tables in MyDB and the aspnet tables? Some discussion
of how my custom tables relate to those provided by aspnet would be wonderful.
Thanks in advance.
There are two options I can see, both of which will yield basically the same result:
FK from dbo.aspnet_User.UserID to dbo.Profile.UserID, then define a unique key on dbo.Profile.UserID (unless you use it as the PK column for dbo.Profile)
FK from dbo.aspnet_Profile.ProfileID to dbo.Profile.ProfileID
dbo.aspnet_User is logically 1 - 1 with dbo.aspnet_Profile, so it doesn't really matter which approach you use as you will still get the same relational integrity.
If you are replacing the standard profile data table with your own implementation then it makes more sense to use the first suggestion, otherwise if you are extending the Profile schema then use the second suggestion.
EDIT
aspnet_Profile is the standard table - the standard SqlProfileProvider stores the user's profile data as a serialized property bag in aspnet_Profile, hence why there is no separate aspnet_ProfileData table as well.
This approach allows the profile schema to be customized easily for different applications without requiring any changes to the underlying database, and is the most optimal solution for a framework such as .NET. The drawback is that SQL Server does not have easy access to this data at all, so it is much more difficult to index, update and query the user's profile data using T-SQL and set-based logic.
The most common approach I have seen to remove this limitation is to extend the standard SqlProfileProvider to write to a custom profile data table which has specific columns for application-specific profile properties. This table naturally has a 1-1 relationship with the aspnet_Profile table, so it has a foreign key as indicated above.
The role of the extended provider is to promote specific profile properties to columns during profile writes, and read in the columns when the profile is retrieved.
This allows you to mix-and-match storage solutions on an as-needs basis, as long as your extended provider knows how to fall back to the standard implementation where it does not 'know' about a given property.
I always think it is best to leave the standard membership tables as-is, and extend where necessary using new tables with appropriate foreign keys, then subclass the appropriate provider and override the provider methods with your own implementation (calling into the base implementation wherever possible).

Why does aspnet_users use guid for id rather than incrementing int? bonus points for help on extending user fields

Why does aspnet_users use guid for id rather than incrementing int?
Also is there any reason no to use this in other tables as the primary key? It feels a bit odd as I know most apps I've worked with in the past just use the normal int system.
I'm also about to start using this id to match against an extended details table for extra user prefs etc. I was also considering using a link table with a guid and an int in it, but decided that as I don't think I actually need to have user id as a public int.
Although I would like to have the int (feels easier to do a user lookup etc stackoverflow.com/users/12345/user-name ) , as I am just going to have the username, I don't think I need to carry this item around, and incure the extra complexity of lookups when I need to find a users int.
Thanks for any help with any of this.
It ensures uniqueness across disconnected systems. Any data store which may need to interface with another previously unconnected datastore can potentially encounter collisions - e.g. they both used int to identify users, now we have to go through a complex resolution process to choose new IDs for the conflicting ones and update all references accordingly.
The downside to using a standard uniqueidentifier in SQL (with newid()) as the primary key is that GUIDs are not sequential, so as new rows are created they are inserted at some arbitrary position in the physical database page, instead of appended to the end. This causes severe page fragmentation for systems that have any substantial insert rate. It can be corrected by using newsequentialid() instead. I discussed this in more detail here.
In general, its best practice to either use newsequentialid() for your GUID primary key, or just don't use GUIDs for the primary key. You can always have a secondary indexed column that stores a GUID, which you can use to keep your references unique.
GUIDs as a primary key are quite popular with certain groups of programmers who don't really (don't want to or don't know to) care about their underlying database.
GUIDs are cool because they're (almost) guaranteed to be unique, and you can create them "ahead of time" on the client app in .NET code.
Unfortunately, those folks aren't aware of the terrible downsides (horrendous index fragmentation and thus performance loss) of those choices when it comes to SQL Server. Lots of programmer really just don't care one bit..... and then blame SQL Server for being slow as a dog.
If you want to use GUIDs for your primary keys (and they do have some really good uses, as Rex M. pointed out - in replication scenarios mostly), then OK, but make sure to use a INT IDENTITY column as your clustering key in SQL Server to minimize index fragmentation and thus performance losses.
Kimberly Tripp, the "Queen of SQL Server Indexing", has a bunch of really good and insightful articles on the topic - see some of my favorites:
GUIDs as Primary and/or clustering key
The clustered index key debate continues....
The clustered index key debate....again!
Indexes in SQL Server 2005/2008 Best Practices
and basically anything she ever publishes on her blog is worth reading.

Should I use the username, or the user's ID to reference authenticated users in ASP.NET

So in my simple learning website, I use the built in ASP.NET authentication system.
I am adding now a user table to save stuff like his zip, DOB etc. My question is:
In the new table, should the key be the user name (the string) or the user ID which is that GUID looking number they use in the asp_ tables.
If the best practice is to use that ugly guid, does anyone know how to get it? it seems to not be accessible as easily as the name (System.Web.HttpContext.Current.User.Identity.Name)
If you suggest I use neither (not the guid nor the userName fields provided by ASP.NET authentication) then how do I do it with ASP.NET authentication? One option I like is to use the email address of the user as login, but how to I make ASP.NET authentication system use an email address instead of a user name? (or there is nothing to do there, it is just me deciding I "know" userName is actually an email address?
Please note:
I am not asking on how get a GUID in .NET, I am just referring to the userID column in the asp_ tables as guid.
The user name is unique in ASP.NET authentication.
You should use some unique ID, either the GUID you mention or some other auto generated key. However, this number should never be visible to the user.
A huge benefit of this is that all your code can work on the user ID, but the user's name is not really tied to it. Then, the user can change their name (which I've found useful on sites). This is especially useful if you use email address as the user's login... which is very convenient for users (then they don't have to remember 20 IDs in case their common user ID is a popular one).
You should use the UserID.
It's the ProviderUserKey property of MembershipUser.
Guid UserID = new Guid(Membership.GetUser(User.Identity.Name).ProviderUserKey.ToString());
I would suggest using the username as the primary key in the table if the username is going to be unique, there are a few good reasons to do this:
The primary key will be a clustered index and thus search for a users details via their username will be very quick.
It will stop duplicate usernames from appearing
You don't have to worry about using two different peices of information (username or guid)
It will make writing code much easier because of not having to lookup two bits of information.
I would use a userid. If you want to use an user name, you are going to make the "change the username" feature very expensive.
I would say use the UserID so Usernames can still be changed without affecting the primary key. I would also set the username column to be unique to stop duplicate usernames.
If you'll mainly be searching on username rather than UserID then make Username a clustered index and set the Primary key to be non clustered. This will give you the fastest access when searching for usernames, if however you will be mainly searching for UserIds then leave this as the clustered index.
Edit : This will also fit better with the current ASP.Net membership tables as they also use the UserID as the primary key.
I agree with Palmsey,
Though there seems to be a little error in his code:
Guid UserID = new Guid(Membership.GetUser(User.Identity.Name)).ProviderUserKey.ToString());
should be
Guid UserID = new Guid(Membership.GetUser(User.Identity.Name).ProviderUserKey.ToString());
This is old but I just want people who find this to note a few things:
The aspnet membership database IS optimized when it comes to accessing user records. The clustered index seek (optimal) in sql server is used when a record is searched for using loweredusername and applicationid. This makes a lot of sense as we only have the supplied username to go on when the user first sends their credentials.
The guid userid will give a larger index size than an int but this is not really significant because we often only retrieve 1 record (user) at a time and in terms of fragmentation, the number of reads usually greately outweighs the number of writes and edits to a users table - people simply don't update that info all that often.
the regsql script that creates the aspnet membership tables can be edited so that instead of using NEWID as the default for userid, it can use NEWSEQUENTIALID() which delivers better performance (I have profiled this).
Profile. Someone creating a "new learning website" should not try to reinvent the wheel. One of the websites I have worked for used an out of the box version of the aspnet membership tables (excluding the horrible profile system) and the users table contained nearly 2 million user records. Even with such a high number of records, selects were still fast because, as I said to begin with, the database indexes focus on loweredusername+applicationid to peform clustered index seek for these records and generally speaking, if sql is doing a clustered index seek to find 1 record, you don't have any problems, even with huge numbers of records provided that you dont add columns to the tables and start pulling back too much data.
Worrying about a guid in this system, to me, based on actual performance and experience of the system, is premature optimization. If you have an int for your userid but the system performs sub-optimal queries because of your custom index design etc. the system won't scale well. The Microsoft guys did a generally good job with the aspnet membership db and there are many more productive things to focus on than changing userId to int.
I would use an auto incrementing number usually an int.
You want to keep the size of the key as small as possible. This keeps your index small and benefits any foreign keys as well. Additonally you are not tightly coupling the data design to external user data (this holds true for the aspnet GUID as well).
Generally GUIDs don't make good primary keys as they are large and inserts can happen at potentially any data page within the table rather than at the last data page. The main exception to this is if you are running mutilple replicated databases. GUIDs are very useful for keys in this scenario, but I am guessing you only have one database so this is not a problem.
If you're going to be using LinqToSql for development, I would recommend using an Int as a primary key. I've had many issues when I had relationships built off of non-Int fields, even when the nvarchar(x) field had constraints to make it a unique field.
I'm not sure if this is a known bug in LinqToSql or what, but I've had issues with it on a current project and I had to swap out PKs and FKs on several tables.
I agree with Mike Stone. I would also suggest only using a GUID in the event you are going to be tracking an enormous amount of data. Otherwise, a simple auto incrementing integer (Id) column will suffice.
If you do need the GUID, .NET is lovely enough that you can get one by a simple...
Dim guidProduct As Guid = Guid.NewGuid()
or
Guid guidProduct = Guid.NewGuid();
I'm agreeing with Mike Stone also. My company recently implemented a new user table for outside clients (as opposed to internal users who authenticate through LDAP). For the external users, we chose to store the GUID as the primary key, and store the username as varchar with unique constraints on the username field.
Also, if you are going to store the password field, I highly recommend storing the password as a salted, hashed binary in the database. This way, if someone were to hack your database, they would not have access to your customer's passwords.
I would use the guid in my code and as already mentioned an email address as username. It is, after all, already unique and memorable for the user. Maybe even ditch the guid (v. debateable).
Someone mentioned using a clustered index on the GUID if this was being used in your code. I would avoid this, especially if INSERTs are high; the index will be rebuilt every time you INSERT a record. Clustered indexes work well on auto increment IDs though because new records are appended only.

Resources