Best approach to having multiple users in one app - realm

This is mobile app which can have different kind of users. I'm using realm only for the offline storage. Say I have two users A and B and a have a List Class. This class wont ever be shared, so different data for each user. How would i go in designing the schema? Considering versioning and migration.
A. Add a primary key for the List and assign it differently to user A and B.
B. Use two different realms

There is no one good way of defining your Realm schema and the solution to choose completely depends on the exact scenario.
If you want your users data to be completely independent of each other and you will never need to use a single query to retrieve both users data or to access some common data, then using separate Realm instances for each use seems like a good approach. It provides complete separation between your users data.
However, if your users might have some shared data or if you might end up making some statistics about all of your users even though their data is independent, using a single Realm instance is the way to go. In this case you should just create a one-to-many relationship between each of your users and whatever objects you want to store in your lists like this:
class User:Object {
let stuff = List<Stuff>()
}

Related

CosmosDB/DocumentDB partitioning with multiple types in same collection

Official recommendation from the team is, to my knowledge, to put all datatypes into single collection that have something like type=someType field on documents to distinguish types.
Now, if we assume large databases with partitioning where different object types can be:
Completely different fields (so no common field for partitioning)
Related (through reference)
How to organize things so that things that should go together end up in same partition?
For example, lets say we have:
User
BlogPost
BlogPostComment
If we store them as separate types with type=user|blogPost|blogPostComment, in same collection, how do we ensure that user, his blogposts and all the corresponding comments end up in same partition?
Is there some best practice for this?
[UPDATE]
Can you ever avoid cross-partition queries completely? Should that be a goal? Or you just try to minimize them?
For example, you can partition your data perfectly for 99% of cases/queries but then you need some dashboard to show aggregates from all-the-data. Is that something you just accept as inevitable and try to minimize or is it possible to avoid it completely?
I've written about this somewhat extensively in other similar questions regarding Cosmos.
Basically, when dealing with many different logical entity types in a single Cosmos collection the easiest option is to put a generic (or abstract, as you refer to it) partition key on all your documents. At this point it's the concern of the application to make sure that at runtime the appropriate value is chosen. I usually name this document property either partitionKey, routingKey or something similar.
This is extremely important when designing for optimal query efficiency as your choice of partition keys can have a huge impact on query and throughput performance. A generic key like this lets you design the optimal storage of your data as it benefits whatever application you're building.
Even something like tenant does not make sense as different tenants might have wildly different data size and access patterns. Instead you could include the tenantId at runtime as part of your partition key as a kind of composite.
UPDATE:
For certain query patterns it might be possible to serve them entirely out of a single partition. It's definitely not the end of the world if things end up going cross partition though. The system is still quick. If possible, limiting the amount of partitions that need to be touched for a given query is ideal but you're never going to get away from it 100% of the time.
A partition should hold data related to a group that is expected to grow, for instance a Tenant which will group many documents (which can be of different types as you have mentioned) So the Partition Key in this instance should be the TenantId. The partitioning is more about the data relating to a group than the type of data. If the data is related to a User then you could use the UserId, however many users may comment on the same posts so it doesn't seem like a good candidate for a partition key unless there is some de-normalization of the user info so it doest have to relate back to the other users directly.. if that makes sense?

What performance can I expect from such a query in Firebase?

I'm wondering if this ,the strategy I will explain, would be recommended to use in Firebase.
I will first explain what my goal is, since I'm sure tons of others have solved the same problem already, and maybe some of you can tell me how it's usually done.
The goal is to notify all users of an App when the friend in common "George" (based on their contacts) is now also a proud new user of the App.
So, my idea was to do so:
1- Build a collection with this structure:
{
"contacts":
{
"user1":
{
{"user239":true}
,
{"user23":false}
,
{"user732":true}
}
,
{
"user2" :
{
{"user23":false}
,
{"user96":false}
,
{"user88":true}
}
}
}
}
To save for each user a list of contacts.
Then the new user would query a list of contacts like this:
fbRef.child('contacts').orderByChild('user23').equalTo(false).once('value', showResults, console.error);
Then the user would save the results in a map, change the value to true, and then updateChildren() using that map.
Now, is this reasonable if we imagine that we aspire to have hundreds of thousands and even millions of users using the App?
How expensive would this be when we have 5M users and a few joining by the second?
Is there a known "best strategy" for this case?
Thanks
The real-time functions in Firebase are not only suited for, but designed for large data sets. The fact that records stream in real-time is perfect for this.
Performance is, as with any large data app, only as good as your implementation. So here are a few gotchas to keep in mind with large data sets.
So what you can do is Denormalise the data.
**/users/uid
/users/uid/profile
/users/uid/chat_messages
/users/uid/groups
/users/uid/audit_record**
**/user_profiles/uid
/user_chat_messages/uid
/user_groups/uid
/user_audit_records/uid**
Second approach is good for iterating large data sets that the first approach which is clearly visible.
Avoid calling the value on large data sets.Call it by the child_used
This helps to denormalize the data above.
Remember firebase can handle large amount of data but it depends upon the approach you follow.
For Example: if we want to store the 'last_logins' for any user we can directly store it under the specific object instance. It will provide ease of access when we want to access 'last_logins' for a particular user.
Maintaining Many to Many Relations
We have already seen that we cannot nest Users in Groups as it will not represent many to many relation and will leave redundant data. We can create an index of groups under a specific containing only the keys of the groups which a user belongs to. This will enable us to easily fetch list of groups to which a user belongs to.
Storing One to Many Relations or Static Lists
Here are the links to some of the best practices that must be used for a firebase design.
Performance
Indexing

MVC3 + MongoDB Architecture: Store models directly to database?

I am currently developing a mvc3 application using mongodb. I am quite unsure on how i shall build the architecture. E.g. my app has a page used for managing the user profile for a registered user (like name, email, some attributes exposed inside enum-comboboxes). Hence i have a ManageProfileModel.cs with all properties to manage. What's the proper way to use the data with mongodb? Shall i store the ManageProfileModel data inside mongodb or do i have to add an additional layer containing domain classes like User.cs, Invoice.cs, ... and store these objects inside mongodb (these objects are being used in the models created)?
I am asking because a model for managing a user profile does not necessarily resemble a user (domain) object. My first approach is to store directly my (view)models inside mongodb. I am not sure if its that easy to get my (consistent) data at a later point.
Thanks!
I would store the models directly in Mongo as-is for most of your data. I'm sure you know this already, but Mongo focuses on denormalization, and so it's different than traditional relational databases that want you to normalize your data.
So for a profile, you might have a user, a set of invoices, a set of addresses etc. As you decide your data models, I would suggest the following:
Consider your UI. If you need user + profile + invoices, go ahead and make a document like that. Makes your life a lot easier.
Don't be afraid to have repeated information stored.
You will constantly be wondering if you should embed a document (adding addresses to user) or link to a document (put a list of references in an array referencing invoices). The rule I've heard that I think is good: If the data is constantly changing, make a link/reference. If it's immutable or slowly changing, embed it.
If your document will grow a lot over time, considering breaking it up. Mongo has to move your document in memory if it grows too big.

sql server database design

I am planning to create a website using ASP.NET and SQL Server. However, my plan for the database design leaves me wondering if there is a better way.
The website will serve as a repository of information for various users. I figure I would have two databases, a Membership and Profile database.
The profile database would contain user data for all users, where each user may have ~20 tables. I would create the tables when the user account is created and generate a key used to name the tables. The tables are not directly related.
For Example a set of tables for two different users could look like:
User1 Tables - TransactionTable_Key1, AssetTable_Key1, ResearchTable_Key1 ....;
User2 Tables - TransactionTable_Key2, AssetTable_Key2, ResearchTable_Key2 ....;
The Key1, Key2 etc.. values would be retrieved based on the MembershipID data when the account was created. This could result in a very large number of tables over time. I'm not sure if this will limit scalability by setting up the database in this way. Any recommendations?
Edit: I should mention that some of these tables would contain 20k+ rows.
Realistically it sounds like you only really need one database for this.
From the way you worded your question, it sounds like you're trying to dynamically create tables for users as they create accounts. I wouldn't recommend this method.
What you want to do is create a master table that contains a primary key for each individual user. I'm assuming this is the Membership table. Then create the ~20 tables that you need for the profiles of these members. Every record, no matter the number of users that you have, will go into these tables. These 20 tables would need to have a foreign key pointing to the unique identifier of the Membership table.
When you want to query a Member for their user information, just select from the tables where the membership table's primary Id matches the foreign key in the profile tables.
This would result in only a few tables in the end and is easily maintainable and follows better database design.
Your ORM layer (EF, LINQ, DAL code) will hate having to deal with one set of tables per tenant. It is much better to have either one set of tables for all tenant in a single database, or a separate database per tenant. The later is only better if schema upgrade has to be vetted by tenant (like Salesforce.com has). If you can afford to upgrade all tenant to a new schema at once then there is no reason for database per tenant.
When you design a schema that hold multiple tenant the important things to remember are
don't use heaps, all tables must be clustered index
add the tenant ID as the leftmost key to every clustered
add the tenant ID as the leftmost key to every non-clustered index too
add the Left.tenantID = right.tenantID predicate to every join
add the table.TenantID = #currentTenantID to every query
These are fairly simple rules and if you obey them (with no exceptions) you will get a perfect partitioning per tenant of every query (no query will ever ever scan rows in a range of a different tenant) so you eliminate contention between tenants. To be more through, you can disable lock escalation to make sure no tenant escalates to block every other tenant.
This design also lends itself to table partitioning and to sharing the database for scale-out.
You definitely don't want to create a set of tables for each user, and you would want these only in one database. Even with SQL Server 2008's large capacity for tables (note really total objects in database), it would quickly become unmanageable. Your best bet is to use 20 tables, and separate them via a column into user areas. You might consider partitioning the tables by this user value, but that should be tested for performance reasons too.
Yes, since the tables only contain id, key, and value, why not make one single table?
Have the columns:
id, user ID, key, value
Put an Index on the user ID field.
A key idea behind a relational database is that the table structure does not change. You create a solid set of tables, and these are the "bones" of your application.
Cheers,
Daniel
Neal,
The solution really depends on your requirement. If security and data access are concern and you have only a handful of users, you can set up a different db for each user with access for him set to only his/her database.
Other wise, what Daniel Williams suggested is a good alternative where you have one DB and tables laid out with a indexed column partitioning the users data rows.
It's hard to tell from the summary, but it looks like you are designing for dynamic attribution by user. This design approach is called EAV (Entity-Attribute-Value) and consists of a simple base collection key (UserID, SiteID, ProductID...) and then rows consisting of name/value pairs. In a more complex version, categories are sometimes added as "super columns" to the tuple/row and provide sub-groupings for a set of name/value pairs.
Designing in this way moves responsibility for data type integrity, relational integrity and tuple integrity to the application layer.
The risk with doing this in a relational system involves the breaking of the tuple or row into a set of rows. Updates, deletes, missing values and the definition of a tuple are no longer easily accessible through human interaction. As your application evolves and the definition of a tuple changes, it becomes almost impossible to tell if a name/value pair is missing because it's part of an earlier-version tuple or because it was unintentionally deleted. Ad-hoc research as well becomes harder to manage as business analysts must keep an understanding of the virtual structure either in their heads or in documentation provided.
If you are looking to implement an EAV model, I would suggest you look at a non-relational solution (nosql) like MongoDB or CouchDB. These stores allow a developer to save and retrieve "documents" or json-formatted messages that are essentially made up of a collection of name/value pairs and can look very much like a serialized object. The advantage here is that you can store dynamic attribution without breaking your tuple. You always know that you have a complete tuple because you can store and retrieve it as a single "blob" of information that can be serialized and deserialized at-will. You can also update single attributes within the tuple, if that's a concern.
MongoDB also provides some database-like features such as multiple-attribute indexes, a query engine that is robust in comparison to other similar non-relational offerings and a sharding solution that is much less trouble than trying to do it with MySQL.
I hope this helps.

Create new database programmatically in Asp.Net MVC application?

I have worked on a timesheet application application in MVC 2 for internal use in our company. Now other small companies have showed interest in the application. I hadn't considered this use of the application, but it got me interested in what it might imply.
I believe I could make it work for several clients by modifying the database (Sql Server accessed by Entity Framework model). But I have read some people advocating multiple databases (one for each client).
Intuitively, this feels like a good idea, since I wouldn't risk having the data of various clients mixed up in the same database (which shouldn't happen of course, but what if it did...). But how would a multiple database solution be implemented specifically?
I.e. with a single database I could just have a client register and all the data needed would be added by the application the same way it is now when there's just one client (my own company).
But with a multiple database solution, how would I create a new database programmatically when a user registers? Please note that I have done all database stuff using Linq to Sql, and I am not very familiar with regular SQL programming...
I would really appreciate a clear detailed explanation of how this could be done (as well as input on whether it is a good idea or if a single database would be better for some reason).
EDIT:
I have also seen discussions about the single database alternative, suggesting that you would then add ClientId to each table... But wouldn't that be hard to maintain in the code? I would have to add "where" conditions to a lot of linq queries I assume... And I assume having a ClientId on each table would mean that each table would have need to have a many to one relationship to the Client table? Wouldn't that be a very complex database structure?
As it is right now (without the Client table) I have the following tables (1 -> * designates one to many relationship):
Customer 1 -> * Project 1 -> * Task 1 -> * TimeSegment 1 -> * Employee
Also, Customer has a one to many relationship directly with TimeSegment, for convenience to simplify some queries.
This has worked very well so far. Wouldn't it be possible to simply have a Client table (or UserCompany or whatever one might call it) with a one to many relationship with Customer table? Wouldn't the data integrity be sufficient for the other tables since the rest is handled by the relationships?
as far as whether or not to use a single database or multiple databases, it really all depends on the use cases. more databases means more management needs, potentially more diskspace needs, etc. there are alot more things to consider here than just how to create the database, such as how will you automate the backup process creation, etc. i personally would use one database with a good authentication system that would filter the data to the appropriate client.
as to creating a database, check out this blog post. it describes how to use SMO (sql management objects) in c#.net to create a database. they are a really neat tool, and you'll definitely want to familiarize yourself with them.
to deal with the follow up question, yes, a single, top level relationship between clients and customers should be enough to limit the new customers to their appropriate data.
without any real knowledge about your application i can't say how complex adding that table will be, but assuming your data layer is up to snuff, i would assume you'd really only need to limit the customers class by the current client, and then get all the rest of your data based on the customers that are available.
did that make any sense?
See my answer here, it applies to your case as well: c# database architecture

Resources