I'm wondering if this ,the strategy I will explain, would be recommended to use in Firebase.
I will first explain what my goal is, since I'm sure tons of others have solved the same problem already, and maybe some of you can tell me how it's usually done.
The goal is to notify all users of an App when the friend in common "George" (based on their contacts) is now also a proud new user of the App.
So, my idea was to do so:
1- Build a collection with this structure:
{
"contacts":
{
"user1":
{
{"user239":true}
,
{"user23":false}
,
{"user732":true}
}
,
{
"user2" :
{
{"user23":false}
,
{"user96":false}
,
{"user88":true}
}
}
}
}
To save for each user a list of contacts.
Then the new user would query a list of contacts like this:
fbRef.child('contacts').orderByChild('user23').equalTo(false).once('value', showResults, console.error);
Then the user would save the results in a map, change the value to true, and then updateChildren() using that map.
Now, is this reasonable if we imagine that we aspire to have hundreds of thousands and even millions of users using the App?
How expensive would this be when we have 5M users and a few joining by the second?
Is there a known "best strategy" for this case?
Thanks
The real-time functions in Firebase are not only suited for, but designed for large data sets. The fact that records stream in real-time is perfect for this.
Performance is, as with any large data app, only as good as your implementation. So here are a few gotchas to keep in mind with large data sets.
So what you can do is Denormalise the data.
**/users/uid
/users/uid/profile
/users/uid/chat_messages
/users/uid/groups
/users/uid/audit_record**
**/user_profiles/uid
/user_chat_messages/uid
/user_groups/uid
/user_audit_records/uid**
Second approach is good for iterating large data sets that the first approach which is clearly visible.
Avoid calling the value on large data sets.Call it by the child_used
This helps to denormalize the data above.
Remember firebase can handle large amount of data but it depends upon the approach you follow.
For Example: if we want to store the 'last_logins' for any user we can directly store it under the specific object instance. It will provide ease of access when we want to access 'last_logins' for a particular user.
Maintaining Many to Many Relations
We have already seen that we cannot nest Users in Groups as it will not represent many to many relation and will leave redundant data. We can create an index of groups under a specific containing only the keys of the groups which a user belongs to. This will enable us to easily fetch list of groups to which a user belongs to.
Storing One to Many Relations or Static Lists
Here are the links to some of the best practices that must be used for a firebase design.
Performance
Indexing
Related
I have an app that helps store owners manage their inventory through a simple API-driven interface.
My app stores all data on Firestore. My simplified database looks like this:
-users
-name
-email
-uid
-products
-atts
...
-ownerId
-someOtherThing
-atts
...
-ownerId
The idea is that only documents with ownerId that matches the current user ID will be accessible to the user. User with ID=5 will only have access to items that match ownerId=5.
Is this a good way of storing this data? I am worried that I will eventually end up with thousands of documents in that collection and querying them by "ownerId" might not be the best way to tackle this. On the other hand, I might end up with hundreds of users too, which probably makes it bad design to introduce several new collections for each of them?
What would be a better approach design-wise?
While "a good way" is subjective and purely dependent on the use-cases of your app, what you're proposing is quite a common way to store data in Firestore.
Your concern about the number of users and other documents is unwarranted, as Firestore guarantees that the performance of returning the (say) products for a specific user depends solely on the number of products returns, not on the total number of products in the database.
So if you have 10 products that you're the ownerId for, then no matter how many other users/products there are, the amount of time it takes to retrieve your 10 products will always be the same.
I am writing an app where there is not a lot of interaction with other users. Set and retrieve your own data only.
In Firebase Firestore how could I model this so that everything fits under a users UID?
Something that would look like this?
users/{uid}/user/
users/{uid}/settings/
users/{uid}/weather/
If I want to achieve something like this, then I need to create another UID:
users/{uid}/user/{uid}/{userInfo}
This feels a bit off to me.
Is this wrong? Would it be better if I moved every subcollection into its own collection?
Is this faster / more efficient?
Any help is appreciated!
The most common approaches for me:
Store the profile information, settings and weather in the user document (your {uid}) itself. This most common for the profile information, but it's always worth considering for other types too: do they really need to be in their own documents?
Have a default name for a single subcollection for each user, and then have each information type as a document with a known name in there. So /users/$uid/documents/profile, /users/$uid/documents/settings, and /users/$uid/documents/weather. So now each information type is in a separate document, meaning you can for example secure access to them individually.
If the information for a certain type is repeated, I'd put that in documents in a known/named subcollection. So if there are many weathers, you'd get /users/$uid/weather/$weatherdocs. So with this you can now have an endless set of the specific type of information.
Neither of these is pertinently better/worse, as it all depends on the use-cases of your app.
There will be performance differences between these approaches, as they require a different number of network requests. If this is a concern for your app, I'd recommend testing all approaches above to measure their relative performance against your requirements.
I have seen videos and read the documentation of Cloud firestore, from Google Firebase service, but I can't figure this out coming from realtime database.
I have this web app in mind in which I want to store my providers from different category of products. I want perform a search query through all my products to find what providers I have for such product, and eventually access that provider info.
I am planning to use this structure for this purpose:
Providers ( Collection )
Provider 1 ( Document )
Name
City
Categories
Provider 2
Name
City
Products ( Collection )
Product 1 ( Document )
Name
Description
Category
Provider ID
Product 2
Name
Description
Category
Provider ID
So my question is, is this approach the right way to access the provider info once I get the product I want?
I know this is possible in the realtime database, using the provider ID I could search for that provider in the providers section, but with Firestore I am not sure if its possible or if this is right approach.
What is the correct way to structure this kind of data in Firestore?
You need to know that there is no "perfect", "the best" or "the correct" solution for structuring a Cloud Firestore database. The best and correct solution is the solution that fits your needs and makes your job easier. Bear also in mind that there is also no single "correct data structure" in the world of NoSQL databases. All data is modeled to allow the use-cases that your app requires. This means that what works for one app, may be insufficient for another app. So there is not a correct solution for everyone. An effective structure for a NoSQL type database is entirely dependent on how you intend to query it.
The way you are structuring your data looks good to me. In general, there are two ways in which you can achieve the same thing. The first one would be to keep a reference of the provider in the product object (as you already do) or to copy the entire provider object within the product document. This last technique is called denormalization and is a quite common practice when it comes to Firebase. So we often duplicate data in NoSQL databases, to suit queries that may not be possible otherwise. For a better understanding, I recommend you see this video, Denormalization is normal with the Firebase Database. It's for Firebase Realtime Database but the same principles apply to Cloud Firestore.
Also, when you are duplicating data, there is one thing that needs to keep in mind. In the same way, you are adding data, you need to maintain it. In other words, if you want to update/delete a provider object, you need to do it in every place that it exists.
You might wonder now, which technique is best. In a very general sense, the best way in which you can store references or duplicate data in a NoSQL database is completely dependent on your project's requirements.
So you should ask yourself some questions about the data you want to duplicate or simply keep it as references:
Is the static or will it change over time?
If it does, do you need to update every duplicated instance of the data so they all stay in sync? This is what I have also mentioned earlier.
When it comes to Firestore, are you optimizing for performance or cost?
If your duplicated data needs to change and stay in sync in the same time, then you might have a hard time in the future keeping all those duplicates up to date. This will also might imply you spend a lot of money keeping all those documents fresh, as it will require a read and write for each document for each change. In this case, holding only references will be the winning variant.
In this kind of approach, you write very little duplicated data (pretty much just the Provider ID). So that means that your code for writing this data is going to be quite simple and quite fast. But when reading the data, you will need to load the data from both collections, which means an extra database call. This typically isn't a big performance issue for reasonable numbers of documents, but definitely does require more code and more API calls.
If you need your queries to be very fast, you may want to prefer to duplicate more data so that the client only has to read one document per item queried, rather than multiple documents. But you may also be able to depend on local client caches makes this cheaper, depending on the data the client has to read.
In this approach, you duplicate all data for a provider for each product document. This means that the code to write this data is more complex, and you're definitely storing more data, one more provider object for each product document. And you'll need to figure out if and how to keep up to date on each document. But on the other hand, reading a product document now gives you all information about the provider document in one read.
This is a common consideration in NoSQL databases: you'll often have to consider write performance and disk storage vs. reading performance and scalability.
For your choice of whether or not to duplicate some data, it is highly dependent on your data and its characteristics. You will have to think that through on a case-by-case basis.
So in the end, remember that both are valid approaches, and neither of them is pertinently better than the other. It all depends on what your use-cases are and how comfortable you are with this new technique of duplicating data. Data duplication is the key to faster reads, not just in Cloud Firestore or Firebase Realtime Database but in general. Any time you add the same data to a different location, you're duplicating data in favor of faster read performance. Unfortunately in return, you have a more complex update and higher storage/memory usage. But you need to note that extra calls in Firebase real-time database, are not expensive, in Firestore are. How much duplication data versus extra database calls is optimal for you, depends on your needs and your willingness to let go of the "Single Point of Definition mindset", which can be called very subjective.
After finishing a few Firebase projects, I find that my reading code gets drastically simpler if I duplicate data. But of course, the writing code gets more complex at the same time. It's a trade-off between these two and your needs that determines the optimal solution for your app. Furthermore, to be even more precise you can also measure what is happening in your app using the existing tools and decide accordingly. I know that is not a concrete recommendation but that's software development. Everything is about measuring things.
Remember also, that some database structures are easier to be protected with some security rules. So try to find a schema that can be easily secured using Cloud Firestore Security Rules.
Please also take a look at my answer from this post where I have explained more about collections, maps and arrays in Firestore.
This is mobile app which can have different kind of users. I'm using realm only for the offline storage. Say I have two users A and B and a have a List Class. This class wont ever be shared, so different data for each user. How would i go in designing the schema? Considering versioning and migration.
A. Add a primary key for the List and assign it differently to user A and B.
B. Use two different realms
There is no one good way of defining your Realm schema and the solution to choose completely depends on the exact scenario.
If you want your users data to be completely independent of each other and you will never need to use a single query to retrieve both users data or to access some common data, then using separate Realm instances for each use seems like a good approach. It provides complete separation between your users data.
However, if your users might have some shared data or if you might end up making some statistics about all of your users even though their data is independent, using a single Realm instance is the way to go. In this case you should just create a one-to-many relationship between each of your users and whatever objects you want to store in your lists like this:
class User:Object {
let stuff = List<Stuff>()
}
I'm really new to firebase, want to try out a simple mix-client app on it - android, js. I have a users table and a tasks table. The very first question that comes to my mind is, how to store them (and thus how the url to be)? For example, based on the tasks table, should I use:
/tasks/{userid}/task1, /tasks/{userid}/task2, ...
Or
/{userid}/tasks/task1, /{userid}/tasks/task2, ...
The next question, based on the answer to the first one - why to use any of the versions?
In my opinion, the first version is good because domains are separated.
The second approach is good because data is stored per-user which may make some of the operations easier.
Any ideas/suggestions?
Update: For the current case, let's say there are following features:
show list of tasks for each user
add new task to the list
edit/delete a task by user.
Simple operations.
This answer might come in late, but here's how I feel about the question after a year's experience with Firebase.
For your very first question, it totally depends on which data your application will mostly read and how and in which order ( kind of like sorting ) you expect to read the data.
your first proposal of data structure, that is "/tasks/{userid}/task1", "taks/{userid}/task2"... is good if the application will oftentimes read the tasks as per users with an added advantage of possibly sorting the data by any task's "attribute" if I might call it so.
say each task has got a priority attribute then,
// get all of a user's tasks with a priority of 25.
var userTasksRef = firebase.database().ref("tasks/${auth.uid}");
userTasksRef.orderByChild("priority").equalTo(25).on(
"desired_event",
(snapshot) => {
//do something important here.
});
2. I'll highly advice against the second approach because generally most if not all of the data that is associated to that user will be stored under the "/{userid}/" node and with firebase's mechanism, should a situation be in which you need more than one datum at that path level, it will require you getting that data with all the other data that's associated to that user's node ( tasks and any other data included). I won't want that behavior on my database. Nonetheless, this approach still permits you to store the tasks as per the users or making multiple RESTfull requesting and collecting the required data datum after datum. Suggest fanning out the data structure if this situation is encountered. Totally valid data structure if there don't exist a use case in the application where in datum at the first level of the path is needed and only that datum is needed but rather the block of data available at that path level with all the data at the deriving paths at that level( that is 2nd 3rd ... levels).
As per the use cases you've described, and if the database structure you've given is exhaustive of your database structure, I'll say it isn't enough to cover your use cases.
Suggest reading the docs here. Great and exhaustive documentation of their's.
As a pick, the first approach is a better approach to modelling this data use case in NoSQL and more accurately Firebase's NoSQL database.