Best way to store user-specific data in Firestore

Best way to store user-specific data in Firestore - firebase

I have an app that helps store owners manage their inventory through a simple API-driven interface.
My app stores all data on Firestore. My simplified database looks like this:
-users
-name
-email
-uid
-products
-atts
...
-ownerId
-someOtherThing
-atts
...
-ownerId
The idea is that only documents with ownerId that matches the current user ID will be accessible to the user. User with ID=5 will only have access to items that match ownerId=5.
Is this a good way of storing this data? I am worried that I will eventually end up with thousands of documents in that collection and querying them by "ownerId" might not be the best way to tackle this. On the other hand, I might end up with hundreds of users too, which probably makes it bad design to introduce several new collections for each of them?
What would be a better approach design-wise?

While "a good way" is subjective and purely dependent on the use-cases of your app, what you're proposing is quite a common way to store data in Firestore.
Your concern about the number of users and other documents is unwarranted, as Firestore guarantees that the performance of returning the (say) products for a specific user depends solely on the number of products returns, not on the total number of products in the database.
So if you have 10 products that you're the ownerId for, then no matter how many other users/products there are, the amount of time it takes to retrieve your 10 products will always be the same.

Related

Firestore data model for events planning app

I am new to Firestore and building an event planning app but I am unsure what the best way to structure the data is taking into account the speed of queries and Firestore costs based on reads etc. In both options I can think of, I have a users collection and an events collection
Option 1:
In the users collection, each user has an array of eventIds for events they are hosting and also events they are attending. Then I query the events collection for those eventIds of that user so I can list the appropriate events to the user
Option 2:
For each event in the events collection, there is a hostId and an array of attendeeIds. So I would query the events collection for events where the hostID === user.id and where attendeeIds.includes(user.id)
I am trying to figure out which is best from a performance and a costs perspective taking into account there could be thousands of events to iterate through. Is it better to search events collections by an eventId as it will stop iterating when all events are found or is that slow since it will be searching for one eventId at a time? Maybe there is a better way to do this than I haven't mentioned above. Would really appreciate the feedback.

In addition to #Dharmaraj answer, please note that none of the solutions is better than the other in terms of performance. In Firestore, the query performance depends on the number of documents you request (read) and not on the number of documents you are searching. It doesn't really matter if you search 10 documents in a collection of 100 documents or in a collection that contains 100 million documents, the response time will always be the same.
From a billing perspective, yes, the first solution will imply an additional document to read, since you first need to actually read the user document. However, reading the array and getting all the corresponding events will also be very fast.
Please bear in mind, that in the NoSQL world, we are always structuring a database according to the queries that we intend to perform. So if a query returns the documents that you're interested in, and produces the fewest reads, then that's the solution you should go ahead with. Also remember, that you'll always have to pay a number of reads that is equal to the number of documents the query returns.
Regarding security, both solutions can be secured relatively easily. Now it's up to you to decide which one works better for your use case.

I would recommend going with option 2 because it might save you some reads:
You won't have to query the user's document in the first place and then run another query like where(documentId(), "in", [...userEvents]) or fetch each of them individually if you have many.
When trying to write security rules, you can directly check if an event belongs to the user trying to update the event by resource.data.hostId == request.auth.uid.
When using the first option, you'll have to query the user's document in security rules to check if this eventID is present in that events array (that may cost you another read). Checkout the documentation for more information on billing.

How to structure Firestore Security Rules & Data Structure for granular access

I am building a community-type app based on Firestore where users should have granual control over what kind of information they share with whom.
Users can have properties such as name, birthdate, etc. and for each of them they can decide to share it with the one of the following groups/roles:
Private
Contacts
Admin (Admins of organizations that user is a member of)
Organization (Members of organizations that a users is a member of)
Public (All users of the app)
As documents in Firestore will always be retrieved as a whole, I already know that I somehow will have to segregate my user properties by access level.
I've got two approaches so far:
Approach 1
Store each user property in a separate document that contains a field access level
Store some metadata in, for example /user/12345/meta/roles, so that I can point the security rules to those documents to validate access
Benefits:
Easy structure
Flexibly
(Almost) no data duplication
Drawbacks:
Lots of document reads for getting a user's profile
Approach 2
Store user profile in, for example /user/12345/profile/private and duplicate the public information into /user/12345/profile/public, and do the same for each access level
Benefits:
Reduced document reads
Drawbacks:
Complexity
It feels wrong to duplicate that much data
Does anyone have any experience with this and any suggestions or alternative approaches they can share?
Follow-up question:
Let’s say I store the list of members of an organization in a subcollection, that is only accessible for members of the organization (for privacy reasons). Doesn’t that mean that when querying that list of members from client side, I have to do it „blindly“, meaning I can’t know if the user can access that document until I actually try? The fact that the query might fail would tell me that the user is not actually a member of that organization.
Would you consider this kind of query that is set up for failure bad practice? Are there any alternatives that still allow to keep the memberlist private?

I think you are moving from a SQL environment to NoSql now which is why you are finding the Approach 2 as not the right way to proceed.
Actually approach 2 is the right way to proceed there are couple of advantages
1.) Reduced Document Reads - More cost savings. Firestore charges by number of reads and writes if you are reducing no of reads and writes optimally its always the way to go for. Also the cost of storage due is increased reads will always be less than the actual cost of reads if you are scaling up your application.
2.) In NoSql database your are allowed to duplicate data provided it is going to increase the read / search speed from the database.
I am not seeing the second approach as complex because that's the tradeoff you are making when Choosing a NoSql over Sql

What is the best way to get multiple specific data from collections in firestore?

is there any better way to get multiple specific data from collection in firestore?
Let's say have this collection:
--Feeds (collection)
--feedA (doc)
--comments (collection)
--commentA (doc)
users_in_conversation: [abcdefg, hijklmn, ...] //Field contains list of all user in conversation
Then, I'll need to retrieve the user data (name and avatar) from the Users collection, currently, I did 1 query per user, but it will be slow when there are many people in conversation.
What's the best way to retrieve specific users?
Thanks!

Retrieving the additional names is actually a lot faster than most developers expect, as the requests can often be pipelined over a single HTTP/2 connection. But if you're noticing performance problems, edit your question to show the code you use, the data you have, and the performance you're getting.
A common way to reduce the need to load additional documents is by duplicating data. For example, if you store the name and avatar of the user in each comment document, you won't need to look up the user profile every time you read a comment.
If you come from a background in relational databases, this sort of data duplication may be very unexpected. But it's actually quite common in NoSQL databases.
You will of course then have to consider how to deal with updates to the user profile, for which I recommend reading: How to write denormalized data in Firebase While this is for Firebase's other database, the same concepts apply to Firebase. I also in general recommend watching Getting to know Cloud Firestore.

I have tried some solution, but I think this solution is the best for the case:
When a user posts a comment, write a field of array named discussions in the user document containing the feed/post id.
When user load on a feed/post, get all user data which have its id in the user discussions (using array-contains)
it’s efficient and costs fewer transaction processes.

Complicated data structuring in firebase/firestore

I need an optimal way to store a lot of individual fields in firestore. Here is the problem:
I get json data from some api. it contains a list of users. I need to tell if those users are active, ie have been online in the past n days.
I cannot query each user in the list from the api against firestore, because there could be hundreds of thousands of users in that list, and therefore hundreds of thousands of queries and reads, which is way too expensive.
There is no way to use a list as a map for querying as far as I know in firestore, so that's not an option.
What I initially did was have a cloud function go through and find all the active users maybe once every hour, and place them in firebase realtime database in the structure:
activeUsers{
uid1: true
uid2: true
uid2: true
etc...
}
and every time I need to check which users are active, I get all fields under activeUsers (which is constrained to a maximum of 100,000 fields, approx 3~5 mb.
Now i was going to use that as my final mechanism, but I just realised that firebase charges for amount of bandwidth used, not number of reads. Therefore it could get very expensive doing this over and over whenever a user makes this request. And I cannot query every single result from firebase database as, while it does not charge per read (i think), it would be very slow to carry out hundreds of thousands of queries.
Now I have decided to use cloud firestore as my final hope, since it charges for number of reads and writes primarily as opposed to data downloaded and uploaded. I am going to use cloud functions again to check every hour the active users, and I'm going to try to figure out the best way to store that data within a few documents. I was thinking 10,000 fields per document with all the active users, then when a user needs to get the active users, they get all the documents (would be
10 if there are 100,000 total active users) and maps those client side to filter the active users.
So I really have 2 questions. 1, If I do it this way, what is the best way to store that data in firestore, is it the way I suggested? And 2, is there an all around better way to be performing this check of active users against the list returned from the api? Have I got it all wrong?

You could use firebase storage to store all the users in a text file, then download that text file every time?

Well this is three years old, but I'll answer here.
What you have done is not efficient and not a good approach. What I would do is as follows:
Make a separate collection, for all active users.
and store all the active users unique field such as ID there.
Then query that collection. Update that collection when needed.

Can Firebase Realtime Database effectively loop through billions of posts and retrieve them by the users that posted them?

I am developing an iOS app with Firebase Realtime Database. The app will potentially have billions of posts with a number of images and data that needs to be retrieved based on the people a specific user follows (something like Instagram).
I understand that the best practice in Firebase is to structure data as flat as possible which would mean having a "Posts" node with potentially billion of entries, which I would then filter by a kind of 'posted_by' parameter. This begs two questions:
1) Will I be able to retrieve said posts with a query that returns posts by any of the users I follow? (By passing something like an array of the users I follow)
2) Will Firebase be effective enough to loop through potentially billions of posts to find the ones that match my criteria, or is there otherwise a better way to structure data so as to make the app as optimal as possible?
Thanks in advance for the answers.

Billions of entries are no problem.
You should check if Firebase is the most cost efficient solution if you have huge volume of data.
1) Firebase can do that, but you probably don't want the user to wait for all entries (when there are a lot for a single user), but instead request them "page" by "page" and only request more pages on demand when the user scrolls up/down.
2) If you ensure you have an index on the user id, then it doesn't have to go through each one individually. Searching by index is efficient.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex