Firestore data structure for two use cases - firebase

I would appreciate some guidance on how to structure data stored within an app. While there are some reasons for the first way, I'm concerned it wouldn't be able to operate efficiently for the second case.
Simplified, the app would contain a list of Places by State. The main use case would be viewing Places within a selected State. The second use case would be that individual users could save specific Places they liked into their profile and view them all at once (showing all state Places in one list).
Option 1- Places saved in one "places" collection, which has a field of "state."
Main use: To show these places by state, the app would query where the "state" field matches the state.
Secondary use: When a user saved the place, the app would save the docID for each place into the user's profile, each of which would need to be retrieved to show the list of places.
Option 2- Have one collection per state.
Main use: To show these places by state, the app would pull all documents within the query and list them out.
Secondary use: When a user saved the place to the user's profile, the app would save the docID for each place into the user's profile, distributed across the different collections, each of which would need to be retrieved to show the list of places.
Goals:
Use the same place document to appear in both the State lists and the user's profile.
Minimize the number of calls/slowness as much as possible in the Secondary use case.
I have been reviewing Firestore data storage guidelines, but I would appreciate any thoughts from experienced developers regarding this data structure.

There is no "perfect", "the best" or "the correct" solution for structuring a Firestore database. We are usually structuring the database according to the queries that we intend to perform.
Regarding storing all the places in a single collection vs. having one collection per state, please note that there is no difference in terms of speed or costs. You'll always have to pay a number of reads that is equal to the number of documents that your query returns. However, if you need to display in your app, for example, all places of all states, then having a collection for each state, will require a separate query for each state.
Furthermore, regarding saving a list of places in a user's profile vs. storing only the IDs, it's a matter of measurement. You should measure how often the details within the places are changed. Remember that if a place is changed, then you should update that data in all places it exists. So if it's not changed so often then you can save the entire place object, otherwise, save only the ID.

Related

Should I create a duplicate collection/document for each use-case? (Firebase/Firestore)

I'm trying to build an ecommerce app with firebase on the backend. I have a collection of 1000+ products, each of which is stored as a separate document, which have product specific info such as price, title etc.
document:{
title: 'Some Title',
price: '$99.99',
genres: ['Horror', 'Action']
}
So in my app I need to display these products in many places, such as product carousels(similar to a bookshelf with arrow buttons at the ends), and also in a search results page.
At any given page, I assume that I will need to display at least 50 products, either as search results, or multiple carousels. I understand that I can use queries to get this data from firebase. But since each document I retrieve counts as (at least)one firestore read, I assume that a typical user session would run into 100+ reads, if not thousands.
It seems a little inefficient to me that I need to read multiple documents to get this data, when I could just all that data in a single array, as its own document. That would mean I get charged for one document read, not 50, per page.
Is this how it is expected to be done? Should I create a new document containing the data I need for each specific use case?
P.S. I'm pretty new to backend dev, let alone firebase.
TL;DR Yes, you should create a new document with the needed data for each specific use case, but it’s not recommended to make it as a document with nested objects like arrays with 1000+ elements.
From a technical point of view, Cloud Firestore is optimized for storing large collections of small documents.
Depending on the use case, you can select the most appropriate Cloud Firestore data structure.
For example, the 10 most buyed books of the month can be a document with nested complex objects like arrays or maps. This structure could be useful for use cases with a small or predefined number of elements, but as stated here, if your data expands over time with larger or growing lists, the document also grows, which can lead to slower document retrieval times.
In plus thousand registers, a better choice can be structure your data as subcollections. It is, you can create collections within documents when you have data that might expand over time, with the main advantage that, as your lists grow, the size of the parent document doesn't change.
Cloud Firestore also has several features to help you manage queries that return a large number of results:
Cursors, which allow you to resume a long-running query.
Page tokens, which help you paginate the query results.
Limits, which specify how many results to retrieve.
Offsets, which allow you
to skip a fixed number of documents.
There are no additional costs for using cursors, page tokens, and limits. In fact, these features can help you save money by reading only the documents that you actually need.
As a best practice, do not use offsets. Instead, use cursors. Using an offset only avoids returning the skipped documents to your application, but these documents are still retrieved internally. The skipped documents affect the latency of the query, and your application is billed for the read operations required to retrieve them.

Single multi-tenanted firestore or many single tenanted firestores?

I'm building a SaaS system that allows users to define their own data models and enter data according to those models. It's a bit like airtable.
One user might model a bookshop, and would have a Book model, with title and ISBN fields. Another user might model medical records, and would have "date of last visit" as a field.
In the case of the bookshop, I want users to be able to search on title and ISBN. In the case of the medical records, I want users to be able to search on the date of the last visit.
I am using Firestore as my backend.
Firestore requires an index to enable a search. So that approach will not scale as # of customers increases.
My thought therefore was to have a Firestore instance for each customer, and those specific instances would have the necessary indexes.
I'm sure there are downsides to doing this though.
What would folks recommend to best solve this need?
What you are trying to achieve is some kind of weird, since you will not provide at least a few standard common properties for each user of your Bookshop.
When you want to perform a search in a Cloud Firestore database, you need the exact name of the property on which you want to search for. Having dynamic properties might not help you solve the search feature. However, you can create a document with a property of type array that can hold the name of all properties the users have chosen and perform a search on every property, but this solution will be much too expensive.
In my opinion, a possible solution might be to create at least a few common properties, so you can have the properties on which you can search. When someone creates, for example, a book shop you can display at the beginning all available properties a user can choose. Once you create a shop, you can have different users with different shop properties. This means that if a user does not choose a property, when you perform a search on that property, the results won't contain his/her products. This will work, only if you have predefined properties.

How to query Firestore collection for documents with field whose value is contained in a list

I have two Firestore collections, Users and Posts. Below are simplified examples of what the typical document in each contains.
*Note that the document IDs in the friends subcollection are equal to the document ID of the corresponding user documents. Optionally, I could also add a uid field to the friends documents and/or the Users documents. Also, there is a reason not relevant to this question that we have friends as a subcollection to each user, but if need-be we change it into a unified root-level Friends collection.
This setup makes it very easy to query for posts, sorted chronologically, by any given user by simply looking for Posts documents whose owner field is equal to the document reference of that user.
I achieve this in iOS/Swift with the following, though we are building this app for iOS, Android, and web.
guard let uid = Auth.auth().currentUser?.uid else {
print("No UID")
return
}
let firestoreUserRef = firestore.collection("Users").document(uid)
firestorePostsQuery = firestore.collection("Posts").whereField("owner", isEqualTo: firestoreUserRef).order(by: "timestamp", descending: true).limit(to: 25)
My question is how to query Posts documents that have owner values contained in the user's friends subcollection, sorted chronologically. In other words, how to get the posts belonging to the user's friends, sorted chronologically.
For a real-world example, consider Twitter, where a given user's feed is populated by all tweets that have an owner property whose value is contained in the user's following list, sorted chronologically.
Now, I know from the documentation that Firestore does not support logical OR queries, so I can't just chain all of the friends together. Even if I could, that doesn't really seem like an optimal approach for anyone with more than a small handful of friends.
The only option I can think of is to create a separate query for each friend. There are several problems with this, however. The first being the challenges presenting (in a smooth manner) the results from many asynchronous fetches. The second being that I can't merge the data into chronological order without re-sorting the set manually on the client every time one of the query snapshots is updated (i.e., real-time update).
Is it possible to build the query I am describing, or am I going to have to go this less-than optimal approach? This seems like a fairly common query use-case, so I'll be surprised if there is not a way to do this.
The sort chronologically is easy provided you are using a Unix timestamp, e.g. 1547608677790 using the .orderBy method. However, that leaves you with a potential mountain of queries to iterate through (one per friend).
So, I think you want to re-think the data store schema.
Take advantage of Cloud Functions for Firebase Triggers. When a new post is written, have a cloud function calculate who all should see it. Each user could have an array-type property containing all unread-posts, read-posts, etc.
Something like that would be fast and least taxing.

Firestore : How to design a Data model to make querying documents that are not exist in an array possible?

I'm trying to find a way to properly desing my Data Model with Firestore. I'm looking for something similar to what Tinder does, showing you people that you have'nt swiped yet, based on your location.
So I ended up with something like :
A User1 has an array of "met people"
A "Haven't yet met user"/ User2 his also a User with the same document model
They all belong in the same "Users" collection
I want to query all the users that this User1 haven't swiped yet
I know that you can't do something like "array_not_contains" or "!=" because all fields that you query need to be indexed.
So I wonder, is this possible to model data to make it work, or the only solution is to drop Firebase because this kind of query is not possible at all?
One alternative can be to store in a collection all the relationships (with theirs status) between all users. But that also means that whenever a user signup, I have to create as many documents as I have users that's really ugly and make a enormous numbers of documents.
EDIT:
Thanks again for your answer and sorry for my late answer.
There is no need to create a new database call since you already got all the users from that area in the first place.
Not If have a large response set, I will limit to a number. (5 in the example below).
And even If I don't limit the number, in the next db call, how I can know that new peoples has been added and how to retrieve only those.
I will not remove them from Users Collection has they can be show to others users.
P.S: I forget User4 in Users Collection pictures.
For User 1, get 5 first matchs, remove existing ones, show User5.
For User2, get 5 first matchs, remove existing ones, show User4, User5.
After users choices, Users are added to their list. Users Collection stay the same.
For User 1, get 5 first matchs, remove existing ones, nothing to show, even if I have a User 6, 7.
To fix that I launch a second query get the new ones but, more the user use the app more query I may need to do to try to display to him existing user in his area.
Maybe I've misunderstood what you named "initial list", for me it is the list object retrieve from my db containing all users (with limit).
EDIT 2:
You can check the answers of Alex Mamo to know how to query documents that are not exist in an array possible.
Let's me explain my use case and why I think, that won't work.
I want to be able to search all users next to me, for trying to do that in Firebase, I store Geopoint. Geopoint can't be really use for now out of the box with Firebase, so I user Geofirestore in a Cloud Function.
I store and update user Geopoints based on theirs locations, so this means user location change by time.
I limit the numbers of Users return by this function.
In my initial state I retrieve users next to me (User1), I get 3 an 4.
Let's say that I store last checked userId to use it later as a cursor for my query (User 4).
Now my geopoint change, and the users in this area changes too.
I request next bunch of users next to me, and I use my previous userId/document to "startAfter" (more on this
here), see the image below, that's won't work.
If I use the cursor (User4), I'll take 5, but not 2, because in the return list, if I order by Id, 2 will be before 4.
Worse, like below, if the return list may not even have user 4 in it, the cursor will be pointless.
My example is a bit simplified and does not take in account what is described in the first answer and my first edit (limited subset of users, data design).
A possible database structure for your app might be:
Firestore-root
|
--- users (collection)
|
--- uid (document)
|
--- acceptedUsers: ["uidOne", "uidTwo"]
|
--- declinedUsers: ["uidThree", "uidFour"]
|
--- //Other user properties
The mechanism is simple. When you first want to show a user profile to the current (authenticated) user, you have to create a query that will return all users (in user area). According to the user decision, you need to add the corresponding uid in either the acceptedUsers array or in declinedUsers array. Once you want to show another users, use the same query but this time, you need to make an extra operation. Once the query returns the users within user location, add all those users to a list. Compare the list that is coming from the database with your exting arrays and remove all the users from both arrays. In this way you'll have a list that contains only users that the actual user didn't see. This extra step is needed to make sure the id of the user does not exist in one of those arrays. In the end, simply choose a random user from the list and show the details to the user. That's it!
One alternative can be to store in a collection all the relationships (with theirs status) between all users. But that also means that whenever a user signup, I have to create as many documents as I have users.....that's really ugly and make a enormous numbers of documents.
This is not an option. This means that you need to write each time a user joins your app an enormous amount of data, which will be very costly. Since everything in Firestore is about the number of read and writes, I think you should think again about this approach. Please see Firestore usage and limits.
Edit:
Let's consider the initial list of users that has 10 records. With other words, all the users within that area are 10. You say that 7 users are already seen, that makes the list contain only the 3 remaining users.
So I display the 3, (or I do another request to get some more) and he check the 3.
Yes, you should display those 3 users and then remove them one by one from the initial list. There is no need to create a new database call since you already got all the users from that area in the first place. Once the list remains empty, you should display a message to the user that in that particular area are no more users to swipe.
When will create another database call?
Only when needed. Which means that you create another call once new users enter that area. Let's say 3 new users are new, you get a list now of 3 user and use the same algorithm.
More my user use the app more it’s difficult to show people that he haven’t seen, because his list become bigger.
If you think that the arrays will grow more than a document can hold, then you should consider storing the users in a collection and not in an array. So in this case, the problem is that the documents have limits. So there are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:
Maximum size for a document: 1 MiB (1,048,576 bytes)
As you can see, you are limited to 1 MiB total of data in a single document. When we are talking about storing text (uids), you can store pretty much but as your array getts bigger, be careful about this limitation.
But if you'll stay within this limits, which I personally think you'll do, you have nothing to worry about.
Edit2:
Not If have a large response set, I will limit to a number. (5 in the example below). And even If I don't limit the number, in the next db call, how I can know that new peoples has been added and how to retrieve only those.
I will not remove them from Users Collection has they can be show to others users.
If you have large amount of data (many users in a single area), yes it's good idea to limit the results, but a much better idea would be to load the data in smaller chunks. In short, get 5 users, remove one by one till the list has zero users, load other 5 users and so on. This can be made using my answer from the following post:
Is there a way to paginate queries by combining query cursors using FirestoreRecyclerAdapter?
The initial list, is the list that you are getting when you first query the database. In this case, the initial list will contain 5 users.

Firebase: How flat should my data structure be?

I'm building an app that tracks the user's location and updates Firebase. I've read the documentation about structure data but still have a few questions.
I'm considering structuring the data in one of two ways, but can't determine which one.
users
$id
-position
-other attr
vs:
user_position
$id
users
$id
-other attr.
In what scenario would the first design work best, second?
If you only keep one position per user (as seems to be the case by the fact that you use singular user_position), there is no useful difference between the two structures. A user's position in that case is just another attribute, just one that happens to have two value (lat and lon).
But if you want to keep multiple positions per user, then your first structure is mixing entity types: users and user_positions. This is an anti-pattern when it comes to Firebase Database.
The two most common reasons are:
Say you want to show a list of user names (or any specific, single-value attribute). With the first structure you will also need to read the list of all positions of all users, just to get the list of names. With the second structure, you just read the user's attributes. If that is still much more data than you need, consider also keeping a list of /user_names for optimal read performance.
Many developers end up wanting different access rules for the user positions and the other user attributes. In the first structure that is only possible by pushing the read permission from the top /users down to lower in the tree. In the second structure, you can just give separate permissions to /users and /user_positions.

Resources