I'm quite new to NoSQL, that's why I come here to get your opinions.
I'm trying to understand if it's better to use nested object or subcollections in a specific case. I will try to explain my case.
I have to store several shops in my Db. Each shop will have an address, a phone number etc... So I have a Collection "Shop" and inside several Documents representing the shops.
Now, my shops have some contacts (2,3 or 4 employees for example). My question is, what should I do :
Store in my "shop" documents 2,3 or 4 objects like :
objectContact: {
name: "Georges",
age: 20....
}
Create a subcollections "Contact" inside my "shop" documents, and then insert 2,3 or 4 documents in this subcollections.
Which is the better ? Does one of this two solutions disable some tools/queries in NoSQL ? Does one of this two solutions is faster when it comes to write/read the data ?
Thanks in advance,
Currently, all the queries in the cloud firestore are shallow which means you can't get the subcollection with its collection.You have read it separately. So I would recommend you to store it in a nested document. But it has few limitations though. Check this link for detailed explanation of modelling data in cloud firestore
Related
I have an app that helps store owners manage their inventory through a simple API-driven interface.
My app stores all data on Firestore. My simplified database looks like this:
-users
-name
-email
-uid
-products
-atts
...
-ownerId
-someOtherThing
-atts
...
-ownerId
The idea is that only documents with ownerId that matches the current user ID will be accessible to the user. User with ID=5 will only have access to items that match ownerId=5.
Is this a good way of storing this data? I am worried that I will eventually end up with thousands of documents in that collection and querying them by "ownerId" might not be the best way to tackle this. On the other hand, I might end up with hundreds of users too, which probably makes it bad design to introduce several new collections for each of them?
What would be a better approach design-wise?
While "a good way" is subjective and purely dependent on the use-cases of your app, what you're proposing is quite a common way to store data in Firestore.
Your concern about the number of users and other documents is unwarranted, as Firestore guarantees that the performance of returning the (say) products for a specific user depends solely on the number of products returns, not on the total number of products in the database.
So if you have 10 products that you're the ownerId for, then no matter how many other users/products there are, the amount of time it takes to retrieve your 10 products will always be the same.
I am having trouble transitioning from Realtime Database to Cloud Firestore. I'm developing something like a story book app. So, I need to store Stories in my database. Each story contains a set of chapters, and each chapter contains a set of pages.
Back in Realtime Database I used to have a data structure in the likes of:
stories:
storyId:
title: "Isaac, the Toothless Dog"
description: "Isaac wanders Paris seeking for true love."
chapters:
title: "Into the Sewers"
published: true
chapterId:
pages:
pageId:
loose: false
occupied: false
body: "This is where the body of the page goes"
With Realtime Database I could just request the node storyId, and receive the whole map below with chapters and pages. With that map, in Dart, I would generate an object Story with all that information. That's what I want to achieve, using Cloud Firestore.
For Cloud Firestore, I have designed the following data structure:
Of course, the purple outter boxes represent Collections, and the blue inner boxes represent Documents. I use three dots (...) to show that there would be plenty more documents like the one next to it.
I like this data structure. It looks neat and clean, and I always know what I'm looking at.
However, Cloud Firestore queries are Shallow (a feature I actually like, serves me well in other points of the app), meaning I can only query from a single Collection. That means I can't generate that Story object (in Dart) without, at least, three queries to Firebase.
My question is: Is there a way to achieve this using only one query? If not, is there any way I can organize my data in order to achieve so?
I know I could put the pages and chapters information inside the Stories Collection (like what I had in Realtime Databse), but Documents have a size limit of 1Mb and 20000 fields. If a story contains multiple chapters, each one with multiple pages, that limit may very well be exceeded. That's why I want to keep them in separated Collections.
Any ideas? Thank you.
I have 2 collections in Firestore:
In the first I have the "alreadyLoaded" user ids,
In the second I have all userIDs,
How can I exclude the fist elements from the second elements making a query in Firestore?
the goal is to get only users that I haven't already loaded (optionally paginating the results).
Is there an easy way to achieve this using Firestore?
EDIT:
The number of documents I'm talking about will eventually become huge
This is not possible using a single query at scale. The only way to solve this situation as you've described is to fully query both collections, then write code in the client to remove the documents from the first set of results using the documents in the second set of results.
In fact, it's not possible to involve two collections in the same query at the same time. There are no joins. The only way to exclude documents from a query is to use a filter on the contents of the documents in the single collection.
Firestore might not be the best database for this kind of requirement if the collections are large and you're not able to precompute or cache the results.
Assume I have an application that shows a list of restaurants & 1000 restaurants to show.
My first impression would be to create a collection of restaurants and each individual restaurant would be a document inside this collection.
The concern with the above approach is that for each user Cloud Firestore would register 1000 reads.
My question is if there is a better way of storing the restaurants to decrease the number of reads?
My first impression would be to create a collection of restaurants and each individual restaurant would be a document inside this collection.
Yes, that's the right way to do it.
The concern with the above approach is that for each user firestore would register 1000 reads
You'll be charged with 1000 read operations only if you read all documents at once. But this is not the right way to do it, you need to limit the data that you get. On how you can achieve this, please check the official documentation regarding order and limit data in Cloud Firestore.
Another most apropiate approach is to load the data in smaller chunks. This practice is called pagination and can be used very simply in Cloud Firestore using startAt() or startAfter() methods.
For Android, this is a recommended way in which you can paginate queries by combining query cursors with the limit() method. I also recommend you take a look at this video for a better understanding.
My question is if there is a better way of storing the restaurants to decrease the number of reads?
And to answer your question, the problem is not about how you store the data is about how you read it.
The firestore docs don't have an in depth discussion of the tradeoffs involved in using sub-collections vs top-level collections, but do point out that they are less flexible and less 'scalable'. Given that you sacrifice flexibility in setting up your data in sub-collections, there must be some definite plus sides besides a mentally satisfying structure.
For example how does the time for a firestore query on a single key across a large collection compare with getting all items from a much smaller collection?
Say we want to query a large collection 'People' for all people in a family unit. Alternatively, partition the data by family in the first place into family units.
People -> person: {family: 'Smith'}
versus
Families -> family: {name:'Smith'} -> People -> person
I would expect the latter to be more efficient, but is this correct? Are the any big-O estimates for each?
Any other advantages of sub-collections (eg for transactions)?
I’ ve got some key points about subcollections that you need to be aware of when modeling your database.
1 – Subcollections give you a more structured database.
2 - Queries are indexed by default: Query performance is proportional to the size of your result set, not your data set. So does not matter the size of your collection, the performance depends on the size of your result set.
3 – Each document has a max size of 1MB. For instance, if you have an array of orders in your customer document, it might be a good idea to create a subcollection of orders to each customer because you cannot foresee how many orders a customer will have. By doing this you don’t need to worry about the max size of your document.
4 – Pricing: Firestore charges you for document reads, writes and deletes. Therefore, when you create many subcollections instead of using arrays in the documents, you will need to perform more read, writes and deletes, thus increasing your bill.
To answer the original question about efficiency:
Querying all people with the family 'Smith' from the people top-level collections really is not any slower than asking for all the people in the 'Smith' family sub-collection.
This is explained in the How to Structure Your Data episode of the Get to Know Cloud Firestore video series.
There are some trade-offs between top-level collections and sub-collections to be aware of. Depending on the specific queries you intend to use you may need to create composite indexes to query top-level collections or collection group indexes to query sub-collections. Both these index types count towards the 200 index exemptions limit.
These trade-offs are discussed in detail near the bottom of the Understanding Collection Group Queries blog post and in Maps, Arrays and Subcollections, Oh My! episode of the Get to Know Cloud Firestore video series.
I've linked to the relevant parts of both videos.
I was wondering about the same thing. The documentation mainly talks about arrays vs sub-collections. My conclusion is that there are no clear advantages of using a sub-collection over a top-level collection. Sub collections had some clear technical limitations before, but I think those are removed with the recent introduction of collection group queries.
Here are some advantages of both approaches:
Sub collection:
Your database "feels" more structured as you will have less top-level collections listed.
No need to store a reference/foreign key/id of the parent document, as it is implied by the database structure. You can get to the parent via the sub collection document ref.
Top-level collection:
Documents are easier to delete. Using sub collections you need to make sure to first delete all sub collection documents before you delete the parent document. There is no API for this so you might need to roll your own helper functions.
Having the parent id directly in each (sub) document might make it easier to process query results, depending on the application.
Todd answered this in firebase youtube video
1) There's a limit to how many documents you can create per minute in
a single collection if the documents have an always-increasing value
(like a timestamp)
2) Very large collections don't do as well from a
performance standpoint when you're offline. But they are generally
good options to consider.