Firebase Firestore - Question about Reads/Writes - firebase

I have a question about Reads/Write in a Firestore DB.
The scenario is:
I have a Collection "City" (for example 20 Cities) and it has a subcollection "Restaurants" (e.g 500 restaurants):
Now is my question: When I want to get all Restaurant in a City, how many reads would Firestore bill? 500?
And when I want to add a Restaurant: Would it only need 1 write to add this document to the subcollection?

As Andres said: you are charged for document reads and writes. It doesn't matter what collection or subcollection the document comes from, each time the server reads a document on your behalf, you're charged for that document read.
So if you read 500 restaurant documents out of the subcollection of a city, you'll be charged for 500 document reads. If you add a single document to that subcollection, you're charged for a single document write.
If you regularly find yourself reading the same set of documents (e.g. the same 500 restaurants for all users in that city), consider creating a data model that reduces the number of documents you need to read. For example: you'll probably need a subset of the information from each restaurant, so you could extract that for all restaurants in the city into a "top restaurants list" document. This type of data duplication is quite normal in NoSQL databases, and key to keeping great performance with a reasonable cost.
Also see:
Getting to know Cloud Firestore, which covers this and many more scenarios.
NoSQL data modeling, which covers general data modeling for all kinds of NoSQL databases.
This answer I gave earlier today: Maxing out document storage in Firestore

One document read/write always costs the same, it could be in a collection, or a subcollection, or a subcollection of a subcollection - or a sub of sub of sub... you got the idea :)

Related

Firestore Collection Write Rate

The article about Best practices for Cloud Firestore states that we should keep the rate of write operations for an individual collection under 1,000 operations/second.
But at the same time, the Firebase team says in Choose a data structure that root-level collections "offer the most flexibility and scalability".
What if I have a root-level collection (e.g. "messages") which expects to have more than 1,000 write operations/second?
If you think at that limitation of 1,000 operations/second it's pretty much but if you find your self in a situation in which you need more than that, then you should consider changing your database schema to allow writes on multiple collections. So you should multiply the number of collections. Having a single collection of messages, in which every user can add messages doesn't sound as a good way to go since you can reach that limitation very soon. In this case you should split that collection into multiple other collections. A possible schema might be the one I have explained in the following video:
https://www.youtube.com/watch?v=u3KwKQddPoo
See, at the end of that video, there is collection named messages which in term contains a roomId document. This document contains a subcollection named roomMessages which contains as documents all messages from a chat room. In this case, there are no chances you can reach that limitation.
But at the same time, the Firebase team says in Choose a data structure that root-level collections "offer the most flexibility and scalability".
But also rememeber, Firestore can as quickly look up a collection at level 1 as it can at level 100, so you don't need to worry about that.
The limit of 1,000 ops/sec per collection only apply to realtime update, so as long as you don't have a snapshot listener this should be okay.
I asked the question on the Cloud Firestore Google Groups
The limit is 10,000 writes per second if no other limits apply first:
https://firebase.google.com/docs/firestore/quotas#writes_and_transactions
Also just keep in mind the best practices for scaling cloud firestore

Firebase Firestore database structure

I'm building an app using flutter and firebase and was wondering what the best firestore database structure.
I want the ability for users to post messages and then search by both the content of the post and the posters username.
Does it make sense to create one collection for users with each document storing username and other info and a separate collection for the posts with each document containing the post and the username of the poster?
In the unlikely event where the number of posts exceeds a million or more, is there an additional cost of querying this kind of massive collection?
Would it make more sense to store each user's posts as a sub-collection under their user document? I believe this would require additional read operations to access each document's sub-collection. Would this be cheaper or more expensive if I end up getting a lot of traffic?
is there an additional cost of querying this kind of massive collection?
The cost and performance of reading from Firestore are purely based on the amount of data (number of documents and their size) you retrieve, and not in any way on the number of documents in the collection.
But what is limited in Firestore is the number of writes you can do to data that is "close to each other". That intentionally vague definition means that it's typically better for write scalability to spread the data over separate subcollections, if the data naturally lends itself to that (such as in your case).
To get a great introduction to Firestore, and to data modeling trade-offs, watch Getting to know Cloud Firestore.

Cloud Firestore Payments

I have a question regarding payment at the Cloud Firestore compared to the Realtime Database. At Firestore you pay per read/write per document, right? In other words: If I display a list of 1000 documents in a collection, do I pay for 1000 reads?
I have a few collections in my app with many (200-300) documents, which unfortunately all have to be displayed on one page. My app has about 10,000 active users. After the calculation I am definitely financially broke... :-)
Therefore my question: Are 300 elements also 300 reads taken into account if I save the 300 elements in ONE document as an Array and retrieve them? Is then only the one document calculated as a read? Or also the 300 elements from the created array?
If I display a list of 1000 documents in a collection, do I pay for 1000 reads?
You only pay for documents that are read on/from the server. Most Firestore SDKs implement a client-side cache, which may significantly reduce the number of documents that are read on/from the server.
I have a few collections in my app with many (200-300) documents, which unfortunately all have to be displayed on one page
One way to reduce the number of read operations is to model the data for that one page into a separate single document. This document is essentially the data for a single page in your app, meaning that you update it whenever any of the underlying data updates. That leads to more code when you write updates to the database, but it saves you 299 document reads for every user accessing the page.
Also see:
Cloud Firestore Pricing | Get to Know Cloud Firestore #3
Firestore: How are "reads" calculated for the quota?
Firebase firestore pricing for querying
Understanding Firestore Pricing

Understanding Firestore Pricing

Before creating a new app I wanna make sure I get the pricing model correct.
For example in a phonebook app, I have a collection called userList that has a list of users which are individual documents.
I have 50k users on my list, which means I have 50k documents in my collection.
If I were to get the userList collection it will read all 50k documents.
FireStore allows 50k document reads. Does that mean 50k document reads in total or 50k document read per document?
As in the example of my phonebook app if it is 50k document reads in total I will run out of the free limit in just one get call.
If you actually have to pull an entire collection of 50k documents, the question you likely should be asking is how to properly structure a Firestore Database.
More than likely you need to filter these documents based on some criteria within them by using the query WHERE clause. Having each client device hold 50k documents locally sounds like poor database planning and possibly a security risk.
Each returned document from your query counts as 1 read. If there are no matches to your query, 1 read is charged. If there are 50k matches, there are 50k reads charged.
For example, you can retrieve the logged in user's document and be charged 1 read with something like:
db.collection('userList').where('uid', '==', clientUID)
Note: As of 10/2018 Firestore charges 6 cents (USD) per 100k reads after the first 50k/ day.
The free quota is for your entire project. So you're allowed 50.000 document reads under the entire project.
Reading 50K user profile documents will indeed use that free quota in one go.
Reading large numbers of documents is in general something you should try to prevent when using NoSQL databases.
The client apps that access Firestore should only read data that they're going to immediately show to the user. And there's no way you'll fit 50K users on a screen.
So more likely you have a case where you're aggregating over the user collection. E.g. things like:
Count the number of users
Count the number of users named Frank
Calculate the average length of the user names
NoSQL databases are usually more limited in their query capabilities than traditional relational databases, because they focus on ensuring read-scalability. You'll frequently do extra work when something is written to the database, if in exchange you can get better performance when reading from the database.
For better performance you'll want to store these aggregation values in the database, and then update them whenever a user profile is written. So you'll have a "userCount", a document with "userCount for each unique username", and a "averageUsernameLength".
For an example of how to run such aggregation queries, see: https://firebase.google.com/docs/firestore/solutions/aggregation. For lower write volumes, you can also consider using Cloud Functions to update the counters.
Don't call all users in one go. You can limit your query to get a limited number of users. And when a user will scroll your query will get more users. And as no one is going to scroll fro 50k users so you can get rid of a bundle of cost. This is something like saving memory in case of recycle view.

Advantages of firestore sub-collections

The firestore docs don't have an in depth discussion of the tradeoffs involved in using sub-collections vs top-level collections, but do point out that they are less flexible and less 'scalable'. Given that you sacrifice flexibility in setting up your data in sub-collections, there must be some definite plus sides besides a mentally satisfying structure.
For example how does the time for a firestore query on a single key across a large collection compare with getting all items from a much smaller collection?
Say we want to query a large collection 'People' for all people in a family unit. Alternatively, partition the data by family in the first place into family units.
People -> person: {family: 'Smith'}
versus
Families -> family: {name:'Smith'} -> People -> person
I would expect the latter to be more efficient, but is this correct? Are the any big-O estimates for each?
Any other advantages of sub-collections (eg for transactions)?
I’ ve got some key points about subcollections that you need to be aware of when modeling your database.
1 – Subcollections give you a more structured database.
2 - Queries are indexed by default: Query performance is proportional to the size of your result set, not your data set. So does not matter the size of your collection, the performance depends on the size of your result set.
3 – Each document has a max size of 1MB. For instance, if you have an array of orders in your customer document, it might be a good idea to create a subcollection of orders to each customer because you cannot foresee how many orders a customer will have. By doing this you don’t need to worry about the max size of your document.
4 – Pricing: Firestore charges you for document reads, writes and deletes. Therefore, when you create many subcollections instead of using arrays in the documents, you will need to perform more read, writes and deletes, thus increasing your bill.
To answer the original question about efficiency:
Querying all people with the family 'Smith' from the people top-level collections really is not any slower than asking for all the people in the 'Smith' family sub-collection.
This is explained in the How to Structure Your Data episode of the Get to Know Cloud Firestore video series.
There are some trade-offs between top-level collections and sub-collections to be aware of. Depending on the specific queries you intend to use you may need to create composite indexes to query top-level collections or collection group indexes to query sub-collections. Both these index types count towards the 200 index exemptions limit.
These trade-offs are discussed in detail near the bottom of the Understanding Collection Group Queries blog post and in Maps, Arrays and Subcollections, Oh My! episode of the Get to Know Cloud Firestore video series.
I've linked to the relevant parts of both videos.
I was wondering about the same thing. The documentation mainly talks about arrays vs sub-collections. My conclusion is that there are no clear advantages of using a sub-collection over a top-level collection. Sub collections had some clear technical limitations before, but I think those are removed with the recent introduction of collection group queries.
Here are some advantages of both approaches:
Sub collection:
Your database "feels" more structured as you will have less top-level collections listed.
No need to store a reference/foreign key/id of the parent document, as it is implied by the database structure. You can get to the parent via the sub collection document ref.
Top-level collection:
Documents are easier to delete. Using sub collections you need to make sure to first delete all sub collection documents before you delete the parent document. There is no API for this so you might need to roll your own helper functions.
Having the parent id directly in each (sub) document might make it easier to process query results, depending on the application.
Todd answered this in firebase youtube video
1) There's a limit to how many documents you can create per minute in
a single collection if the documents have an always-increasing value
(like a timestamp)
2) Very large collections don't do as well from a
performance standpoint when you're offline. But they are generally
good options to consider.

Resources