How to efficiently count the documents in a large Firestore collection.
Obviously, I do not want to get the entire collection and count it on the front end as the money will go through the roof. Is there really not a simple API such as db.collection('someCollection').count() or similar, but we need to hack around it?
(2022-10-20) Edit:
Starting from now, counting the documents in a collection or the documents that are returned by a query is actually possible without the need for keeping a counter. So you can count the documents using the new count() method which:
Returns a query that counts the documents in the result set of this query.
This new feature was announced at this year's Firebase summit. Keep in mind that this feature doesn't read the actual documents. So according to the official documentation:
For aggregation queries such as count(), you are charged one document read for each batch of up to 1000 index entries matched by the query. For aggregation queries that match 0 index entries, there is a minimum charge of one document read.
For example, count() operations that match between 0 and 1000 index entries are billed for one document read. For A count() operation that matches 1500 index entries, you are billed 2 document reads.
Is there really not a simple api such as db.collection('someCollection').count() or similar
No, there is not.
but we need to hack around it
Yes, we can use a workaround for counting the number of documents within a collection, which would be to keep a separate counter that should be updated every time a document is added to, or removed from the collection.
This counter can be added as a field inside a document in Firestore. However, if the documents in the collection are added or deleted very frequently, then this solution might be a little costly, a case in which I highly recommend you to use the Realtime Database. In this case, there is nothing you need to pay when you update the counter, but only when you read (download) it. And since it's just a number, then you'll have to pay almost nothing. I have even written an article a couple of years ago regarding solutions for counting documents in Firestore:
How to count the number of documents in a Firestore collection?
Related
As the title suggests I would like to know how to get the total elements count of a paginated and filtered collection.
I have seen that many recommend, for the counting of the documents of the collection, to create a statistics document with the counter of the documents in the collection.
But if I need to implement a paged and filtered retrieval, how can I have the count of the total filtered items without having to retrieve them all?
Edit: October 20th, 2022
Starting from now, counting the documents in a collection or the documents that are returned by a query is actually possible without the need for keeping a counter. So you can count the documents using the new count() method which:
Returns a query that counts the documents in the result set of this query.
This new feature was announced at this year's Firebase summit. Keep in mind that this feature doesn't read the actual documents. So according to the [official documentation][2]:
For aggregation queries such as count(), you are charged one document read for each batch of up to 1000 index entries matched by the query. For aggregation queries that match 0 index entries, there is a minimum charge of one document read.
For example, count() operations that match between 0 and 1000 index entries are billed for one document read. For A count() operation that matches 1500 index entries, you are billed 2 document reads.
I have seen that many recommend, for the counting of the documents of the collection, creating a statistics document with the counter of the documents in the collection.
Yes, that is correct. It's very costly to count the number of documents within a collection, each time you need that total number. So it's best to have a field in a document that contains that number and increment it each time a new document is added and decrement it each time a document is deleted.
But if I need to implement a paged and filtered retrieval, how can I have the count of the total filtered items without having to retrieve them all?
There is no way you can know ahead of time, how many documents exist in a collection without reading them all or reading a document that contains that information, as explained above.
The pagination in NoSQL databases is a little different than in SQL databases. In all modern applications, we paginate the data using an infinite scroll. If you understand Java, then you can take a look at my answer in the following post:
How to paginate Firestore with Android?
Here is also the official documentation regarding Firestore pagination that can be achieved using query cursors:
https://firebase.google.com/docs/firestore/query-data/query-cursors
If you understand Kotlin, I also recommend you check the following resource:
How to implement pagination in Firestore using Jetpack Compose?
This question already has answers here:
Cloud Firestore collection count
(29 answers)
How to get a count of number of documents in a collection with Cloud Firestore [duplicate]
(10 answers)
Closed last year.
Of course, I can know how to get the number of docs by the following code:
handledocsNumber(){
Future<QuerySnapshot<Map<String, dynamic>>> number = FirebaseFirestore.instance.collection("users").get();
number.then((value) {
int docsNumber = value.docs.length;
});
}
But it sounds horrifying way if the collection has huge docs because .get() will consider the whole docs as new reads special if this method was continuously for User's purposes. I just imagine docs were 100.000, that's mean .get() will always read 100.000 docs as new read every time the user need to know the length.
any good way to know the length by only paying for one query process which is the length process?
(2022-10-20) Edit:
Starting from now, counting the documents in a collection or the documents that are returned by a query is actually possible without the need for keeping a counter. So you can count the documents using the new [count()][1] method which:
Returns a query that counts the documents in the result set of this query.
This new feature was announced at this year's Firebase summit. Keep in mind that this feature doesn't read the actual documents. So according to the [official documentation][2]:
For aggregation queries such as count(), you are charged one document read for each batch of up to 1000 index entries matched by the query. For aggregation queries that match 0 index entries, there is a minimum charge of one document read.
For example, count() operations that match between 0 and 1000 index entries are billed for one document read. For A count() operation that matches 1500 index entries, you are billed 2 document reads.
any good way to know the length by only paying for one query process which is the length process?
Yes, you can keep a counter in a document and increment the value once you add a new document to a collection. If you delete a document then simply decrement it. In this way, you can read the counter with a single document read.
Is there a way to determine a read count for each document in Firestore? I would like to limit read counts to 100,000 per document.
(2022-10-20) Edit:
Starting from now, counting the documents in a collection or the documents that are returned by a query is actually possible without the need for keeping a counter. So you can count the documents using the new [count()][1] method which:
Returns a query that counts the documents in the result set of this query.
This new feature was announced at this year's Firebase summit. Keep in mind that this feature doesn't read the actual documents. So according to the [official documentation][2]:
For aggregation queries such as count(), you are charged one document read for each batch of up to 1000 index entries matched by the query. For aggregation queries that match 0 index entries, there is a minimum charge of one document read.
For example, count() operations that match between 0 and 1000 index entries are billed for one document read. For A count() operation that matches 1500 index entries, you are billed 2 document reads.
Is there a way to determine a read count for each document in Firestore?
As also #FrankvanPuffelen mentioned in his answer, there is no API for doing that. If you need such a mechanism you need to create it yourself. That means that each time a user reads a document, you should increment a counter. That's pretty simple to implement since Firestore provides a really straightforward solution for that. To keep a counter for each read, you can increment a field in a document using ServerValue.increment(1).
Here are the docs for Android:
https://firebase.google.com/docs/database/android/read-and-write#atomic_server-side_increments
Here are the docs for iOS:
https://firebase.google.com/docs/database/ios/read-and-write#atomic_server-side_increments
And here are for the web:
https://firebase.google.com/docs/database/web/read-and-write#atomic_server-side_increments
There is nothing built into Firestore to limit the number of reads for a specific document. There is a quota system (which a.o. is used to enforce the quota on the free plan), but that doesn't apply per document.
You could do this through cloud functions with onRequest or onCall:
Read a value from Realtime database
If the value is larger than 0, return the respective document.
Then decrement the value in Realtime database
Sources:
https://firebase.google.com/docs/database/web/read-and-write#atomic_server-side_increments
https://firebase.google.com/docs/functions/callable
I have 2 collections in Firestore:
In the first I have the "alreadyLoaded" user ids,
In the second I have all userIDs,
How can I exclude the fist elements from the second elements making a query in Firestore?
the goal is to get only users that I haven't already loaded (optionally paginating the results).
Is there an easy way to achieve this using Firestore?
EDIT:
The number of documents I'm talking about will eventually become huge
This is not possible using a single query at scale. The only way to solve this situation as you've described is to fully query both collections, then write code in the client to remove the documents from the first set of results using the documents in the second set of results.
In fact, it's not possible to involve two collections in the same query at the same time. There are no joins. The only way to exclude documents from a query is to use a filter on the contents of the documents in the single collection.
Firestore might not be the best database for this kind of requirement if the collections are large and you're not able to precompute or cache the results.
Are collection group query prices the same as a single collection query? Or is there a minimum of one document read per collection even if the query for that collection returns no documents? Is there anything I should watch out for when it comes to billing?
Collections group queries are billed exactly like normal collection queries, as described in the documentation. It stands to reason that if a normal query that returns no documents incurs at least one read, then collection group queries would behave the same way.
There is a minimum of one document-read per query. So if you query across a group of collections and no documents are returned, that will be charged as a single document read.
The easiest way I find to remember this is to think of the read as a change for reading the index. Since a collection group query works from a single index, it is (at least) a single document read for reading that index.