Firestore query get size of results without reading the documents? - firebase

I have an app that returns a list of health foods. There will be approximately 10000-20000 foods (documents) in the product collection.
These foods are queried by multiple fields using arrayContains. This may be categories, subcategories and when the user searches in the search bar it is an arrayContains on the keywords array.
With so many products I plan to paginate the results of query as I get the documents. The issue is that I need to know the amount of results to display the total of results to the user.
I have read that for a query you are charged one read and then if you get the documents then they are further charged per document. Is there a way of getting the number of results for a query without getting all the documents.
I have seen this answer here:
Get size of the query in Firestore
But in this example they say to use a counter which doesn't seem practical as I am using a query on keyword when the user searches and I am using a mixture of categories, subcategories when the user filters.
Thanks

With so many products I plan to paginate the results of query as I get the documents.
That's a very good decision since getting 10000-20000 foods (documents) at once is not an option. Reason one is the cost, it will be quite expensive and second is that you'll get an OutOfMemoryError when trying to load such enormous amount of data.
The issue is that I need to know the amount of results to display the total of results to the user.
There is no way in Firestore so you can know in advance the size of the result set.
Is there a way of getting the number of results for a query without getting all the documents.
No, you have to page through all the results that are returned by your query to get the total size.
But in this example they say to use a counter which doesn't seem practical as I am using a query on keyword when the user searches
That's correct, that solution doesn't fit your needs since it solves the problem of storing the number of all documents in a collection and not the number of documents that are returned by a query. As far as I know, it's just not scalable to provide that information, in the way this cloud hosted, NoSQL, realtime database needs to "massively scale".

For any future lurker, a "solution" to this problem is to paginate results with a cursor until the query doesn't return any more documents. When the query snapshot is empty, return undefined for your cursor and handle from there:
const LIMIT = 100
const offset = req.query.offset
const results = db.collection(COLLECTION)
.offset(offset)
.limit(LIMIT)
.get()
const docs = results.docs.map(doc => doc.data())
res.status(200).send({
data: docs,
// Return the next offset or undefined if no docs are returned anymore.
offset: docs.length > 0 ? offset + LIMIT : undefined
})

Related

Firestore OR query improvement

I have a social network app.
I need to query the last Posts ordered by date from any people that i follow.
My working query so far is :
Query getRecentFriendsPost(List<String> followings, int postedDate) {
return getUsersCollection()
.where("uid", whereIn: followings)
.where("postedAt", isGreaterThanOrEqualTo: postedDate)
.orderBy("dateCreated", descending: true)
.limit(16);
}
This doesn't answer my needs unfortunately.
The followings list contains the userIDs that i follow, and the length can grow until thousands. Firestore support only 10 items in whereIn clause.
Second problem, if the length is 1000, it will cost 100x10 queries to split the followings list into block of 10 uids.
This is very expensive and slow.
I must use the whole followings list to be sure to not miss the most recents posts.
How can i perform such a query with Firestore ?
The followings list allows me to query the User object associated to the Post, if a Post is retrieved by the query.
Should i go for RealTime Database in such situation ? (If i should even keep Firebase products)
I did not find a better architecture either, if anyone knows how to improve this it would be much appreciated.
I have migrated my database to MongoDB and i am able to do so in one query.
Performances are not impacted before passing an array of 6000 values, after that it starts giving results in more than 100ms which remains perfectly fine for my use case.

Firebase Firestore Query Date and Status not equals. Cannot have inequality filters on multiple properties

I want to query by date and status. I created an index for both the fields, but when querying it throws an error:
Cannot have inequality filters on multiple properties: [created_at, status]
const docQuery = db.collection('documents')
.where('created_at', '<=', new Date()) // ex. created_at later than 2 weeks ago
.where('status', '!=', 'processed') // works when == is used
.limit(10);
'status' is a string and can be multiple things.
I read about query limitations, https://firebase.google.com/docs/firestore/query-data/queries but that is such a simple thing for any database to do, what is a good solution for such a problem? Loading entire records is not an option.
The issue isn't that it's a "simple" thing to do. The issue is that it's an unscalable thing to do. Firestore is fast and cheap because of the limitations it places on queries. Limiting queries to a single inequality/range filter allows it to scale massively without requiring arbitrarily large amounts of memory to perform lots of data transformations. It can simply stream results directly to the client without loading them all into enough memory to hold all of the results. While it's not necessary to understand how this works, it's necessary to accept the limitations if you want to use Firestore effectively.
For your particular case, you could make your data more scalable to query by changing to your "status" field from a single string field to a set of boolean fields. For each one of the possible status values, you could instead have a boolean field that indicates the current status. So, for the query you show here, if you had a field called "isProcessed" with a boolean value true/false, you could find the first 10 unprocessed documents created before the current date like this:
const docQuery = db.collection('documents')
.where('created_at', '<=', new Date())
.where('processed', '==', false)
.limit(10);
Yes, this means you have to keep each field up to date when there is a change in status, but you now have a schema that's possible to query at massive scale.

Optimizing the number of reads from firestore server using caching or snapshot listener

I am rendering the following view using Firebase. So basically the search is powered by a Firebase query.
I am using the following code:
Query query = FirebaseUtils.buildQuery(
fireStore, 'customers', filters, lastDocument, documentLimit);
print("query =" + query.toString());
QuerySnapshot querySnapshot = await query.getDocuments();
print("Got reply from firestore. No of items =" + querySnapshot.documents.length.toString());
Questions:
If the user hits the same query, again and again, it still hits the server. I checked this by using doc.metadata.isFromCache and it always returns false.
Will using query snapshots help in reduce no of reads for this search query? I guess no. As the user is changing the query again and again.
Any other way to limit the number of reads?
If the user hits the same query, again and again, it still hits the server. I checked this by using doc.metadata.isFromCache and it always returns false.
If you are online, it will always return false and that's the expected behavior since the listener is always looking for changes on the server. If you want to force the retrieval of the data from the cache while you are online, then you should explicitly specify this to Firestore by adding Source.CACHE to your get() call. If you're offline, it will always return true.
Will using query snapshots help in reduce no of reads for this search query? I guess no. As the user is changing the query again and again.
No, it won't. What does a query snapshot represent? It's basically an object that contains the results of your query. However, if you perform a query, "again and again", as long as it's the same query and nothing has changed on the server, then you will not be charged with any read operations. This is happening because the second time you perform the query, the results are coming from the cache. If you perform each time a new search, you'll always be billed with a number of read operations that are equal with the number of elements that are returned by your query. Furthermore, if you create new searches and the elements that are returned are already in your cache, then you'll be billed with a read operation only for the new ones.
Any other way to limit the number of reads?
The simplest method to limit the results of a query is to use a limit() call and pass as an argument the number of elements you want your query to return:
limit(10)

Maximum number of fields for a Firestore document?

Right now I have a products collection where I store my products as documents like the following:
documentID:
title: STRING,
price: NUMBER,
images: ARRAY OF OBJECTS,
userImages: ARRAY OF OBJECTS,
thumbnail: STRING,
category: STRING
NOTE: My web app has approximately 1000 products.
I'm thinking about doing full text search on client side, while also saving on database reads, so I'm thinking about duplicating my data on Firestore and save a partial copy of all of my products into a single document to send that to the client so I can implement client full text search with that.
I would create the allProducts collection, with a single document with 1000 fields. Is this possible?
allProducts: collection
Contains a single document with the following fields:
Every field would contain a MAP (object) with product details.
document_1_ID: { // Same ID as the 'products' collection
title: STRING,
price: NUMBER,
category: STRING,
thumbnail
},
document_2_ID: {
title: STRING,
price: NUMBER,
category: STRING,
thumbnail
},
// AND SO ON...
NOTE: I would still keep the products collection intact.
QUESTION
Is it possible to have a single document with 1000 fields? What is the limit?
I'm looking into this, because since I'm performing client full text search, every user will need to have access to my whole database of products. And I don't want every user to read every single document that I have, because I imagine that the costs of that would not scale very well.
NOTE2: I know that the maximum size for a document is 1mb.
According to this document, in addition to the 1MB limit per document, there is a limit of index entries per document, which is 40,000. Because each field appears in 2 indexes (ascending and descending), the maximum number of fields is 20,000.
I made a Node.js program to test it and I can confirm that I can create 20,000 fields but I cannot create 20,001.
If you try to set more than 20,000 fields, you will get the exception:
INVALID_ARGUMENT: too many index entries for entity
// Setting 20001 here throws "INVALID_ARGUMENT: too many index entries for entity"
const indexPage = Array.from(Array(20000).keys()).reduce((acc, cur) => {
acc[`index-${cur}`] = cur;
return acc;
}, {});
await db.doc(`test/doc`).set(indexPage);
I would create the allProducts collection, with a single document with 1000 fields. Is this possible?
There isn't quite a fixed limitation for that. However, the documentation recommends having fewer than 100 fields per document:
Limit the number of fields per document: 100
So the problem isn't the fact that you duplicate data, the problem is that the documents have another limitation that you should care about. So you're also limited to how much data you can put into a document. According to the official documentation regarding usage and limits:
Maximum size for a document: 1 MiB (1,048,576 bytes)
As you can see, you are limited to 1 MiB total of data in a single document. When we are talking about storing text, you can store pretty much but as your documents get bigger, be careful about this limitation.
If you are storing a large amount of data in your documents and those documents should be updated by lots of admins, there is another limitation that you need to take care of. So you are limited to 1 write per second on every document. So if you have a situation in which the admins are trying to write/update products in that same document all at once, you might start to see some of these writes fail. So, be careful about this limitation too.
And the last limitation is for index entries per document. So if you decide to get over the first limitation, please note that the maximum limit is set to 40,000. Because each field has associated two indexes (ascending and descending), the max number of fields is 20,000.
Is it possible to have a single document with 1000 fields?
It is possible up to 40,000 properties but in your case with no benefits. I say that because every time you perform a query (get the document), only a single document will be returned. So there is no way you can implement a search algorithm in a single document and expect to get Product objects in return.
And I don't want every user to read every single document that I have, because I imagine that the costs of that would not scale very well.
Downloading an entire collection to search for fields client-side isn't practical at all and is also very costly. That's the reason why the official documentation recommends a third-party search service like Algolia.
For Android, please see my answer in the following post:
Is it possible to use Algolia query in FirestoreRecyclerOptions?
Firebase has a limit of 20k fields per document.
https://www.youtube.com/watch?v=o7d5Zeic63s
According to the documentation, there is no stated limit placed on the number of fields in a document. However, a document can only have up to 40,000 index entries, which will grow as documents contain more fields that are indexed by default.

Firebase firestore collection count with angularFire 2

I want to get the total number of the documents that exist in firestore.
I don't want to get the data only the total number of inside Products collection I have 200.000 items is that possible with Angular 4-5, not angular.js
Can someone expert tell me how I can achieve that ??
My code so far and is not work
get_total_messages() {
this.messages_collection = this.afs.collection<MessageEntity>('messages');
return this.messages_collection.snapshotChanges();
}
End this is how I try to get the data but is not what I want;
this.firebase_Service.get_total_messages().subscribe( data => {
console.log(data);
});
There is no API to get the count of the number of documents in a Firestore collection. This means that the only ways to get the count are:
Get all documents and count them client-side.
Store the count as a separate property and update that as you add/remove documents.
Both approaches are quite common in NoSQL databases, with the second of course being a lot more efficient as the number of documents grows.
Firebase provides a sample of using Cloud Functions to keep a counter. While this sample is written for the Firebase Realtime Database, it can easily be modified to work on Cloud Firestore too.
Firestore also provides documentation on running aggregation queries and running distributed counters. Both seem slightly more involved than the first sample I linked though.
this.firebase_Service.get_total_messages().subscribe( data=>this.totalnumber=data.length);
//now, you can get total number of messages
luckily , i've solved somehow using the code,
try this, and it works well .
this.db.collection('User').valueChanges()
.subscribe( result => {
console.log(result.length);
})

Resources