Cloud firestore keep in cache queries called several times? - firebase

What happens if I call this query once which is used in a StreamBuilder which wrap a ListView :
Firestore.instance
.collection('users')
.orderBy('createdAt', descending: true)
.limit(3)
.snapshots();
and then I run this same query a second time but with a limit of 6 :
Firestore.instance
.collection('users')
.orderBy('createdAt', descending: true)
.limit(6)
.snapshots();
Does the 3 first snapshots are called a second time or kept in cache ?
Does the StreamBuilder rebuild all the ListView ?

The second query will fetch all 6 documents again. None of them will not come from cache, so they will all be billed as reads on the server, and take time to transfer. The only way query results will come from cache is if the client app is offline, or you specify a query source of cache using getDocuments and specify a Source of cache.

Related

How can I save myself from thousand dollar Firebase bill?

I am building a hospital management app. In this app there are bunch of features of which major features are messaging, appointment booking, video appointment and much more.
In the messaging feature what I am trying to do is loading all the previous messages from cache using bellow code
QuerySnapshot<Map<String, dynamic>> localMessageDocs = await _db
.collection("users")
.doc(AuthService.uid())
.collection("messages")
.orderBy('time', descending: true)
.get(GetOptions(source: Source.cache));
and after that I am querying latest messages using below code
_db
.collection('users')
.doc(AuthService.uid())
.collection('messages')
.where('time', isGreaterThan: _lastLocalMessageTime)
.orderBy('time', descending: true)
.snapshots()
but still I am afraid of bunch of reads that may occur if this system fails, like the cache size allowed in android and the firebase cache limitations(Default is 40MB) as messages for single user can be increase exponentially, which may cause thousand dollar bill.
Any help will be appreciated!

Spurious MaxListenersExceededWarning EventEmitter memory leak when running consecutive firestore queries

I have a firebase HTTP function which in turns calls some firestore operations. If I call the HTTP function several times, letting each call finish before calling the next, I get the following error in the firebase functions log:
(node:2) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 close listeners added. Use emitter.setMaxListeners() to increase limit
The firebase function is an import task which takes the data to import, check for duplicates by calling a firestore query, and if there is none, it adds the data to the firestore DB by another DB operation.
Here is the firebase function, with parts removed for brevity:
module.exports = functions.https.onCall(async (obj, context) => {
// To isolate where the problem is
const wait = (ms: number) => new Promise(resolve => setTimeout(resolve, ms))
try {
const photo = getPhoto(obj)
// Query to look for duplicates
const query = db
.collection(`/Users/${context.auth.uid}/Photos`)
.where('date', '==', photo.date)
.where('order', '==', photo.order)
.limit(1)
await wait(300)
log.info('Before query')
const querySnap = await query.get()
log.info('After Query')
await wait(300)
// And then the rest of the code, removed for brevity
} catch (error) {
throw new functions.https.HttpsError('internal', error.message)
}
})
I inserted a pause before and after the const querySnap = await query.get() to show that it really is this invocation that causes the error message.
I also set the firestore logger to output its internal logging to help debug the issue, by doing this:
import * as admin from 'firebase-admin'
admin.initializeApp()
admin.firestore.setLogFunction(log => {
console.log(log)
})
So the more complete log output I get is this: (read it bottom to top)
12:50:10.087 pm: After Query
12:50:10.087 pm: Firestore (2.3.0) 2019-09-13T19:50:10.087Z RTQ7I [Firestore._initializeStream]: Received stream end
12:50:10.084 pm: Firestore (2.3.0) 2019-09-13T19:50:10.084Z RTQ7I [Firestore._initializeStream]: Releasing stream
12:50:10.084 pm: Firestore (2.3.0) 2019-09-13T19:50:10.084Z RTQ7I [Firestore.readStream]: Received response: {"document":null,"transaction":{"type":"Buffer","data":[]},"readTime":{"seconds":"1568404210","nanos":76771000},"skippedResults":0}
12:50:10.026 pm: (node:2) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 close listeners added. Use emitter.setMaxListeners() to increase limit
12:50:10.020 pm: Firestore (2.3.0) 2019-09-13T19:50:10.020Z RTQ7I [Firestore.readStream]: Sending request: {"parent":"[redacted]/documents/Users/SpQ3wTsFzofj6wcsF7efRrSMrtV2","structuredQuery":{"from":[{"collectionId":"Photos"}],"where":{"compositeFilter":{"op":"AND","filters":[{"fieldFilter":{"field":{"fieldPath":"date"},"op":"EQUAL","value":{"stringValue":"2019-06-26"}}},{"fieldFilter":{"field":{"fieldPath":"order"},"op":"EQUAL","value":{"integerValue":0}}}]}},"limit":{"value":1}}}
12:50:10.019 pm: Firestore (2.3.0) 2019-09-13T19:50:10.019Z RTQ7I [ClientPool.acquire]: Re-using existing client with 100 remaining operations
12:50:10.012 pm: Before query
The interesting thing is that I usually run these imports in batches of 10. I seem to only get the error during the first batch of 10. If I then quickly run more batches, I don't seem to get the error again. But if I wait some time, it returns. Also, it is not consistent in which invocation within a batch the error occurs. It may be the 9th or 2nd or invocation, or any other.
Finally, the error doesn't stop execution. In fact, the imports seem to never fail. But, I don't like have unaccounted for errors in my logs! I won't be able to sleep at night with them there. :-)
I'm grateful for any help you can offer.
I got a useful response from the Firebase support team. They told me to try to install the latest version of firebase-admin (which upgraded it from 8.5.0 to 8.6.0) and that resolved the issue, even without the workaround of installing grpc. So, I think this should be the correct answer now.
Looks like this bug MaxListenersExceededWarning: Possible EventEmitter memory leak detected #694 might the problem here.
Workaround is to use npm install #grpc/grpc-js#0.5.2 --save-exact until bug is fixed and the Firestore library starts using it.

Firestore Deadline Exceeded Node

I would like to load collection that is ~30k records. I.e load it via.
const db = admin.firestore();
let documentsArray: Array<{}> = [];
db.collection(collection)
.get()
.then(snap => {
snap.forEach(doc => {
documentsArray.push(doc);
});
})
.catch(err => console.log(err));
This will always throw Deadline Exceeded error. I have searched for some sorts of mechanism that will allow me to paginate trough it but I find it unbelievable not to be able to query for not that big amount in one go.
I was thinking that it may be that due to my rather slow machine I was hitting the limit but then I deployed simple express app that would do the fetching to app engine and still had no luck.
Alternatively I could also export the collection with gcloud beta firestore export but it does not provide JSON data.
I'm not sure about firestore, but on datastore i was never able to fetch that much data in one shot, I'd always have fetch pages of about 1000 records at a time and build it up in memory before processing it. You said:
I have searched for some sorts of mechanism that will allow me to paginate trough
Perhaps you missed this page
https://cloud.google.com/firestore/docs/query-data/query-cursors
In the end the issue was that machine that was processing the 30k records from the Firestore was not powerful enough to get the data needed in time. Solved by using, GCE with n1-standard-4 GCE.

Flutter Firestore transaction running multiple times

Firestore documentation says:
"In the case of a concurrent edit, Cloud Firestore runs the entire transaction again. For example, if a transaction reads documents and another client modifies any of those documents, Cloud Firestore retries the transaction. This feature ensures that the transaction runs on up-to-date and consistent data."
I am using the cloud_firestore package and I noticed that doing
final TransactionHandler transaction = (Transaction tx) async {
DocumentSnapshot ds = await tx.get(userAccountsCollection.document(id));
return ds.data;
};
return await runTransaction(transaction).then((data){
return data;
});
the transaction may run multiple times but always return after the first transaction. Now in case of concurrent edits, the first transaction data may be incorrect so this is a problem for me.
How can I wait for the transaction to actually finish even if it will run multiple times and not return after the first one finished?
Your transaction code doesn't make any sense. It's not getting the contents of any documents. You only need to use a transaction if you intend to read, modify, and write at least one document.
The transaction function might only be run once anyway. There is only need for it to run multiple times if the server sees that there are a lot of other transactions occurring on documents, and it's having trouble keeping up with them all.

Firebase: Cloud Functions, How to Cache a Firestore Document Snapshot

I have a Firebase Cloud Function that I call directly from my app. This cloud function fetches a collection of Firestore documents, iterating over each, then returns a result.
My question is, would it be best to keep the result of that fetch/get in memory (on the node server), refreshed with .onSnapshot? It seems this would improve performance as my cloud function would not have to wait for the Firestore response (it would already have the collection in memory). How would I do this? Simple as populating a global variable? How to do .onSnaphot realtime listener with cloud functions?
it might depend how large these snapshots are and how many of them may be cached ...
because, it is a RAM disk and without house-keeping it might only work for a limited time.
Always delete temporary files
Local disk storage in the temporary directory is an in-memory file-system. Files that you write consume memory available to your function, and sometimes persist between invocations. Failing to explicitly delete these files may eventually lead to an out-of-memory error and a subsequent cold start.
Source: Cloud Functions - Tips & Tricks.
It does not tell there, what exactly the hard-limit would be - and caching elsewhere might not improve access time that much. it says 2048mb per function, per default - while one can raise the quotas with IAM & admin. it all depends, if the quota per function can be raised far enough to handle the cache.
here's an example for the .onShapshot() event:
// for a single document:
var doc = db.collection('cities').doc('SF');
// this also works for multiple documents:
// var docs = db.collection('cities').where('state', '==', 'CA');
var observer = doc.onSnapshot(docSnapshot => {
console.log(`Received doc snapshot: ${docSnapshot}`);
}, err => {
console.log(`Encountered error: ${err}`);
});
// unsubscribe, to stop listening for changes:
var unsub = db.collection('cities').onSnapshot(() => {});
unsub();
Source: Get realtime updates with Cloud Firestore.
Cloud Firestore Triggers might be another option.

Resources