Prevent more than 1 write a second to a Firestore document when using a counter with cloud function - firebase

Background:
I have a Firestore database with a users collection. Each user is a document which contains a contacts collection. Each document in that collection is a single contact.
Since firestore does not have a "count" feature for all documents, and since I don't want to read all contacts to count how many contacts a user has, I trigger cloud functions when a contact is added or deleted which increments or decrements numberOfContacts in the user document. In order to make the function idempotent, it has to do multiple reads and writes to prevent incrementing the counter more than once if it's called more than once for the same document. This means that I need to have a different collection of eventIDs that I've already handled so I don't duplicate it. This requires me to run another function once a month to go through each user deleting all such documents (which is a lot of reads and some writes).
Issue
Now the challenge is that the user can import his/her contacts. So if a user imports 10,000 contacts, this function will get fired 10,000 times in quick succession.
How do I prevent that?
Current approach:
Right now I am adding a field in the contact document that indicates that the addition was part of an import. This gets the cloud function to not increment.
I perform the operation from the client 499 contacts at a time in a transaction, which also increments the count as the 500th write. That way the count stays consistent if something failed halfway.
Is this really the best way? It seems so complicated to just have a count of contacts available. I end up doing multiple reads and writes each time a single contact changes plus I have to run a cleanup function every month.
I keep thinking there's gotta be a simpler way.

For those who're curious, it seems like the approach I am taking is the best appraoch.
I add a field in the contact document that indicates that the addition was part of an import (bulkAdd = true). This gets the cloud function to not increment.
I have another cloud function add the contacts 200 at a time (I do FieldValue.timestamp and that counts as another write, so it's 400 writes). I do this in a batch and the 401th write in the batch is the increment count. That way I can bulk import contacts without having to bombard a single document with writes.

Problem with increments
There are duplicate-safe operations like FieldValue.arrayUnion() & FieldValue.arrayRemove(). I wrote a bit about that approach here: Firebase function document.create and user.create triggers firing multiple times
By this approach you make your user document contain a special array field with contact IDs. Once the contact is added to a subcollection and your function is triggered, the contact's id can be written to this field. If the function is triggered twice or more times for one contact, there will be only one instance of it written into the master user doc. But the actual size can be fetched on the client or with one more function triggered on the user doc update. This is a bit simplier than having eventIDs.
Problem with importing 10k+ contacts
This is a bit philosophically.
If I got it, the problem is that a user performs 10k writes. Than these 10k writes trigger 10k functions, which perform additional 10k writes to the master doc (and same amount of reads if they use eventIDs document)?
You can make a special subcollection just for importing multiple contacts to your DB. Instead of writing 10k docs to the DB, the client would create one but big document with 10k contact fields, which triggers a cloud function. The mentioned function would read it all, make the neccessary 10k contact writes + 1 write to master doc with all the arrayUnions. You would just need to think how to prevent 10k invoked function writes (adding a special metadata field like yours bulkAdd)
This is just an opinion.

Related

Using Firestore Triggers to Manage User Document Count

If every document in a collection is a user resource that is limited, how can you ensure the user does not go over their assigned limit?
My first thought was to take advantage of the Firestore triggers to avoid building a real backend, but the triggers sometimes fire more than once even if the inputed data has not changed. I was comparing the new doc to the old doc and taking action if certain keys did not match but if GCP fires the same function twice I get double the result. In this case incrementing or decrementing counts.
The Firestore docs state:
Events are delivered at least once, but a single event may result in multiple function invocations. Avoid depending on exactly-once mechanics, and write idempotent functions.
So in my situation the only solution I can think of is saving the event id's somewhere and ensuring they did not fire already. Or even worse doing a read on each call to count the current docs and adjust them accordingly (increasing read costs).
Whats a smart way to approach this?
If reinvocations (which while possible are quite uncommon) are a concern for your use-case, you could indeed store the ID of the invocation event or something less frequent, like (depending on the use-case) the source document ID.

What constitutes a write action in Firestore?

I'm currently developing a Flutter web application using Firestore for data persistence. The app is not live in production, so I'm the only one accessing this backend. There is only one collection that holds a single document, with many nested fields (6 levels deep). My understanding from looking at https://firebase.google.com/docs/firestore/pricing, is that reads are counted per doc, so every time I reload my app it should count as one read, yet in the last 4 hours since I started working today I already hit 1.7K reads (as reported in the usage tab). I know I haven't reloaded the app that many times, and there's also no hidden loop that calls the collection multiple times.
This is the Flutter code that calls Firestore:
final sourceRef=FirebaseFirestore.instance.collection("source");
var data=await sourceRef.doc("stats").get();
What am I missing please?
According to Firebase pricing, writes are defined as:
You are charged for each document read, write, and delete that you perform with Cloud Firestore.
Charges for writes and deletes are straightforward. For writes, each set or update operation counts as a single write.
Meaning that one document created is one write. If the same document is updated later, then Firebase counts it as one more write.
Here is a more detailed table that you can use for billing, and an example.
It is recommended to view individual product usage in the "Usage" tab for many products in the Firebase console, as this can narrow the product that is causing the elevated usage that you are seeing.
I would highly recommend adding write and view logs to your application; that way, you can monitor how many writes and reads you have.

Trigger function on batch create with firebase

In my app, I have two ways of creating users.
One is a singular add which triggers a cloud function onCreate to send email and does some other logic.
The other one is by batch which ultimately triggers the same function for each added document.
Question is how can I trigger a different function when users are added by a batch ?
I looked into firebase documentation and it doesn't seem to have this feature. Am I wrong ?
This will greatly help reducing the number of reads and I can bulk send emails to added users instead of sending them one by one.
The trigger on Cloud functions for document creation is only one.
What you can do is to have two different functions with the same trigger and incode differentiate between both creation methods.
This can be something like adding to the document two more values:
creation_method
batch
with creation method you can evaluate its value on each document to verify if the execution continues or it finishes at that point.
batch can be used in the batch created to identify the whole batch.
for creation_method I recommend there different values:
singular
batch_normal
batch_final
on Batch just having a batchID
For the function for singular creation verify that is singular and thats it.
For the batch function make that it only continue on batch_final status and get all the values that have the same batchId.
This approach will not reduce the reads as the reads are billed for each document read so unless you depend on additional documents the number of reads will be the same.
As a work around if you want to reduce the amount you are billed per reads you can change to Realtime Database the triggers you mentioned also exist and it has the advantage that it doesn't bill for reads.

Do Firestore Function Triggers count as reads?

I know what you are probably thinking, "why does it matter? Don't try to over-complicate it just to optimize pricing". In my case, I need to.
I have a collection with millions of records in Firestore, and each document gets updated quite often. Every-time one gets updated, I need to do some data-cleaning (and more). So I have a function trigger by onUpdate that does that. In the function there's two parameters: document before update and document after update.
My question is:
Because the document is been passed as an argument, does that count as a database read?
The event generated by Cloud Firestore to send to Cloud Functions should not count as an extra read beyond what what was done by the client to initially trigger that event.

What is the most cost-efficient method of making document writes/reads from Firestore?

Firebase's Cloud Firestore gives you limits on the number of document writes and reads (and deletes). For example, the spark plan (free) allows 50K reads and 20k writes a day. Estimating how many writes and reads is obviously important when developing an app, as you will want to know the potential costs incurred.
Part of this estimation is knowing exactly what counts as a document read/write. This part is somewhat unclear from searching online.
One document can contain many different fields, so if an app is designed such that user actions done through a session require the fields within a single document to be updated, would it be cost-efficient to update all the fields in one single document write at the end of the session, rather than writing the document every single the user wants to update one field?
Similarly, would it not make sense to read the document once at the start of a session, getting the values of all fields, rather than reading them when each is needed?
I appreciate that method will lead to the user seeing slightly out-of-date field values, and the database not being updated admittedly, but if such things aren't too much of a concern to you, couldn't such a method reduce you reads/writes by a large factor?
This all depends on what counts as a document write/read (does writing 20 fields within the same document in one go count as 20 writes?).
The cost of a write operation has no bearing on the number of fields you write. It's purely based on the number of times you call update() or set() on a document reference, weither independently, in a transaction, or in a batch.
If you choose to write each N fields using N separate updates, then you will be charged N writes. If you choose to write N fields using 1 update, then you will be charged 1 write.

Resources