Apparently some of onCreate/onDelete events our cloud function is triggered by are received more than once!
We've observed them arriving even 3 times a few seconds apart from each other spread between instances of the cloud function. Is it a normal behavior or there is something we've been doing wrong?
Have a look at the content of this "Firebase Google Group" post, that I paste below. It was written in August 2018 but is still fully valid at the time of writing this "copy/paste answer".
Cloud Functions typically
guarantees that your functions are run "at least once", meaning that
it's very possible (but typically rare) that an event may get
delivered to your function more than once. To deal with this, your
functions should be "idempotent", meaning that the receipt of the same
event multiple times should result in no further changes to whatever
it is you need to update. This can be kind of challenging, but it's
one of the properties of serverless systems that needs to be dealt
with, if it's problematic for your app.
https://cloud.google.com/functions/docs/bestpractices/tips#write_idempotent_functions
Related
Let's say I have a collection called persons and another collection called cities with a field population. When a Person is created in a City, I would like to increment the population field in the corresponding city.
I have two options.
Create a onCreate trigger function. Find the city document and increment using FieldValue.increment(1).
Create an HTTPS callable cloud function to create the person. The cloud function executes a transaction in which the person is created and the population is incremented.
The first one is simpler and I am using it right now. But, I am wondering if there could be cases where the onCreate is not called due to some glitch...
I am thinking of moving to the second option. I am wondering if there are any disadvantages. Does HTTPS callable function cost more?
The only problem I see with the HTTPS callables would be that if something fails you would need to handle that on your client side. That would be (at least for me) a little bit to much logic for the client side.
What I can recommend you after almost 4 years experience with exactly that problem is a solution with a virtual queue. I had a long dicussion on that theme here and even with the Firebase ppl on the last in person Google IO and Firebase Summit.
Our problem was that there where those glitches and even if they happend sometimes the changes and transaction failed due to too much requests. After trying every offical recommendation like the shard counters etc. we ended up creating a virtual queue where each onCreate adds an entry to just a Firestore or RTD list/collection and another function that runs eaither by crone or another trigger (that doesn't matter). That cloud function handles each entry in the queue one by one and starts again for each of them to awoid timouts and memeroy limits. We made sure one handler/calculation is enought for a single function to handle it.
This method was the only bullet proof one that could handle thousands of new entries in a second without having an issue. The only downside is that it takes more time than an usual trigger because each entries is calculated one by one. If your calculations are smaller you could do them in batches (that is how we started to).
I remember to have read an article where it was explained that Cloud Functions are not guaranteed to be executed and especially in the right order. I can't find any sources on this anymore.
Is this still recent information?
I am aware that the start of a function can take a couple seconds, especially when cold starting the function.
Could I reliably increment a number each time a document is created in a specific Firestore collection without getting my numbers mixed up? I know this is done often but I've never seen information on whether or not it is safe to do.
Following up on question one, are there red flags when using Cloud Functions for payment backend services?
Can I be sure that Cloud Functions are executed in the order that they were triggered i.e. are they queued or executed in parallel?
Could I reliably increment a number each time a document is created in a specific Firestore collection without getting my numbers mixed up?
You can certainly write code to do that. You will need to keep track of a running count of documents in another document, and use a transaction to keep it up to date.
I don't recommend doing this. It's kind an anti-pattern in Firestore to impose sequentially increasing numbers for documents in a collection. If you want time-based ordering, you should consider using a timestamp instead.
Can I be sure that Cloud Functions are executed in the order that they were triggered i.e. are they queued or executed in parallel?
Cloud Functions provides absolutely no guarantee that functions invocations will happen in any order. They are asynchronous and can execute in parallel on multiple server instances, depending on the load applied to the function.
I strongly suggest reading through the documentation to understand the execution environment provided by Cloud Functions.
So according to the Google Cloud docs:
Max inactivity time for background functions = 30 days
The maximum amount of time that a background function can be kept without any invocation. Functions that are not invoked even once during this time may enter a state in which new events will not trigger them anymore. If this happens, such functions have to be redeployed to start working again. Note: This inactive state is not reflected in the UI, CLI, or API in any way.
I have an application with close to 40 cloud functions and there doesn't seem to be an easy way for me to tell if a function is close to becoming inactive so I know to take action before it happens and I really don't want to filter through the logs in the console with each function every week to see when each was invoked last.
Outside of just doing a redeploy of my functions every month to insure they are fresh is there anyway to easily tell when a function is becoming stale in case it does happen so I only have to deploy said function before it becomes stale?
Also, for any firebase'r that might read this, is there any solutions possibly coming to this in the future?
Thanks in advance.
Sorry, there's no way to tell where your function is during its 30 day idle expiration. The good news is that the Cloud Functions team is working on removing this limitation, but there's no public timeline for that.
EDIT: This problem is resolved now for all newly deployed functions.
I tried a couple of my trigger functions as https.onCall and called them after promise return and so far they work really well and faster than the triggers.
What's the catch? Are they also affected by the cold starts too?
If not, then unless it's cron job or lack of support of app language, why should anyone use use a trigger function at all?
All Cloud Functions are affected by cold starts. This is how all serverless function architectures work. In order to scale down to zero (so you pay nothing if you use nothing), all server instances must be able to be decommissioned. Cold start up cost is paid when a new server instance is allocated, so going from zero to one will cost you one cold start.
You haven't defined what a "trigger function" is, so I'll assume you mean a "background function" which triggers in response to events that occur within your project.
Background functions are absolutely required when you want to have some work performed in response to those changes when you can't trust the client to perform that work directly. This is important to maintain data consistency, and also to prevent having to duplicate logic among all your different clients that are all doing the same thing. This also allows you to ship new features and bug fixes without having to ship new client code, which can be difficult and time-consuming.
After watching a fair amount of youtube videos, it seems that Google is advocating for multipath updates when changing data stored in multiple places, however, The more I've messed with cloud functions, it seems like they're and even more viable option as they can just sit in the back and listen for changes to a specific reference and push changes as needed to the other references in real time. Is there a con to going this route? Just curious as to why Google doesn't recommend them for this use case.
NEWER UPDATE: Literally as I was writing this, I received a response from Google regarding my issues. It's too late to turn our apps direction around at this point but it may be useful for someone else.
If your function doesn't return a value, then the server doesn't know how long to wait before giving up and terminating it. I'd wager a quick guess that this might be why the DB calls aren't getting invoked.
Note that since DatabaseReference.set() returns a promise, you can simply return that if you want.
Also, you may want to add a .catch() and log the output to verify the set() op isn't failing.
~firebase-support#google.com
UPDATE: My experience with cloud functions in the last month or so has been sort of a love-hate. A lot of our denormalized data relied on Cloud Functions to keep everything in sync. Unfortunately (and this was a bad idea from the start) we were dealing with transactional/money data and storing that in multiple areas was uncomfortable. When we started having issues with Cloud Functions, i.e. the execution of them on a DB listener was not 100% reliable, we knew that Firebase would not work at least for our transaction data.
Overall the concept is awesome. They work amazingly well when they trigger, but due to some inconsistencies in triggering the functions, they weren't reliable enough for our use case.
We're currently using SQL for our transactional data, and then store user data and other objects that need to be maintained real-time in Firebase. So far that's working pretty well for us.