Is firebase function's execution speed consistent? - firebase

I'm interested in using firebase functions.
I can't find any reference about execution speed of firebase functions.
So the question is: If I write a function (which is independent of network and outer resources), will it take nearly the same execution time every time I execute it? Is the execution speed consistent?

Each instance that Cloud Functions spins up to run your code will be an instance of the same container on the same hardware. Since there are no parallel executions of functions on the same instance, the functions all have access to the complete resources of their instance while they run.
The only way to change the performance is by changing the memory that is available in the container (which in turn also changes the CPU), but this is a configuration that you control and it applies to all instances of your function that start after you change it. For an overview of these instance types, see the table on the Cloud Functions pricing page.
And as Doug pointed out, if Cloud Functions needs to provision a new container for a function invocation there will be a delay that is known as a cold-start while the container is being set up.

Related

How to store a sql database connection pool in Firebase functions?

On first invocation of the Firebase function I create a mysql connection pool (which is expensive) and store it in global scope. This works fine as long as there is a single instance serving requests. Assuming multiple functions instances can exist under load, what's a good practice to prevent numerous such pools from getting created?
There is a specific section in the Cloud Functions documentation: "Use global variables to reuse objects in future invocations".
There is no guarantee that the state of a Cloud Function will be preserved for future invocations. However, Cloud Functions often recycles the execution environment of a previous invocation. If you declare a variable in global scope, its value can be reused in subsequent invocations without having to be recomputed.
This way you can cache objects that may be expensive to recreate on
each function invocation.
Cloud functions scale up to 1,000 instances per region.
SQL connections scale up to 32,000+.
Practically speaking, you're not going to run into upper limits on your db connection(s).

Firebase Cloud Functions min-instances setting seems to be ignored

Firebase has announced in September 2021 that it is possible now to configure its cloud function autoscaling in a way, so that a certain number of instances will always be running (https://firebase.google.com/docs/functions/manage-functions#min-max-instances).
I have tried to set this up, but I can not get it to work:
At first I have set the number of minimum instances in Google Cloud Console: Cloud Console Screenshot
After doing this I expected that one instance for that cloud function would run at any time. The metrics of that function indicate that it instances were still scaled down to 0: Cloud functions "Active Instances Metric"
So to me it looks a bit as if my setting is ignored here. Am I missing anything? Google Cloud Console shows me that the number of minimum instances has been set to 1 so it seems to know about it but to ignore it. Is this feature only available in certain regions?
I have also tried to set the number of minimum instances using the Firebase SDK for Cloud Functions (https://www.npmjs.com/package/firebase-functions). This gave me the same result, my setting is still ignored.
According to the Documentation, the Active Instances metrics shows the number of instances that are currently handling the request.
As stated in the Documentation :
Cloud Functions scales by creating new instances of your function. Each of these instances can handle only one request at a time, so large spikes in request
volume often causes longer wait times as new instances are created to handle the demand.
Because functions are stateless, your function sometimes initializes the execution environment from scratch, which is called a cold start. Cold starts can take
significant amounts of time to complete, so we recommend setting a minimum number of Cloud Functions instances if your application is latency-sensitive.
You can also refer to the Stackoverflow thread where it has been mentioned that
Setting up minInstances does not mean that there will always be that much number of Active Instances. Minimum instances are kept running idle (without CPU > allocated), so are not counted in Active Instances.

Callback to cloud functions/cloud run if timeout happens?

I have some cloud run and cloud functions that serve to parse a large number of files that users upload. Sometimes users upload an exceedingly large number of files, and that causes these functions to timeout even when I set them to their maximum runtime limits (15 minutes for Cloud Run and 9 minutes for Cloud Functions respectively.) I have a loading icon corresponding to a database entry that shows the progress of processing each batch of files that's been uploaded, and so if the function times out currently, the loading icon gets stuck for that batch in perpetuity, as the database is not updated after the function is killed.
Is there a way for me to create say a callback function to the Cloud Run/Functions to update the database and indicate that the parsing process failed if the Cloud Run/Functions timed out? There is currently no way for me to know a priori if the batch of files is too large to process, and clearly I cannot use a simple try/catch here as the execution environment itself will be killed.
One popular method is to have a public-facing API location that you can invoke by passing on the remaining queued information. You should assume that this API location is compromised so some sort of OTP should be used. This does depend on some factors, such as how these files are uploaded or the cloud trigger was handled which may require you to store that information in a database location to be retrieved.
You can set a flag on the db before you start processing, then after processing, clear/delete the flag. Then have another function regularly check for the status.
No such callback functionality exists for either product.
Serverless products are generally not meant to be used for batch processing where the batches can easily be larger than the limits of the system. They are meant for small bits of discrete work, such as simple API calls.
If you want to process larger amounts of data, considering first uploading that to Cloud Storage (which will accept files of any size), then sending a pubsub message upon completion to a backend compute product that can handle the processing requirements (such as Compute Engine).
Direct answer. For example, you might be able to achieve that by filtering and creating a sink in the relevant StackDriver logs (where a cloud function timeout crash is to be recorded), so that the relevant log records are pushed into some PubSub topic. On the other side of that topic you may have some other cloud function, which can implement the desired functionality.
Indirect answer. Without context, scope and requirement details - it is difficult to provide a good suggestion... but, based on some guesses - I am not sure that the design is optimal. Serverless services are supposed to be used for handling independent and relatively small chunks of data. If are have something large - you might like to use the first, let's say cloud function, to divide it into reasonably small chunks, so they can be processed independently by, let's say the second cloud function. In your case - can you have a cloud function per file, for example? If a file is too large (a few Gb, or dozen Gb) - can it be saved to a cloud storage and read/processed in chunks, so that the cloud functions are triggered from he cloud storage? And so on. That approach should help, but has a drawback - complexity is increased, as you have to coordinate and control how the process is going...

Is there any issue, if firebase cloud functions run multiple times (without writing hitting firestore db)

I need a bit of expert advice. I'm using the firebase cloud function to automate few things ( using this brilliant nodejs package "https://github.com/jdgamble555/adv-firestore-functions".
What happens is, it runs on - onWrite trigger, as I understand when onWrite triggers, it executes the function on each new document or childnode within a document being updated or created or deleted. That package has taken care many of things, but my concern is, executing the functions multiple times does any harm? I'm already making sure than if not required do not hit firestore by using condition checks. So (all) the function executes as I can see it in the log not writing/updating firestore db if not required.
I'm worried if all functions execute all the time will I finish my limits quickly. ( right now I'm testing on Firebase emulator), specially when userbase with increase.
Anything can be done to reduce these calls or its normal?
As per the firebase documentation, the first 2 million invocations are free per month, after that 0.40$ for every million invocations. Also, there is a resource limit for each function call. https://cloud.google.com/functions/pricing#free_tier
As of my knowledge and experience, this is normal. Just make sure that your code does not make any infinite function calls & database reads/writes.
I'm also using cloud functions for my social media platform which also uses triggers to execute & write to the database based on conditions. It never went beyond the free quota.

How to invoke other Cloud Firebase Functions from a Cloud Function

Let's say I have a Cloud Firebase Function - called by a cron job - that produces 30+ tasks every time it's invoked.
These tasks are quite slow (5 - 6 second each in average) and I can't process them directly in the original because it would time out.
So, the solution would be invoking another "worker" function, once per task, to complete the tasks independently and write the results in a database. So far I can think of three strategies:
Pubsub messages. That would be amazing, but it seems that you can only listen on pubsub messages from within a Cloud Function, not create one. Resorting to external solutions, like having a GAE instance, is not an option for me.
Call the worker http-triggered Firebase Cloud Function from the first one. That won't work, I think, because I would need to wait for a response from the all the invoked worker functions, after they finish and send, and my original Function would time out.
Append tasks to a real time database list, then have a worker function triggered by each database change. The worker has to delete the task from the queue afterwards. That would probably work, but it feels there are a lot of moving parts for a simple problem. For example, what if the worker throws? Another cron to "clean" the db would be needed etc.
Another solution that comes to mind is firebase-queue, but its README explicitly states:
"There may continue to be specific use-cases for firebase-queue,
however if you're looking for a general purpose, scalable queueing
system for Firebase then it is likely that building on top of Google
Cloud Functions for Firebase is the ideal route"
It's not officially supported and they're practically saying that we should use Functions instead (which is what I'm trying to do). I'm a bit nervous on using in prod a library that might be abandoned tomorrow (if it's not already) and would like to avoid going down that route.
Sending Pub/Sub messages from Cloud Functions
Cloud Functions are run in a fairly standard Node.js environment. Given the breadth of the Node/NPM ecosystem, the amount of things you can do in Cloud Functions is quite broad.
it seems that you can only listen on pubsub messages from within a Cloud Function, not create one
You can publish new messages to Pub/Sub topics from within Cloud Functions using the regular Node.js module for Pub/Sub. See the Cloud Pub/Sub documentation for an example.
Triggering new actions from Cloud Functions through Database writes
This is also a fairly common pattern. I usually have my subprocesses/workers clean up after themselves at the same moment they write their result back to the database. This works fine in my simple scenarios, but your mileage may of course vary.
If you're having a concrete cleanup problem, post the code that reproduces the problem and we can have a look at ways to make it more robust.

Resources