I'm interested in evaluating the performance of using Cloud Functions for Firebase, so I've set up a simple function with basically the same code as this example from Firebase. What the function does is that is simply listens for new messages at a certain node in Firebase, and then sends a push notification to the receiver of that message.
Next, I wrote a script that bombards the messages node by adding x new children there every second. I found out that the function seems to be able to handle up to about 40 new messages per second, anything more than that and the notifications start to lag more and more behind. This seems like pretty poor performance – too poor for us to use in production – and I wonder what the bottleneck might be here. Of course, Functions are still in beta, but according to the documentation, each function should be able to run 400 concurrent invocations.
Related
I have a good old fashioned piece of code that takes some inputs, runs some code and produces outputs for a variety of tasks. This code runs in a function, and it should be triggered by a UI action. It is relevant to store a history of these runs per user, to list them in a specific view of the UI, and, on returning users, put them back where they left.
Since the Firestore documentation and videos strongly encourage us to default to it for data exchange with the backend, I of course decided to entertain the idea before just moving on to simply implementing a callable.
First, it made sense to me to model the data as follows:
/users
id
/taskRuns
id
inputs
outputs
/history
id
date
...outputs
and make the data flow as follows:
The UI writes to the inputs field and it listens to the document.
The unction is triggered. It runs happily and it writes to outputs.
A second function listening to /assets puts copies over the inputs and outputs onto its corresponding /history subcollection.
The UI reacts to the change.
At this point, a couple of red flags are raised on my mind:
Since the UI has to both read and write /assets, a malicious user could technically trigger infinite loops here. I'm aware that the same thing is possible with a callable, but it somehow seems even easier in this scenario.
I lose the ability to do timeout and error handling the way one would with a simple API call.
However, I also understand that:
My function needs to write to the database anyway to store the outputs even if it was returning it à la API.
Logic for who's responsible for what has somewhat been broken down into pieces.
So - Am I looking at it wrong? Is there a world in which this approach would make sense over the callable?
The approach you're describing uses the database as a queue/cache/proxy for your backend functionality. I regularly use it myself with either Realtime Database or Firestore to expose backend functionality, because it:
Works more gracefully when there's intermittent loss of internet connection. Where a direct call to Cloud Functions would have to implement its own retries, the database SDKs already handle this scenario.
Uses the database as a proxy/cache for the result, so that repeated loads of the data don't cause additional calls to my Cloud Functions.
By setting a maximum number of instances on my Cloud Function, I can prevent overloading any legacy infrastructure my code calls, and the database becomes my queue of pending requests.
The pattern is also used in this video from Cloud Next 2019: Serverless in real life: a case study in the travel industry and this talk from Google I/O: Architecting for Data Contention in a Realtime World with Firebase.
That said, many of my team mates only ever use callable functions, so it's definitely a matter of preference and the requirements of your use-case.
I need a bit of expert advice. I'm using the firebase cloud function to automate few things ( using this brilliant nodejs package "https://github.com/jdgamble555/adv-firestore-functions".
What happens is, it runs on - onWrite trigger, as I understand when onWrite triggers, it executes the function on each new document or childnode within a document being updated or created or deleted. That package has taken care many of things, but my concern is, executing the functions multiple times does any harm? I'm already making sure than if not required do not hit firestore by using condition checks. So (all) the function executes as I can see it in the log not writing/updating firestore db if not required.
I'm worried if all functions execute all the time will I finish my limits quickly. ( right now I'm testing on Firebase emulator), specially when userbase with increase.
Anything can be done to reduce these calls or its normal?
As per the firebase documentation, the first 2 million invocations are free per month, after that 0.40$ for every million invocations. Also, there is a resource limit for each function call. https://cloud.google.com/functions/pricing#free_tier
As of my knowledge and experience, this is normal. Just make sure that your code does not make any infinite function calls & database reads/writes.
I'm also using cloud functions for my social media platform which also uses triggers to execute & write to the database based on conditions. It never went beyond the free quota.
So, I'm making multiplayer mobile game using Xamarin and Firebase. In game there are many moment when I'm letting players decide what to do and send their decision to the server (by putting decision enum in player-specific Firebase database node). Decision is time limited (short time, no longer than 20s).
I set listener to that specific node in my Firebase functions to check if all player decided or player decision comes after time deadline, but I need to deal with case when: some players send their decision in time - sersnmart were will not execute next action, and that one player just will not send his decision (eave game or something) - server won't be poke again to check deadline and invoke functions.
That why I'm looking for something else, I found method for schedule functions using crontab, but the minimal time interval there seems to be minutes, which is way more to long for me.
Second idea includes wait that specific time interval in previous Firebase thread, but it seems too bad way to deal with this.
Which way is best for dynamic invoking short-interval scheduled Firebase functions?
The best way to schedule Cloud Functions to run at a specific time it through the Google Cloud Tasks schedules. See Doug's blog post for a full description of this: How to schedule a Cloud Function to run in the future with Cloud Tasks (to build a Firestore document TTL)
That said, I regularly use setTimeOut in my own Cloud Functions too when I need to delay an operation for a short period of time. Just keep in mind that you pay for the seconds that the function is sleeping, so cost-wise you'll want to trade that time off against what another invocation would cost.
So for now I decided to use setTimeout, free firebase plan seems to limit only functions invoke number, not working plan so this shouldn't be a problem. Depsite this, I'm still waiting for advice from you
I am designing a system, one component of the system gives me approx 50 outputs. I then start up VM instances for each of the 50 outputs, pass the outputs as inputs and run a process which can take 10 - 60 minutes on each of the instances.
Currently, when I get my output data, what I do is add each output to a message queue (rabbitmq) and then send an HTTP request to a cloud function. This cloud function basically creates 'self-destructing' instances for each output. The HTTP request has the "number_of_req_instances" and then each instance acts as a consumer, and picks one task from the queue.
I was wondering, is there any way to send the HTTP request from rabbitmq? Or whats the best practice for handling this sort of use-case? I'm not entirely happy that my 'http-request' to create instances and the population of my queue are two steps.
I not only need to pass the output as input, but I also need to start up the instances. I also like the fact that RabbitMQ works quite well with the acknowledgement of messages, so I'm keen to keep that as part of the system. I could however use HTTP requests to pass all the information and feed it to the metadata of the instances. But that's not ideal since the HTTP response would be direct and I wouldn't know if any of the tasks failed as opposed to using RabbitMQ.
Any suggesstions?
You could look into a solution with Cloud Function being triggered by a Pub/Sub message. The output would be sent to a topic in Pub/Sub. This topic is set as a trigger to launch the Function once a topic is published. The Cloud Function will ingest the Pub/Sub message containing the output and process the output.
You may look more into this documentation for Cloud Function triggered by Pub/Sub. There is also some architecture references you might find interesting. ie The serveless event driven
Let's say I have a Cloud Firebase Function - called by a cron job - that produces 30+ tasks every time it's invoked.
These tasks are quite slow (5 - 6 second each in average) and I can't process them directly in the original because it would time out.
So, the solution would be invoking another "worker" function, once per task, to complete the tasks independently and write the results in a database. So far I can think of three strategies:
Pubsub messages. That would be amazing, but it seems that you can only listen on pubsub messages from within a Cloud Function, not create one. Resorting to external solutions, like having a GAE instance, is not an option for me.
Call the worker http-triggered Firebase Cloud Function from the first one. That won't work, I think, because I would need to wait for a response from the all the invoked worker functions, after they finish and send, and my original Function would time out.
Append tasks to a real time database list, then have a worker function triggered by each database change. The worker has to delete the task from the queue afterwards. That would probably work, but it feels there are a lot of moving parts for a simple problem. For example, what if the worker throws? Another cron to "clean" the db would be needed etc.
Another solution that comes to mind is firebase-queue, but its README explicitly states:
"There may continue to be specific use-cases for firebase-queue,
however if you're looking for a general purpose, scalable queueing
system for Firebase then it is likely that building on top of Google
Cloud Functions for Firebase is the ideal route"
It's not officially supported and they're practically saying that we should use Functions instead (which is what I'm trying to do). I'm a bit nervous on using in prod a library that might be abandoned tomorrow (if it's not already) and would like to avoid going down that route.
Sending Pub/Sub messages from Cloud Functions
Cloud Functions are run in a fairly standard Node.js environment. Given the breadth of the Node/NPM ecosystem, the amount of things you can do in Cloud Functions is quite broad.
it seems that you can only listen on pubsub messages from within a Cloud Function, not create one
You can publish new messages to Pub/Sub topics from within Cloud Functions using the regular Node.js module for Pub/Sub. See the Cloud Pub/Sub documentation for an example.
Triggering new actions from Cloud Functions through Database writes
This is also a fairly common pattern. I usually have my subprocesses/workers clean up after themselves at the same moment they write their result back to the database. This works fine in my simple scenarios, but your mileage may of course vary.
If you're having a concrete cleanup problem, post the code that reproduces the problem and we can have a look at ways to make it more robust.