Let's say there are 2 Firebase writes that go together, such as "liking a post" and "adding my uid to the list of people that liked the post."
Is it better to chain together the actions in my client code (adding the second write in the completion block of the first)?
Or is it better to use a cloud function that does the second write that is triggered on the first?
I'm asking on a cost standpoint, and how susceptible to hacking is client code? Is it easy for the client to disallow the second write after the first, especially in web applications?
The idea behind using cloud functions in this case would be to take as much processing as possible out of your app, not only for cost and security but also to make it more efficient and fast.
The second option is definitely more secure, since the Cloud Function would be a event triggered function, making it "uninvocable" by the malicious users.
As per costs, for Firestore either options would represent 2 writes, but the second option would represent also a cost of Compute Time on your Cloud Function, you can get more details here. There is a free tier for Cloud Functions, but as you app scale this may become significant, so that's something to be considered.
Related
I remember to have read an article where it was explained that Cloud Functions are not guaranteed to be executed and especially in the right order. I can't find any sources on this anymore.
Is this still recent information?
I am aware that the start of a function can take a couple seconds, especially when cold starting the function.
Could I reliably increment a number each time a document is created in a specific Firestore collection without getting my numbers mixed up? I know this is done often but I've never seen information on whether or not it is safe to do.
Following up on question one, are there red flags when using Cloud Functions for payment backend services?
Can I be sure that Cloud Functions are executed in the order that they were triggered i.e. are they queued or executed in parallel?
Could I reliably increment a number each time a document is created in a specific Firestore collection without getting my numbers mixed up?
You can certainly write code to do that. You will need to keep track of a running count of documents in another document, and use a transaction to keep it up to date.
I don't recommend doing this. It's kind an anti-pattern in Firestore to impose sequentially increasing numbers for documents in a collection. If you want time-based ordering, you should consider using a timestamp instead.
Can I be sure that Cloud Functions are executed in the order that they were triggered i.e. are they queued or executed in parallel?
Cloud Functions provides absolutely no guarantee that functions invocations will happen in any order. They are asynchronous and can execute in parallel on multiple server instances, depending on the load applied to the function.
I strongly suggest reading through the documentation to understand the execution environment provided by Cloud Functions.
I need to keep track of the number of photos I have in a Photos collection. So I want to implement an Aggregate Query as detailed in the linked article.
My plan is to have a Cloud Function that runs whenever a Photo document is created or deleted, and then increment or decrement the aggregate counter as needed.
This will work, but I worry about running into the 1 write/document/second limit. Say that a user adds 10 images in a single import action. That is 10 executions of the Cloud Function in more-or-less the same time, and thus 10 writes to the Aggregate Query document more-or-less at the same time.
Looking around I have seen several mentions (like here) that the 1 write/doc/sec limit is for sustained periods of constant load, not short bursts. That sounds reassuring, but it isn't really reassuring enough to convince an employer that your choice of DB is a safe and secure option if all you have to go on is that 'some guy said it was OK on Google Groups'. Is there any official sources stating that short write bursts are OK, and if so, what definitions are there for a 'short burst'?
Or are there other ways to maintain an Aggregate Query result document without also subjecting all the aggregated documents to a very restrictive 1 write / second limitation across all the aggregated documents?
If you think that you'll see a sustained write rate of more than once per second, consider dividing the aggregation up in shards. In this scenario you have N aggregation docs, and each client/function picks one at random to write to. Then when a client needs the aggregate, it reads all these subdocuments and adds them up client-side. This approach is quite well explained in the Firebase documentation on distributed counters, and is also the approach used in the distributed counter Firebase Extension.
I need to delete very large collections in Firestore.
Initially I used client side batch deletes, but when the documentation changed and started to discouraged that with the comments
Deleting collections from an iOS client is not recommended.
Deleting collections from a Web client is not recommended.
Deleting collections from an Android client is not recommended.
https://firebase.google.com/docs/firestore/manage-data/delete-data?authuser=0
I switched to a cloud function as recommended in the docs. The cloud function gets triggered when a document is deleted and then deletes all documents in a subcollection as proposed in the above link in the section on "NODE.JS".
The problem that I am running into now is that the cloud function seems to be able to manage around 300 deletes per seconds. With the maximum runtime of a cloud function of 9 minutes I can manage up to 162000 deletes this way. But the collection I want to delete currently holds 237560 documents, which makes the cloud function timeout about half way.
I cannot trigger the cloud function again with an onDelete trigger on the parent document, as this one has already been deleted (which triggered the initial call of the function).
So my question is: What is the recommended way to delete large collections in Firestore? According to the docs it's not client side but server side, but the recommended solution does not scale for large collections.
Thanks!
When you have too muck work that can be performed in a single Cloud Function execution, you will need to either find a way to shard that work across multiple invocations, or continue the work in a subsequent invocations after the first. This is not trivial, and you have to put some thought and work into constructing the best solution for your particular situation.
For a sharding solution, you will have to figure out how to split up the document deletes ahead of time, and have your master function kick off subordinate functions (probably via pubsub), passing it the arguments to use to figure out which shard to delete. For example, you might kick off a function whose sole purpose is to delete documents that begin with 'a'. And another with 'b', etc by querying for them, then deleting them.
For a continuation solution, you might just start deleting documents from the beginning, go for as long as you can before timing out, remember where you left off, then kick off a subordinate function to pick up where the prior stopped.
You should be able to use one of these strategies to limit the amount of work done per functions, but the implementation details are entirely up to you to work out.
If, for some reason, neither of these strategies are viable, you will have to manage your own server (perhaps via App Engine), and message (via pubsub) it to perform a single unit of long-running work in response to a Cloud Function.
I tried a couple of my trigger functions as https.onCall and called them after promise return and so far they work really well and faster than the triggers.
What's the catch? Are they also affected by the cold starts too?
If not, then unless it's cron job or lack of support of app language, why should anyone use use a trigger function at all?
All Cloud Functions are affected by cold starts. This is how all serverless function architectures work. In order to scale down to zero (so you pay nothing if you use nothing), all server instances must be able to be decommissioned. Cold start up cost is paid when a new server instance is allocated, so going from zero to one will cost you one cold start.
You haven't defined what a "trigger function" is, so I'll assume you mean a "background function" which triggers in response to events that occur within your project.
Background functions are absolutely required when you want to have some work performed in response to those changes when you can't trust the client to perform that work directly. This is important to maintain data consistency, and also to prevent having to duplicate logic among all your different clients that are all doing the same thing. This also allows you to ship new features and bug fixes without having to ship new client code, which can be difficult and time-consuming.
After watching a fair amount of youtube videos, it seems that Google is advocating for multipath updates when changing data stored in multiple places, however, The more I've messed with cloud functions, it seems like they're and even more viable option as they can just sit in the back and listen for changes to a specific reference and push changes as needed to the other references in real time. Is there a con to going this route? Just curious as to why Google doesn't recommend them for this use case.
NEWER UPDATE: Literally as I was writing this, I received a response from Google regarding my issues. It's too late to turn our apps direction around at this point but it may be useful for someone else.
If your function doesn't return a value, then the server doesn't know how long to wait before giving up and terminating it. I'd wager a quick guess that this might be why the DB calls aren't getting invoked.
Note that since DatabaseReference.set() returns a promise, you can simply return that if you want.
Also, you may want to add a .catch() and log the output to verify the set() op isn't failing.
~firebase-support#google.com
UPDATE: My experience with cloud functions in the last month or so has been sort of a love-hate. A lot of our denormalized data relied on Cloud Functions to keep everything in sync. Unfortunately (and this was a bad idea from the start) we were dealing with transactional/money data and storing that in multiple areas was uncomfortable. When we started having issues with Cloud Functions, i.e. the execution of them on a DB listener was not 100% reliable, we knew that Firebase would not work at least for our transaction data.
Overall the concept is awesome. They work amazingly well when they trigger, but due to some inconsistencies in triggering the functions, they weren't reliable enough for our use case.
We're currently using SQL for our transactional data, and then store user data and other objects that need to be maintained real-time in Firebase. So far that's working pretty well for us.