Cloud functions and Firebase Firestore with Idempotency - firebase

I'm using Firestore at beta version with Cloud Functions. In my app I need to trigger a function that listens for an onCreate event at /company/{id}/point/{id} and performs an insert (collection('event').add({...}))
My problem is: Cloud Functions with Firestore require an idempotent function. I don't know how to ensure that if my function triggers two times in a row with the same event, I won't add two documents with the same data.
I've found that context.eventId could handle that problem, but I don't recognize a way to use it.
exports.creatingEvents = functions.firestore
.document('/companies/{companyId}/points/{pointId}')
.onCreate((snap, context) => {
//some logic...
return db.doc(`eventlog/${context.params.companyId}`).collection('events').add(data)
})

Two things:
First check your collection to see if a document has a property with the value of context.eventId in it. If it exists, do nothing in the function.
If a document with the event id doesn't already exist, put the value of context.eventId in a property in the document that you add.
This should prevent multiple invocations of the function from adding more than one document for a given event id.

Why not set the document (indexing by the event id from your context) instead of creating it? This way if you write it twice, you'll just overwrite rather than create a new record.
https://firebase.google.com/docs/firestore/manage-data/add-data
This approach makes the write operation idempotent.

Related

Firebase cloud functions: How to wait for a document to be created and update it afterwards

Here is the situation:
I have collections 'lists', 'stats', and 'posts'.
From frontend, there is a scenario where the user uploads a content. The frontend function creates a document under 'lists', and after the document is created, it creates another document under 'posts'.
I have a CF that listens to creation of a document under 'lists' and create a new document under 'stats'.
I have a CF that listens to creation of a document under 'posts' and update the document created under 'stats'.
The intended order of things to happen is 2->3->4. However, apparently, step 4 is triggered before step 3, and so there is no relevant document under 'stats' to update, thus throwing an error.
Is there a way to make the function wait for the document creation under 'stats' and update only after it is created? I thought about using setTimeout() for the function in step 4, but guess there might be a better way.
Below is the code that I am using for steps 3 and 4. Can someone advise? Thanks!
//This listens to a creation of a document under 'lists' and creates a new document
//with the same document ID under 'stats'.
exports.statsCreate = functions.firestore
.document('lists/{listid}').onCreate((snap,context)=>{
const listidpath=snap.ref.path;
const pathfinder=listidpath.split('/');
const listid=pathfinder[pathfinder.length-1];
return db.collection('stats').doc(listid).set({
postcount:0,
})
})
//This listens to a creation of a document under 'posts' and updates the corresponding
// document under 'stats'. There is a field under 'posts' with the list ID to make this possible.
// How do I make sure the update operation happens only after the document is actually there?
exports.statsUpdate = functions.firestore
.document('posts/{postid}').onCreate((snap,context)=>{
const data=snap.data();
return db.collection('stats').doc(data.listid).update({
postcount:admin.firestore.FieldValue.increment(1)
})
})
I can see at least two "easy" solutions:
Solution #1: In your front end, set a listener to the to-be-created stat document (with onSnapshot()), and only create the post document when the stat one has been created. Note however that this solution will not work if the user does not have read access right to the posts collection.
Solution #2: Use the "retry on failure" option for background Cloud Functions. Within your statsUpdate Cloud Function you intentionally throw an exception if the stat doc is not found => The CF will be retried until the stat doc is created.
A third solution would be to use a Callable Cloud Function, called from your front-end. This Callable Cloud Function would write the three docs in the following order: list, stat and post. Then the statsUpdate Cloud Function would be triggered in the background (or you could include its business logic in the Callable Cloud Function as well).
One of the drawbacks of this solution is that the Cloud Function may encounter some cold start effect. In this case, from an end-user perspective, the process may take more time than the abonne solutions. However note that you can specify a minimum number of container instances to be kept warm and ready to serve requests.
PS: Note that in the statsCreate CF, you don't need to extract the listid with:
const listidpath=snap.ref.path;
const pathfinder=listidpath.split('/');
const listid=pathfinder[pathfinder.length-1];
Just do:
const listid = context.params.listid;
The context parameter provides information about the Cloud Function's execution.

Firebase cloud function use current value and not value when function was initially called

Not sure if this is even possible with firebase cloud functions.
Let's assume, I want to trigger a cloud function onCreate on all documents in a specific collection.
After creation, the cloud function should add another document in a different collection.
Passing a value from the manually created document.
Sure, that works!:
export const createAutomaticInvoice = functions.firestore.document('users/{userId}/lessons/{lesson}').onCreate((snap, context) => {
let db = admin.firestore();
let info = snap.ref.data()
db.collection('toAdd').add({
info: info
})
})
But if I create a document within users/{userId}/lessons/ and change the value of info directly afterwards, before the cloud function is triggered, the cloud function takes the old value of info as supposed to the one it was changed to.
Is this expected behaviour? For me it is definetely not as I would assume that it takes the values at runtime.
How can I make my example work as expected?
This is the expected behavior - the function is going to execute as soon as possible after that document is created. The snapshot is always going to contain the contents of the document as it was originally created. It's not going to wait around to see if that document changes at some point in the future, and it's not going to try to query that document in case it might have changed.
If you want to handle updates to a document, you should also be using an onUpdate trigger to know if that happens.

How to avoid loops when writing cloud functions?

When writing event based cloud functions for firebase firestore it's common to update fields in the affected document, for example:
When a document of users collection is updated a function will trigger, let's say we want to determine the user info state and we have a completeInfo: boolean property, the function will have to perform another update so that the trigger will fire again, if we don't use a flag like needsUpdate: boolean to determine if excecuting the function we will have an infinite loop.
Is there any other way to approach this behavior? Or the situation is a consequence of how the database is designed? How could we avoid ending up in such scenario?
I have a few common approaches to Cloud Functions that transform the data:
Write the transformed data to a different document than the one that triggers the Cloud Function. This is by far the easier approach, since there is no additional code needed - and thus I can't make any mistakes in it. It also means there is no additional trigger, so you're not paying for that extra invocation.
Use granular triggers to ensure my Cloud Function only gets called when it needs to actually do some work. For example, many of my functions only need to run when the document gets created, so by using an onCreate trigger I ensure my code only gets run once, even if it then ends up updating the newly created document.
Write the transformed data into the existing document. In that case I make sure to have the checks for whether the transformation is needed in place before I write the actual code for the transformation. I prefer to not add flag fields, but use the existing data for this check.
A recent example is where I update an amount in a document, which then needs to be fanned out to all users:
exports.fanoutAmount = functions.firestore.document('users/{uid}').onWrite((change, context) => {
let old_amount = change.before && change.before.data() && change.before.data().amount ? change.before.data().amount : 0;
let new_amount = change.after.data().amount;
if (old_amount !== new_amount) {
// TODO: fan out to all documents in the collection
}
});
You need to take care to avoid writing a function that triggers itself infinitely. This is not something that Cloud Functions can do for you. Typically you do this by checking within your function if the work was previously done for the document that was modified in a previous invocation. There are several ways to do this, and you will have to implement something that meets your specific use case.
I would take this approach from an execution time perspective, this means that the function for each document will be run twice. Each time when the document is triggered, a field lastUpdate would be there with a timestamp and the function only updates the document if the time is older than my time - eg 10 seconds.

firebase functions manage multiple operations on a single database trigger

We are using firebase realtime DB and firebase functions. We have wrote a DB trigger for whenever a user is updated. In this case we have a referral system. So whenever a user adds a referrer to his account then this trigger gives some reward to the referrar. Hence the user_update update trigger does the job.
This works well. Now, we need to do one more unrelated activity whenever user is updated. To be specific we want to keep total reward given so far to all the users for analytics purpose.
So, what is the best way to implement two independent operations on a single update trigger?
Technically we can embed one operation call into another but that will make like hell and messy especially if need more operations like that in future.
You have two options, either use 1 realtime database trigger like now and put the logic in that function. You can make it clean and tidy by putting all the logic in separate functions that this trigger just calls
Or you can simply create another trigger exactly how you did this time and just change its export name e.g. like below. With this method all it means is you have 2 functions being called so doubling the cost.
exports.userUpdate = functions.database.ref('/users/{uid}').onUpdate(async (change, context) => { /* LOGIC */ });
exports.userUpdateSecond = functions.database.ref('/users/{uid}').onUpdate(async (change, context) => { /* LOGIC */ });

How to trigger onCreate in Firestore cloud functions shell without using an existing document

I am using the firebase-tools shell CLI to test Firestore cloud functions.
My functions respond to the onCreate trigger for all documents in a certain collection, by using a wildcard, and then mutate that document with an update call.
firestore
.document(`myCollection/{documentId}`)
.onCreate(event => {
const ref = event.data.ref
return ref.update({ some: "mutation"})
})
In the shell I run something like this, (passing some fake auth data required by my database permissions):
myFunction({some: "data"}, { auth: { variable: { uid: "jj5BpbX2PxU7fQn87z10d4Ks6oA3" } } } )
Hoever this results in an error, because the update tries to mutate a document that is not in the database.
Error: no entity to update
In the documentation about unit testing it is explained how you would create mocks for event.data in order to execute the function without touching the actual database.
However I am trying to invoke a real function which should operate on the database. A mock would not make sense, otherwise this is nothing more then a unit test.
I'm wondering what the strategy should be for invoking a function like this?
By using an existing id of a document the function can execute successfully, but this seems cumbersome because you need look it up in the database for every test, and it might not be there anymore at some point.
I think it would be very helpful if the shell would somehow create a new document from the data you pass in, and run the trigger from that. Would this be possible maybe, or is there another way?
The Cloud Functions emulator can only emulate events that could happen within your project. It doesn't emulate the actual change to the database that would have triggered it.
As you're discovering, when your function depends on that actual change previously occurring, you can run into problems. The fact of the matter is that it's entirely possible that the created document may have already been deleted by the time you're handling the event in the function (imagine a user acts quickly to delete, but the event is delayed for whatever reason).
All that said, perhaps you want to use set() with SetOptions that indicate you want to merge instead of overwrite. Bear in mind that if the document was previously deleted (with good reason) before the event triggered, you'll unconditionally recreate the document, which may not be what the user wanted.

Resources