How to schedule a firebase function, in limited times - firebase

I have a firebase function that should be ran every 1 hour, but only for X times.
I scheduled it in this way :
functions.pubsub.schedule('every 1 hour').onRun((context) => {
let counter = // read last counter value from db
counter++;
if(counter === X)
return;
// save counter in db
// my job
}
But this method is not optimal, because the scheduler is always active, even when it's not required. Do you offer a better method for this purpose?

What you're trying to do isn't supported by scheduled functions. What you'll have to do is schedule future invocations of the function with Cloud Tasks with the specific schedule you want. A complete list of instructions to do this is too long for Stack Overflow, but you can read this blog post to learn how to programmatically schedule future invocations of a Cloud Function.

Related

Periodically run a cloud function every second without client input

I am working on an app, where I need to periodically (every second) need to write a new timestamp in a field in firestore, this write should be performed when a specific property of the document equals true, if not the periodic execution should stop - how can I do that?
The easiest solution I can offer is to use Cloud Task. When a user create a session, create a dedicated task queue with a rate limit of 1 per seconds and a bunch of task in that queues (for instance 3600 task per hour).
That task will trigger a HTTP endpoint (typically a Cloud Functions or a Cloud Run endpoint) that will increment the counter.
The main question that I had was about the firestore choice. As far as I understand, if you have 10 users in parallel, you have 10 counter and you write 10 times the same thing in firestore. not really efficient.
I have 2 propositions here:
Can you use a single counter and have several user object using the same count down?
Did you consider Cloud Memorystore to use in memory database to perform your per-user-timestamp-write and save only the result in Firestore document.
As you see here in the doc, the maximum execution frequency for scheduled Cloud Functions is one execution per minute.
Since you need a higher frequency, you cannot fulfill your requirement with only a Cloud Function. You'll need to use another approach.
You can solve this issue by doing the following
Create a cloud function that runs every minute
Add in 4 timers that executes every 15 seconds
NOTE: Depending on you logic this might cost you... Don't do this without understanding what your usage costs will be, and don't forget about it. I take no responsibility for servers catching fire with this.
export const my15SecondTimer = functions.pubsub
.schedule("* * * * *")
.onRun(async (context) => {
// Some logic (1)
setTimeout(async () => {
// Some logic (2)
}, 15000);
setTimeout(async () => {
// Some logic (3)
}, 3000);
setTimeout(async () => {
// Some logic (4)
}, 4500);
});

Mismatch in count even with FieldValue.increment

My use case is that I want to keep aggregating my firebase user count in the database for quick and easy access. For that, I have a cloud function listening on user.onCreate and it simply increments a field in a document using the atomic FieldValue.increment.
Here is the code:
exports.createProfile = functions.auth.user().onCreate(async user => {
return Promise.all([
addProfileToDatabase(),
function() {
db.collection('someCollection').doc(docId).update({
count: admin.firestore.FieldValue.increment(1)
)}
}
])
})
Issue: the count in the database becomes more than the number of authenticated users shown in the Authentication tab of Firebase. I regularly reset it to the correct number and then it slowly increases again.
I have read about the write throttling on a document, but that should instead result in lower count if at all. But why is that the count in the database always overshoots the actual count?
Without seeing your code, the only thing I can imagine is that your function isn't idempotent. It's possible that functions may be invoked more than once per triggering event. This would be an explanation why your count exceeds the expectation.
Read more about Cloud Functions idempotency in the documentation and also this video.

Is it good idea to use admin.database().ref().on('child_added') in cloud functions?

I'm creating a general purpose queue on firebase cloud functions to run huge list of task. I was wondering if i can use .on('child_added') to get new task pushed to the queue.
Problem i was getting is that my queue is breaking in a middle randomly after 10 mins or sometimes 15 mins.
admin.database().ref('firebase/queues/').on('child_added', async snap => {
let data = snap.val();
console.log(data);
try {
await queueService.start(data);
} catch (e) {
console.log(e.message);
}
snap.ref.remove();
});
Or shall i go back to use triggers?
functions.database.ref('firebase/queues/{queueId}').onCreate(event => {
return firebaseQueueTrigger(event);
});
You can use child_added in cloud functions if you just want to retrieve data from the database.
onCreate() is a database trigger, that is triggered whenever new data is added to the database.
more info here:
https://firebase.google.com/docs/functions/database-events
so when new data is added to the database, at the location provided onCreate() will trigger. Both can also be used in cloud functions
The problem is not in using child_added, it is in keeping an active on(... in Cloud Functions. The on(... method attaches a listener, which stays attached until you call off(). This by nature conflicts with the ephemeral nature of Cloud Functions, which are meant to have a trigger-execute-exit flow. Typically if you need to read additional data from the database in your Cloud Function, you'll want to use once(.... so that you can detect when the read is done.
But in your specific case: if you're creating a worker queue, then I'd expect all data to come in with event.data already. Your functions.database.ref('firebase/queues/{queueId}').onCreate(event is essentially the Cloud Functions equivalent of firebase.database().ref('firebase/queues').on('child_added'.

How to make idempotent aggregation in Cloud Functions?

I'm working on a Firebase Cloud Function that updates some aggregate information on some documents in my DB. It's a very simple function and is simply adding 1 to a total # of documents count. Much like the example function found in the Firestore documentation.
I just noticed that when creating a single new document, the function was invoked twice. See below screenshot and note the logged document ID (iDup09btyVNr5fHl6vif) is repeated twice:
After a bit of digging around I found this SO post that says the following:
Delivery of function invocations is not currently guaranteed. As the Cloud Firestore and Cloud Functions integration improves, we plan to guarantee "at least once" delivery. However, this may not always be the case during beta. This may also result in multiple invocations for a single event, so for the highest quality functions ensure that the functions are written to be idempotent.
(From Firestore documentation: Limitations and guarantees)
Which leads me to a problem with their documentation. Cloud Functions as mentioned above are meant to be idempotent (In other words, data they alter should be the same whether the function runs once or runs multiple times). However the example function I linked to earlier (to my eyes) is not idempotent:
exports.aggregateRatings = firestore
.document('restaurants/{restId}/ratings/{ratingId}')
.onWrite(event => {
// Get value of the newly added rating
var ratingVal = event.data.get('rating');
// Get a reference to the restaurant
var restRef = db.collection('restaurants').document(event.params.restId);
// Update aggregations in a transaction
return db.transaction(transaction => {
return transaction.get(restRef).then(restDoc => {
// Compute new number of ratings
var newNumRatings = restDoc.data('numRatings') + 1;
// Compute new average rating
var oldRatingTotal = restDoc.data('avgRating') * restDoc.data('numRatings');
var newAvgRating = (oldRatingTotal + ratingVal) / newNumRatings;
// Update restaurant info
return transaction.update(restRef, {
avgRating: newAvgRating,
numRatings: newNumRatings
});
});
});
});
If the function runs once, the aggregate data is increased as if one rating is added, but if it runs again on the same rating it will increase the aggregate data as if there were two ratings added.
Unless I'm misunderstanding the concept of idempotence, this seems to be a problem.
Does anyone have any ideas of how to increase / decrease aggregate data in Cloud Firestore via Cloud Functions in a way that is idempotent?
(And of course doesn't involve querying every single document the aggregate data is regarding)
Bonus points: Does anyone know if functions will still need to be idempotent after Cloud Firestore is out of beta?
The Cloud Functions documentation gives some guidance on how to make retryable background functions idempotent. The bullet point you're most likely to be interested in here is:
Impose a transactional check outside the function, independent of the code. For example, persist state somewhere recording that a given event ID has already been processed.
The event parameter passed to your function has an eventId property on it that is unique, but will be the same when an even it retried. You should use this value to determine if an action taken by an event has already occurred, so you know to skip the action the second time, if necessary.
As for how exactly to check if an event ID has already been processed by your function, there's a lot of ways to do it, and that's up to you.
You can always opt out of making your function idempotent if you think it's simply not worthwhile, or it's OK to possibly have incorrect counts in some (probably rare) cases.

Does Realm support SELECT FOR UPDATE style read locking

I've spent a fair amount of time looking into the Realm database mechanics and I can't figure out if Realm is using row level read locks under the hood for data selected during write transactions.
As a basic example, imagine the following "queue" logic
assume the queue has an arbitrary number of jobs (we'll say 5 jobs)
async getNextJob() {
let nextJob = null;
this.realm.write(() => {
let jobs = this.realm.objects('Job')
.filtered('active == FALSE')
.sorted([['priority', true], ['created', false]]);
if (jobs.length) {
nextJob = jobs[0];
nextJob.active = true;
}
});
return nextJob;
}
If I call getNextJob() 2 times concurrently, if row level read blocking isn't occurring, there's a chance that nextJob will return the same job object when we query for jobs.
Furthermore, if I have outside logic that relies on up-to-date data in read logic (ie job.active == false when it actually is true at current time) I need the read to block until update transactions complete. MVCC reads getting stale data do not work in this situation.
If read locks are being set in write transactions, I could make sure I'm always reading the latest data like so
let active = null;
this.realm.write(() => {
const job = this.realm.pseudoQueryToGetJobByPrimaryKey();
active = job.active;
});
// Assuming the above write transaction blocked the read until
// any concurrent updates touching the same job committed
// the value for active can be trusted at this point in time.
if (active === false) {
// code to start job here
}
So basically, TL;DR does Realm support SELECT FOR UPDATE?
Postgresql
https://www.postgresql.org/docs/9.1/static/explicit-locking.html
MySql
https://dev.mysql.com/doc/refman/5.7/en/innodb-locking-reads.html
So basically, TL;DR does Realm support SELECT FOR UPDATE?
Well if I understand the question correctly, the answer is slightly trickier than that.
If there is no Realm Object Server involved, then realm.write(() => disallows any other writes at the same time, and updates the Realm to its latest version when the transaction is opened.
If there is Realm Object Server involved, then I think this still stands locally, but the Realm Sync manages the updates from remote, in which case the conflict resolution rules apply for remote data changes.
Realm does not allow concurrent writes. There is at most one ongoing
write transaction at any point in time.
If the async getNextJob() function is called twice concurrently, one of
the invocations will block on realm.write().
SELECT FOR UPDATE then works trivially, since there are no concurrent updates.

Resources