Firestore - Decrease number if greater than zero - firebase

Imagine 1 user that can press a button which resets a counter to 0.
In the other side, imagine multiple users (100k, for example) which can increase/decrease the same counter at the same time or whenever they want.
The counter can't never be lower than 0.
What I have thought to do is to run a transaction (read value and then update if necessary), but this seems that, if the counter is updated multiple times before a transaction finishes, it will be repeated again and again, and might ignores some increases if the counter is updated 100k times in a short period and the transaction fails (because of multiple repetitions, maybe I am wrong).
Is the only way to handle this with a transaction?

What you're describing is known as a contention bottleneck, and is a common limit in multi-user systems.
If having 100k concurrent updates to the same data is a realistic scenario in your case, you'll want to look at a different way to solve it.
The first one that comes to mind, and a common solution in general, is to have the users write their increase/decrease to a separate "queue". This can be a collection in Firestore, but the most important thing is that these are append only operations: there is no contention between multiple users writing at the same time.
Then you'd have a Cloud Run instance, or Cloud Functions, process the increase/decrease actions from the users. You can either limit this to at most one concurrent or a few concurrents, leading to either no contention or low contention on updating the final counter.

Related

Do you need to do consistent read after using a DynamoDB transaction to commit a change?

We need strong consistency (insert where not exists, check conditions etc) to keep things in order a fast moving DynamoDb store, however we do far more reads than writes, and would prefer to sent consistentRead = false because it is faster, more stable (when nodes are down) and (most importantly) less costly.
If we use a Transaction write items collection to commit changes, does this wait for all nodes to propagate before returning? If so, surely you don’t need to use a consistent read to query this… is that the case?
No. Transactional writes work like regular reads in that they are acknowledged when they are written to at least 2 of the 3 nodes in the partition. One of those 2 nodes must be the leader node for the partition. The difference in a transaction is that all of the writes in that transaction have to work or none of them work.
If you do an eventually consistent read after the transaction, there is a 33% chance you will get the one node that was not required for the ack. Now then, if all is healthy that third node probably has the write anyhow.
All that said, if your workload needs a strongly consistent read like you indicate, then do it. Don't play around. There should not be a performance hit for a strong consistent read, but like you pointed out, there is a cost implication.

Why wouldn't a small Firebase Functions app just use a single Function to handle logic?

...aside from the benefit in separate performance monitoring and logging.
For logging, I am confident I can get granularity through manually adding the name of the "routine" to each call. This is how it is now with several discrete Functions for different parts of the system:
There are multiple automatic logs: start and finish of the routine, for example. It would be more challenging to find out how expensive certain routines are, but it would not be impossible.
The reason I want the entire logic of the application handled by a single handle function is because of reducing cold starts: one function means only one container that can be persistently kept alive when there are very few users of the app.
If a month is ~2.6m seconds and we assume the system uses 1 GB RAM and 1 GHz CPU frequency at all times, that's:
2600000 * 0.0000025 + 2600000 * 0.000001042 = USD$9.21 a month
...for one minimum instance.
I should also state that all of my functions have the bare minimum amount of global scope code; it just sets up Firebase assets (RTDB and Firestore).
From a billing, performance (based on user wait time), and user/developer experience perspective, is there any reason why it would be smart to keep all my functions discrete?
I'd also accept an answer saying "one single function for all logic is reasonable" as long as there's a reason for it.
Thanks!
If you have very small app with ~5 end points and very low traffic. Sure you could do something like this. But why not do it:
billing and performance
The important thing to realize is that with every request a new instance of your function is created. Which means there could be 10s of them running at the same time.
If you would like to have just 1 instance handling all the traffic you should explore GCP Cloud run, where you have 1 container handling multiple requests and scaling only when it's not sufficient.
Imagine you have several end-points and every one of them have different performance requirements.
1 can need only 128MB or RAM
1 can need 1GB RAM
(FYI: You can control the CPU MHz of the function via the RAM settings too - which can speed up execution in some cases)
If you had only 1 function with 1GB of ram. Every request would allocate such function and in some cases most of the memory could go to waste.
But if you split it into multiple, some requests will require much less resources and can save you $ when we talk about bigger amount of executions / month. (tens of thousands+).
Let's imagine function, 3 second execution, 10k executions/month:
128MB would cost you $0.0693
1024MB would cost you $0.495
As you can see, with small app the difference could be nothing. But if you scale it matters. (*The cost can vary based on datacenter)
As for the logging, I don't think it matters. Usually in bigger systems there could be messages traveling trough several functions so you have to deal with that anyway.
As for the cold start. You just need good UI to facilitate that. At first I was worry about it in our apps but later on, you just get used to it that some action can take ~2s to execute (cold start). And you should have the UI "loading" regardless, because you don't know if the function will take ~100ms or 3s due to bad connection.

How can I implement a transaction of 50 writes in dynamoDB?

I’m aware there is a hard limit of 25 items per transaction. However, I’m sure there is a way of implementing transactions for more items from scratch. How might I go about it?
I’m thinking something like, keep a version number on every item. Fetch all the items up front, during insert verify version number is the same. Ie optimistic locking. If the condition fails, revert all failed items. Naturally, I can imagine that the revert could fail and I need to do optimistic locking on the revert and end up in a deadlock of reverts.
The solution I found in the end was to implement pessimistic locking. It supports an arbitrary number of writes and reads and guarantees transactional consistency. The catch is if you're not careful, it's easy to run into deadlocks.
The idea is that you
Create a lock table. Each row refers to a specific lock. The primary key of the lock table should be a string which I'll refer to as the lock-key. Often you'll want to lock a specific entity so this is a reasonable format for the lock-key {table_name}#{primary_key} but it might be more arbitrary so any string will do. Rows in the lock table should also auto-delete after a certain time period as per a ttl field ie TimeToLiveSpecification.
Before starting the transaction, acquire the lock. You do this by creating the row with your arbitrary lock-key and with a conditional check that the row doesn't already exist. If it does exist the row creation should fail which means another process has already acquired the lock. You then need to poll, trying to recreate the lock row until the lock has been released.
Once you have acquired the lock you need to keep the lock alive with a heartbeat to prevent other tasks from executing. The hearbeat process should increment a heartbeat property on the lock row which reflects the last-active time of the lock. The ttl of the row should be greater than the heartbeat interval. Normally about double, so that the lock is not auto-purged erroneously. If your process dies, the lock will be naturally released by the auto-deletion of the ttl.
If your task completes successfully it should delete the lock row freeing it up for other tasks.

Is DynamoDb UpdateExpression with ADD to increment a counter transactional?

Do I need to use optimistic locking when updating a counter with ADD updateExpression to make sure that all increments from all the clients will be counted?
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_UpdateItem.html#API_UpdateItem_RequestSyntax
I'm not sure if you would still call it a transaction if that is the only thing you are doing in DynamoDB, it is a bit confusing the terminology.
IMO it is more correct to say it is Atomic. You can combine the increment with other changes in DynamoDB with a condition that will mean it won't be written unless that condition is true, but if your only change is the increment then other than hitting capacity limits there won't be any other reason (other than an asteroid hitting a datacenter or something of the like) why your increment would fail. (Unless you put a condition on your request which turns out to be false upon writing). If you have two clients incrementing at the same time, DynamoDB will handle this somebody will get in first.
But let's say you are incrementing a values many many times a second, whereby you may indeed be hitting a DynamoDB capacity limit. Consider batching the increments in a Kinesis Stream, whereby you can set the maximum time the stream should wait upon receiving a value that processing should begin. This will enable you to achieve consistency within x seconds in your aggregation.
But other than extremely high traffic situations you should be fine, and in that case the standard way of approaching that problem is using Streams which is very cost effective, saving you capacity units.

Limitations of using sequential IDs in Cloud Firestore

I read in a stackoverflow post that (link here)
By using predictable (e.g. sequential) IDs for documents, you increase the chance you'll hit hotspots in the backend infrastructure. This decreases the scalability of the write operations.
I would like if anyone could explain better on the limitations that can occur when using sequential or user provided id.
Cloud Firestore scales horizontally by allocated key ranges to machines. As load increases beyond a certain threshold on a single machine, it will split the range being served by it and assign it to 2 machines.
Let's say you just starting writing to Cloud Firestore, which means a single server is currently handling the entire range.
When you are writing new documents with random Ids, when we split the range into 2, each machine will end up with roughly the same load. As load increases, we continue to split into more machines, with each one getting roughly the same load. This scales well.
When you are writing new documents with sequential Ids, if you exceed the write rate a single machine can handle, the system will try to split the range into 2. Unfortunately, one half will get no load, and the other half the full load! This doesn't scale well as you can never get more than a single machine to handle your write load.
In the case where a single machine is running more load than it can optimally handle, we call this "hot spotting". Sequential Ids mean we cannot scale to handle more load. Incidentally, this same concept applies to index entries too, which is why we warn sequential index values such as timestamps of now as well.
So, how much is too much load? We generally say 500 writes/second is what a single machine will handle, although this will naturally vary depending on a lot of factors, such as how big a document you are writing, number of transactions, etc.
With this in mind, you can see that smaller more consistent workloads aren't a problem, but if you want something that scales based on traffic, sequential document ids or index values will naturally limit you to what a single machine in the database can keep up with.

Resources