Firestore limit - firebase

Firestore offers 50000 documents read operations as part of its free bundle.
However, in my application, the client is fetching a collection containing price data. The price data is created over time. Hence, starting from a specific timestamp, the client can read up to 1000 documents. Each document represents one timestamp with the price information.
This is means that if the client refreshes his/her web browser 50 times, it will exhaust my quota immediately. And that is just for a single client.
That is what happened. And got this error:
Error: 8 RESOURCE_EXHAUSTED: Quota exceeded
The price data are static. Once they have been written, it is not supposed to change.
Is there a solution for this issue or I should consider other database other than Firestore?

The error message indicates that you've exhausted the quota that is available. On the free plan the quota is 50,000 document reads per day, so you've read that number of documents already.
Possible solutions:
Upgrade to a paid plan, which has a much higher quota.
Wait until tomorrow to continue, since the quota resets every day.
Try in another free project, since each project has its own quota.

If you have a dataset that will never-ever (or rarely) change, why not write it as JSON object in the app itself. You could make it a separate .js file and then import for reading to make your table.
Alternatively - is there a reason your users would ever navigate through all 1,000 records. You can simulate a full table even with limiting to calls to say 10 or so and then paginate to get more results if needed.

Related

Managing Firestore document read count in MV3 browser extension

In the MV2 extension, I attach a Firestore onSnapshot listener in the persistent background page. As I understand it: 1. Firestore downloads all documents on first attaching of the listener, and afterwards, 2. only downloads the changed documents when they change. Since this listener persists over several hours, the total number of Firestore read counts (and hence the cost) is low.
But in the MV3 extension, the service worker (which houses the Firestore listener) is destroyed after five minutes. Therefore, the onSnapshot listener will be destroyed and re-attached several times in just a few hours. On every re-attachment, that listener would potentially re-download all of the user data. So, if the listener gets destroyed and attached five times, we incur five times as many document read counts (and hence the cost) in an MV3 extension as compared to an MV2 extension.
I'd like to understand:
Does using IndexedDB persistence help significantly reduce the document read counts? Even the when the service worker is restarted.
In case we are not using IDB persistence, how is the billing done? For example, the billing docs state that:
Also, if the listener is disconnected for more than 30 minutes (for example, if the user goes offline), you will be charged for reads as if you had issued a brand-new query.
If I re-attach the listener within 15-20 minutes, does it again incur document reads on all user data?
I tried writing a small example myself, and monitor the results on Cloud Console Monitoring to measure "Firestore Instance - Document Reads". However, I was not able to get clear results from it.
Note: For the purpose of discussion, I will avoid the workaround for making service-workers persistent, and focus on a worst case assuming that this workaround does not work.
Does using IndexedDB persistence help significantly reduce the document read counts? Even the when the service worker is restarted.
Yes. Upon a reconnect within the 30m interval, most documents can be read from the disk cache and won't have to be read from/on the server.
In case we are not using IDB persistence, how is the billing done? ... If I re-attach the listener within 15-20 minutes, does it again incur document reads on all user data?
If there is no data in the disk cache and no existing listener on the data, the documents will have to read from the server, and thus you will be charged for each document read there.

DynamoDB: Time frame to avoid stale read

I am writing to dynamoDB using AWS lambda. I am using AWS Console to read data from dynamoDB. But I have seen instances of stale read with latest not records getting updated when trying to pull records under few mins. What is a safe time interval for data pull which would ensure that latest data is available on read? Would 30 mins be a safe interval?
The below is from AWS site: Just want to understand how recent is recent here. "When you read data from a DynamoDB table, the response might not reflect the results of a recently completed write operation. The response might include some stale data"
Regards,
Dbeings
If you must have a strongly consistent read, you can specify that in your read statement. That way the client will always read from the leader storage node for that partition.
In order for DynamoDB to acknowledge a write, the write must be durable on the leader storage node for that partition and one other storage node for the partition.
If you do an eventually consistent read (which is the default), you might get that read you have a 1:3 chance of that read coming from a node that was not part of the write and an even lesser chance that the item has not been updated on that third storage node.
So, if you need a strongly consistent read, ask for one and you'll get the newest of that item. There is no real performance degradation for doing a strongly consistent read.

Firestore write stream exhausted when exceeding 15 writes per second

We recently started getting write stream exhausted errors:
#firebase/firestore: Firestore (8.10.0): FirebaseError: [code=resource-exhausted]: 8 RESOURCE_EXHAUSTED: Write stream exhausted maximum allowed queued writes.
#firebase/firestore: Firestore (8.10.0): Using maximum backoff delay to prevent overloading the backend.
We're sending a few million transactions in batches of 500 transactions to a firestore collection every 15 seconds. If we run it any more often than every 15 seconds we get the error above. If we exceed it too far, it eventually hard crashes. The advice we got from firebase support was to instead write to a bunch of subcollections, but their limits clearly show 500 writes/second even when writing sequentially to a single collection. Our document ids aren't quite sequential but are often close like client-id:guid where client id would often be the same for most of the write. Ideas on what we might be doing wrong/how to fix it? We've tried sending smaller more frequent batches and larger less frequent batches. We created a new project and didn't create any indexes to see if they were a problem. We've sent the requests from different clients, none of which are taxed for resources.
Cloud Firestore does not stop you from exceeding the tresholds below, but doing so affects perfomance, thus the "batches of 500" comes from this hard limit and this soft limit.
The main issue here is hotspotting, since the document IDs are client-id:guid where client id would often be the same for most of the write. I advised to follow best-practices, or your requests will get queued up, and eventually return RESOURCE_EXHAUSTED.
Our issue turned out to the be the batch sizes. We started sending batches of 50 instead of 500 and were able to write about 400/s. We still see the backoff errors but we're not losing any data. I haven't had time to test out the parallel individual writes to see if we can gain a little more write speed but I'll update this answer once I do.
Also, we tried to increase our batch write speeds with the 500/50/5 method they recommend and it still hit the resource exhausted errors.

Will Google Firestore always write to the server immediately in a mobile app with frequent writes?

Is it possible to limit the speed at which Google Firestore pushes writes made in an app to the online database?
I'm investigating the feasibility of using Firestore to store a data stream from an IoT device via a mobile device/bluetooth.
The main concern is battery cost - receive a new data packet roughly two minutes, I'm concerned about the additional battery drain that an internet round-trip every two minutes, 24hrs a day, will cost. I also would want to limit updates to wifi connections only.
It's not important for the data to be available online real-time. However it is possible for multiple sources to add to the same datastream in a 2-way sybc, (ie online DB and all devices have the merged data).
I'm currently handling that myself, but when I saw the offline capability of Datastore I hoped I could get all that functionality for free.
I know we can't directly control offline-mode in Firestore, but is there any way to prevent it from always and immediately pushing write changes online?
The only technical question I can see here has to do with how batch writes operate, and more importantly, cost. Simply put, a batch write of 100 writes is the same as writing 100 writes individually. The function is not a way to avoid the write costs of firestore. Same goes for transactions. Same for editing a document (that's a write). If you really want to avoid those costs then you could store the values for the thirty minutes and let the client send the aggregated data in a single document. Though you mentioned you need data to be immediate so I'm not sure that's an option for you. Of course, this would be dependent on what one interprets "immediate" as based off the relative timespan. In my opinion, (I know those aren't really allowed here but it's kind of part of the question) if the data is stored over months/years, 30 minutes is fairly immediate. Either way, batch writes aren't quite the solution I think you're looking for.
EDIT: You've updated your question so I'll update my answer. You can do a local cache system and choose how you update however you wish. That's completely up to you and your own code. Writes aren't really automatic. So if you want to only send a data packet every hour then you'd send it at that time. You're likely going to want to do this in a transaction if multiple devices will write to the same stream so one doesn't overwrite the other if they're sending at the same time. Other than that I don't see firestore being a problem for you.

How to implement real time updates in ASP.NET

I have seen several websites that show you a real time update of what's going on in the database. An example could be
A stock ticker website that shows stock prices in real time
Showing data like "What other users are searching for currently.."
I'd assume this would involve some kind of polling mechanism that queries the database every few seconds and renders it on a web page. But the thought scares me when I think about it from the performance standpoint.
In an application I am working on, I need to display the real time status of an operation that a user has submitted. Users wait for the process to be completed. As and when an operation is completed, the status is updated by another process (could be a windows service). Should I query the database every second to get the updated status?
It's not necessarily done in the db. As you suggested that's expensive. Although db might be a backing store, likely a more efficient mechanism is used to accompany the polling operation like storing the real-time status in memory in addition to finally on the db. You can poll memory much more efficiently than SELECT status from Table every second.
Also as I mentioned in a comment, in some circumstances, you can get a lot of mileage out of forging the appearance of status update through animations and such, employ estimation, checking the data source less often.
Edit
(an optimization to use less db resources for real time)
Instead of polling the database per user to check job status every X seconds, slightly alter the behaviour of the situation. Each time a job is added to the database, read the database once to put meta data about all jobs in the cache. So , for example, memory cache will reflect [user49 ... user3, user2, user1, userCurrent] 50 user's jobs if 1 job each. (Maybe I should have written it as [job49 ... job2, job1, job_current] but same idea)
Then individual users' web pages will poll that cache which is always kept current. In this example the db was read just 50 times into the cache (once per job submission). If those 50 users wait an average 1 minute for job processing and poll for status every second then user base polls the cache a total of 50 users x 60 secs = 3000 times.
That's 50 database reads instead of 3000 over a period of 50 min. (avg. one per min.) The cache is always fresh and yet handles the load. It's much less scary than considering hitting the db every second for each user. You can store other stats and info in the cache to help out with estimation and such. As long as fresh cache provides more efficiency then it's a viable alternative to massive db hits.
Note: By cache I mean a global store like Application or 3rd party solution, not the ASP.NET page cache which goes out of scope quickly. Caching using ASP.NET's mechanisms might not suit your situation.
Another Note: The db will know when another job record is appended no matter from where, so a trigger can init the cache update.
Despite a good database solution, so many users polling frequently is likely to create problems with web server connections and you might need a different solution at that level, depending on traffic.
Maybe have a cache and work with it so yo don't hit the database each time the data is modified and update the database every few seconds or minutes or what you like
The problem touches many layers of a web application.
On the client, you either use an iframe whose content autorefreshes every n seconds using the meta refresh tag (HTML), or a javascript which is triggered by a timer and updated a named div (AJAX).
On the server, you have at least two places to cache your data:
One is in the Application object, where you keep a timestamp of the last update, and refresh the cached data as your refresh interval elapses.
If you want to present data from a database, keep aggregated values or cache relevant data for faster retrieval.

Resources