In the GEE documentation here there is limited information about the quota limits. Basically all it tells us is that there are separate limits for concurrent computation vs tile requests. I am hitting the 429 Too Many Requests often for computation requests.
In order to properly throttle my requests or add a queueing system then I would need to know details about the quota policy e.g. "the quota is X concurrent computations", "there's a rate limit of Y requests within a Z minute window".
Does anyone have knowledge of the actual quota policy?
Earth Engine's request quota limits are a fairly complex topic (I know because I work on them) and there are not currently any documented guarantees about what is available. I recommend that you implement automatic backoff that adapts to observed 429s rather than attempting to hardcode a precisely matching policy.
Also note that fetching data in lots of small pieces is not the best use of the Earth Engine API — as much as possible, you should let Earth Engine do the computation, reducing large data sets into just the answers you actually need. This will reduce your need to worry about QPS as opposed to concurrency, and reduce the total amount of request processing and computation startup overhead.
Related
I developed an Android application where I use Firebase as my main service for storing data, authenticating users, storage, and more.
I recently went deeper into the service and wanted to see the API usage in my Google Cloud Platform.
In order to do so, I navigated to https://console.cloud.google.com/ to see what it has to show inside APIs and Services:
And by checking what might cause it I got:
Can someone please explain what is the meaning of "Latency" and what could be the reason that specifically this service has so much higher Latency value compared to the other API's?
Does this value have any impact on my application such as slowing the response or something else? If yes, are there any guidelines to lower this value?
Thank you
Latency is the "delay" until an operation starts. Cloud Functions, in particular, have to actually load and start a container (if they have paused), or at least load from memory (it depends on how often the function is called).
Can this affect your client? Holy heck, yes. but what you can do about it is a significant study in and of itself. For Cloud Functions, the biggest latency comes from starting the "container" (assuming cold-start, which your low Request count suggests) - it will have to load and initialize modules before calling your code. Same issue applies here as for browser code: tight code, minimal module loads, etc.
Some latency is to be expected from Cloud Functions (I'm pretty sure a couple hundred ms is typical). Design your client UX accordingly. Cloud Functions real power isn't instantaneous response; rather it's the compute power available IN PARALLEL with browser operations, and the ability to spin up multiple instances to respond to multiple browser sessions. Use it accordingly.
Listen and Write are long lived streams. In this case a 8 minute latency should be interpreted as a connection that was open for 8 minutes. Individual queries or write operations on those streams will be faster (milliseconds).
I am using the Google Calendar API to preprocess events that are being added (adjust their content depending on certain values they may contain). This means that theoretically I need to update any number of events at any given time, depending on how many are created.
The Google Calendar API has usage quotas, especially one stating a maximum of 500 operations per 100 seconds.
To tackle this I am using a time-based trigger (every 2 minutes) that does up to 500 operations (and only updates sync tokens when all events are processed). The downside of this approach is that I have to run a check every 2 minutes, whether or not anything has actually changed.
I would like to replace the time-based trigger with a watch. I'm not sure though if there is any way to limit the amount of watch calls so that I can ensure the 100 seconds quota is not exceeded.
My research so far shows me that it cannot be done. I'm hoping I'm wrong. Any ideas on how this can be solved?
AFAIK, that is one of the best practice suggested by Google. Using watch and push notification allows you to eliminate the extra network and compute costs involved with polling resources to determine if they have changed. Here are some tips to best manage working within the quota from this blog:
Use push notifications instead of polling.
If you cannot avoid polling, make sure you only poll when necessary (for example poll very seldomly at night).
Use incremental synchronization with sync tokens for all collections instead of repeatedly retrieving all the entries.
Increase page size to retrieve more data at once by using the maxResults parameter.
Update events when they change, avoid re-creating all the events on every sync.
Use exponential backoff for error retries.
Also, if you cannot avoid exceeding to your current limit. You can always request for additional quota.
Background: I have a DynamoDB table which I interact exclusively with a DAO class. This DAO class logs metrics on the number of calls to insert/update/delete operations to the boto library.
I noticed that the # of operations I logged in my code do correlate with the consumed read/write capacity on AWS monitoring but the AWS measurements on consumption are 2 - 15 times the # of operations I logged in my code.
I know for a fact that the only other process interacting with the table is my manual queries on the AWS UI (which is insignificant in capacity consumption). I also know that the size of each item is < 1 KB, which would mean each call should only consume 1 read.
I use strong consistent reads so I do not enjoy the 2x benefit of eventual consistent reads.
I am aware that boto auto-retries at most 10 times when throttled but my throttling threshold is seldomly reached to trigger such a problem.
With that said, I wonder if anyone knows of any factor that may cause such a discrepency in # of calls to boto w.r.t. the actual consume capacities.
While I'm not sure of the support with the boto AWS SDK, in other languages it is possible to ask DynamoDB to return the capacity that was consumed as part of each request. It sounds like you are logging actual requests and not this metric from the API itself. The values returned by the API should accurately reflect what is consumed.
One possible source for this discrepancy is if you are doing query/scan requests where you are performing server side filtering. DynamoDB will consume the capacity for all of the records scanned and not just those returned.
Another possible cause of a discrepancy are the actual metrics you are viewing in the AWS console. If you are viewing the CloudWatch metrics directly make sure you are looking at the appropriate SUM or AVERAGE value depending on what metric you are interested in. If you are viewing the metrics in the DynamoDB console the interval you are looking at can dramatically affect the graph (ex: short spikes that appear in a 5 minute interval would be smoothed out in a 1 hour interval).
I'm wondering if the use of ETags with the YouTube DATA API has any beneficial effect on quota usage? An answer to this question implies that they will not count against your quota, but I can't find anything like that in the official documentation.
Anyone have any insight?
AS I understand the quota usage document, every API call (even invalid ones) incur 1 point against your quota; additional quota charges are assessed based on the parts returned, the actions performed, etc. So it seems logical that using etags will still use up 1 point per call, but won't incur any of the other read/write charges (since nothing else is returned) when the resource hasn't been modified. Of course, if the resource has been modified, you'll get the full resource back and have full quota charges again.
You're right, though, that no where is this explicitly stated.
I’m after some thoughts on how people go about calculating database load for the purposes of capacity planning. I haven’t put this on Server Fault because the question is related to measuring just the application rather than defining the infrastructure. In this case, it’s someone else’s job to worry about that bit!
I’m aware there are a huge number of variables here but I’m interested in how others go about getting a sense of rough order of magnitude. This is simply a costing exercise early in a project lifecycle before any specific design has been created so not a lot of info to go on at this stage.
The question I’ve had put forward from the infrastructure folks is “how many simultaneous users”. Let’s not debate the rationale of seeking only this one figure; it’s just what’s been asked for in this case!
This is a web front end, SQL Server backend with a fairly fixed, easily quantifiable audience. To nail this down to actual simultaneous requests in a very rough fashion, the way I see it, it comes down to increasingly granular units of measurement:
Total audience
Simultaneous sessions
Simultaneous requests
Simultaneous DB queries
This doesn’t account for factors such as web app caching, partial page requests, record volume etc and there’s some creative license needed to define frequency of requests per user and number of DB hits and execution time but it seems like a reasonable starting point. I’m also conscious of the need to scale for peak load but that’s something else that can be plugged into the simultaneous sessions if required.
This is admittedly very basic and I’m sure there’s more comprehensive guidance out there. If anyone can share their approach to this exercise or point me towards other resources that might make the process a little less ad hoc, that would be great!
I will try, but obviously without knowing the details it is quite difficult to give a precise advice.
First of all, the infrastructure guys might have asked this question from the licensing perspective (SQL server can be licensed per user or per CPU)
Now back to your question. "Total audience" is important if you can predict/work out this number. This can give you the worst case scenario when all users hit the database at once (e.g. 9am when everyone logs in).
If you store session information you would probably have at least 2 connections per user (1 session + 1 main DB). But this number can be (sometimes noticeably) reduced by connection pooling (depends on how you connect to the database).
Use a worst case scenario - 50 system connection + 2 * number of users.
Simultaneous requests/queries depend on the nature of the application. Need more details.
More simultaneous requests (to your front end) will not necessarily translate to more requests on the back end.
Having said all of that - for the costing purposes you need to focus on a bigger picture.
SQL server license (If my memory serves me right) will cost ~128K AUD (dual Xeon). Hot/warm standby? Double the cost.
Disk storage - how much storage will you need? Disks are relatively cheap but if you are going to use SAN the cost might become noticeable. Also - the more disks the better from the performance perspective.