I have an application that's been running since 2015. It both reads and writes to approx 16 calendars via a service account, using the Google node.js library (calendar v3 API). We also have G Suite for Education.
The general process is:
Every 30 seconds it caches all calendar data via a list operation
Periodically a student will request an appointment "slot", it first checks to see if the slot is still open (via a list call) then an insert.
That's all it does. It's been running fine until the past few days, where API insert calls started failing:
{
"code": 403,
"errors": [{
"domain": "usageLimits",
"reason": "quotaExceeded",
"message": "Calendar usage limits exceeded."
}]
}
This isn't all that special - the documentation has three "solutions":
Read more on the Calendar usage limits in the G Suite Administrator
help.
If one user is making a lot of requests on behalf of many users
of a G Suite domain, consider using a Service Account with authority
delegation (setting the quotaUser parameter).
Use exponential backoff.
I'm not exceeding any of the stated limits as far as I can tell.
While I'm using a service account, it isn't making a request on behalf of a user. The service account has write access to the calendar and adds the user as an attendee
Finally, I do not think exponential backoff will help, although I do not have this implemented. The time between a request to insert and the next insert call is measured in seconds, not milliseconds. Additionally, just running calls directly on the command line with a simple script produce the same problem.
Some stats:
2015 - 2,466 inserts, 186 errors
2016 - 25,747 inserts, 237 errors
2017 - 42,815 inserts, 225 errors
2018 - 41,390 inserts, 1,074 errors (990 of which are in the past 3 days)
I have updated the code over the years, but it has remained largely untouched this term.
At this point I'm unsure what to do - there is no channel to reach Google, and while I have not implemented a backoff strategy, the way timings work with this application, subsequent calls are delayed by seconds, and processed in a queue that sequentially processes requests. The only concurrent requests would be list operations.
Related
I have been testing an application I am developing using the webapi, and I have started to get the following error message:
GCSP: Hello error: [1010] The Gracenote ODP 15822 [Name: *registered-name*]
[App: *registered-app*] application has reached is daily lookup limit with
Gracenote. You may try again tomorrow or may contact Gracenote support at
support#gracenote.com.
[Gracenote Error: <ERR>]
The application I am developing is looking up track details and cover artwork for songs being streamed from Mood/Pandora for Business service. It is making approximately one call for each song, so something like 15 searches per hour on average. I may have done more during testing, but not a lot more.
Once completed, I would expect this service to make fewer than 500 searches per day per location, and for it initially to be used at 4 locations.
What are the lookup limits I am running into?
What are my options to get a higher lookup limit?
Thanks
I am issuing GET requests as defined in the Google Measurement Protocol from our server to record offline conversions.
The following test request (tracking id obfuscated)
https://www.google-analytics.com/debug/collect?v=1&tid=xx&cid=111300&t=transaction&ti=1500000&tr=100
validates against the /debug Endpoint (using Postman)
{
"hitParsingResult": [ {
"valid": true,
"parserMessage": [ ],
"hit": "/debug/collect?v=1\u0026tid=xxu0026cid=111300\u0026t=transaction\u0026ti=1500000\u0026tr=100"
} ],
"parserMessage": [ {
"messageType": "INFO",
"description": "Found 1 hit in the request."
} ]
}
And shows up in the Sales Performance report in Google Analytics when submitted to the production endpoint using PostMan (i.e. without /debug/)
However I can't see any of the actual production data, submitted from the server in the Sales Performance report.
Any ideas?
This is kind of tricky, yes the transaction is valid,but the debuger only check the syntaxis, but your Google Analytics configuration has not enabled that type of hit (t=transaction, That is only for Standart E-commerce). In my test account, I run that hit and this work. In your case, if your account is enhanced e-commerce is being filtered on the processing.
So here is a screeshot of you hit on my test view running on classic ecommerce.
So you have 2 options to fix this, downgrade you e-commerce (not recommendable in all the cases)
Downgrade
If you do want to use that syntaxis, you have to uncheck the enhance e-commerce and that should work in your case. With your hit and with my configuration this works (a new account w/no filter and standard e-commerce enabled)
Attach information
The enhanced ecommerce was designed to be send attached with other hits (on event or pageview mainly).
For example, this hit is a no interaction event and it's valid for receive transaction and the purchase. Use no interaction events avoid fake sessions and allows to you import the data of the transaction without alter metrics as bounce rate.
https://www.google-analytics.com/collect?v=1&t=event&ni=1&ec=Ecommerce&ea=Transaction&cid=2.2&tid=UA-xxxxx-1&ti=T12345&tr=35.43&pa=purchase
There is a data latency with Google analytics. Officially its 24 - 72 hours before data shows up in the standard reports.
From my own experience I can say depending upon how much data there is in your account you can see it as early as 12 - 24 hours.
If the debug end point says its a valid it you can assume its working fine.
I've been using Firebase for one of my games and while it's been an extremely useful service and tool I happened to come across an issue that I didn't address during development which has allowed users to cheat.
As you would expect with Firebase that introduces most of the logic on the client side, I have client-side authorization in place, which was a mistake from the start to begin with. The issue that I'm running into is the following. Please note that this is not my structure, just an example to work from.
{
"user":
{
currently_training: false,
current_units: 673,
unit_cap: 1000
}
}
The client would take this and tell the user, "Okay, you're only allowed to train (1000 - 673) = 327 units. However, by bypassing this on the client side to change the unit_cap to lets say 10,000 the user can now send a request to the database to create 9,327 units, which will result in his units exceeding his unit cap.
How would I go about validating a query, such as..
- User requests to train 412 units.
- Insert is not executed as the amount of requested units + current_units > unit_cap
- Error is sent back to client to be handled.
OR
- User requests to train 300 units.
- 300 + current_units <= unit_cap
- Insert executes successfully.
OR
- User requests to train 300 units.
- 300 + current_units <= unit_cap
- currently_training is true, so the Insert fails with error.
I'm fairly worried that creating a middleware server is going to be required, which is the reason I went with firebase to begin with. (So I wouldn't have to worry about the scalability of my own servers)
In short, we are sometimes seeing that a small number of Cloud Bigtable queries fail repeatedly (for 10s or even 100s of times in a row) with the error rpc error: code = 13 desc = "server closed the stream without sending trailers" until (usually) the query finally works.
In detail, our setup is as follows:
We are running a collection (< 10) of Go services on Google Compute Engine. Each service leases tasks from a pair of PULL task queues. Each task contains an ID of a bigtable row. The task handler executes the following query:
row, err := tbl.ReadRow(ctx, <my-row-id>,
bigtable.RowFilter(bigtable.ChainFilters(
bigtable.FamilyFilter(<my-column-family>),
bigtable.LatestNFilter(1))))
If the query fails then the task handler simply returns. Since we lease tasks with a lease time between 10 and 15 minutes, a little while later the lease will expire on that task, it will be lease again, and we'll retry. The tasks have a max retry of 1000 so they can be retried many times over a long period. In a small number of cases, a particular task will fail with the grpc error above. The task will typically fail with this same error every time it runs for hours or days on end, before (seemingly out of the blue) eventually succeeding (or the task runs out of retries and dies).
Since this often takes so long, it seems unrelated to server load. For example right now on a Sunday morning, these servers are very lightly loaded, and yet I see plenty of these errors when I tail the logs. From this answer, I had originally thought that this might be due to trying to query for a large amount of data, perhaps near the max limit that cloud bigtable will support. However I now see that this is not the case; I can find many examples where tasks that have failed many times finally succeed and report only a small amount of data (e.g. <1 MB) was retrieved.
What else should I be looking at here?
edit: From further testing I now know that this is completely machine (client) independent. If I tail the log on one of the task leasing machines, wait for a "server closed the stream without sending trailers" error, and then try a one-off ReadRow query to the same rowId from another, unrelated, totally unused machine, I get the same error repeatedly.
This error is typically caused by having more than 256MB of data in your reply.
However, there is currently a bug in our server side error handling code that allows some invalid characters in HTTP/2 trailers which is not allowed by the spec. This means that some error messages that have invalid characters will be seen as this kind of error. This should be fixed early next year.
I am working on an asp.net mvc-5 web application, and I am facing a problem in using Hangfire tool to run long running background jobs. the problem is that if the job execution exceed 30 minutes, then hangfire will automatically initiate another job, so I will end up having two similar jobs running at the same time.
Now I have the following:-
Asp.net mvc-5
IIS-8
Hangfire 1.4.6
Windows server 2012
Now I have defined a hangfire recurring job to run at 17:00 each day. The background job mainly scan our network for servers and vms and update the DB, and the recurring job will send an email after completing the execution.
The recurring job used to work well when its execution was less than 30 minutes. But today as our system grows, the recurring job completed after 40 minutes instead of 22-25 minutes as it used to be. and I received 2 emails instead of one email (and the time between the emails was around 30 minutes). Now I re-run the job manually and I have noted that that the problem is as follow:-
"when the recurring job reaches 30 minutes of continuous execution, a
new instance of the recurring job will start, so I will have two
instances instead of one running at the same time, so that why I received 2 emails."
Now if the recurring job takes less than 30 minutes (for example 29 minute) I will not face any problem, but if the recurring job execution exceeds 30 minutes then for a reason or another hangfire will initiate a new job.
although when I access the hangfire dashboard during the execution of the job, I can find that there is only one active job, when I monitor our DB I can see from the sql profiler that there are two jobs accessing the DB. this happens after 30 minutes from the beginning of the recurring job (at 17:30 in our case), and that why I received 2 emails which mean 2 recurring jobs were running in the background instead of one.
So can anyone advice on this please, how I can avoid hangfire from automatically initiating a new recurring job if the current recurring job execution exceeds 30 minutes?
Thanks
Did you look at InvisibilityTimeout setting from the Hangfire docs?
Default SQL Server job storage implementation uses a regular table as
a job queue. To be sure that a job will not be lost in case of
unexpected process termination, it is deleted only from a queue only
upon a successful completion.
To make it invisible from other workers, the UPDATE statement with
OUTPUT clause is used to fetch a queued job and update the FetchedAt
value (that signals for other workers that it was fetched) in an
atomic way. Other workers see the fetched timestamp and ignore a job.
But to handle the process termination, they will ignore a job only
during a specified amount of time (defaults to 30 minutes).
Although this mechanism ensures that every job will be processed,
sometimes it may cause either long retry latency or lead to multiple
job execution. Consider the following scenario:
Worker A fetched a job (runs for a hour) and started it at 12:00.
Worker B fetched the same job at 12:30, because the default invisibility timeout was expired.
Worker C (did not fetch) the same job at 13:00, because (it
will be deleted after successful performance.)
If you are using cancellation tokens, it will be set for Worker A at
12:30, and at 13:00 for Worker B. This may lead to the fact that your
long-running job will never be executed. If you aren’t using
cancellation tokens, it will be concurrently executed by WorkerA and
Worker B (since 12:30), but Worker C will not fetch it, because it
will be deleted after successful performance.
So, if you have long-running jobs, it is better to configure the
invisibility timeout interval:
var options = new SqlServerStorageOptions
{
InvisibilityTimeout = TimeSpan.FromMinutes(30) // default value
};
GlobalConfiguration.Configuration.UseSqlServerStorage("<name or connection string>", options);
As of Hangfire 1.5 this option is now Obsolete. Jobs that are being worked on are invisible to other workers.
Say goodbye to confusing invisibility timeout with unexpected
background job retries after 30 minutes (by default) when using SQL
Server. New Hangfire.SqlServer implementation uses plain old
transactions to fetch background jobs and hide them from other
workers.
Even after ungraceful shutdown, the job will be available for other
workers instantly, without any delays.
I was having trouble finding documentation on how to do this properly for a Postgresql database, every example I was see is using sqlserver, I found how the invisibility timeout was a property inside the PostgreSqlStorageOptions object, I found this here : https://github.com/frankhommers/Hangfire.PostgreSql/blob/master/src/Hangfire.PostgreSql/PostgreSqlStorageOptions.cs#L36. Luckily through trial and error I was able to figure out that the UsePostgreSqlStorage has an overload to accept this object. For .Net Core 2.0 when you are setting up the hangfire postgresql DB in the ConfigureServices method in the startup class add this(the default timeout is set to 30 mins):
services.AddHangfire(config =>
config.UsePostgreSqlStorage(Configuration.GetConnectionString("Hangfire1ConnectionString"), new PostgreSqlStorageOptions {
InvisibilityTimeout = TimeSpan.FromMinutes(720)
}));
I had this problem when using Hangfire.MemoryStorage as the storage provider. With memory storage you need to set the FetchNextJobTimeout in the MemoryStorageOptions, otherwise by default jobs will timeout after 30 minutes and a new job will be executed.
var options = new MemoryStorageOptions
{
FetchNextJobTimeout = TimeSpan.FromDays(1)
};
GlobalConfiguration.Configuration.UseMemoryStorage(options);
Just would like to point out that even though, it is stated the thing below:
As of Hangfire 1.5 this option is now Obsolete. Jobs that are being worked on are invisible to other workers.
Say goodbye to confusing invisibility timeout with unexpected background job retries after 30 minutes (by default) when using SQL Server. New Hangfire.SqlServer implementation uses plain old transactions to fetch background jobs and hide them from other workers.
Even after ungraceful shutdown, the job will be available for other workers instantly, without any delays.
It seems that for many people using MySQL, PostgreSQL, MongoDB, InvisibilityTimeout is still the way to go: https://github.com/HangfireIO/Hangfire/issues/1197