We are building service that will synchronize with user Google Tasks data, so if user add/edit/delete task in GTask, so it will be added/edited/deleted in our service.
And there is a big problem with synchronization: as I see GTasks API does not provide any onUpdate/onChange event listeners. I mean, the perfect solution can be if there will be Google Tasks API method that can be used to set some callback URL that will be requested when user add/edit/delete tasks.
But I can't find such method in Google Tasks API, so now there is only one very bad way to sync with Google Tasks API - request all users tasks and compare them with service tasks. This is very bad way to sync, because if we have 10k users and want their tasks list be synchronizaed up to 1 minute, so we will need to make > 10k GTasks API requests per minute :(
I hope that I'm wrong and there is some way to set onChange/onUpdate callback for user tasks. Or may be there is some another way to receive actual notification of user GTasks changes(by email & etc).
Does anybody know it?
Thank you.
You could use updatedMin parameter to only get Tasks that have been updated since a given timestamp, as described in the documentation.
You should be able to rely on ETag and If-None-Match headers when querying user tasks lists to get a 304 Not Modified if the no tasks in the list have changed. (Not that should also works when polling individual tasks)
This way you can effectively poll for the tasks that have changed since the last time you synced.
Related
My application needs front-end searching. It searches an external API, for which I'm limited to a few calls per second.
So, I wanted to keep ALL queries, related to this external API, on the same Cloud Task queue, so I could guarantee the amount of calls per second.
That means the user would have to wait for second or two, most likely, when searching.
However, using Google's const { CloudTasksClient } = require('#google-cloud/tasks') library, I can create a task but when I go to check it's status using .getTask() it says:
The task no longer exists, though a task with this name existed recently.
Is there any way to poll a task until it's complete and retrieve response data? Or any other recommended methods for this? Thanks in advance.
No. GCP Cloud Tasks provides no way to gather information on the body of requests that successfully completed.
(Which is a shame, because it seems quite natural. I just wrote an email to my own GCP account rep asking about this possible feature. If I get an update I'll put it here.)
So suppose I have an API to create a cloud instance asynchronously. So after I made an API call it will just return the success response, but the cloud instance will not been initialized yet. It will take 1-2 minutes to create cloud instance and after that it will save the cloud instance information (ex. ip, hostname, os) to db which mean I have to wait 1-2 minutes so I can fetch the data again to show cloud information. At first I try making a loading component, but the problem is that I don't know when the cloud instance is initialized (each instance has different time duration for creating). I'm considering using websocket or using cron or should I redesign my API? Has anyone design asynchronous system before how do you handle such a case.
If the API that you call gives you no information on when it's done with its asynchronous processing, it seems to me that you'll have to check at intervals until you find that the resource is ready; i.e. to poll it.
This seems to me to roughly fit the description and intent of the Polling Consumer pattern. In general, for asynchronous systems design, I can't recommend Enterprise Integration Patterns enough.
As other noted you can either have a notification channel using WebSockets or poll the backend. Personally I'd probably go with the latter for this case and would actually create several APIs, one for initiating the work and get back a URL with "job id" in it where the status of the job can be polled.
RESTfully that would look something like POST /instances to initiate a job GET /instances see all the instances that are running/created/stopped and GET /instances/<id> to see the status of a current instance (initiating , failed , running or whatever)
WebSockets would work, but might be an overkill for this use case. I would probably display a status of 'creating' or something similar after receiving the success response from the API call, and then start polling the API to see if the creation process has finished.
While using google cloudVision services we have a need to track individual requests made and the response. Is there a way I can generate an ID as a memo field to every request I make to the CloudVision API & read it in the usage logs/ billing statement?
I can get close to this with timestamps- but with concurrent usages and latencies, mapping can get challenging. Along with reconciliation, this will also help track error codes and reasons
I am wondering whether you can log every request which has error yourself?
You may sending request in parallel in multiple thread, but in each thread you can just write the request and its result to a GCS bucket to record the log.
Can anyone clarify what is the purpose of using queue ?
What i understand is that a webhook is just a URL , you do a POST request to that URL and then do some stuff based on the body/data of the request. So why i need to queue the data and store it in a database then loop through the database again and perform the stuff.
The short answer is, you don't have to use a queue. A webhook is just an HTTP request (typically POST) notifying your application of some type of event. The reason you might want to consider a queue is because of typical issues you could run into.
One of these is because of response time back to the webhook requester (source). Many sources want a response (HTTP status 200) as quickly as possible so they can dequeue the request from their webhook system. If processing the webhook takes some time, a source will typically advise you to use a queue to defer the lengthier process asynchronous to the 200 response to the webhook.
Another possible reason could be for removing duplicate requests. There is no guarantee with webhooks that you will only receive a single request per event. A queue can be used to de-dupe these requests.
I would recommend you stick with a simple request handler if possible, then evolve a more sophisticated handler if you run into issues. Consider queues as a potential design approach if you run into issues like those above.
You need some way to prevent a conflict if the webhook is invoked multiple times very close together.
It doesn't necessarily have to be a queue, though. If the webhook performs database queries and updates, you can use a transaction to ensure that this is atomic for each invocation.
In this respect, it's little different from any other web utility. You should do something similar in scripts that process web forms.
I am thinking on the following approach but not sure if its the best way out:
step1 (server side): A TaskMangaer class creates a new thread and start a task.
step2 (server side): Store taskManager object reference into the cache for future reference.
step3 (client side): Use periodic Ajax call to check the status of the task.
Basically the intention is to have a framework to run a background task (5mins approx) and provide regular feedback on the web UI for the percentage of task completed.
Is there a neat way around this or any existing asp.net API that will be helpful ?
Edit 1#: I want to run the task in-proc with the app.
Edit 2#: Looks like badge implementation on stack overflow is also using the cache to track background task. https://blog.stackoverflow.com/2008/07/easy-background-tasks-in-aspnet/
I think the problem with storing the result in the cache is that ASP.NET might scavenge that cache entry for other purposes (ie if its short on memory, if its grumpy, etc). Something that is served from the cache should be something you can recreate on demand if its not found in the cache, the ASP.NET runtime is free to dump cache entries whenever it feels like it.
The usage of the cache in the badge discussion seems fundamentally different, in that case the task was shortlived. The cache was just being used as a hacky timer to fire off the task periodically.
Can you confirm this is a task that is going to take 5 minutes, and require its own thread that whole time? This is a performance concern in itself, you will only be able to support a limited number of such requests if each requires its own thread for so long. Only if thats acceptable would I let the task camp a thread for so long.
If its ok for these tasks to camp a thread, then I'd just go ahead and store the result in a dictionary global to the process. The key of the dictionary would correlate to the client request / AJAX callback series. The key should incorporate the user ID as well if security is at all important.
If you need to scale up to many users, then I think you need to break the task down into asynchronous steps, and in that case I'd probably use a DB table to store the results (again keyed per request / user).
Microsoft Message Queuing was built for scenarios like the one you try to solve:
http://www.microsoft.com/windowsserver2003/technologies/msmq/default.mspx
Windows Communicatio Foundation also has message queuing support.
Hope this helps.
Thomas
One approach for doing this is to use application state. When you spawn a worker thread, pass it a request ID that you generate, and return this to the client. The client will then pass that request ID back to the server in its AJAX calls. The server will then fetch the status using the request ID from application state. (The worker thread would be updating the application state based on its status).
I saw an approach to a similar problem somewhere. The solution was something like:
Start the background task on server.Return immediately with a url to the result.
Until the result is posted, this url will return 404.
The client checks periodically for this url.
The client reads the results when
they are finally posted.
The url will be something like http://mysite/myresults/cffc6c30-d1c2-11dd-ad8b-0800200c9a66.
The best document format is probably JSON.
If feedback on progress is important, modify the document to also contain status (inprogress/finish) and progress (42 %).