I've been writing 100's of reports over the last few months. I have built a meta language that is used to schedule with fleet with service and timer unit files. These applications run from 10sec to 10min. Some are one-time and others repeat weekly, monthly, daily. Is there a Deis idiomatic way to implement that type of scheduling?
Deis supports CRON semantics via the "clock process" model. The general idea is that you create a long-running process that implements its own CRON, and you keep 1 running at all times with a deis scale clock=1.
For details see: https://devcenter.heroku.com/articles/scheduled-jobs-custom-clock-processes#custom-clock-processes
Related
I need to run a function in firebase every 10 seconds (I call an external api and process data).
With a normal cron I cant because it is limited to maximum once a minute. Using a setTimeOut is also not convenient since Functions charges per second of use.
My only idea so far is to use Cloud Tasks. But I don't know if it is the most convenient to use this tool for my purpose.
On Google Cloud Platform, Cloud Tasks is almost certainly the best/only way to get this job done without using other products in a way that they weren't intended. Cloud Functions is not good for the reasons you mentioned. With Cloud Tasks, you will need to have the prior task schedule the next task upon completion, as they will not repeat themselves automatically like cron.
We are considering to use Airflow for a project that needs to do thousands of calls a day to external APIs in order to download external data, where each call might take many minutes.
One option we are considering is to create a task for each distinct API call, however this will lead to thousands of tasks. Rendering all those tasks in UI is going to be challenging. We are also worried about the scheduler, which may struggle with so many tasks.
Other option is to have just a few parallel long-running tasks and then implement our own scheduler within those tasks. We can add a custom code into PythonOperator, which will query the database and will decide which API to call next.
Perhaps Airflow is not well suited for such a use case and it would be easier and better to implement such a system outside of Airflow? Does anyone have experience with running thousands of tasks in Airflow and can shed some light on pros and cons on the above use case?
One task per call would kill Airflow as it still needs to check on the status of each task at every heartbeat - even if the processing of the task (worker) is separate e.g. on K8s.
Not sure where you plan on running Airflow but if on GCP and a download is not longer than 9 min, you could use the following:
task (PythonOperator) -> pubsub -> cloud function (to retrieve) -> pubsub -> function (to save result to backend).
The latter function may not be required but we (re)use a generic and simple "bigquery streamer".
Finally, you query in a downstream AF task (PythonSensor) the number of results in the backend and compare with the number of requests published.
We do this quite efficiently for 100K API calls to a third-party system we host on GCP as we maximize parallelism. The nice thing of GCF is that you can tweak the architecture to use and concurrency, instead of provisioning a VM or container to run the tasks.
I need to schedule certain daily tasks, example payment notifications, payments and other things. My question is if through Cloud Functions I can get this.
An example of a task I need, is to make a daily payment, for 8 months, from Monday to Friday.
Activation of this I can do with a Cloud Function and the payment schedule I want to implement node-schedule. The main reason is because I use Cloud Firestore and it comes in handy in the project to implement the functions of the cloud and the database.
That's why I open the post, to know if it is possible for the Cloud Function to load these tasks in memory and execute them when node-schedule requires it.
Thank you.
Cloud Functions have a maximum execution time of 9 minutes, and you are billed for CPU and memory usage the entire time an instance is running. Using an in-process scheduler like node-schedule isn't possible for long time periods and isn't generally recommended even for shorter ones due to the cost involved.
Instead, you can use scheduled functions to define an arbitrary cron-like repeating job that will execute a function on a set schedule. It should be very possible to establish a "Mon-Fri daily payment" in such a scheduled function.
I'm digging into creating a simple web interface for scheduling iOS push notifications to occur either at a specific time or periodically in the future.
For example, someone could use this notification data:
"This is a periodic push notification!" - Every Monday - Expires Oct 31
"This will only happen once!" - Sept 20
and have the first one be executed every monday until october 31st and the second occur on september 20th.
I've done some research for some server software I could install to do this sort of thing, but I'm at a bit of a loss as to the recommended software to achieve this sort of thing. Is there ready-made software that could be recommended for scheduling push notifications?
If not, I'm also curious about software that would allow me to schedule my own tasks from input in a web form. Could I add/remove cron tasks through PHP? Or is it more appropriate to use something like Celery for this sort of thing? I guess since I haven't got much web development experience, I'm unsure of what the most appropriate approach and tools would be fore this.
You don't have to make cron task for specific events, you have to call at a define (but high) frequency a program through cron. TMHO having a program updating CRON for the purpose you describe is a bad idea.
Then this specific program has to send notifications based on the business logic you want. I implement such programs in PHP usually, then I can call them through cli or wget (this case require a bit more of work to avoid security problems since an attacker can call it too).
We have a long running data transfer process that is just an asp.net page that is called and run. It can take up to a couple hours to complete. It seems to work all right but I was just wondering what are some of the more popular ways to handle a long process like this. Do you create an application and run it through windows scheduler, or a web service or custom handler?
In a project for long running tasks in web-application, i made a windows service.
whenever the user has to do the time-consuming task, the IIS would give the task to the service which will return a token(a temporary name for the task) and in the background the service would do the task. At anytime, the user would see the status of his/her task which would be either pending in queue, processing, or completed. The service would do a fixed number of jobs in parallel, and would keep a queue for the next-incoming tasks.
A windows service is the typical solution. You do not want to use a web service or a custom handler as both of those will lie prey to the app pool recycling, which will kill your process.
Windows Workflow Foundation
What I find the most appealing about WF, is that workflows can be designed without much complexity to be persisted in SQL Server, so that if the server reboots in the middle of a process, the workflow can resume.
I use two types of processes depending on the needs of my BAs. For transfer processes that are run on demand and can be scheduled regularly, I typically write a WinForms (this is a personal preference) application that accepts command line parameters so I can schedule the job with params or run it on demand through an interactive window. I've written enough of them over the last few years that I have my own basic generic shell that I use to create new applications of this nature. For processes that must detect events (files appearing in folders, receiving CyberMation calls, or detecting SNMP traps), I prefer to use Windows Services so that they are always available. It's a little trickier simply because you have to be much more cautious of memory usage, leaks, recycling, security, etc. For me, the windows application tends to run faster on long duration jobs than they do when through an IIS process. I don't know if this is because it's attached to an IIS thread or if its memory/security is more limited. I've never investigated it.
I do know that .Net applications provide a lot of flexibility and management over resources, and with some standards and practice, they can be banged out fairly quickly and produce very positive results.