One instance of Airflow or multiple? - airflow

I'm looking into airflow and I have upwards of 10 projects that could use airflow.
Would it be better to have one instance of airflow for all these projects or each project has its own instance?
For context, each project has its own server setup.
Thanks in advance

Keep a central Airflow instance, that will save you the cost of maintenance and logging. Airflow is designed in a way to host a number of dependent and independent workflows, so keeping it separate for separate projects doesn't make sense.
If your team size increases you can make use of Airflow's security features for fine grained access control.

Related

Should I create a separate AppicationInsights instance per WebJob?

I'm wondering are there any best practices or recommendations out there on using Application Insights to monitor web jobs. At the moment I have all my app service and web jobs logs going to the one AI instance and there is alot of noise in there.
Specifically should I:
create a separate AI instance for all the web jobs
or create a separate AI instance per web job.
Thanks
You should generally use one instance for each system in your environment, but separate out dev, test, and prod. This is to make it simpler to track dependencies as jobs move throughout the system. So with multiple web apps, you might group an API, a separate web app that serves the front end content, and any web jobs that support those two apps. On another you may have just a single web app or web job that acts independently from the rest of your apps.
However, you should choose the number of Application Insights instances that best fit your situation. If it would work better for you to split each web job then you can certainly do that. You can query across App Insights instances, so you don't completely lose the ability to join the data from different services together if you choose to split them into separate App Insights instances.

Flyway concurrent migration

We are having many projects running on many servers looking up into one database, we"re thinking to setup Flyway to every project for control our database structure.
But we are worrying about concurrent migration problem, if some projects re-deploy in sametime.( Off-coures, we always take care the "If exist" things in sql syntax )
How Flyway work when concurrent change on same data table or other struture things ?
It works as expected. See the answer in the FAQ: https://flywaydb.org/documentation/learnmore/faq.html#parallel
Can multiple nodes migrate in parallel?
Yes! Flyway uses the locking technology of your database to coordinate multiple nodes. This ensures that even if multiple instances of your application attempt to migrate the database at the same time, it still works. Cluster configurations are fully supported.

Network Scheduler Service

I am planning on a project to schedule scripts on multiple Windows and Linux servers. I'm kind of going down the path of doing this all from scratch because I have requirements which alternative software don't seem to meet (such as running tasks on completion or failure of other tasks and being able to schedule on non standard intervals).
I was thinking about having a web interface which will allow users to add/modify/delete schedules for each machine to a database.
A windows service will then be checking the database for any jobs that need to be run at that point and connect over SSH for Linux or PowerShell for windows. All the scripts will write back to the database on their progress so that they can be checked by the user.
Basically I just wanted some advice from people who knows better ways or things I may need to look out for which could cause problems because I don't have much experience.
Thanks.
Oracle Scheduler has all options where you are looking for and probably more. See Overview of Oracle Scheduler for some global info. It comes doen to having a central schedular database that submits jobs to remote job agents that do the work pretty much independent from the central schedular repository. It does report back status etc. when the repository is accessible after a job has finished.
It's a very powerful tool and it takes away a lot of complex tasks for you by giving a framework that you can start using right out of the box.

Proper way for one website to keep another alive in IIS?

Basically I have a website that will run a background task to perform some "maintenance" duties while it's not idle. When it is idle, these processes do not need to run.
Now I have a secondary website (virtual directory under main site) that needs to execute these tasks as well. However they can't both run them at the same time or it will cause issues.
Now the more correct solution would probably be to either merge the sites, break out the tasks into a different application, or change the tasks so that they do not conflict with each other if they're both running at the same time. For one reason or another, these are not (currently) options.
So basically when the secondary website is active, what would be the best way to make sure the primary website is awake and running these tasks?
I'm thinking the easiest solution would be to include a reference to the main website from the secondary website, so that any page load on the secondary website would force the first website to be server the request. Something like a 1px image.
But would this be better solved through IIS? Should they share the same application pool? Both applications are relatively stable, so I'm not too worried about one website bringing down the other.
Your question is a little confused. At one point you say that the two sites can't run the task at the same time, but then you say that when the secondary is active the primary should also be active?
Assuming your goal is to have the task run only in one place at a time, you basically having a locking problem. Some solutions would be:
Maintain a lock (e.g. physical file, database entry, ...) and only run the task if the other site isn't holding the lock.
Make the task callable in site A and then have site B call it rather than running the task itself. Site A can then track if it is already running the task.
All the solutions you listed yourself (especially separating background tasks away from websites altogether) which are better solutions than the above.
Hope that helps.
Kind of hard to depend on running an automated task on a web service. If the application pool is recycled then that task will become lost and the process could lose its integrity. There is no guarantee that the process will degrade nicely when the application pool is collected.
Running a dedicated task is better handled by a system which is set up to be dedicated. Namely, the server hosting this web service. I think you would be better suited to have the host service run directly on the server locally instead of exposed to the web as a web service.
However, if that is not an option, then you will probably benefit from merging them as you state. This way as long as the application pool is alive from the child service, the parent service will be available. Similarly, you could have them share an application pool. The issue here is as stated above, the integrity of the dedicated process could become compromised.
.NET is not really designed to run dedicated background tasks. That is better suited to a desktop application which supports the web service.

How do you handle scheduled tasks for your websites running on IIS?

I have a website that's running on a Windows server and I'd like to add some scheduled background tasks that perform various duties. For example, the client would like users to receive emails that summarize recent activity on the site.
If sending out emails was the only task that needed to be performed, I would probably just set up a scheduled task that ran a script to send out those emails. However, for this particular site, the client would like a variety of different scheduled tasks to take place, some of them always running and some of them only running if certain conditions are met. Right now, they've given me an initial set of things they'd like to see implemented, but I know that in the future there will be more.
What I am wondering is if there's a simple solution for Windows that would allow me to define the tasks that needed to be run and then have one scheduled task that ran daily and executed each of the scheduled tasks that had been defined. Is a batch file the easiest way to do this, or is there some other solution that I could use?
To keep life simple, I would avoid building one big monolithic exe and break the work to do into individual tasks and have a Windows scheduled task for each one. That way you can maintain the codebase more easily and change functionality at a more granular level.
You could, later down the line, build a windows service that dynamically loads plugins for each different task based on a schedule. This may be more re-usable for future projects.
But to be honest if you're on a deadline I'd apply the KISS principle and go with a scheduled task per task.
I would go with a Windows Service right out of the gates. This is going to be the most extensible method for your requirements, creating the service isn't going to add much to your development time, and it will probably save you time not too far down the road.
We use Windows Scheduler Service which launches small console application that just passes parameters to the Web Service.
For example, if user have scheduled reports #388 and #88, scheduled task is created with command line looking like this:
c:\launcher\app.exe report:388 report:88
When scheduler fires, this app just executes web method on web service, for example, InternalService.SendReport(int id).
Usually you already have all required business logic available in your Web application. This approach allows to use it with minimal efforts, so there is no need to create any complex .exe or windows service with pluggable modules, etc.
The problem with doing the operations from the scheduled EXE, rather than from inside a web page, is that the operations may benefit from, or even outright require, resources that the web page would have -- IIS cache and an ORM cache are two things that come to mind. In the case of ORM, making database changes outside the web app context may even be fatal. My preference is to schedule curl.exe to request the web page from localhost.
Use the Windows Scheduled Tasks or create a Windows Service that does the scheduling itself.

Resources