Holistic and simplified view for Airflow job status - airflow

Sorry if this is a dumb question. I'm still a somewhat novice dev.
I'm interested in creating a holistic view that shows the current status of every airflow job my team maintains. The point would be to simplify the view rather than having the user go into the Airflow UI to check the status. I would be interested in something along the lines of a front-end webpage that has a list of each of the DAGs and kind of a progress bar whose length depends on the number of tasks for each DAG. If a task is currently running, it would be light-green, solid green for success and red for failures. Similar to the Airflow UI but a lot simpler. I would also want the home view to show the current day with a left and right arrow to go through each day if the user is interested. Essentially it would be a airflow monitoring system for less technical users.
What would be a good way to go about this?
I'm also open to any other solutions anyone may have come up that could help with simplify monitoring a large amount of airflow jobs.
Kind of looking for some folks to help me brainstorm. Not sure if Stack is the right place for it. :)
I'll be the developer of this app so no need to pull punches as far as the technical end goes.
Currently, I'm thinking of using a standard web app where the screen will be populated by a log that I'll keep in a backend database that gets populated by a function that gets called whenever a task concludes within a DAG. The view will always show current day and whichever DAGs are scheduled to run during that day with whatever their progress is.

Airflow allows creating plugins to expose web views with FlaskAppBuilder, so you can create a view and add whatever you want in it, then add it to the Airflow UI.

Related

How to assign "fixed work" task to multiple resources taking vacations into account

Let's say you have a small project. The team has estimated all the tasks as 300 days of effort.
I have 5 developers in the team, and I want MS Project to tell me when the project will complete considering vacations and working schedule of my team member.
In order to do that:
I'm creating a Task "Development" with fixed work "300d", and task type "Fixed Work".
Then I create 5 resources, and specify a 2 week vacation for one of the developers somewhere in the middle of the schedule.
Then I assign my 5 development resources to this task.
The problem is, the 300d distributed evenly to all 5 development resources. And If one of them have a two weeks vacation in between, due to that particular resource the work will be finished 2 weeks later, where other 4 resources are sitting and doing nothing for 2 weeks. Total duration is 70 days.
what I get
What I want to get is: work is distributed accordingly through all 5 resources unevenly in a way that the whole task finishes as earlier as possible taking most of the usable time from all developers.
That's how I would expect it to work. In that particular case I was distributing hours manually.
what i would expect
Is there a possibility in MS Project to do something like this? Or am I doing something wrong?
There are a couple issues with how you are approaching the problem.
1. Rather than just planning out the manpower hours estimated to be needed for the entire project on a single line item, You should plan out the tasks that will need to be done to accomplish "Small Project"
If you discretely plan out the tasks that need to be accomplished to satisfy the scope of "Small project", you can establish dependency (predecessor/successor) relationships between your tasks and figure out what tasks need to be done before you can move on to others. When you do this it will give you a good idea of how long the total duration of the project will take and likely be more accurate than just relying on an estimate based on the manpower hours estimate your developers give you. Find out what tasks they actually need to do, not just how many hours they think the whole project will take them. This will also allow you to plan out the utilization of your resources better because you'll be able to assign specific resources to specific tasks, and not all of your resources need to be on every task.
2. In general I would avoid using the Task Usage form.
I noticed you are altering resources in the task usage form, but unless you are really experienced with Microsoft Project I would avoid ever touching that, as it's really easy to set the period of performance of resources assigned to a task to be different than the actual period of performance of the task itself. This will cause MS Project to behave unusually, and it can be hard for an unexperienced user to understand why. This usually leads to pain and frustration. This leads me to my next bit of advice:
3. If you really want to specify a resource's vacation time, it's better to adjust the calendar associated the resource to exclude those dates as working dates.
In your situation with only 5 resources on your project, this can be fairly easy to do. You can accomplish this 2 different ways (I'll start with the easiest option):
1. You can add resource specific exclusion dates to the default calendar in your project
You can accomplish this by opening the Resource Sheet table and then clicking the Project tab then Change Working Times. If you have the Resource Sheet open instead of the Gantt chart, you can specify the resource that is going to be effected by the exceptions:
In this example you can see that I would be excluding (removing) 8/23/21 thru 9/3/21 as working days for the SW Engineer resource, without needing to change the calendar used by the resource completely.
2. You can completely change the calendar used by particular resources to be different than the default calendar set for the project.
You can accomplish this by going into the Resource Sheet and opening the Base Calendar column:
From here you can assign any calendar that exists in the project to the resource. Of course this means you would need to create the calendars and assign exclusion dates to them.
To create a calendar, click the Project tab then click Change Working Times. Click Create New Calendar on the form that opens up and give it a name:
From there you can add exclusion dates and all that.
Note: In a larger project with many resources, I would recommend not messing with the calendar for the resources at all. It just gets hard to deal with when there are a lot of resources.

Reassign user story during sprint?

If a story is in progress and then swim lanes are code review and QA-ready, how should the assignment of stories work? Should a story remain assigned to the developer? And should the code review and QA tasks be created as sub-tasks in it? Or should the story be re-assigned when it is moved to code review by the developer, and when code review is done, it is moved to QA lane by the reviewer and re-assigned to QA by the reviewer. It seems anti-pattern to re-assign tickets from in-progress to future states. It looks okay to re-assign tickets before it was brought in the sprint but not after.
Scrum does not have anything to say about how the work is done nor how a board is managed. However, many team's look at Kanban's "pull" approaches to answer this. In that case, work is never assigned or given, it is only claimed/taken on. Therefor, work would be moved to "Code Review" by the reviewer when they began the work. Similarly, the work would be moved to QA by the tester when they started. "Ready" columns are a bit of a misnomer as they are not states. Rather, they are statuses of the previous state. If your order is Code Review - QA Ready - QA, then in fact, QA ready is a possible designation on work in Code Review. This may seem minor, but it is very important to prevent pile-ups in your process where work stalls without owners.
There is no single answer, but one way of doing it is to think of of a User Story as a container of tasks where each task is a small technical deliverable of any kind. With this mindset you can effectivly stop thinking of who the assignee is as each developer will have its small contribution towards the goal.
One of the problems with task re-assignment is that at one point you can loose traceability of who has done what and productivity on per developer basis. So in this sense having each teammember doing its own tasks and delivering towards the completion of a user story can solve this.
Then you can assign the User Story to the product owner, or you can assign it to a developer that kind of holds ownership towards its delivery to test when the tester will take over. But the user story when assigned to a developer does not mean that he owns the User Story, it just means that it is his responsibility to ensure hand over to test nothing more nothing less.
When a tester encounters a bug then you create a bug attached to the User story.
Not recommended. It's feasible tho. You have to assess your current work situation. If the user story is something that can make a whole difference, then it would be better to just stop the sprint, reassess your situation and make the necessary changes - then continue. Either way, when you are adding a new user story to the backlog, deadlines can be hardly met.
We are using a little bit different approach. Like we have following columns on Jira Board.
To-do
In_progress
Ready for Review
Ready for QA
In-Testing
Rework/Rejected
Done
A developer pick a task from to-do and assign it to him self and keep it in-progress. Once he is done he moved it to Ready for Review and keep it un assign. Someone will pick it and assign it to himself and review it. After reviewing that person will move the case to ready for QA without assigning it to anyone. Whoever is free or plaining to work on case will assign that case to himself and when he starts working on the case, he will move it to in-testing. As a result of testing the case can go in rework/rejected or in Done. If it moved to Rework/Rejected he will assign it to original person who initially worked on it. And that person when rework on it, will move the case to in-progress again.

Best Practices using Firebase (Saving)

I'm currently making a online mobile game. It's like an online Idle Clicker.
In order to save the data I will use firebase. I'm still deciding if I should use the "Realtime Database" or " Cloud Firestore". (If any of you could help me too I would appreciate).
My main question is: When should I save my data ?
Saving the data every second is crazy because I will spend millions of euros. Even saving the data every minute seems not a viable solution to me.
I have searched and I can save the game everytime the user press the Home Button to leave the app. What if the user is playing and the phone dies?
Is there any other better solution that I am not thinking of ?
Thank you very much, Gonçalo
I would use OnApplicationQuit() for this. It's called whenever your game is closed. It won't be called if the device loses power though, so if you're worried about that you could start a timer when the game is opened and do autosaves every 10 minutes or on certain scene switches (if the player exits to the main menu for example).
This thread has more info on the topic that you may find useful.

Run a DB-intensive query/calculation asynchronously

This question relates to WordPress's wp-cron function but is general enough to apply to any DB-intensive calculation.
I'm creating a site theme that needs to calculate a time-decaying rating for all content in the system at regular intervals. This rating determines the order of posts on the homepage, which is paged to allow visitors to potentially view all content. This rating value needs to be calculated frequently to make sure the site has fresh content listed in the proper order.
The rating calculation is not heavy but the rating needs to be calculated for, potentially, 1,000s of items and doing that hourly via wp-cron will start to cause problems for sites with lots of content. Ignoring the impact on page load (wp-cron processes requests on page loads once a certain interval has been reached), at some point the script will reach a time limit. Setting up the site to use "plain ol' cron" will solve the page loading issue but not the timeout one.
Assuming that I have no control over the sites that this will run on, what's the best way to handle this rating calculation on a regular basis? A few things that came to mind:
Only calculate the rating for the most recent 1,000 posts, assuming that the rest won't be seen much. I don't like the idea of ignoring all old content, though.
Calculate the first, say, 100 or so, then only calculate the rating for older groups if those pages are loaded. This might be hard to get right, though, and lead to incorrect listing and ratings (which isn't a huge problem for older content but something I'd like to avoid)
Batch process 100 or so at regular intervals, keeping track of the last one processed. This would cycle through the whole body of content eventually.
Any other ideas? Thanks in advance!
Depending on the host, you're in for a potentially sticky situation. Let me outline a couple of ideal cases and you can pick/choose where you need to.
Option 1
Mirror the database first and use a secondary app (WordPress or otherwise) to do the calculations asynchronously against that DB mirror. When they're done, they can update a static file in the project root, write data to a shared Memcached instance, trigger a POST to WordPress' admin_post endpoint to write some internal state, whatever.
The idea here is that you're removing your active site from the equation. The last thing you want to do is have a costly cron job lock the live site's database or cause queries to slow down as it does its indexing.
Option 2
Offload the calculation entirely to a separate application. Tracking ratings in real time with WordPress is a poor idea as it bypasses page caching and triggers an uncachable request every time a new rating comes in. Pushing this off to a second server means your WordPress site is super fast, and it also means you can have the second server do the calculations for you in the first place.
If you're already using something like Elastic Search on the site, you can add ratings as an added indexing facet. Then just update posts as ratings change, and use the ES API to query most popular posts later.
Alternatively, you can use a hosted service like Keen IO to record and aggregate ratings.
Option 3
Still use cron, but don't schedule it as a cron job in WordPress. Instead, write a WP CLI routine that does the reindexing for you. Then, schedule real cron jobs to process the job.
This has the advantage of using PHP's command line version, which can be configured to skip the timeouts and memory limits imposed on the FPM/CGI/whatever version used to serve the site. It also means you don't have to wait for site traffic to trigger the job - and a long-running job won't block other cron events within WordPress from firing.
If using this process, I would set the job to run hourly and, each hour, run a batch of 1/24th of the total posts in the database. You can keep track of offsets or even processed post IDs in the database, the point is just that you're silently re-indexing posts throughout the day.

Show workflow while running

I am trying out Workflow Foundation 4. Is it possible to show the workflow while it is running, with some sort of indicator of state (e.g. green box around activity = running)? The workflow would have to be read-only. However, I would also like to right click an activity and bring up info like how long the activity took to run, current logging state, etc.
Edit: I found the following links, but they are not for Workflow Foundation 4. Does anyone know what it has been replaced with?
WorkflowView: http://msdn.microsoft.com/en-us/library/ms617016.aspx
Workflow Monitor Sample: http://msdn.microsoft.com/en-us/library/ms741706.aspx

Resources