Updating airflow variable which contains a dictionary

Updating airflow variable which contains a dictionary - airflow

We have a certain scenario where we need to update an airflow variable which looks like
'status': {'t1':'op1', 't2':'op2'}
Multiple tasks in the dag need to update this variable and add their own entry into this dictionary. When tasks are in sequence, the updation works fine using Variable.set() ,however, when tasks are in parallel, the updates are getting lost. For example if task9 and task10 are in parallel, only the entry 't10' was added while 't9' was lost and not added to the variable

It seems like you have race condition between tasks that cause the tasks to overwrite each other.
It might be better to create a task at the end of all parallel tasks that update this variable with the return_value from other tasks .

Related

Re-executing CREATE FUNCTION script using ADF not reflected in output of show journal

I have a create function script .create-or-alter function that I am submitting to an ADX cluster every 5 mins using Azure Data Factory (ADF).
I keep firing the command .show journal to detect whether this was executed. The first time ADF submitted this script, when the function was not already there, the function got created and I could even see its entry using .show journal command. But after that I could not see 'ADD-FUNCTION' event in the latest output of .show journal even though I kept checking for a long time, and during this the pipeline has been succeeding.
I don't understand that if the pipeline is successfully submitting the existing create function script without any change , why ADX is not allowing to go through?
If I open existing function script in Kusto Explorer and just re-execute it without any change, it is reflected in .show journal but logically the same thing ADF is doing but that is not reflected in .show journal.
Just to experiment , I dropped this function using Kusto Explorer.
So, the next time when the ADF pipeline ran , it created the function again and that entry was reflected in the output of .show journal.
It means whenever we are re-submitting create function script from ADF to ADX , probably ADF checks if the function definition is changed , if not , it ignores the command ?
But then this check is not performed when we do the same thing from Kusto Explorer, which is strange.
ADX behavior should not change depending on how we are submitting commands.
Another interesting fact is that this behavior is unique to functions ,
I also tested re-creating the same update policy for a table through ADF again and again without any change and every time it ends up showing up in the output of .show journal.
Is this behavior a feature or a bug in case of functions?

From the ADX service's perspective, when you execute an .alter function or a .create-or-alter command that results with an existing function having the exact same body, parameters, folder and docstring - the command does nothing, and therefore nothing is written to the journal.
If you're seeing differently, I would recommend that you open a support ticket.

How to delete records from a model referenced by a different model?

I have two models, where model A references items in model B. When I delete items from model B, react rerenders before I can update model A to also delete the referenced items.
Example:
I have a feed. The feed is made up of different types of items. Each type of item is stored in its respective model.
The feed order is stored as an array of item IDs, referencing the items in their different substores.
I have made a minimal failing example here. The issue occurs after clicking the button three times
https://codesandbox.io/s/deleting-referenced-items-kmw1e
The key line is
else console.error(">>>> feed order contains deleted item", id);
It's problematic that the feed order might contain deleted items because it could mean there is a programming error that resulted in bad references. In this case it's not a programming error, the second store just hasn't updated yet.
Is there a way I might be able to batch the createAndDeleteTodo, to not evaluate all listeners until the entire thunk and all subthunks have completed?
In the above example it's trivial enough to just have one master action which updates the feed order and the items but it would feel cumbersome if there was more than just one type of item as each item type lives in it's own respective model.

The same thunk action is triggering multiple Actions. And as per the definition of useStoreState:
The useStoreState will execute any time an update to your store's state occurs
So, in effect, when you do this inside the thunk:
actions.setTodos({ newTodos: [newTodo], todoToDelete });
actions.updateFeedOrder({ newTodos: [newTodo], todoToDelete });
there are two actions being dispatched and those would account for a separate store state change event listener. You will get multiple render calls for the multiple store updates.
You have two options:
Either club those actions into one as shown in the example: https://codesandbox.io/s/so-deleting-referenced-items-forked-x4d7v
OR check the useStoreState method for a case on handling the render only when both the store values are matching the count

It seems the problem is that you're dispatching to redux store 2 times. First when you create new items and then when you delete them.
I'd suggest to do it this way:
Create a deep copy of object/array you wanna work with.
Make both operations on that copy.
Dispatch that value to the store.
This way nothing will be rerendered until both operations are finished.

Is there a way to compare file vs table record with creating new mapping using Informatica?

I'm working on a scenario where I have to compare a data record which is coming from a file with the data from a table as part of validation check before loading the data file into the staging table. I have come up with a couple of possible scenarios which involve something that needs to change within the load mapping, but my team suggested to me to make a change to something that is easy to notice since it is a non-standard approach.
Is there any approach that we can handle within the workflow manager using any of the workflow tasks or session properties?

Create a mapping that will read the file, join data with the table, do the required validation and will write nothing out (use a filter with FALSE condition) and set a variable to 0/1 to indicate if the loading should start.
Next, run the loading session if the validation passed.
This can be improved a bit if you want to store the validation errors in some audit table. Then you don't need a variable - the condition can refer to $PMTargetName#numAffectedRows built-in variable. If it's more then zero - meaning there were some errors - don't start the load.

create a workflow with command line where you need to write a script which will pull the data from the table by using JDBC connections and try to compare with data present in the file and then flag whether to load or not .
based on this command line output you need to go ahead with staging workflow or not..
Use awk commands for comparison of the data , where you ll get flexibility to compare date parts in a column
FYR : http://www.cs.unibo.it/~renzo/doc/awk/nawkA4.pdf

Add multiple items from a firebase tree and display on firebase

Below is a picture of the firebase database tree . I have items a, b , c. I want the value of totalresult = a + b + c
My requirement is : As the value of a or b or c gets updated , it should get automatically reflected in the totalresult item value.
Is there a way to set in firebase to do it automatically instead running a piece of code everytime to add these and update in firebase
Am able to run a piece of code to add these and update the value in totalresult. But I have to run it manually every time, which is not an ideal solution

There isn't an internal way to do this in firebase realtime database.
That said, while you still have to write code, you can write a firebase function to trigger on updates to those fields, and then apply the update to total result. This will be automatic instead of manual, as the trigger will happen for every event on the database.
Documentation is here for how to create such a trigger (probably using the "onWrite" event).
Of course, there are a few things to be aware of:
There will be a period of time while the function is running that the data is not updated. In other words, you should be tolerant of inconsistencies. (You will likely also want to do the actual writing to the total using a transaction)
You need to be careful to not run the function (or exit early) when "tot/total result" is being updated, or you could get into an infinite loop of functions (it'd be best to have the result object elsewhere in your tree)

Firebase better way of getting total number of records

From the Transactions doc, second paragraph:
The intention here is for the client to increment the total number of
chat messages sent (ignore for a moment that there are better ways of
implementing this).
What are some standard "better ways" of implementing this?
Specifically, I'm looking at trying to do things like retrieve the most recent 50 records. This requires that I start from the end of the list, so I need a way to determine what the last record is.
The options as I see them:
use a transaction to update a counter each time a record is added, use the counter value with setPriority() for ordering
forEach() the parent and read all records, do my own sorting/filtering at client
write server code to analyze Firebase tables and create indexed lists like "mostRecent Messages" and "totalNumberOfMessages"
Am I missing obvious choices?

To view the last 50 records in a list, simply call "limit()" as shown:
var data = new Firebase(...);
data.limit(50).on(...);
Firebase elements are ordering first by priority, and if priorities match (or none is set), lexigraphically by name. The push() command automatically creates elements that are ordered chronologically, so if you're using push(), then no additional work is needed to use limit().
To count the elements in a list, I would suggest adding a "value" callback and then iterating through the snapshot (or doing the transaction approach we mention). The note in the documentation actually refers to some upcoming features we haven't released yet which will allow you to count elements without loading them first.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Updating airflow variable which contains a dictionary - airflow

It seems like you have race condition between tasks that cause the tasks to overwrite each other. It might be better to create a task at the end of all parallel tasks that update this variable with the return_value from other tasks .

Related

Re-executing CREATE FUNCTION script using ADF not reflected in output of show journal

How to delete records from a model referenced by a different model?

Is there a way to compare file vs table record with creating new mapping using Informatica?

Add multiple items from a firebase tree and display on firebase

Firebase better way of getting total number of records

Categories

Resources