Flatten nested dictionary to key/value pairs with Ansible - dictionary

---
- hosts: localhost
vars:
mydict:
key1: val1
key2: val2
key3:
subkey1: subval1
subkey2: subval2
tasks:
- debug:
msg: "{{ TODO }}"
How would I make the above debug message print out all key/value pairs from the nested dictionary? Assume the depth is unknown. I would expect the output to be something like:
{
"key1": "val1",
"key2": "val2",
"subkey1": "subval1"
"subkey2": "subval2"
}

Write a filter plugin and use pandas.json_normalize, e.g.
shell> cat filter_plugins/dict_normalize.py
from pandas.io.json import json_normalize
def dict_normalize(d):
df = json_normalize(d)
l = [df.columns.values.tolist()] + df.values.tolist()
return(l)
class FilterModule(object):
def filters(self):
return {
'dict_normalize': dict_normalize,
}
The filter returns lists of keys and values
- set_fact:
mlist: "{{ mydict|dict_normalize }}"
gives
mlist:
- - key1
- key2
- key3.subkey1
- key3.subkey2
- - val1
- val2
- subval1
- subval2
Create a dictionary, e.g.
- debug:
msg: "{{ dict(mlist.0|zip(mlist.1)) }}"
gives
msg:
key1: val1
key2: val2
key3.subkey1: subval1
key3.subkey2: subval2
If the subkeys are unique remove the path
- debug:
msg: "{{ dict(_keys|zip(mlist.1)) }}"
vars:
_regex: '^(.*)\.(.*)$'
_replace: '\2'
_keys: "{{ mlist.0|map('regex_replace', _regex, _replace)|list }}"
gives
msg:
key1: val1
key2: val2
subkey1: subval1
subkey2: subval2
Notes
Install the package, e.g. python3-pandas
The filter might be extended to support all parameters of json_normalize
The greedy regex works also in nested dictionaries
Q: "Turn the key3.subkey1 to get the original dictionary."
A: Use json_query. For example, given the dictionary created in the first step
- set_fact:
mydict_flat: "{{ dict(mlist.0|zip(mlist.1)) }}"
mydict_flat:
key1: val1
key2: val2
key3.subkey1: subval1
key3.subkey2: subval2
iterate the keys and retrieve the values from mydict
- debug:
msg: "{{ mydict|json_query(item) }}"
loop: "{{ mydict_flat|list }}"
gives
msg: val1
msg: val2
msg: subval1
msg: subval2

Related

Ansible: split a dictionary with list values to a list of dictionaries with a single item from the list as value

I need to convert a dictionary with list values into a list of dictionaries.
Given:
my_dict:
key1: ["111", "222"]
key2: ["444", "555"]
Desired output:
my_list:
- key1: "111"
key2: "444"
- key1: "222"
key2: "555"
What I've tried:
- set_fact:
my_list: "{{ my_list | default([]) + [{item.0.key: item.1}] }}"
loop: "{{ my_dict | dict2items | subelements('value') }}"
And what I've got:
[
{
"key1": "111"
},
{
"key1": "222"
},
{
"key2": "444"
},
{
"key2": "555"
}
]
Thankful for any help and suggestions!
Get the keys and values of the dictionary first
keys: "{{ my_dict.keys()|list }}"
vals: "{{ my_dict.values()|list }}"
gives
keys: [key1, key2]
vals:
- ['111', '222']
- ['444', '555']
Transpose the values
- set_fact:
tvals: "{{ tvals|d(vals.0)|zip(item)|map('flatten') }}"
loop: "{{ vals[1:] }}"
gives
tvals:
- ['111', '444']
- ['222', '555']
Create the list of the dictionaries
my_list: "{{ tvals|map('zip', keys)|
map('map', 'reverse')|
map('community.general.dict')|
list }}"
gives
my_list:
- key1: '111'
key2: '444'
- key1: '222'
key2: '555'
Notes
Example of a complete playbook
- hosts: localhost
vars:
my_dict:
key1: ["111", "222"]
key2: ["444", "555"]
keys: "{{ my_dict.keys()|list }}"
vals: "{{ my_dict.values()|list }}"
my_list: "{{ tvals|map('zip', keys)|
map('map', 'reverse')|
map('community.general.dict')|
list }}"
tasks:
- set_fact:
tvals: "{{ tvals|d(vals.0)|zip(item)|map('flatten') }}"
loop: "{{ vals[1:] }}"
- debug:
var: my_list
You can use a custom filer to transpose the matrix. For example,
shell> cat filter_plugins/numpy.py
# All rights reserved (c) 2022, Vladimir Botka <vbotka#gmail.com>
# Simplified BSD License, https://opensource.org/licenses/BSD-2-Clause
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
from ansible.errors import AnsibleFilterError
from ansible.module_utils.common._collections_compat import Sequence
import json
import numpy
def numpy_transpose(arr):
if not isinstance(arr, Sequence):
raise AnsibleFilterError('First argument for numpy_transpose must be list. %s is %s' %
(arr, type(arr)))
arr1 = numpy.array(arr)
arr2 = arr1.transpose()
return json.dumps(arr2.tolist())
class FilterModule(object):
''' Ansible wrappers for Python NumPy methods '''
def filters(self):
return {
'numpy_transpose': numpy_transpose,
}
Then you can avoid iteration. For example, the playbook below gives the same result
- hosts: localhost
vars:
my_dict:
key1: ["111", "222"]
key2: ["444", "555"]
keys: "{{ my_dict.keys()|list }}"
vals: "{{ my_dict.values()|list }}"
tvals: "{{ vals|numpy_transpose()|from_yaml }}"
my_list: "{{ tvals|map('zip', keys)|
map('map', 'reverse')|
map('community.general.dict')|
list }}"
tasks:
- debug:
var: my_list
Transposing explained
Let's start with the matrix 2x2
vals:
- ['111', '222']
- ['444', '555']
The task below
- set_fact:
tvals: "{{ tvals|d(vals.0)|zip(item) }}"
loop: "{{ vals[1:] }}"
gives step by step:
a) Before the iteration starts the variable tvals is assigned the default value vals.0
vals.0: ['111', '222']
b) The task iterates the list vals[1:]. These are all lines in the array except the first one
vals[1:]:
- ['444', '555']
c) The first, and the only one, iteration zip the first and the second line. This is the result
vals.0|zip(vals.1):
- ['111', '444']
- ['222', '555']
Let's proceed with matrix 3x3
vals:
- ['111', '222', '333']
- ['444', '555', '666']
- ['777', '888', '999']
The task below
- set_fact:
tvals: "{{ tvals|d(vals.0)|zip(item)|map('flatten') }}"
loop: "{{ vals[1:] }}"
gives step by step:
a) Before the iteration starts the variable tvals is assigned the default value vals.0
vals.0: ['111', '222', '333']
b) The task iterates the list vals[1:]
vals[1:]:
- ['444', '555', '666']
- ['777', '888', '999']
c) The first iteration zip the first and the second line, and assigns it to tvals. The filer flatten has no effect on the lines
vals.0|zip(vals.1)|map('flatten'):
- ['111', '444']
- ['222', '555']
- ['333', '666']
d) The next iteration zip tvals and the third line
tvals|zip(vals.2):
- - ['111', '444']
- '777'
- - ['222', '555']
- '888'
- - ['333', '666']
- '999
e) The lines must be flattened. This is the result
tvals|zip(vals.2)|map('flatten'):
- ['111', '444', '777']
- ['222', '555', '888']
- ['333', '666', '999']

PyParsing parsing section with/without space

I want to parse variables in the following code with or without space in an existing line. If I do not have space, I can not distinguish the variable from a string
from pyparsing import *
Jinja_str_all = NotAny(Regex(r"{{"))+Word(printables)
Jinja_str_all1 = Word(printables)
Jinja_str = Word(alphas)
Jinja_Var_start = Regex(r"{{")
Jinja_Var_end = Regex(r"}}")
test1 = """
{{ variable }}
{{variable}}
aldkjflsdf {{ variable }}
aldkjflsdf{{ variable }}
aldkjflsdf {{ variable }} asdflskdfjlj {{ bbb }}
aldkjflsdf{{ variable }}asdflskdfjlj{{ bbb }}sdfsdfwerwr"""
test2 = "aldkjflsdf {{ variable }}"
line_Variable = ZeroOrMore(Jinja_str_all) + Group(Jinja_Var_start+OneOrMore(Jinja_str) + Jinja_Var_end) + ZeroOrMore(Jinja_str_all)
for a in test1.split("\n"):
print(a)
print(line_Variable.parseString(a))
it should be possible to parse out the variables in all variations
This is the expression that is causing the run-on in parsing the names immediately before the '{' character:
Jinja_str_all = NotAny(Regex(r"{{"))+Word(printables)
Pyparsing, unlike regex, does no implicit backtracking. The Word class is especially greedy, so Word(printables) is going to consume every non-space character up to the end of the line, so that includes the following '{' character.
If I understand your format, you do not want to include '{}' characters in these names, but to use them for grouping and delimiters, so you should exclude them from the list of characters the Word should match on.
This is easily done adding the excludeChars argument to Word:
Jinja_str_all = NotAny(Regex(r"{{"))+Word(printables, excludeChars='{}')
line_variable is also a bit more complicated than it needs to be. You just need to match one or more Jinja_str_all OR the grouped expression:
line_Variable = OneOrMore(Jinja_str_all
| Group(Jinja_Var_start
+ OneOrMore(Jinja_str)
+ Jinja_Var_end)
)
Using the new ellipsis notation, you can also write this as:
line_Variable = (Jinja_str_all
| Group(Jinja_Var_start
+ Jinja_str[1, ...]
+ Jinja_Var_end)
)[1, ...]
I chose to use runTests to test your input strings:
line_Variable[...].runTests(test1)
This runs the expression on each line in the test1 variable, and dumps the parsed result, or shows where the parsing goes astray:
{{ variable }}
[['{{', 'variable', '}}']]
[0]:
['{{', 'variable', '}}']
{{variable}}
[['{{', 'variable', '}}']]
[0]:
['{{', 'variable', '}}']
aldkjflsdf {{ variable }}
['aldkjflsdf', ['{{', 'variable', '}}']]
[0]:
aldkjflsdf
[1]:
['{{', 'variable', '}}']
aldkjflsdf{{ variable }}
['aldkjflsdf', ['{{', 'variable', '}}']]
[0]:
aldkjflsdf
[1]:
['{{', 'variable', '}}']
aldkjflsdf {{ variable }} asdflskdfjlj {{ bbb }}
['aldkjflsdf', ['{{', 'variable', '}}'], 'asdflskdfjlj', ['{{', 'bbb', '}}']]
[0]:
aldkjflsdf
[1]:
['{{', 'variable', '}}']
[2]:
asdflskdfjlj
[3]:
['{{', 'bbb', '}}']
aldkjflsdf{{ variable }}asdflskdfjlj{{ bbb }}sdfsdfwerwr
['aldkjflsdf', ['{{', 'variable', '}}'], 'asdflskdfjlj', ['{{', 'bbb', '}}'], 'sdfsdfwerwr']
[0]:
aldkjflsdf
[1]:
['{{', 'variable', '}}']
[2]:
asdflskdfjlj
[3]:
['{{', 'bbb', '}}']
[4]:
sdfsdfwerwr
(I just got tired of writing those little test loops for every parser demo...)

How to get xcom value in normal function or how to use pubsubOperator

I am trying to pass xcom value to normal function but it is passing the actual value
I tried with below using sample code
def getArgsForPractice(practice, messageId, status_result):
practice_args = dict()
practice_args['practice_id'] = practice
practice_args['message_id'] = messageId
practice_args['status'] = status_result
practice_args_json = json.dumps(practice_args)
message = {'data': base64.b64encode(practice_args_json.encode('utf-8')).decode()}
return message
PubSubPublishSuccess = PubSubPublishOperator(task_id='publish-messages_success',
topic=PUB_SUB_TOPIC,
project=PROJECT_ID,
messages=[
getArgsForPractice(
"{{ task_instance.xcom_pull('get_practice_id_task', key='return_value')[0]}}",
"{{ task_instance.xcom_pull('get_measure_id_task', key='return_value')[0]}}",
"SUCCESS"
)
],
dag=dag)
I seen the output value below like this enter code here:
{"practice_id": "{{ task_instance.xcom_pull('get_practice_id_task', key='return_value')[0]}}", "message_id": "{{ task_instance.xcom_pull('get_measure_id_task', key='return_value')[0]}}", "status": "SUCCESS"} │ 599454601822320 │

How to replace string by its own part

I have one column in data.table in R which looks like this.
[1] "<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"UNIT_RESULT\",\"SK190400\",
[2] "=> MSG: 'MessageReq', BODY: '{\"MessageReq\":{\"Parameters\":[\"UNIT_CHECKIN\",\"SK190400\",
[3] "<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"UNIT_CHECKIN\",\"SK190400\",
[4] "=> MSG: 'MessageReq', BODY: '{\"MessageReq\":{\"Parameters\":[\"OEE_DATA\",
[5] "<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"PING\",\"SK190400\",
But only thing that I care about is whether it is "UNIT_RESULT", "UNIT_CHECKIN", "OEE_DATA" or "PING", so I would like to replace each of row by new string ("UNIT_RESULT" etc.)
Result should looks like:
[1] "UNIT_RESULT"
[2] "UNIT_CHECKIN"
[3] "UNIT_CHECKIN"
[4] "OEE_DATA"
[5] "PING"
I have spent many hours by trying to find how to replace string by its own part but nothing showed me any useful result.
Replace specific characters within strings
Reference - What does this regex mean?
Test if characters in string in R
In the beginning function substring(x, 53, 63) looks like solution for me but it just choose fixed symbols in string so unless I have all rows same it is useless.
Any hints?
The str_match_all function will apply a regex to each element of a vector of strings and return only the match. So we can make a list of all the terms we want to extract and use paste0 to join them together with the | OR operator to make a single regular expression that matches any of the 4 desired terms.
Then we just run the str_match_all function and unlist the resulting list into a character vector.
strings <- c("<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"UNIT_RESULT\",\"SK190400\"",
"=> MSG: 'MessageReq', BODY: '{\"MessageReq\":{\"Parameters\":[\"UNIT_CHECKIN\",\"SK190400\"",
"<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"UNIT_CHECKIN\",\"SK190400\"",
"=> MSG: 'MessageReq', BODY: '{\"MessageReq\":{\"Parameters\":[\"OEE_DATA\"",
"<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"PING\",\"SK190400\""
)
items <- c('UNIT_RESULT', 'UNIT_CHECKIN', 'OEE_DATA', 'PING')
library(stringr)
unlist(str_match_all(strings, paste0(items,collapse = '|')))
[1] "UNIT_RESULT" "UNIT_CHECKIN" "UNIT_CHECKIN" "OEE_DATA" "PING"
An alternative is to use str_extract. You pass your string as the 'string' argument and the alternatives you gave as the 'pattern' argument, and it will return whatever of your alternatives is the first one to appear in the string.
library(stringr)
DT[, newstring := str_extract(string_column, "UNIT_RESULT|UNIT_CHECKIN|OEE_DATA|PING")]
I suggest
gsub("^.*?(UNIT_RESULT|UNIT_CHECKIN|OEE_DATA|PING).*","\\1",strings,perl=TRUE)
If you do not have a finite list of strings you are searching for I would recommend using a reg-ex pattern. Here is one that works based on the examples you provided:
# Code to create example data.table
library(data.table)
dt <- data.table(f1 = c("<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"UNIT_RESULT\",\"SK190400\"",
"=> MSG: 'MessageReq', BODY: '{\"MessageReq\":{\"Parameters\":[\"UNIT_CHECKIN\",\"SK190400\"",
"<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"UNIT_CHECKIN\",\"SK190400\"",
"=> MSG: 'MessageReq', BODY: '{\"MessageReq\":{\"Parameters\":[\"OEE_DATA\"",
"<= MSG: 'ACK', BODY: '{\"MessageRep\":{\"Parameters\":[\"PING\",\"SK190400\""
))
# Start of code to parse out values:
rex_pattern <- "(?<=(\"))[A-Z]{2,}_*[A-Z]+(?=(\"))"
dt[, .(parsed_val = regmatches(f1, regexpr(pattern = rex_pattern, f1, perl = TRUE)))]
This gives you:
parsed_val
1: UNIT_RESULT
2: UNIT_CHECKIN
3: UNIT_CHECKIN
4: OEE_DATA
5: PING
If you really want to "overwrite" the original field f1 with the new substring, you can use the following:
dt[, `:=`(f1 = regmatches(f1, regexpr(pattern = rex_pattern, f1, perl = TRUE)))]

Ansible - math operation, substract

Trying to substract a number for a variable, which is an int in Ansible.
var:
number: 30
tasks:
- set_fact: me={{ number -1 }}
- debug: var=me
Expectation: me = 29
Result:
fatal: [node1]: FAILED! => {"failed": true, "msg": "Unexpected templating type error occurred on ({{ number - 1 }}): unsupported operand type(s) for -: 'AnsibleUnicode' and 'int'"}
It is a known issue with Ansible/Jinja that you can't preserve numeric type after templating.
Use int filter inside {{..}} expression:
- set_fact: me={{ number | int - 1 }}

Resources