how to print recursively a Python dictionary and its subdictionaries with whitespace alignment into columns - dictionary

I want to create a function that can take a dictionary of dictionaries such as the following
information = {
"sample information": {
"ID": 169888,
"name": "ttH",
"number of events": 124883,
"cross section": 0.055519,
"k factor": 1.0201,
"generator": "pythia8",
"variables": {
"trk_n": 147,
"zappo_n": 9001
}
}
}
and then print it in a neat way such as the following, with alignment of keys and values using whitespace:
sample information:
ID: 169888
name: ttH
number of events: 124883
cross section: 0.055519
k factor: 1.0201
generator: pythia8
variables:
trk_n: 147
zappo_n: 9001
My attempt at the function is the following:
def printDictionary(
dictionary = None,
indentation = ''
):
for key, value in dictionary.iteritems():
if isinstance(value, dict):
print("{indentation}{key}:".format(
indentation = indentation,
key = key
))
printDictionary(
dictionary = value,
indentation = indentation + ' '
)
else:
print(indentation + "{key}: {value}".format(
key = key,
value = value
))
It produces the output like the following:
sample information:
name: ttH
generator: pythia8
cross section: 0.055519
variables:
zappo_n: 9001
trk_n: 147
number of events: 124883
k factor: 1.0201
ID: 169888
As is shown, it successfully prints the dictionary of dictionaries recursively, however is does not align the values into a neat column. What would be some reasonable way of doing this for dictionaries of arbitrary depth?

Try using the pprint module. Instead of writing your own function, you can do this:
import pprint
pprint.pprint(my_dict)
Be aware that this will print characters such as { and } around your dictionary and [] around your lists, but if you can ignore them, pprint() will take care of all the nesting and indentation for you.

Related

Airflow SqlToS3Operator has unwanted an index in the beginning

Recent airflow-providers-amazon has deprecated MySQLToS3Operator and introduced SqlToS3Operator and now it is adding an index column in the beginning of the CSV dump.
For example, if I run the following
sql_to_s3_task = SqlToS3Operator(
task_id="sql_to_s3_task",
sql_conn_id=conn_id_name,
query="SELECT created_at, score FROM my_table",
s3_bucket=bucket_name,
s3_key=key,
replace=True,
)
The S3 file has something like this:
,created_at,score
1,2023-01-01,5
2,2023-01-02,6
The output seems to be a direct dump from Pandas. How can I remove this unwanted preceding index column?
The operator uses pandas DataFrame under the hood.
You should use pd_kwargs. It allows you to pass arguments to include in DataFrame .to_parquet(), .to_json() or .to_csv().
Since your output is csv the relevant pandas.DataFrame.to_csv parameters are:
header: bool or list of str, default True
Write out the column names. If a list of strings is given it is assumed to be aliases for the column names.
index: bool, default True
Write row names (index).
Thus you can do:
sql_to_s3_task = SqlToS3Operator(
task_id="sql_to_s3_task",
sql_conn_id=conn_id_name,
query="SELECT created_at, score FROM my_table",
s3_bucket=bucket_name,
s3_key=key,
replace=True,
file_format="csv",
pd_kwargs={"index": False, "header": False},
)

Passing array as input variable into a Ruby file from zsh CLI

I am writing a ruby file that is called from zsh and, among others, I am trying to pass an array as an input variable like that:
ruby cim_manager.rb test --target=WhiteLabel --devices=["iPhone 8", "iPhone 12 Pro"]
Inside my ruby file I have a function:
# Generates a hash value from an array of arguments
#
# #param [Array<String>] The array of values. Each value of the array needs to separate the key and the value with "=". All "--" substrings will be replaced for empty substrings
#
# #return [Hash]
#
def generate_hash_from_arguemnts(args)
hash = {}
args.each{ |item|
item = item.gsub("--", "")
item = item.split("=")
puts item.kind_of?(Array)
hash[item[0].to_s] = item[1].to_s
}
return hash
end
So I can have a value like:
{"target": "WhiteLabel", "devices": ["iPhone 8", "iPhone 12 Pro"]}
The error I am getting when executing my Ruby file is:
foo#Mac-mini fastlane % ruby cim_manager.rb test --target=WhiteLabel --devices=["iPhone 8", "iPhone 12 Pro"]
zsh: bad pattern: --devices=[iPhone 8,
Any ideas?
#ReimondHill : I don't see how the error is possibly related to Ruby. You have a zsh-line, in which you have --devices= [.... You could get the same error when doing a
echo --devices=["iPhone 8", "iPhone 12 Pro"]
An open square bracket is a zsh wildcard construct; for instance, [aeiou] is a wildcard which tries to match against a vocal in a file name. Hence, this parameter tries to match against files starting with the name --devices= in your working directory, so you would expect an error message like no matches found: --devices=.... However, there is one gotcha: The list of characters between [ ... ] must not have an (unescaped) space. Therefore, you don't see no matches found, but bad pattern.
After all, you don't want a filename expansion to occur, but pass the parameter to your program. Therefore you need to quote it:
ruby .... '--devices=["iPhone 8", "iPhone 12 Pro"]'
Ronald
Following the answer from #user1934428, I am extending my ruby file like that:
# Generates a hash value from an array of arguments
#
# #param [Array<String>] The array of values. Each value of the array needs to separate the key and the value with "=". All "--" substrings will be replaced for empty substrings
#
# #return [Hash]
#
def generate_hash_from_arguemnts(args)
hash = {}
args.each{ |item|
item = item.gsub("--", "")
item = item.gsub("\"", "")
item = item.split("=")
key = item[0].to_s
value = item[1]
if value.include?("[") && value.include?("]") #Or any other pattern you decide
value = value.gsub("[","")
value = value.gsub("]","")
value = value.split(",")
end
hash[key] = value
}
return hash
end
And then my zsh-line:
ruby cim_manager.rb test --target=WhiteLabel --devices='[iPhone 8,iPhone 12 Pro]'
The return value from generate_hash_from_arguemnts prints:
{"target"=>"WhiteLabel", "devices"=>["iPhone 8", "iPhone 12 Pro"]}

after match, get next line in a file using python

i have a file with multiple lines like this:
Port id: 20
Port Discription: 20
System Name: cisco-sw-1st
System Description:
Cisco 3750cx Switch
i want to get the next line, if the match found in the previous line, how would i do that.
with open("system_detail.txt") as fh:
show_lldp = fh.readlines()
data_lldp = {}
for line in show_lldp:
if line.startswith("System Name: "):
fields = line.strip().split(": ")
data_lldp[fields[0]] = fields[1]
elif line.startswith("Port id: "):
fields = line.strip().split(": ")
data_lldp[fields[0]] = fields[1]
elif line.startswith("System Description:\n"):
# here i Want to get the next line and append it to the dictionary as a value and assign a
# key to it
pass
print()
print(data_lldp)
Iterate each line in text and then use next when match found
Ex:
data_lldp = {}
with open("system_detail.txt") as fh:
for line in fh: #Iterate each line
if line.startswith("System Name: "):
fields = line.strip().split(": ")
data_lldp[fields[0]] = fields[1]
elif line.startswith("Port id: "):
fields = line.strip().split(": ")
data_lldp[fields[0]] = fields[1]
elif line.startswith("System Description:\n"):
data_lldp['Description'] = next(fh) #Use next() to get next line
print()
print(data_lldp)
Check out this question about getting the next value(in your case the next line) in a loop.
Python - Previous and next values inside a loop

How to pull out variable-length lists using PyParsing

I have the following data:
protocol:: DHCP Other items
following:: line
I'd like to pull the "protocol" data strings into an array of ['dhcp', 'other', 'items'], and then have the following line parsed as a separate thing. I've tried 'protocol::' + OneOrMore(Word(alphas)), but that is eating up the following lines as well.
I've tried a several different variations of this, and nothing has worked. Is there a preferred way of doing this?
eg:
text = """
protocol:: DHCP some other thing
following:: line
"""
delim = '::'
proto_line = Group('protocol' + delim + OneOrMore(Word(alphanums)))
following_line = Group('following' + delim + OneOrMore(Word(alphanums)))
grammar = OneOrMore(proto_line | following_line)
print grammar.parseString(text).dump()
prints:
[['DHCP', 'some', 'other', 'thing', 'following']]
[0]:
['DHCP', 'some', 'other', 'thing', 'following']
and I would like [[protocol stuff], [following stuff]].

IndexError: list index out of range, scores.append( (fields[0], fields[1]))

I'm trying to read a file and put contents in a list. I have done this mnay times before and it has worked but this time it throws back the error "list index out of range".
the code is:
with open("File.txt") as f:
scores = []
for line in f:
fields = line.split()
scores.append( (fields[0], fields[1]))
print(scores)
The text file is in the format;
Alpha:[0, 1]
Bravo:[0, 0]
Charlie:[60, 8, 901]
Foxtrot:[0]
I cant see why it is giving me this problem. Is it because I have more than one value for each item? Or is it the fact that I have a colon in my text file?
How can I get around this problem?
Thanks
If I understand you well this code will print you desired result:
import re
with open("File.txt") as f:
# Let's make dictionary for scores {name:scores}.
scores = {}
# Define regular expressin to parse team name and team scores from line.
patternScore = '\[([^\]]+)\]'
patternName = '(.*):'
for line in f:
# Find value for team name and its scores.
fields = re.search(patternScore, line).groups()[0].split(', ')
name = re.search(patternName, line).groups()[0]
# Update dictionary with new value.
scores[name] = fields
# Print output first goes first element of keyValue in dict then goes keyName
for key in scores:
print (scores[key][0] + ':' + key)
You will recieve following output:
60:Charlie
0:Alpha
0:Bravo
0:Foxtrot

Resources