split a key-value pair in Python - dictionary

I have a dictionairy as follows:
{
"age": "76",
"Bank": "98310",
"Stage": "final",
"idnr": "4578",
"last number + Value": "[345:K]"}
I am trying to adjust the dictionary by splitting the last key-value pair creating a new key('total data'), it should look like this:
"Total data":¨[
{
"last number": "345"
"Value": "K"
}]
}
Does anyone know if there is a split function based on ':' and '+' or a for loop to accomplish this?
Thanks in advance.

One option to accomplish that could be getting the last key from the dict and using split on + for the key and : for the value removing the outer square brackets assuming the format of the data is always the same.
If you want Total data to contain a list, you can wrap the resulting dict in []
from pprint import pprint
d = {
"age": "76",
"Bank": "98310",
"Stage": "final",
"idnr": "4578",
"last number + Value": "[345:K]"
}
last = list(d.keys())[-1]
d["Total data"] = dict(
zip(
last.strip().split('+'),
d[last].strip("[]").split(':')
)
)
pprint(d)
Output (tested with Python 3.9.4)
{'Bank': '98310',
'Stage': 'final',
'Total data': {' Value': 'K', 'last number ': '345'},
'age': '76',
'idnr': '4578',
'last number + Value': '[345:K]'}
Python demo

Related

OpenAI package leaving linebreak in response

I've starting using OpenAI API in R. I downloaded the openai package. I keep getting a double linebreak in the text response. Here's an example of my code:
library(openai)
vector = create_completion(
model = "text-davinci-003",
prompt = "Tell me what the weather is like in London, UK, in Celsius in 5 words.",
max_tokens = 20,
temperature = 0,
echo = FALSE
)
vector_2 = vector$choices[1]
vector_2$text
[1] "\n\nRainy, mild, cool, humid."
Is there a way to get rid of this without 'correcting' the response text using other functions?
No, it's not possible.
The OpenAI API returns the completion with starting \n\n by default. There's no parameter for the Completions endpoint to control this.
You need to remove linebreak manually.
Example response looks like this:
{
"id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
"object": "text_completion",
"created": 1589478378,
"model": "text-davinci-003",
"choices": [
{
"text": "\n\nThis is indeed a test",
"index": 0,
"logprobs": null,
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 5,
"completion_tokens": 7,
"total_tokens": 12
}
}

I'm having trouble with parsing a JSON file

I am attempting to use a .json file I found online, but I'm starting to think that there is an underlying issue with the file. I am not very knowledgeable in .json files, so I am trying to convert it into a CSV file. I have yet to find a website that can do that for me.
I've tried using R to convert the file since the file is also quite large and I can only assume that most websites have a size limit. I have tried flattening it in r with this code:
library(jsonlite)
library(tidyr)
library(tidyverse)
json_string <- readLines("data.json")
json_data <- fromJSON(json_string)
json_data <- flatten(json_data)
df <- as_data_frame(json_data)
write_csv(df, "output.csv")
but it returns this error:
! Tibble columns must have compatible sizes.
* Size 2: Columns `A-Alrund, God of the Cosmos // A-Hakka, Whispering Raven`, `A-Blessed Hippogriff // A-Tyr's Blessing`, `A-Emerald Dragon // A-Dissonant Wave`, `A-Monster Manual // A-Zoological Study`, `A-Rowan, Scholar of Sparks // A-Will, Scholar of Frost`, and 484 more.
* Size 3: Column `Smelt // Herd // Saw`.
* Size 5: Column `Who // What // When // Where // Why`.
* Size 6: Columns `Everythingamajig`, `Garbage Elemental`, `Ineffable Blessing`, `Knight of the Kitchen Sink`, `Scavenger Hunt`, and 4 more.
i Only values of size one are recycled.
Backtrace:
1. tibble::as_data_frame(json_data)
3. tibble:::as_tibble.list(x, ...)
4. tibble:::lst_to_tibble(x, .rows, .name_repair, col_lengths(x))
5. tibble:::recycle_columns(x, .rows, lengths)
Here is what the first 2 items of the .json file look like
{"data": {"\"Ach! Hans, Run!\"": [{"colorIdentity": ["G", "R"], "colors": ["G", "R"], "convertedManaCost": 6.0, "foreignData": [], "identifiers": {"scryfallOracleId": "a2c5ee76-6084-413c-bb70-45490d818374"}, "isFunny": true, "layout": "normal", "legalities": {}, "manaCost": "{2}{R}{R}{G}{G}", "manaValue": 6.0, "name": "\"Ach! Hans, Run!\"", "printings": ["UNH"], "purchaseUrls": {"cardKingdom": "https://mtgjson.com/links/84dfefe718a51cf8", "cardKingdomFoil": "https://mtgjson.com/links/d8c9f3fc1e93c89c", "cardmarket": "https://mtgjson.com/links/b9d69f0d1a9fb80c", "tcgplayer": "https://mtgjson.com/links/c51d2b13ff76f1f0"}, "rulings": [], "subtypes": [], "supertypes": [], "text": "At the beginning of your upkeep, you may say \"Ach! Hans, run! It's the . . .\" and the name of a creature card. If you do, search your library for a card with that name, put it onto the battlefield, then shuffle. That creature gains haste. Exile it at the beginning of the next end step.", "type": "Enchantment", "types": ["Enchantment"]}], "\"Brims\" Barone, Midway Mobster": [{"colorIdentity": ["B", "W"], "colors": ["B", "W"], "convertedManaCost": 5.0, "foreignData": [], "identifiers": {"scryfallOracleId": "c64c31f2-c1be-414e-9dff-c3b77ba97545"}, "isFunny": true, "layout": "normal", "leadershipSkills": {"brawl": false, "commander": true, "oathbreaker": false}, "legalities": {}, "manaCost": "{3}{W}{B}", "manaValue": 5.0, "name": "\"Brims\" Barone, Midway Mobster", "power": "5", "printings": ["UNF"], "purchaseUrls": {"cardKingdom": "https://mtgjson.com/links/d1e320bd9d6813c0", "cardKingdomFoil": "https://mtgjson.com/links/18f86e8a04682c34", "cardmarket": "https://mtgjson.com/links/d5a3d8cfb60767d4", "tcgplayer": "https://mtgjson.com/links/980f45f2bc8c3733"}, "rulings": [], "subtypes": ["Human", "Rogue"], "supertypes": ["Legendary"], "text": "When \"Brims\" Barone, Midway Mobster enters the battlefield, put a +1/+1 counter on each other creature you control that has a hat.\n\"Brims\" Barone, Midway Mobster has menace as long as you're wearing a hat.", "toughness": "4", "type": "Legendary Creature — Human Rogue", "types": ["Creature"]}]}
I am hoping that the resulting csv file has the keys as the column names, and the values to be assigned to the columns based on their keys.
EDIT:
I have now attached a screenshot of what the json_data structure looks like.Structure of json_data
Assuming it's one of the JSON dumps from scryfall, try this:
library(jsonlite)
library(tidyr)
library(tidyverse)
todo <- list.files(pattern = ".json")
json_data <- fromJSON(todo)
json_data_flat_jsl <- jsonlite::flatten(json_data)
df <- as_tibble(json_data_flat_jsl)
write_csv(df, "output.csv")

Incrementing a value progressively on each item

So i have this Grafana dashboard that i'm making up using jq and different files. The problem i end up with is that when you export the json produced by Grafana, it will export it the way it sees it currently. Example:
[
{
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 22
},
"panels": []
},
{
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 43
},
"panels": []
},
{
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 17
},
"panels": []
}
]
But the problem is that the grid positions need to be properly incremented (the Y's) so that when you reload the Grafana dashboards, the panels nested under row panels get set to their proper locations. If you have a sub panel that has a gridPos.y that is lower than the row panel's gridPos.y then it will appear in a weird location.
I tried using reduce and foreach but i'm not super good with these constructs yet. For example, i tried this:
[
1 as $currentY |
foreach .[] as $item (
[];
(. + [$item * {"gridPos": {"y": ($currentY + 1)}}]);
. | last
)
]
But i can't figure out how to increment $currentY within the loop to achieve proper incrementation. The objective would be to nest a second foreach/reduce to continue setting and incrementing $currentY in all panels and sub panels.
Can you help? Thanks!
Note: I know i should use reduce when using .|last, this was just the last try. Don't point that out, i want guidance on how to increment $currentY in the current approach.
With your existing approach as such, you need to reference the y field in each $item processed and increment its value, rather than the predefined value of $currentY, i.e.
[
1 as $currentY |
foreach .[] as $item (
[];
(. + [$item * {"gridPos": {"y": ($currentY + $item.gridPos.y )}}]);
last
)
]
which again could be written as
[
1 as $currentY |
foreach .[] as $item (
[];
(. + [ $item | .gridPos.y += $currentY ]);
last
)
]
which again could be written with a simple walk expression
1 as $currentY |
walk ( if type == "object" and has("gridPos") then .gridPos.y += $currentY else . end )

extract value from JSON object using SQLite and the json_tree function

I have a table (named, patrons) that contains a column (named, json_patron_varfields) of JSON data--an array of objects that looks something like this:
[
{
"display_order": 1,
"field_content": "example 1",
"name": "Note",
"occ_num": 0,
"varfield_type_code": "x"
},
{
"display_order": 2,
"field_content": "example 2",
"name": "Note",
"occ_num": 1,
"varfield_type_code": "x"
},
{
"display_order": 3,
"field_content": "some field we do not want",
"occ_num": 0,
"varfield_type_code": "z"
}
]
What I'm trying to do is to target the objects that contain the key named varfield_type_code and the value of x which I've been able to do with the following query:
SELECT
patrons.patron_record_id,
json_extract(patrons.json_patron_varfields, json_tree.path)
FROM
patrons,
json_tree(patrons.json_patron_varfields)
WHERE
json_tree.key = 'varfield_type_code'
AND json_tree.value = 'x'
My Question is... how do I extract (or even possibly filter on) the values of the field_content keys from the objects I'm extracting?
I'm struggling with the syntax of how to do that... I was thinking it could be as simple as using json_extract(patrons.json_patron_varfields, json_tree.path."field_content") but that doesn't appear to be correct..
You can concat to build the string
json_tree.path || '.field_content'
With the structure you've given - you can also use json_each() instead of json_tree() which may simplify things.
extract:
SELECT
patrons.patron_record_id,
json_extract(value, '$.field_content')
FROM
patrons,
json_each(patrons.json_patron_varfields)
WHERE json_extract(value, '$.varfield_type_code') = 'x'
filter:
SELECT
patrons.patron_record_id,
value
FROM
patrons,
json_each(patrons.json_patron_varfields)
WHERE json_extract(value, '$.varfield_type_code') = 'x'
AND json_extract(value, '$.field_content') = 'example 2'

how to print recursively a Python dictionary and its subdictionaries with whitespace alignment into columns

I want to create a function that can take a dictionary of dictionaries such as the following
information = {
"sample information": {
"ID": 169888,
"name": "ttH",
"number of events": 124883,
"cross section": 0.055519,
"k factor": 1.0201,
"generator": "pythia8",
"variables": {
"trk_n": 147,
"zappo_n": 9001
}
}
}
and then print it in a neat way such as the following, with alignment of keys and values using whitespace:
sample information:
ID: 169888
name: ttH
number of events: 124883
cross section: 0.055519
k factor: 1.0201
generator: pythia8
variables:
trk_n: 147
zappo_n: 9001
My attempt at the function is the following:
def printDictionary(
dictionary = None,
indentation = ''
):
for key, value in dictionary.iteritems():
if isinstance(value, dict):
print("{indentation}{key}:".format(
indentation = indentation,
key = key
))
printDictionary(
dictionary = value,
indentation = indentation + ' '
)
else:
print(indentation + "{key}: {value}".format(
key = key,
value = value
))
It produces the output like the following:
sample information:
name: ttH
generator: pythia8
cross section: 0.055519
variables:
zappo_n: 9001
trk_n: 147
number of events: 124883
k factor: 1.0201
ID: 169888
As is shown, it successfully prints the dictionary of dictionaries recursively, however is does not align the values into a neat column. What would be some reasonable way of doing this for dictionaries of arbitrary depth?
Try using the pprint module. Instead of writing your own function, you can do this:
import pprint
pprint.pprint(my_dict)
Be aware that this will print characters such as { and } around your dictionary and [] around your lists, but if you can ignore them, pprint() will take care of all the nesting and indentation for you.

Resources