R Parsing error when trying to import JSON to R - r

I've got a JSON file that looks like this
I am trying to import it into R using the jsonlite package.
#Load package for import
library(jsonlite)
df <- fromJSON("test.json")
But it throws an error
Error in parse_con(txt, bigint_as_char) : parse error: trailing
garbage
ome in at a later time." } { "id": "e5fa37f44557c62ee
(right here) ------^
I've tried looking at all solutions on stackoverflow, but haven't been able to figure this out.
Any inputs would be very helpful.

The JSON file you linked contains two JSON objects. Perhaps you want an array:
[
{
"id": "71bb8883780bb152e4bb4db976bedc62",
"metadata": {
"abc_bad_date": "true",
"abc_client": "Hydra Corp",
"abc_doc_id": 1,
"abc_file": "Hydra Corp 2016.txt",
"abc_interview_type": "Post Analysis",
"abc_interviewee_role": "Director Corporate Engineering; Greater Chicago Area; Global Procurement Director Facilities and MRO",
"abc_interviewer": "Piper Thomas",
"abc_services_provided": "Food",
"section": "on_expectations"
},
"text": "Gerrit: There were a number ...."
},
{
"id": "e5fa37f44557c62eef44baafb13128f0",
"metadata": {
"abc_bad_date": "true",
"abc_client": "Hydra Corp",
"abc_doc_id": 1,
"abc_file": "Hydra Corp 2016.txt",
"abc_interview_type": "Post Analysis",
"abc_interviewee_role": "Director Corporate Engineering; Greater Chicago Area; Global Procurement Director Facilities and MRO",
"abc_interviewer": "Piper Thomas",
"abc_services_provided": "Painting",
"section": "on_relationships"
},
"text": "Gerrit: I thought the ABC ..."
}
]

Related

jq replace values based on external map

I would like to change a field in my json file as specified by another json file. My input file is something like:
{"id": 10, "name": "foo", "some_other_field": "value 1"}
{"id": 20, "name": "bar", "some_other_field": "value 2"}
{"id": 25, "name": "baz", "some_other_field": "value 10"}
I have an external override file that specifies how name in certain objects should be overridden, for example:
{"id": 20, "name": "Bar"}
{"id": 10, "name": "foo edited"}
As shown above, the override may be shorter than input, in which case the name should be unchanged. Both files can easily fit into available memory.
Given the above input and the override, I would like to obtain the following output:
{"id": 10, "name": "foo edited", "some_other_field": "value 1"}
{"id": 20, "name": "Bar", "some_other_field": "value 2"}
{"id": 25, "name": "baz", "some_other_field": "value 10"}
Being a beginner with jq, I wasn't really sure where to start. While there are some questions that cover similar ground (the closest being this one), I couldn't figure out how to apply the solutions to my case.
There are many possibilities, but probably the simplest, efficient solution would use the built-in function: INDEX/2, e.g. as follows:
jq -n --slurpfile dict f2.json '
(INDEX($dict[]; .id) | map_values(.name)) as $d
| inputs
| .name = ($d[.id|tostring] // .name)
' f1.json
This uses inputs with the -n option to read the first file so that each JSON object can be processed in turn.
Since the solution is so short, it should be easy enough to figure it out with the aid of the online jq manual.
Caveat
This solution comes with a caveat: that there are no "collisions" between ids in the dictionary as a result of the use of "tostring" (e.g. if {"id": 10} and {"id": "10"} both occurred).
If the dictionary does or might have such collisions, then the above solution can be tweaked accordingly, but it is a bit tricky.

Re-create openstack artifacts from previous command output?

Is there an easy way to convert Openstack show command outputs into openstack commands ?
The goal is to rebuild an openstack environment after a complete wipe.
(for example: openstack network show myNet > out.txt,
then somehow generate the Openstack CLI command with appropriate fields to re-create this same exact network, based on out.txt ?)
Thanks!
You can write the output of the show commands as json formated string into a file, so you can easily read the information of the output with python-script to create and execute your desired commands.
To print the output of an openstack-command as json, add a -f json at the end of your command.
Example:
openstack server show cirros -f json
{
"OS-DCF:diskConfig": "MANUAL",
"OS-EXT-AZ:availability_zone": "nova",
"OS-EXT-SRV-ATTR:host": "test-system",
"OS-EXT-SRV-ATTR:hypervisor_hostname": "test-system",
"OS-EXT-SRV-ATTR:instance_name": "instance-00000001",
"OS-EXT-STS:power_state": "Shutdown",
"OS-EXT-STS:task_state": null,
"OS-EXT-STS:vm_state": "stopped",
"OS-SRV-USG:launched_at": "2020-07-22T08:41:06.000000",
"OS-SRV-USG:terminated_at": null,
"accessIPv4": "",
"accessIPv6": "",
"addresses": "test-network=192.168.62.207",
"config_drive": "",
"created": "2020-07-22T08:40:46Z",
"flavor": "f1 (273a2179-ac85-4c54-a40a-2c0121b338ff)",
"id": "6d302fcf-4de3-45a5-93c0-eb95650e5952",
"image": "cirros (86dded1f-8e0f-4342-906e-8ff9fbd854e2)",
"name": "cirros",
"project_id": "cbba4b1f3cb4460ca63e8ddb87c9b5fb",
"properties": "",
"security_groups": "name='default'",
"status": "SHUTOFF",
"updated": "2020-08-17T13:26:55Z",
"user_id": "b6505d6801e84fb98d77d2461f9719c2",
"volumes_attached": ""
}

Importing csv file but few columns are full of special symbols

I have imported a movie dataset in csv format, few of the columns are full of special symbols along with the data I need(Example is attached below along with the image of the Movie dataset). Now, do I have to remove those special characters individually OR is there anyway(shortcut) to remove them while importing the file into R. Thanks
Movie.csv Image
GENRE
[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]
Spoken Languages
[{"iso_639_1": "en", "name": "English"}, {"iso_639_1": "es", "name": "Espa\u00f1ol"}]

Deleting multiple keys at once with jq

I need to delete multiple keys at once from some JSON (using jq), and I'm trying to learn if there is a better way of doing this, than calling map and del every time. Here's my input data:
test.json
[
{
"label": "US : USA : English",
"Country": "USA",
"region": "US",
"Language": "English",
"locale": "en",
"currency": "USD",
"number": "USD"
},
{
"label": "AU : Australia : English",
"Country": "Australia",
"region": "AU",
"Language": "English",
"locale": "en",
"currency": "AUD",
"number": "AUD"
},
{
"label": "CA : Canada : English",
"Country": "Canada",
"region": "CA",
"Language": "English",
"locale": "en",
"currency": "CAD",
"number": "CAD"
}
]
For each item, I want to remove the number, Language, and Country keys. I can do that with this command:
$ cat test.json | jq 'map(del(.Country)) | map(del(.number)) | map(del(.Language))'
That works fine, and I get the desired output:
[
{
"label": "US : USA : English",
"region": "US",
"locale": "en",
"currency": "USD"
},
{
"label": "AU : Australia : English",
"region": "AU",
"locale": "en",
"currency": "AUD"
},
{
"label": "CA : Canada : English",
"region": "CA",
"locale": "en",
"currency": "CAD"
}
]
However, I'm trying to understand if there is a jq way of specifying multiple labels to delete, so I don't have to have multiple map(del()) directives?
You can provide a stream of paths to delete:
$ cat test.json | jq 'map(del(.Country, .number, .Language))'
Also, consider that, instead of blacklisting specific keys, you might prefer to whitelist the ones you do want:
$ cat test.json | jq 'map({label, region, locale, currency})'
There is no need to use both map and del.
You can pass multiple paths to del, separated by commas.
Here is a solution using "dot-style" path notation:
jq 'del( .[] .Country, .[] .number, .[] .Language )' test.json
doesn't require quotation marks (which you may feel makes it more readable)
doesn't group the paths (requires you to retype .[] once per path)
Here is an example using "array-style" path notation, which allows you to combine paths with a common prefix like so:
jq 'del( .[] ["Country", "number", "Language"] )' test.json
Combines subpaths under the "last-common ancestor" (which in this case is the top-level list iterator .[])
peak's answer uses map and delpaths, though it seems you can also use delpaths on its own:
jq '[.[] | delpaths( [["Country"], ["number"], ["Language"]] )]' test.json
Requires both quotation marks and array of singleton arrays
Requires you to put it back into a list (with the start and end square brackets)
Overall, here I'd go for the array-style notation for brevity, but it's always good to know multiple ways to do the same thing.
A better compromise between "array-style" and "dot-style" notation mentioned in by Louis in his answer.
del(.[] | .Country, .number, .Language)
jqplay
This form can also be used to delete a list of keys from a nested object (see russholio's answer):
del(.a | .d, .e)
Implying that you can also pick a single index to delete keys from:
del(.[1] | .Country, .number, .Language)
Or multiple:
del(.[2,3,4] | .Country,.number,.Language)
You can delete a range using the range() function (slice notation doesn't work):
del(.[range(2;5)] | .Country,.number,.Language) # same as targetting indices 2,3,4
Some side notes:
map(del(.Country,.number,.Language))
# Is by definition equivalent to
[.[] | del(.Country,.number,.Language)]
If the key contains special characters or starts with a digit, you need to surround it with double quotes like this: ."foo$", or else .["foo$"].
This question is very high in the google results, so I'd like to note that some time in the intervening years, del has apparently been altered so that you can delete multiple keys with just:
del(.key1, .key2, ...)
So don't tear your hair out trying to figure out the syntax work-arounds, assuming your version of jq is reasonably current.
In addition to #user3899165's answer, I found that to delete a list of keys from "sub-object"
example.json
{
"a": {
"b": "hello",
"c": "world",
"d": "here's",
"e": "the"
},
"f": {
"g": "song",
"h": "that",
"i": "I'm",
"j": "singing"
}
}
$ jq 'del(.a["d", "e"])' example.json
delpaths is also worth knowing about, and is perhaps a little less mysterious:
map( delpaths( [["Country"], ["number"], ["Language"]] ))
Since the argument to delpaths is simply JSON, this approach is particularly useful for programmatic deletions, e.g. if the key names are available as JSON strings.

Why does Tableau Logs [Vertica][ODBC] (11430) Catalog name not supported?

I have been working on Tableau + vertica Solutions.
I have installed relevant vertica ODBC driver from Vertica provided packages .
While going through tdeserver.txt log file, I stumbled upon a line of error log as below :
{
"ts": "2015-12-16T21:42:41.568",
"pid": 51081,
"tid": "23d247",
"sev": "warn",
"req": "-",
"sess": "-",
"site": "{759FD0DA-A1AB-4092-AAD3-36DA0923D151}",
"user": "-",
"k": "database-error",
"v": {
"retcode-desc": "SQL_ERROR",
"retcode": -1,
"protocol": "7fc6730d6000",
"line": 2418,
"file": "/Volumes/build/builds/tableau-9-2/tableau-9-2.15.1201.0018/modules/connectors/tabmixins/main/db/ODBCProtocolImpl.cpp",
"error-records": [{
"error-record": 1,
"error-desc": "[Vertica][ODBC] (11430) Catalog name not supported.",
"sql-state": "HYC00",
"sql-state-desc": "SQLSTATE_API_OPT_FEATURE_NOT_IMPL_ODBC3x",
"native-error": 11430
}]
}
}
This piece of log is repeated several times .
Rest of the setup runs smooth as expected .
Below are the attributes from ~/Library/ODBC/odbc.ini
[ODBC]
Trace = 1
TraceAutoStop = 0
TraceFile = ~/log
TraceLibrary =
ThreePartNaming=1
What am I missing here ?

Resources