Values from array foo not in array bar - jq

I'm trying to filter an array with another.
For example, given I have this input array:
['foo', bar', 'baz']
And this filter array:
['foo', 'baz']
I want to have this output:
['bar']
I feel like I could be able to do this by piping to select(inside()), but I can't get inside() to work; I get a "not defined" error.

You can use the convenient subtraction operator - as follows:
jq '. - ["foo", "baz"]'

Related

Different array types, for comparison in the set_difference() function

I'm having some issues getting expected results from set_difference(). I assumed I was comparing two dynamic arrays, but I'm not sure where the gap is. The only additional insight I have is that when I compare the two arrays using the gettype() function, I get the following:
First array
Created using a make_list aggregation, e.g.
| summarize inv_list = make_list(Date)
When I run gettype() on the array:
"type_inv_list": array
Second array
Created through a scalar function
let period_check_range = todynamic(range(make_datetime(start_date), datetime_add('day',8,make_datetime(start_date)),1d));
When I run gettype() on the array:
"type_range___scalar_90e56a216d8942f28e6797e5abc35dd9": array
Any guidance on how to make these arrays work so I can use the set_difference() function?
You're missing toscalar() (see doc) in the first array. When you run | summarize ... you get a table as a result, but what you actually want is a single scalar, what's why toscalar() is needed.
Here's how to achieve what you want:
let StartDate = ago(10d);
let Array1 = toscalar(MyTable | summarize make_set(Timestamp));
let Array2 = todynamic(range(make_datetime(StartDate), datetime_add('day',8,make_datetime(StartDate)),1d));
print set_difference(Array2, Array1)
By the way, you probably want to use make_set and not make_list as you're not interested in duplicate values.

What is it called when you have multiple data structures, but not connected with json in jq?

For instance, I might have something coming out of my jq command like this:
"some string"
"some thing"
"some ping"
...
Note that there is no outer object or array and no commas between items.
Or you might have something like:
["some string"
"some thing"
"some ping"]
["some wing"
"some bling"
"some fing"]
But again, no commas or outer object or array and no commas between them to indicate that this is JSON.
I keep thinking the answer is that it is called "raw", but I'm uncertain about this.
I'm specifically looking for a term to look for in the documentation that allows you to process the sorts of examples above, and I am at a loss as how to proceed.
To start with, the jq manual.yml describes the behavior of filters this way:
Some filters produce multiple results, for instance there's one that
produces all the elements of its input array. Piping that filter
into a second runs the second filter for each element of the
array. Generally, things that would be done with loops and iteration
in other languages are just done by gluing filters together in jq.
It's important to remember that every filter has an input and an
output. Even literals like "hello" or 42 are filters - they take an
input but always produce the same literal as output. Operations that
combine two filters, like addition, generally feed the same input to
both and combine the results. So, you can implement an averaging
filter as add / length - feeding the input array both to the add
filter and the length filter and then performing the division.
It's also important to keep in mind that the default behavior of jq is to run the filter you specify once for each JSON object. In the following example, jq runs the identity filter four times passing one value to it each time:
$ (echo 2;echo {}; echo []; echo 3) | jq .
2
{}
[]
3
What is happening here is similar to
$ jq -n '2, {}, [], 3 | .'
2
{}
[]
3
Since this isn't always what you want, the -s option can be used to tell jq to gather the separate values into an array and feed that to the filter:
$ (echo 2;echo {}; echo []; echo 3)| jq -s .
[
2,
{},
[],
3
]
which is similar to
$ jq -n '[2, {}, [], 3] | .'
[
2,
{},
[],
3
]
The jq manual.yml explains how the --raw-input/-R option can be included for even more control over input handing:
Don't parse the input as JSON. Instead, each line of text is passed to the filter as a string. If combined with --slurp,then the entire input is passed to the filter as a single long string.
You can see using the -s and -R options together in this example produces a different result:
$ (echo 2;echo {}; echo []; echo 3)| jq -s -R .
"2\n{}\n[]\n3\n"

How to get the value of field

In julia i can get a list of fields like so
INPUT:
type Foobar
foo::Int
bar::String
end
baz = Foobar(5,"GoodDay")
fieldnames(baz)
OUTPUT:
2-element Array{Symbol,1}:
:foo
:bar
But how can access the values of those fields, given the names that I am finding dynamically?
I know one way is to build the expression myself:
fieldvalue(v,fn::Symbol) = eval(Expr(:(.), v, QuoteNode(fn)))
That is kinda scary looking, so I think there is a better way.
Usecase:
INPUT:
function print_structure(v)
for fn in fieldnames(v)
println(fn,"\t", fieldvalue(v,fn))
end
end
print_structure(baz)
OUTPUT:
foo 5
bar GoodDay
getfield(baz, :foo) will get the field foo from variable baz i.e. the result will be the same as baz.foo.
Note :foo has to be a symbol, therefore if you somehow get the field name in a string, it should be used as follows: getfield(varname, Symbol(fieldnamestring))
You can also use e.g. getfield(baz, 2) to get the 2nd field without needing to know its name.

How to extract keys in a nested json array object in Presto?

I'm using the latest(0.117) Presto and trying to execute CROSS JOIN UNNEST with complex JSON array like this.
[{"id": 1, "value":"xxx"}, {"id":2, "value":"yy"}, ...]
To do that, first I tried to make an ARRAY with the values of id by
SELECT CAST(JSON_EXTRACT('[{"id": 1, "value":"xxx"}, {"id":2, "value":"yy"}]', '$..id') AS ARRAY<BIGINT>)
but it doesn't work.
What is the best JSON Path to extract the values of id?
This will solve your problem. It is more generic cast to an ARRAY of json (less prone to errors given an arbitrary map structure):
select
TRANSFORM(CAST(JSON_PARSE(arr1) AS ARRAY<JSON>),
x -> JSON_EXTRACT_SCALAR(x, '$.id'))
from
(values ('[{"id": 1, "value":"xxx"}, {"id":2, "value":"yy"}]')) t(arr1)
Output in presto:
[1,2]
... I ran into a situation where a list of jsons was nested within a json. My list of jsons had an ambiguous nested map structure. The following code returns an array of values given a specific key in a list of jsons.
Extract the list using JSON EXTRACT
Cast the list as an array of jsons
Loop through the json elements in the array using the TRANSFORM function and extract the value of the key that you are interested in.
>
TRANSFORM(CAST(JSON_EXTRACT(json, '$.path.toListOfJSONs') AS ARRAY<JSON>),
x -> JSON_EXTRACT_SCALAR(x, '$.id')) as id
You can cast the JSON into an ARRAY of MAP, and use transform lambda function to extract the "id" key:
select
TRANSFORM(CAST(JSON_PARSE(arr1) AS ARRAY<MAP<VARCHAR, VARCHAR>>), entry->entry['id'])
from
(values ('[{"id": 1, "value":"xxx"}, {"id":2, "value":"yy"}]')) t(arr1)
output:
[1, 2]
Now, you can use presto-third-functions , It provide json_array_extract function, you can extract json array info like this:
select
json_array_extract_scalar(arr1, '$.book.id')
from
(values ('[{"book":{"id":"12"}}, {"book":{"id":"14"}}]')) t(arr1)
output is:
[12, 14]
I finally gave up finding a simple JSON Path to extract them.
Instead, I wrote a redundant dirty query like the following to make the task done.
SELECT
...
FROM
(
SELECT
SLICE(ARRAY[
JSON_EXTRACT(json_column, '$[0].id'),
JSON_EXTRACT(json_column, '$[1].id'),
JSON_EXTRACT(json_column, '$[2].id'),
...
], JSON_ARRAY_LENGTH(json_column)) ids
FROM
the.table
) t1
CROSS JOIN UNNEST(ids) AS t2(id)
WHERE
...
I still want to know the best practice if you know another good way to CROSS JOIN them!

Reducing JSON with jq

I have a JSON array of Objects:
[{key1: value},{key2:value}, ...]
I would like to reduce these into the following structure:
{key1: value, key2: value, ...}
Is this possible to do with jq?
I was trying:
cat myjson.json | jq '.[] | {(.key): value}'
This doesn't quite work as it iterates over each datum rather than reducing it to one Object.
Note that jq has a builtin function called 'add' that the same thing that the first answer suggests, so you ought to be able to write:
jq add myjson.json
To expand on the other two answers a bit, you can "add" two objects together like this:
.[0] + .[1]
=> { "key1": "value", "key2": "value" }
You can use the generic reduce function to repeatedly apply a function between the first two items of a list, then between that result and the next item, and so on:
reduce .[] as $item ({}; . + $item)
We start with {}, add .[0], then add .[1] etc.
Finally, as a convenience, jq has an add function which is essentially an alias for exactly this function, so you can write the whole thing as:
add
Or, as a complete command line:
jq add myjson.json
I believe the following will work:
cat myjson.json | jq 'reduce .[] as $item ({}; . + $item)'
It takes each item in the array, and adds it to the sum of all the previous items.

Resources