How to produce the Cartesian square of an array in jq?
Input:
[0,1,2]
Output:
[[0,0],[0,1],[0,2],
[1,0],[1,1],[1,2],
[2,0],[2,1],[2,2]]
I found simple way to make it work with arithmetic operations, but no luck with comma operator.
Cartesian product
One way to generate the array of pairs in the specified order would be as follow:
def data: [0,1,2];
data | [.[] as $i | .[] as $j | [$i, $j] ]
Alternatively, avoiding $-variables:
[range(0;3) | [.] + (range(0;3)|[.])]
Square matrix with m[i][j] = [i,j]
def Mij(n):
[ range(0;n) as $i
| [ range(0;n) as $j
| [$i, $j] ] ];
Mij(3)
produces:
[[[0,0],[0,1],[0,2]],[[1,0],[1,1],[1,2]],[[2,0],[2,1],[2,2]]]
Related
Sample Input
[1,2,3,4,5,6,7,8,9]
My Solution
$ echo '[1,2,3,4,5,6,7,8,9]' | jq --arg g 4 '. as $l|($g|tonumber) as $n |$l|length as $c|[range(0;$c;($g|tonumber))]|map($l[.:.+$n])' -c
Output
[[1,2,3,4],[5,6,7,8],[9]]
shorthand, handy method anything else?
Use a while loop to chop off the first 4 elements .[4:] until the array is empty []. Then, for each result array, consider only its first 4 items [:4]. Generalized to $n:
jq -c --argjson n 4 '[while(. != []; .[$n:])[:$n]]'
[[1,2,3,4],[5,6,7,8],[9]]
Demo
There's an undocumented builtin function, _nwise/1, which you would use like this:
jq -nc --argjson n 4 '[1,2,3,4,5,6,7,8,9] | [_nwise($n)]'
[[1,2,3,4],[5,6,7,8],[9]]
Notice that using --argjson allows you to avoid the call to tonumber.
One way using reduce operating on the whole list, forming only n entries (sub-arrays) at a time
jq -c --argjson g 4 '. as $input |
reduce range(0; ( $input | length ) ; $g) as $r ( []; . + [ $input[ $r: ( $r + $g ) ] ] )'
The three argument form of range(from: upto; by) generates numbers from to upto with an increment of by
E.g. range(0; 9; 4) from your original input produces a set of indices - 0, 4, 8 which is ranged over and the final list is formed by appending the slices, coming out of the array slice operation e.g. [0:4], [4:8] and [8:12]
Now, this is somewhat similar to jq: select only an array which contains element A but not element B but it somehow doesn't work for me (which is likely my fault)... ;-)
So here's what we have:
[ {
"employeeType": "student",
"cn": "dc8aff1",
"uid": "dc8aff1",
"ou": [
"4210910",
"4210910 #Abg",
"4210910 Abgang",
"4240115",
"4240115 5",
"4240115 5\/5"
]
},
{
"employeeType": "student",
"cn": "160f656",
"uid": "160f656",
"ou": [
"4210910",
"4210910 3",
"4210910 3a"
] } ]
I'd like to select all elements where ou does not contain a specific string, say "4210910 3a" or - which would be even better - where ou does not contain any member of a given list of strings.
When it comes to possibly changing inputs, you should make it a parameter to your filter, rather than hardcoding it in. Also, using contains might not work for you in general. It runs the filter recursively so even substrings will match which might not be preferred.
For example:
["10", "20", "30", "40", "50"] | contains(["0"])
is true
I would write it like this:
$ jq --argjson ex '["4210910 3a"]' 'map(select(all(.ou[]; $ex[]!=.)))' input.json
This response addresses the case where .ou is an array and we are given another array of forbidden strings.
For clarity, let's define a filter, intersectq(a;b), that will return true iff the arrays have an element in common:
def intersectq(a;b):
any(a[]; . as $x | any( b[]; . == $x) );
This is effectively a loop-within-a-loop, but because of the semantics of any/2, the computation will stop once a match has been found.(*)
Assuming $ex is the list of exceptions, then the filter we could use to solve the problem would be:
map(select(intersectq(.ou; $ex) | not))
For example, we could use an invocation along the lines suggested by Jeff:
$ jq --argjson ex '["4210910 3a"]' -f myfilter.jq input.json
Now you might ask: why use the any-within-any double loop rather than .[]-within-all double loop? The answer is efficiency, as can be seen using debug:
$ jq -n '[1,2,3] as $a | [1,1] as $b | all( $a[]; ($b[] | debug) != .)'
["DEBUG:",1]
["DEBUG:",1]
false
$ jq -n '[1,2,3] as $a | [1,1] as $b | all( $a[]; . as $x | all( $b[]; debug | $x != .))'
["DEBUG:",1]
false
(*) Footnote
Of course intersectq/2 as defined here is still O(m*n) and thus inefficient, but the main point of this post is to highlight the drawback of the .[]-within-all double loop.
Here is a solution that checks the .ou member of each element of the input using foreach and contains.
["4210910 3a"] as $list # adjust as necessary
| .[]
| foreach $list[] as $e (
.; .; if .ou | contains([$e]) then . else empty end
)
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
["4210910 3a"] as $list
| .[]
| $list[] as $e
| if .ou | contains([$e]) then . else empty end
Say I have a JSON like this:
{
"json": [
"a",
[
"b",
"c",
[
"d",
"foo",
1
],
[
[
42,
"foo"
]
]
]
]
}
And I want an array of jq index paths that contain foo:
[
".json[1][2][1]",
".json[1][3][0][1]"
]
Can I achieve this using jq and how?
I tried recurse | .foo to get the matches first but I receive an error: Cannot index array with string "foo".
First of all, I'm not sure what is the purpose of obtaining an array of jq programs. While means of doing this exist, they are seldom necessary; jq does not provide any sort of eval command.
jq has the concept of a path, which is an array of strings and numbers representing the position of an element in a JSON; this is equivalent to the strings on your expected output. As an example, ".json[1][2][1]" would be represented as ["json", 1, 2, 1]. The standard library contains several functions that operate with this concept, such as getpath, setpath, paths and leaf_paths.
We can thus obtain all leaf paths in the given JSON and iterate through them, select those for which their value in the input JSON is "foo", and generate an array out of them:
jq '[paths as $path | select(getpath($path) == "foo") | $path]'
This will return, for your given input, the following output:
[
["json", 1, 2, 1],
["json", 1, 3, 0, 1]
]
Now, although it should not be necessary, and it is most likely a sign that you're approaching whatever problem you are facing in the wrong way, it is possible to convert these arrays to the jq path strings you seek by transforming each path through the following script:
".\(map("[\(tojson)]") | join(""))"
The full script would therefore be:
jq '[paths as $path | select(getpath($path) == "foo") | $path | ".\(map("[\(tojson)]") | join(""))"]'
And its output would be:
[
".[\"json\"][1][2][1]",
".[\"json\"][1][3][0][1]"
]
Santiago's excellent program can be further tweaked to produce output in the requested format:
def jqpath:
def t: test("^[A-Za-z_][A-Za-z0-9_]*$");
reduce .[] as $x
("";
if ($x|type) == "string"
then . + ($x | if t then ".\(.)" else ".[" + tojson + "]" end)
else . + "[\($x)]"
end);
[paths as $path | select( getpath($path) == "foo" ) | $path | jqpath]
jq -f wrangle.jq input.json
[
".json[1][2][1]",
".json[1][3][0][1]"
]
I want to use jq map my input
["a", "b"]
to output
[{name: "a", index: 0}, {name: "b", index: 1}]
I got as far as
0 as $i | def incr: $i = $i + 1; [.[] | {name:., index:incr}]'
which outputs:
[
{
"name": "a",
"index": 1
},
{
"name": "b",
"index": 1
}
]
But I'm missing something.
Any ideas?
It's easier than you think.
to_entries | map({name:.value, index:.key})
to_entries takes an object and returns an array of key/value pairs. In the case of arrays, it effectively makes index/value pairs. You could map those pairs to the items you wanted.
A more "hands-on" approach is to use reduce:
["a", "b"] | . as $in | reduce range(0;length) as $i ([]; . + [{"name": $in[$i], "index": $i}])
Here are a few more ways. Assuming input.json contains your data
["a", "b"]
and you invoke jq as
jq -M -c -f filter.jq input.json
then any of the following filter.jq filters will generate
{"name":"a","index":0}
{"name":"b","index":1}
1) using keys and foreach
foreach keys[] as $k (.;.;[$k,.[$k]])
| {name:.[1], index:.[0]}
EDIT: I now realize a filter of the form foreach E as $X (.; .; R) can almost always be rewritten as E as $X | R so the above is really just
keys[] as $k
| [$k, .[$k]]
| {name:.[1], index:.[0]}
which can be simplified to
keys[] as $k
| {name:.[$k], index:$k}
2) using keys and transpose
[keys, .]
| transpose[]
| {name:.[1], index:.[0]}
3) using a function
def enumerate:
def _enum(i):
if length<1
then empty
else [i, .[0]], (.[1:] | _enum(i+1))
end
;
_enum(0)
;
enumerate
| {name:.[1], index:.[0]}
I have several lists which may or may not be empty. I want to find those elements which occur in all lists, but only for those lists that are not empty.
I have something like this:
let $results :=
$list1 intersect
$list2 intersect
$list3 intersect
$list4
But if any of the lists is empty this expression returns an empty list. Is there any way I can exclude a list from my intersection if it is empty?
SOLUTION:
This is the solution I ended up using, based on the answer provided by Ranon.
let $union := $list1 | $list2 | $list3 | $list4
let $results :=
(if ($list1) then $list1 else $union) intersect
(if ($list2) then $list2 else $union) intersect
(if ($list3) then $list3 else $union) intersect
(if ($list4) then $list4 else $union)
I would like to thank all who have contributed. Coming from an object-oriented and procedural background, and with XQuery being a functional language it doesn't come as naturally to me (yet).
Instead of using the builtin intersect directly, you can wrap it in a function and check the input lists:
declare function local:safe-intersect($xs, $ys) {
if(exists($xs) and exists($ys))
then $xs intersect $ys
else ($xs, $ys) (: at least one is empty :)
};
Then your example would look like this:
let $results :=
local:safe-intersect(
$list1,
local:safe-intersect(
$list2,
local:safe-intersect($list3, $list4)
)
)
...
You could check all lists if they're empty. If so, assign the union of all lists to them.
For my example I used the functx-implementation of intersect and union to be able to intersect sequences, too. Maybe one should write some function to avoid redundant code, but for showing the idea this code is fine:
import module namespace functx = "http://www.functx.com" at ".../functx-1.0-nodoc-2007-01.xq";
let $list1 := (1,2,3)
let $list2 := (1,3)
let $list3 := ()
let $union := functx:value-union($list1, functx:value-union($list2, $list3))
let $list1a := if (count($list1) != 0) then $list1 else $union
let $list2a := if (count($list2) != 0) then $list2 else $union
let $list3a := if (count($list3) != 0) then $list3 else $union
return functx:value-intersect($list1a, functx:value-intersect($list2a, $list3a))
$union could also be written as ($list1, $list2, $list3), but that will lead to double elements which result in slower intersection-operations for large element counts.
Use:
( $vL1 | ($vL2 | $vL3 | $vL4)[not($vL1)] )
intersect
( $vL2 | ($vL3 | $vL4 | $vL1)[not($vL2)] )
intersect
( $vL3 | ($vL4 | $vL1 | $vL2)[not($vL3)] )
intersect
( $vL4 | ($vL1 | $vL2 | $vL3)[not($vL4)] )
In this expression every argument of intersect is either a $vN or, if $vN is empty, it is the union of the rest of the sets.
This can be written more compactly as:
let $vUniverse := $vL1 | $vL2 | $vL3 | $vL4
return
( $vL1 | $vUniverse [not($vL1)] )
intersect
( $vL2 | $vUniverse [not($vL2)] )
intersect
( $vL3 | $vUniverse [not($vL3)] )
intersect
( $vL4 | $vUniverse [not($vL4)] )
Here is a complete example:
let $vL1 := /*/*[. mod 2 eq 1],
$vL2 := /*/*[. mod 3 eq 1],
$vL3 := /*/*[. mod 4 eq 1],
$vL4 := /*/*[. mod 5 eq 1],
$vUniverse := $vL1 | $vL2 | $vL3 | $vL4
return
( $vL1 | $vUniverse [not($vL1)] )
intersect
( $vL2 | $vUniverse [not($vL2)] )
intersect
( $vL3 | $vUniverse [not($vL3)] )
intersect
( $vL4 | $vUniverse [not($vL4)] )
when this XQuery expression is evaluated (using Saxon 9.3.04 EE) on the following XML document:
<nums>
<num>01</num>
<num>02</num>
<num>03</num>
<num>04</num>
<num>05</num>
<num>06</num>
<num>07</num>
<num>08</num>
<num>09</num>
<num>10</num>
</nums>
the wanted, correct result is produced:
<num>01</num>