MariaDB JSON_ARRAYAGG gives wrong result - mariadb

I have 2 problems in MariaDB 15.1 when using JSON_ARRAYAGG
The brackets [] are omitted
Incorrect wrong result, values are duplicates or omitted
My database is the following:
user:
+----+------+
| id | name |
+----+------+
| 1 | Jhon |
| 2 | Bob |
+----+------+
car:
+----+---------+-------------+
| id | user_id | model |
+----+---------+-------------+
| 1 | 1 | Tesla |
| 2 | 1 | Ferrari |
| 3 | 2 | Lamborghini |
+----+---------+-------------+
phone:
+----+---------+----------+--------+
| id | user_id | company | number |
+----+---------+----------+--------+
| 1 | 1 | Verzion | 1 |
| 2 | 1 | AT&T | 2 |
| 3 | 1 | T-Mobile | 3 |
| 4 | 2 | Sprint | 4 |
| 5 | 1 | Sprint | 2 |
+----+---------+----------+--------+
1. The brackets [] are omitted
For example this query that gets users with their list of cars:
SELECT
user.id AS id,
user.name AS name,
JSON_ARRAYAGG(
JSON_OBJECT(
'id', car.id,
'model', car.model
)
) AS cars
FROM user
INNER JOIN car ON user.id = car.user_id
GROUP BY user.id;
Result: brackets [] were omitted in cars (JSON_ARRAYAGG has the behavior similar to GROUP_CONCAT)
+----+------+-----------------------------------------------------------+
| id | name | cars |
+----+------+-----------------------------------------------------------+
| 1 | Jhon | {"id": 1, "model": "Tesla"},{"id": 2, "model": "Ferrari"} |
| 2 | Bob | {"id": 3, "model": "Lamborghini"} |
+----+------+-----------------------------------------------------------+
However when adding the filter WHERE user.id = 1, the brackets [] are not omitted:
+----+------+-------------------------------------------------------------+
| id | name | cars |
+----+------+-------------------------------------------------------------+
| 1 | Jhon | [{"id": 1, "model": "Tesla"},{"id": 2, "model": "Ferrari"}] |
+----+------+-------------------------------------------------------------+
2. Incorrect wrong result, values are duplicates or omitted
This error is strange as the following conditions must be met:
Consult more than 2 tables
The DISTINCT option must be used
A user has at least 2 cars and at least 3 phones.
Duplicate values
for example, this query that gets users with their car list and their phone list:
SELECT
user.id AS id,
user.name AS name,
JSON_ARRAYAGG( DISTINCT
JSON_OBJECT(
'id', car.id,
'model', car.model
)
) AS cars,
JSON_ARRAYAGG( DISTINCT
JSON_OBJECT(
'id', phone.id,
'company', phone.company,
'number', phone.number
)
) AS phones
FROM user
INNER JOIN car ON user.id = car.user_id
INNER JOIN phone ON user.id = phone.user_id
GROUP BY user.id;
I will leave the output in json format and I will only leave the elements that interest.
Result: brackets [] were omitted and duplicate Verizon
{
"id": 1,
"name": "Jhon",
"phones": // [ Opening bracket expected
{
"id": 5,
"company": "Sprint",
"number": 2
},
{
"id": 1,
"company": "Verzion",
"number": 1
},
{
"id": 1,
"company": "Verzion",
"number": 1
}, // Duplicate object with the DISTINCT option
{
"id": 2,
"company": "AT&T",
"number": 2
},
{
"id": 3,
"company": "T-Mobile",
"number": 3
}
// ] Closing bracket expected
}
Omitted values
This error occurs when omit phone.id is omitted in the query
SELECT
user.id AS id,
user.name AS name,
JSON_ARRAYAGG( DISTINCT
JSON_OBJECT(
'id', car.id,
'model', car.model
)
) AS cars,
JSON_ARRAYAGG( DISTINCT
JSON_OBJECT(
--'id', phone.id,
'company', phone.company,
'number', phone.number
)
) AS phones
FROM user
INNER JOIN car ON user.id = car.user_id
INNER JOIN phone ON user.id = phone.user_id
GROUP BY user.id;
Result: brackets [] were omitted and Sprint was omitted.
Apparently this happens because it makes an OR type between the columns of the JSON_OBJECT, since the company exists in a different row and number in a other different row
{
"id": 1,
"name": "Jhon",
"phones": // [ Opening bracket expected
//{
// "company": "Sprint",
// "number": 2
//}, `Sprint` was omitted
{
"company": "Verzion",
"number": 1
},
{
"company": "AT&T",
"number": 2
},
{
"company": "T-Mobile",
"number": 3
}
// ] Closing bracket expected
}
GROUP_CONCAT instance of JSON_ARRAYAGG solves the problem of duplicate or omitted objects
However, by adding the filter WHERE user.id = 1, the brackets [] are not omitted and also the problem of duplicate or omitted objects is also solved:
{
"id": 1,
"name": "Jhon",
"phones": [
{
"id": 1,
"company": "Verzion",
"number": 1
},
{
"id": 2,
"company": "AT&T",
"number": 2
},
{
"id": 3,
"company": "T-Mobile",
"number": 3
},
{
"id": 5,
"company": "Sprint",
"number": 2
}
]
}
What am I doing wrong?

So far my solution is this, but I would like to use JSON_ARRAYAGG since the query is cleaner
-- 1
SELECT
user.id AS id,
user.name AS name,
CONCAT(
'[',
GROUP_CONCAT( DISTINCT
JSON_OBJECT(
'id', car.id,
'model', car.model
)
),
']'
) AS cars
FROM user
INNER JOIN car ON user.id = car.user_id
GROUP BY user.id;
-- 2
SELECT
user.id AS id,
user.name AS name,
CONCAT(
'[',
GROUP_CONCAT( DISTINCT
JSON_OBJECT(
'id', car.id,
'model', car.model
)
),
']'
) AS cars,
CONCAT(
'[',
GROUP_CONCAT( DISTINCT
JSON_OBJECT(
'id', phone.id,
'company', phone.company,
'number', phone.number
)
),
']'
) AS phones
FROM user
INNER JOIN car ON user.id = car.user_id
INNER JOIN phone ON user.id = phone.user_id
GROUP BY user.id;

Related

Conditionally output a field?

In this example I only want isGreaterThanOne field to be shown if it's true. Here's what I started with (always shown)
echo '[{"a":5},{"a":1}]' | jq '[.[] | {value:.a, isGreaterThanOne:(.a>1)}]'
I inserted an if statement
echo '[{"a":5},{"a":1}]' | jq '[.[] | {value:.a, X:(if .a>1 then "Y" else "N" end) }]'
Then got stuck trying to move the field into the conditional. Also it seems like I must have an else with an if
echo '[{"a":5},{"a":1}]' | jq '[.[] | {value:.a, (if .a>1 then (K:"Y)" else (L:"N") end) }]'
I want the below as the result (doesn't need to be pretty printed)
[
{
"value": 5,
"X": "Y"
},
{
"value": 1,
}
]
Using if, make one branch provide an empty object {} which wouldn't contain the extra field:
map({value: .a} + if .a > 1 then {X: "Y"} else {} end)
Demo
Alternatively, equip only selected items with the extra field:
map({value: .a} | select(.value > 1).X = "Y")
Demo
Output:
[
{
"value": 5,
"X": "Y"
},
{
"value": 1
}
]

jq: list users belonging to a specific group in array

input json:
[
{
"user": "u1"
},
{
"user": "u2",
"groups": [
{
"id": "100001",
"name": "G1"
},
{
"id": "100002",
"name": "G2"
}
]
},
{
"user": "u3",
"groups": [
{
"id": "100001",
"name": "G1"
}
]
}
]
I want to find all users belonging to specific group (searching by group name or group id in the groups array)
$ jq -r '.[]|select(.groups[].name=="G1" | .user)' json
jq: error (at json:27): Cannot iterate over null (null)
Desired output format when searching of example group G1 would be:
u2
u3
Additional question:
Is it possible to produce comma-separated output u2,u3 without using external utilities like tr?
Better enter your serach data from parameters using --arg and use any to avoid duplicate outputs if both inputs match:
jq -r --arg id "" --arg name "G1" '
.[] | select(.groups | map(.id == $id or .name == $name) | any)? | .user
'
u2
u3
Demo
Using ? as the Optional Object Identifier-Index operator, you could do a select as below
map(select(.groups[].name == "G1")? | .user)
and un-wrap the results from the array by using [] at the end of the filter. To combine multiple selection conditions use the boolean operators with and/or inside the select statement
See demo on jqplay

Double condition in array with JQ

My JSON is an array of one object like this:
[{
"id": 125650,
"status": "success",
"name": "build_job",
"artifacts": [
{
"file_type": "archive",
"size": 72720116,
"filename": "artifacts.zip",
"file_format": "zip"
},
{
"file_type": "metadata",
"size": 1406,
"filename": "metadata.gz",
"file_format": "gzip"
}
]
}]
I want to select only the object ID if the following conditions matches:
status == success
name == build_job
artifacts.size > 0 where file_type == archive
I'm stuck on the last condition, I can select artifacts with size > 0, OR artifacts where file_type = archive, but not both at the same time.
Here's my current query :
| jq '.[0] | select(.name == "build_job" and .status == "success" and .artifacts[].file_type == "archive") | .id'
Can you help me with that ?
For the last condition, you presumably mean something like:
all(.artifacts[];
if .file_type == "archive" then .size > 0 else true end)
which can also be written as:
all(.artifacts[] | select(.file_type == "archive");
.size > 0)
I’d recommend using either all or any, depending on your requirements.
Try this:
.[0] | select(
.name == "build_job" and .status == "success" and (
.artifacts[] | select(.file_type == "archive") | length > 0
)
) | .id
This selects successful build_jobs containing one or more archive artifacts. Unfortunately, multiple ids are returned if there's more than one such artifacts. Here's how to wrap the expression to fix that:
[
.[] | select(
.name == "build_job" and .status == "success" and (
.artifacts[] | select(.file_type == "archive") | length > 0
)
)
] | unique | .[].id
For the last condition, take the array .artifacts, reduce it to those elements matching your criteria map(select(.file_type == "archive")) and test the resulting array's length length > 0.
All together:
.[0] | select(
.name == "build_job" and
.status == "success" and (
(.artifacts | map(select(.file_type == "archive"))) | length > 0
)
)
| .id

jq - find duplicates in a value which is nested array of strings

Assuming the below input, how can I detect the presence of duplicates in the replicas list? (replicas":[5,5,6]")
{"version":1,
"partitions":
[{"topic":"mytopic1","partition":3,"replicas":[4,5],"log_dirs":["any","any"]},
{"topic":"mytopic1","partition":1,"replicas":[5,5,6],"log_dirs":["any","any"]},
{"topic":"mytopic2","partition":2,"replicas":[6,5],"log_dirs":["any","any"]}]
}
This one will give you an array of just the partitions with duplicates in the replicas field:
jq '[.partitions[] | select((.replicas | length) != (.replicas | unique | length))]' input.json
Pretty-printed example output:
[
{
"topic": "mytopic1",
"partition": 1,
"replicas": [
5,
5,
6
],
"log_dirs": [
"any",
"any"
]
}
]

How do I select multiple fields in jq?

My input file looks something like this:
{
"login": "dmaxfield",
"id": 7449977,
...
}
{
"login": "dmaxfield",
"id": 7449977,
...
}
I can get all the login names with this : cat members | jq '.[].login'
but I have not been able to crack the syntax to get both the login and id?
You can use jq '.[] | .login, .id' to obtain each login followed by its id.
This works for me:
> echo '{"a":1,"b":2,"c":3}{"a":1,"b":2,"c":3}' | jq '{a,b}'
{
"a": 1,
"b": 2
}
{
"a": 1,
"b": 2
}
Just provide one more example here (jq-1.6):
Walk through an array and select a field of an object element and a field of object in that object
echo '[{"id":1, "private_info": {"name": "Ivy", "age": 18}}, {"id":2, "private_info": {"name": "Tommy", "aga": 18}}]' | jq ".[] | {id: .id, name: .private_info.name}" -
{
"id": 1,
"name": "Ivy"
}
{
"id": 2,
"name": "Tommy"
}
Without the example data:
jq ".[] | {id, name: .private_info.name}" -
.[]: walk through an array
{id, name: .private_info.name}: take .id and .private_info.name and wrap it into an object with field name "id" and "name" respectively
In order to select values which are indented to different levels (i.e. both first and second level), you might use the following:
echo '[{"a":{"aa":1,"ab":2},"b":3,"c":4},{"a":{"aa":5,"ab":6},"b":7,"c":8}]' \
| jq '.[]|[.a.aa,.a.ab,.b]'
[
1,
2,
3
]
[
5,
6,
7
]

Resources