jq accepts invalid json input when it should throw an error - jq

I am having problem to validate json string.
i am using below code
if jq -e . >/dev/null 2>&1 <<<"$json_string"; then
echo "Parsed JSON successfully and got something other than false/null"
else
echo "Failed to parse JSON, or got false/null"
fi
This does not work for json_string={"fruit":{"name":"app. this still shows Parsed JSON successfully and got something other than false/null where as the json string is incomplete.

Apparently it is one of the issues in jq-1.5. Un-terminated objects/arrays, without a corresponding close character, are being treated as valid objects and are accepted by the parser. Can reproduce in jq-1.5, but fixed in jq-1.6
On jq-1.6
jq -e . <<< '{"fruit":{"name":"app'
parse error: Unfinished string at EOF at line 2, column 0
echo $?
4
minimal reproducible example below, which again is handled well in 1.6 but doesn't throw an error in 1.5
jq -e . <<< '{'
parse error: Unfinished JSON term at EOF at line 2, column 0
jq -e . <<< '['
parse error: Unfinished JSON term at EOF at line 2, column 0
Suggest upgrading to jq-1.6 to make this work!

Related

Can't print a field called "end" using jq

I'm trying to print out a field called "end" from a json file using jq but am running into the following error:
$ echo '{"start": 10, "end": 20}` > /tmp/out.json
$ jq .start /tmp/out.json
10
$ jq .end /tmp/out.json
error: syntax error, unexpected end, expecting $end
.end
^^^
1 compile error
This issue (https://github.com/stedolan/jq/issues/256) suggests using .["end"] as the selector but that doesn't seem to work either.
$ jq .["end"] /tmp/out.json
error: syntax error, unexpected end
.[end]
^^^
1 compile error
Any ideas?
This was fixed in more recent versions of jq. I can do this:
$ jq --version
jq-1.6-1-g2f2d05b
$ jq .end <<< '{"start": 10, "end": 20}'
20
Your second attempt failed because the shell removes the double quotes. You have to protect them by quoting the whole thing:
jq '.["end"]'
The relevant issue that describes your initial problem is Reserved words should not generate errors when used as object keys; the fix was in this commit, and it looks like it was in jq since version 1.5rc2.

jq in CLI create error when I want to parse the output

Using Home Assistant 0.92 to test my CLI for creating automated backuping. After a successful backup, the command responds with an output and I need to catch that value. I'm trying to use jq to parse it but only get an error.
$ hassio snapshots new --name"Testbackup"
This gives an output of slug: 07afd144 and I want to catch 07afd144
Tried following:
$ hassio snapshots new --name"Testbackup" | jq --raw-output '.data.slug'
This gives an output of parse error: Invalid numeric literal at line 1, column 5
The final result is planned to be:
slug=$(hassio snapshots new --name="${name}" | jq --raw-output '.data.slug')
where ${slug}=07afd144
What am I doing wrong?
jq is a tool for parsing and transforming JSON documents. What you have shown is not legal JSON. It is however a legal YAML document and can be transformed with yq. yq uses jq-like syntax, but can handle JSON, YAML, XML, and CSV files.
slug=$(hassio snapshots new --name="${name}" | yq '.slug')
slug: 07afd144 isn't valid JSON and as such cannot be parsed with jq. Furthermore, it doesn't contain a data property anywhere, so .data.slug doesn't make sense.
If the format is always this simple (property name, colon, space, value), then the value can be easily extracted with other common tools generally available on GNU+Linux systems:
cut (different invocations possible):
cut -d' ' -f2-
cut -c7-
awk:
awk '{print $2}'
sed:
sed 's/^slug: //'
`perl:
perl -lane 'print $F[0]'
or even grep (different invocations possible):
grep -o '[^ ]*$'
grep -o '[[:xdigit:]]*$'

jq fails to parse when one of the values contains a backslash

I have a json output, representing a linux command in one of it's values:
... ,"proc.cmdline":"sh -c pgrep -fl \"unicorn.* worker\[.*?\]\"", ...
In some cases, the command contains a backslash, so the outputing json will contain a backslash too.
I need to parse the output with jq, but it fails with an error:
parse error: Invalid escape at line 1, column 373
It refers to this: \[
However, this is a part of the command, so it is expected to be there.
If a manually edit the line, converting \[ to \\[, then it passes. However the resulting output contains both backslashes:
...
"proc.cmdline": "sh -c pgrep -fl \"unicorn.* worker\\[.*?\\]\"",
...
Now, I can't be there to manually edit every time. This output is produced automatically by another software, and I need to parse it with jq every time it comes in.
Also, even if I was able to edit every \[ to \\[, (like by using something like sed) the output becomes a lie, the second \ is fake.
Any ideas on how to work around this?
EDIT: here is the full json for reference (received raw by the output of the program I'm using (falco)):
{"priority":"Debug","rule":"Run shell untrusted","time":"2019-05-15T07:32:36.597411997Z", "output_fields": {"evt.time":1557905556597411997,"proc.aname[2]":"gitlab-mon","proc.aname[3]":"runsv","proc.aname[4]":"runsvdir","proc.aname[5]":"wrapper","proc.aname[6]":"docker-containe","proc.aname[7]":"docker-containe","proc.cmdline":"sh -c pgrep -fl \"unicorn.* worker\[.*?\]\"","proc.name":"sh","proc.pcmdline":"reactor.rb:249 ","proc.pname":"reactor.rb:249","user.name":null}}
JSON standard is quite explicit about which characters have to be escaped, and [ is not one of them (though reverse solidus - \ is). So it's your script / software generating JSON violates the JSON standard - you can validate it on any of well-known online JSON validators, e.g., like this one: https://jsoncompare.com/#!/simple/ - it will produce the error too.
If you cannot enhance/fix your script generating that JSON, then you'd need to ensure you double quote those non-compliant quotations before passing to JSON processor: e.g.
... | sed -E 's/\\([][])/\\\\\1/g' | ...
You'll need to fix whatever is generating that "json" string. Use something that produces compliant json.
If that's not an option for you, then you will have to modify it so that it is valid json. Fortunately jq can handle that. Read it in raw, fix the string then parse it.
Assuming we just need to fix the \[ and \] sequence:
$ ... | jq -R 'gsub("\\\\(?<c>[[\\]])"; "\\\\\(.c)") | fromjson | "your filter"'
Remember, "sh -c pgrep -fl \"unicorn.* worker\\[.*?\\]\"" is a string with escapes... it represents the value:
sh -c pgrep -fl "unicorn.* worker\[.*?\]"
So it's absolutely correct to have "both backslashes."

ksh: syntax error near unexpected token `done'

I am trying to write a script to output lines which fulfill a certain criteria into a new .txt file, trying to combined unix and awk
been googling but keep getting this error:syntax error near unexpected token `done'
Filename="bishan"
file="659.A"
while IFS= read line
do
cat $Filename.txt | awk '{ otherSubNo = substr($0,73,100);gsub(/
/,"",otherSubNo); if(length(otherSubNo)>8){ print "Subscriber Number is
",": ",substr($0,1,20)," Other Subscriber Number is ", " :
",substr($0,73,100) }}'| wc -l >> $Filename.txt
done <"$file"
example of 659.A is as follows:
This is the first line of the 659.a file:
6581264562 201611050021000000002239442239460000000019010000010081866368
00C0525016104677451 100C 0 0000
0111000 000000000000000000006598540021 01010000000000659619778001010000
000000659854000300000000000000000000 004700001
Please help, I have been googling about this but no avail
I was able to reproduce the specified error, albeit only with close approximation, by typing the script in notepad (windows) and testing it in cygwin.
script.sh:
while read myline
do
echo $myline
done
In ksh:
~> /usr/bin/ksh ./script.sh
: not found
./script.sh[7]: syntax error: 'done' unexpected
In bash:
~> /usr/bin/bash ./script.sh
./script.sh: line 2: $'\r': command not found
./script.sh: line 6: syntax error near unexpected token `done'
./script.sh: line 6: `done'
The said error (at least, in my case) is because of the CRLF characters. When I copy-paste the code to cygwin, the CRLF turns to LF (along with all invisible control characters that get lost), thus making the error disappear.

When it is allowed to omit the dot filter in jq?

I do not understand, when it is allowed to omit the dot expression.
It is possible to convert every line of raw input into a JSON string:
$ echo -e "a\nb" | jq -Rc .
"a"
"b"
In that example it makes no difference, when the dot expression is missing:
$ echo -e "a\nb" | jq -Rc
"a"
"b"
Next I can read the output from the first jq and slurp it into an array:
$ echo -e "a\nb" | jq -Rc . | jq -sc .
["a","b"]
Here it makes also no difference, when I omit the dot expression:
$ echo -e "a\nb" | jq -Rc . | jq -sc
["a","b"]
But when I omit both dot expressions, I get an usage error and an empty array as result:
$ echo -e "a\nb" | jq -Rc | jq -sc
jq - commandline JSON processor [version 1.5]
Usage: jq [options] <jq filter> [file...]
...
[]
Why?
Before directly answering the question, I'd like to clarify that:
It is always acceptable to specify a filter explicitly.
Some versions of jq expect that a filter will be specified explicitly.
Different versions of jq behave differently in the absence of an explicit filter.
The main idea guiding jq's evolution with regard to interpreting the absence of a filter intelligently has been that if there's something to read on STDIN, and if a filter has not been specified explicitly, and if it looks like you meant ., then assume you did mean ..
The answer to the question, then, is that the perplexing behavior noted in the question is a bug in a particular version of jq.
(Or if you like, the perplexing behavior reflects the difficulties that arise when developers seek to endow software with the ability to read your mind.)
By the way, the bug has been fixed:
$ jq --version
jq-1.5rc2-150-g1740fd0
$ echo -e "a\nb" | jq -Rc | jq -sc
["a","b"]
The answer is in the rest of the text
Usage: jq [options] <jq filter> [file...]
A filter should be mandatory then, a filter takes an input and produces an output, but in many times you dont need to produce an output and just want the result printed so the default was . (see the issue believe introduced in 1.5, before you must had to include the filter)
so it should be the same if . is the default filtering, unfortunately is how pipe is reading stdin and stout. You can read the details in the GitHub issue
Maybe we should print the usage message only when the program is empty, and stdin and stdout are both terminals? That is, assume . when stdin is not a terminal or when stdout is not a terminal.
so the rule is :
if you want to be perfectionist always use a filter even if . is the filter you want
if you want the result of your command to be the input of another pipe, you must indicate the filter, again if you just want the same result to be taken as input of the next command.
so the same
echo -e "a\nb" | jq -Rc > test.txt will produce an error but echo -e "a\nb" | jq -Rc . > test.txt will write the result of the command into the file

Resources