jq fails to parse when one of the values contains a backslash - jq

I have a json output, representing a linux command in one of it's values:
... ,"proc.cmdline":"sh -c pgrep -fl \"unicorn.* worker\[.*?\]\"", ...
In some cases, the command contains a backslash, so the outputing json will contain a backslash too.
I need to parse the output with jq, but it fails with an error:
parse error: Invalid escape at line 1, column 373
It refers to this: \[
However, this is a part of the command, so it is expected to be there.
If a manually edit the line, converting \[ to \\[, then it passes. However the resulting output contains both backslashes:
...
"proc.cmdline": "sh -c pgrep -fl \"unicorn.* worker\\[.*?\\]\"",
...
Now, I can't be there to manually edit every time. This output is produced automatically by another software, and I need to parse it with jq every time it comes in.
Also, even if I was able to edit every \[ to \\[, (like by using something like sed) the output becomes a lie, the second \ is fake.
Any ideas on how to work around this?
EDIT: here is the full json for reference (received raw by the output of the program I'm using (falco)):
{"priority":"Debug","rule":"Run shell untrusted","time":"2019-05-15T07:32:36.597411997Z", "output_fields": {"evt.time":1557905556597411997,"proc.aname[2]":"gitlab-mon","proc.aname[3]":"runsv","proc.aname[4]":"runsvdir","proc.aname[5]":"wrapper","proc.aname[6]":"docker-containe","proc.aname[7]":"docker-containe","proc.cmdline":"sh -c pgrep -fl \"unicorn.* worker\[.*?\]\"","proc.name":"sh","proc.pcmdline":"reactor.rb:249 ","proc.pname":"reactor.rb:249","user.name":null}}

JSON standard is quite explicit about which characters have to be escaped, and [ is not one of them (though reverse solidus - \ is). So it's your script / software generating JSON violates the JSON standard - you can validate it on any of well-known online JSON validators, e.g., like this one: https://jsoncompare.com/#!/simple/ - it will produce the error too.
If you cannot enhance/fix your script generating that JSON, then you'd need to ensure you double quote those non-compliant quotations before passing to JSON processor: e.g.
... | sed -E 's/\\([][])/\\\\\1/g' | ...

You'll need to fix whatever is generating that "json" string. Use something that produces compliant json.
If that's not an option for you, then you will have to modify it so that it is valid json. Fortunately jq can handle that. Read it in raw, fix the string then parse it.
Assuming we just need to fix the \[ and \] sequence:
$ ... | jq -R 'gsub("\\\\(?<c>[[\\]])"; "\\\\\(.c)") | fromjson | "your filter"'
Remember, "sh -c pgrep -fl \"unicorn.* worker\\[.*?\\]\"" is a string with escapes... it represents the value:
sh -c pgrep -fl "unicorn.* worker\[.*?\]"
So it's absolutely correct to have "both backslashes."

Related

Re-colorizing escaped ANSI sequences in shell

I have the following string:
\u001b[1m\u001b[36mDelayed::Backend::ActiveRecord::Job Load
It's coming back as part of a remote log fetch. I would like to reinterpret the escape codes so I get colorized output as originally intended.
The following works fine:
❯ echo '\u001b[1m\u001b[36mDelayed::Backend::ActiveRecord::Job Load' | sed 's/\\u001/\u001/'
However when I try the equivalent command using tail:
tail -f logfile | sed 's/\\u001/\u001
I do not get colonized output.
What is the best way to re-colorize text with escaped ANSI sequences?

jq in CLI create error when I want to parse the output

Using Home Assistant 0.92 to test my CLI for creating automated backuping. After a successful backup, the command responds with an output and I need to catch that value. I'm trying to use jq to parse it but only get an error.
$ hassio snapshots new --name"Testbackup"
This gives an output of slug: 07afd144 and I want to catch 07afd144
Tried following:
$ hassio snapshots new --name"Testbackup" | jq --raw-output '.data.slug'
This gives an output of parse error: Invalid numeric literal at line 1, column 5
The final result is planned to be:
slug=$(hassio snapshots new --name="${name}" | jq --raw-output '.data.slug')
where ${slug}=07afd144
What am I doing wrong?
jq is a tool for parsing and transforming JSON documents. What you have shown is not legal JSON. It is however a legal YAML document and can be transformed with yq. yq uses jq-like syntax, but can handle JSON, YAML, XML, and CSV files.
slug=$(hassio snapshots new --name="${name}" | yq '.slug')
slug: 07afd144 isn't valid JSON and as such cannot be parsed with jq. Furthermore, it doesn't contain a data property anywhere, so .data.slug doesn't make sense.
If the format is always this simple (property name, colon, space, value), then the value can be easily extracted with other common tools generally available on GNU+Linux systems:
cut (different invocations possible):
cut -d' ' -f2-
cut -c7-
awk:
awk '{print $2}'
sed:
sed 's/^slug: //'
`perl:
perl -lane 'print $F[0]'
or even grep (different invocations possible):
grep -o '[^ ]*$'
grep -o '[[:xdigit:]]*$'

Passing variables to grep command in Tcl Script

I'm facing a problem while trying to pass a variable value to a grep command.
In essence, I want to grep out the lines which match my pattern and the pattern is stored in a variable. I take in the input from the user, and parse through myfile and see if the pattern exists(no problem here).
If it exists I want to display the lines which have the pattern i.e grep it out.
My code:
if {$a==1} {
puts "serial number exists"
exec grep $sn myfile } else {
puts "serial number does not exist"}
My input: SN02
My result when I run grep in Shell terminal( grep "SN02" myfile):
serial number exists
SN02 xyz rtw 345
SN02 gfs rew 786
My result when I try to execute grep in Tcl script:
serial number exists
The lines which match the pattern are not displayed.
Your (horrible IMO) indentation is not actually the problem. The problem is that exec does not automatically print the output of the exec'ed command*.
You want puts [exec grep $sn myfile]
This is because the exec command is designed to allow the output to be captured in a variable (like set output [exec some command])
* in an interactive tclsh session, as a convenience, the result of commands is printed. Not so in a non-interactive script.
To follow up on the "horrible" comment, your original code has no visual cues about where the "true" block ends and where the "else" block begins. Due to Tcl's word-oriented nature, it pretty well mandates the one true brace style indentation style.

programmatic grep command output

Is there a way to get XML or equivalent output of grep command that can be passed on to other programs.
For example, grep can give the file names, line numbers and context of the pattern matched.
Filename and line number extraction can be done using some split command with delimiter ':'. However, if the filename contains ':' character (I know it is weird, but there is a possibility), it would need lot more processing.
With the context (grep -C option), it becomes even more complex. If the context of two matches overlaps, grep optimizes the output and it will be difficult to separate.
So I am wondering if grep command can simply generate an XML or JSON like output that other programs can just load.
There is an option -Z to grep which produces unambiguous output, by using Nul characters.

Zsh completions with multiple repeated options

I am attempting to bend zsh, my shell of choice, to my will, and am completely at a loss on the syntax and operation of completions.
My use case is this: I wish to have completions for 'ansible-playbook' under the '-e' option support three variations:
Normal file completion: ansible-playbook -e vars/file_name.yml
Prepended file completion: ansible-playbook -e #vars/file_name.yml
Arbitrary strings: ansible-playbook -e key=value
I started out with https://github.com/zsh-users/zsh-completions/blob/master/src/_ansible-playbook which worked decently, but required modifications to support the prefixed file pathing. To achieve this I altered the following lines (the -e line):
...
"(-D --diff)"{-D,--diff}"[when changing (small files and templates, show the diff in those. Works great with --check)]"\
"(-e --extra-vars)"{-e,--extra-vars}"[EXTRA_VARS set additional variables as key=value or YAML/JSON]:extra vars:(EXTRA_VARS)"\
'--flush-cache[clear the fact cache]'\
to this:
...
"(-D --diff)"{-D,--diff}"[when changing (small files and templates, show the diff in those. Works great with --check)]"\
"(-e --extra-vars)"{-e,--extra-vars}"[EXTRA_VARS set additional variables as key=value or YAML/JSON]:extra vars:__at_files"\
'--flush-cache[clear the fact cache]'\
and added the '__at_files' function:
__at_files () {
compset -P #; _files
}
This may be very noobish, but for someone that has never encountered this before, I was pleased that this solved my problem, or so I thought.
This fails me if I have multiple '-e' parameters, which is totally a supported model (similar to how docker allows multiple -v or -p arguments). What this means is that the first '-e' parameter will have my prefixed completion work, but any '-e' parameters after that point become 'dumb' and only allow for normal '_files' completion from what I can tell. So the following will not complete properly:
ansible-playbook -e key=value -e #vars/file
but this would complete for the file itself:
ansible-playbook -e key=value -e vars/file
Did I mess up? I see the same type of behavior for this particular completion plugin's '-M' option (it also becomes 'dumb' and does basic file completion). I may have simply not searched for the correct terminology or combination of terms, or perhaps in the rather complicated documentation missed what covers this, but again, with only a few days experience digging into this, I'm lost.
If multiple -e options are valid, the _arguments specification should start with * so instead of:
"(-e --extra-vars)"{-e,--extra-vars}"[EXTR ....
use:
\*{-e,--extra-vars}"[EXTR ...
The (-e --extra-vars) part indicates a list of options that can not follow the one being specified. So that isn't needed anymore because it is presumably valid to do, e.g.:
ansible-playbook -e key-value --extra-vars #vars/file

Resources