reduction of stream of object does not work for me - dictionary

I wrote several reductions, where I had array to begin with. But if I try to read raw data and transform each line into object, I don't have much luck reducing them together
echo -e "1\n2\n\n\n3\n4\n5" | jq --raw-input '. | select (. != "") | {(.):123} | reduce . as $i ({}; . + $i)'
the reduction does nothing. Why? How to correct the reduction to produce single object having keys 1,2,3,4,5?

First, the initial .| is unnecessary.
Second, since your input is a stream, you will either need to use the -s option, or better, use the -n option with inputs.
So you could go with:
echo -e "1\n2\n\n\n3\n4\n5" |
jq -nR 'reduce (inputs|select(. != "")) as $i ({}; . + {($i): 123})'
though maybe {($i): null} might be more appropriate.

You were almost there. To convert from multiple results to a single object, you can run another jq in slurp mode:
echo -e "1\n2\n\n\n3\n4\n5" \
| jq --raw-input 'select (. != "") | {(.):123}' \
| jq --slurp 'reduce .[] as $o ({}; . + $o)'

Related

jq: how to check two conditions in the any filter?

I have this line jq 'map(select( any(.topics[]; . == "stackoverflow" )))'
Now I want to modify it (I didn't write the original) to add another condition to the any function.
Something like this jq 'map(select( any(.topics[]; . == "stackoverflow" and .archived == "false" )))'
But it gives me “Cannot index string with string “archived”".
The archive field is on the same level as the topics array (it's repo information from the github API).
It is part of a longer command, FYI:
repositoryNames=$(curl \
-H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/orgs/organization/repos?per_page=100&page=$i" | \
jq 'map(select(any(.topics[]; . == "stackoverflow")))' | \
jq -r '.[].name')
The generator provided to any already descends to .topics[] from where you cannot back-reference two levels higher. Use the select statement to filter beforehand (also note that booleans are not strings):
jq 'map(select(.archived == false and any(.topics[]; . == "stackoverflow")))'
You should also be able to combine both calls to jq into one:
jq -r '.[] | select(.archived == false and any(.topics[]; . == "stackoverflow")).name'

tcsh passing a variable inside a shell script

I've defined a variable inside a shell script and I want to use it. For some reason, I cannot pass it into to command line that I need it in.
Here's my script which fails at the last lines
#! /usr//bin/tcsh -f
if ( $# != 2 ) then
echo "Usage: jump_sorter.sh <jump> <field to sort on>"
exit;
endif
set a = `cat $1 | tail -1` #prepares last row for check with loop
set b = $2 #this is the value last row will be checked for
set counter = 0
foreach i ($a)
if ($i == "$b") then
set bingo = $counter
echo "$bingo is the field to print from $a"
endif
set counter = `expr $counter + 1`
end
echo $bingo #this prints the correct value for using in the command below
cat $1 | awk '{print($bingo)}' | sort | uniq -c | sort -nr #but this doesn't work.
#when I use $9 instead of $bingo, it does work.
How can I pass $bingo into the final line correctly, please?
Update: following the accepted answer from Martin Tournoij, the correct way to handle the "$" sign in the command is:
cat $1 | awk "{print("\$"$bingo)}" | sort | uniq -c | sort -nr
The reason it doesn't work is because variables are only substituted inside double quotes ("), not single quotes ('), and you're using single quotes:
cat $1 | awk '{print($bingo)}' | sort | uniq -c | sort -nr
The following should work:
cat $1 | awk "{print($bingo)}" | sort | uniq -c | sort -nr
You also have an error here:
#! /usr//bin/tcsh -f
That should be:
#!/usr/bin/tcsh -f
Note that csh isn't usually recommended for scripting; it has many quirks and lacks some features like functions. Unless you really need to use csh, it's recommended to use a Bourne shell (/bin/sh, bash, zsh) or a scripting language (Python, Ruby, etc.) instead.

Passing Multiple Objects to jq for Recursive Filter Operation

I am trying to use jq 1.5 to develop a script that can take one or more user inputs that represent a key and recursively remove them from JSON input.
The JSON I am referencing is here:
https://github.com/EmersonElectricCo/fsf/blob/master/docs/Test.json
My script, which seems to work pretty well, is as follows.
def post_recurse(f):
def r:
(f | select(. != null) | r), .;
r;
def post_recurse:
post_recurse(.[]?);
(post_recurse | objects) |= del(.META_BASIC_INFO)
However, I would like to replace META_BASIC_INFO with one or more user inputs. How would I go about accomplishing this? I presume with --arg from the command line, but I am unclear on how to incorporate this into my .jq script?
I've tried replacing del(.META_BASIC_INFO) with del(.$module) and invoking with cat test.json | ./jq -f fsf_key_filter.jq --arg module META_BASIC_INFO to test but this does not work.
Any guideance on this is greatly appreciated!
ANSWER:
Based on a couple of suggestions I was able to arrive to the following that works and users JQ.
Innvocation:
cat test.json | jq --argjson delete '["META_BASIC_INFO","SCAN_YARA"]' -f fsf_module_filter.jq
Code:
def post_recurse(f):
def r:
(f | select(. != null) | r), .;
r;
def post_recurse:
post_recurse(.[]?);
(post_recurse | objects) |= reduce $delete[] as $d (.; delpaths([[ $d ]]))
It seems the name module is a keyword in 1.5 so $module will result in a syntax error. You should use a different name. There are other builtins to do recursion for you, consider using them instead of churning out your own.
$ jq '(.. | objects | select(has($a))) |= del(.[$a])' --arg a "META_BASIC_INFO" Test.json
You could also use delpaths/1. For example:
$ jq -n '{"a":1, "b": 1} | delpaths([["a"]])'
{
"b": 1
}
That is, modifying your program so that the last line reads like this:
(post_recurse | objects) |= delpaths([[ $delete ]] )
you would invoke jq like so:
$ jq --arg delete "META_BASIC_INFO" -f delete.jq input.json
(One cannot use --arg module ... as "$module" has some kind of reserved status.)
Here's a "one-line" solution using walk/1:
jq --arg d "META_BASIC_INFO" 'walk(if type == "object" then del(.[$d]) else . end)' input.json
If walk/1 is not in your jq, here is its definition:
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
If you want to recursively delete a bunch of key-value pairs, then here's one approach using --argjson:
rdelete.jq:
def rdelete(key):
walk(if type == "object" then del(.[key]) else . end);
reduce $strings[] as $s (.; rdelete($s))
Invocation:
$ jq --argjson strings '["a","b"]' -f rdelete.jq input.json

Terminating jq processing when condition is met

I am using jq to search for specific results in a large file. I do not care for duplicate entries matching this specific condition, and it takes a while to process the whole file. What I would like to do is print some details about the first match and then terminate the jq command on the file to save time.
I.e.
jq '. | if ... then "print something; exit jq" else ... end'
I looked into http://stedolan.github.io/jq/manual/?#Breakingoutofcontrolstructures but this didn't quite seem to apply
EDIT:
The file I am parsing contains multiple json objects, one after another. They are not in an array.
Here is an approach which uses a recent version of first/1 (currently in master)
def first(g): label $out | g | ., break $out;
first(inputs | if .=="100" then . else empty end)
Example:
$ seq 1000000000 | jq -M -Rn -f filter.jq
Output (followed by immediate termination)
"100"
Here I use seq in lieu of a large JSON dataset.
To do what is requested is possible using features that were added after the release of jq 1.4. The following uses foreach and inputs:
label $top
| foreach inputs as $line
# state: true means found; false means not yet found
(false;
if . then break $top
else if $line | tostring | test("goodbye") then true else false end
end;
if . then $line else empty end
)
Example:
$ cat << EOF | jq -n -f exit.jq
1
"goodbye"
3
4
EOF
Result:
"goodbye"
You can accomplish it using halt and the inputs builtin:
jq -n 'inputs | if ... then "something", halt else ... end'
Will print "something" and terminate gracefully when the condition matches.
For this to work (i.e. terminate when condition is true), jq needs the -n parameter. See this issue

How to sort characters in a string?

I would like to sort the characters in a string.
E.g.
echo cba | sort-command
abc
Is there a command that will allow me to do this or will I have to write an awk script to iterate over the string and sort it?
echo cba | grep -o . | sort |tr -d "\n"
Please find the following useful methods:
Shell
Sort string based on its characters:
echo cba | grep -o . | sort | tr -d "\n"
String separated by spaces:
echo 'dd aa cc bb' | tr " " "\n" | sort | tr "\n" " "
Perl
print (join "", sort split //,$_)
Ruby
ruby -e 'puts "dd aa cc bb".split(/\s+/).sort'
Bash
With bash you have to enumerate each character from a string, in general something like:
str="dd aa cc bb";
for (( i = 0; i < ${#str[#]}; i++ )); do echo "${str[$i]}"; done
For sorting array, please check: How to sort an array in bash?
This is cheating (because it uses Perl), but works. :-P
echo cba | perl -pe 'chomp; $_ = join "", sort split //'
Another perl one-liner
$ echo cba | perl -F -lane 'print sort #F'
abc
$ # for reverse order
$ echo xyz | perl -F -lane 'print reverse sort #F'
zyx
$ # or
$ echo xyz | perl -F -lane 'print sort {$b cmp $a} #F'
zyx
This will add newline to output as well, courtesy -l option
See Command switches for doc on all the options
The input is basically split character wise and saved in #F array
Then sorted #F is printed
This will also work line wise for given input file
$ cat ip.txt
idea
cold
spare
umbrella
$ perl -F -lane 'print sort #F' ip.txt
adei
cdlo
aeprs
abellmru
This would have been more appropriate as a comment to one of the grep -o . solutions (my reputation's not quite up to that low bar alas, damn my lurking), but I thought it worth mentioning that separating letters can be done more efficiently within the shell. It's always worth avoiding code, but this letsep function is pretty small:
letsep ()
{
INWORD="$1"
while [ "$INWORD" ]
do
echo ${INWORD:0:1}
INWORD=${INWORD#?}
done
}
. . . and outputs one letter per line for an input string of arbitrary length. For example, once letsep is defined, populating an array FLETRS with the letters of a string contained in variable FRED could be done (assuming contemporary bash) as:
readarray -t FLETRS < <(letsep $FRED)
. . . which for word-size strings runs about twice as fast as the equivalent :
readarray -t FLETRS < <(echo $FRED | grep -o .)
Whether this is worth setting up depends on the application. I've only measured this crudely, but the slower procedural code seems to maintain an advantage over the context switch up to ~60 chars (grep is obviously more efficient, but loading it is relatively expensive). If the above operation is taking place in one or more steps of a loop over an indeterminate number of executions, the difference in efficiency can add up (at which point some might argue for switching tools and rewriting regardless, but that's another set of tradeoffs).

Resources