I have json file with multiple object Id's and I need a query that excludes different ids based on naming conventions. These are essentially OR's. I thought I had it with this query but they are still appearing in the output.
If I run the query with them separately I can get it to work, but I need to add a large list.
Works
cat file.json | jq '.interface[] | select(.description | contains ("VLL") | not )'
Not working
cat file.json | jq '.interface[] | select(.description | contains ("VLL"|"2002089"|"otherstuff" ) | not )'
Ive tried a few different ways with commas and quoting but no luck.
Am I far off?
I also plan to run this in bash script if that help(probably makes worse)
Thanks
Am I far off?
If you use test/1 instead of contains, and make corresponding adjustments, no:
.interface[]
| select(.description | test ("VLL|2002089|otherstuff" ) | not )
The argument of test is interpreted as a regex. There are of course alternatives, but if using a regex is appropriate, then test would be suitable.
Blacklist of strings
If you have a blacklist of strings and want to use string equality as the criterion, consider:
["VLL","2002089","otherstuff"] as $blacklist
| .interface[]
| select(.description | IN($blacklist[]) | not)
I have to distinguish between the following two paths.
shorter: https://www.example.com/
longer: https://www.example.com/foo/
In Bash script, using Bash built-in literals as follows returns only longer one.
$ url1=https://www.example.com/
$ url2=https://www.example.com/foo/
$ cut -d/ -f4 <<<${url1%/*} # this returns nothing
>$
$ cut -d/ -f4 <<<${url2%/*} # this returns last part of path
>$ foo
So it could be identified longer one in Bash script,
but now I have to define same filter for JSON value handled in jq.
If jq can write like the following, my goal can be achieved...
jq '. | select( .url | (cut -d/ -f4 <<< ${url2%/*})!=null) )'
But can not do that. How can do that?
jq has many string-handling functions -- one could do worse than checking the jq manual. For the task at hand, using a regex function would probably be best, but since you mentioned cut -d/ -f4, it might be of interest to note that much the same effect can be achieved by:
split("/")[3]
For the last non-trivial part you could consider:
sub("/ *$";"") | split("/")[-1]
I have a log file and I would like to divide the result of one grep and count by another grep and count.
$ echo $((cat log2.txt | grep timed\|error\|Error | wc -l)/(cat log2.txt | grep Duration | wc -l))
zsh: bad math expression: operator expected at `log2.txt |...'
It's ugly, doesn't work and I can probably do it in a better way but I don't know how.
Also I would like to know if it possible to id incrementaly on a log stream read by tail for example.
First of all, you should know that, both grep|wc -l will count number of matched lines instead of occurrences, I hope this is what you really want.
Regarding your requirement, indeed, your approach is ugly (7 processes), apart from the mistakes. The job can be done by a single awk line:
awk '/timed|[Ee]rror/{a++}/Duration/{b++}END{printf "%.2f\n",a/b}' log2.txt
The above line calculates the result based on matched number of lines, same as your grep|wc -l.
You have several problems:
You are trying to run shell commands directly inside an arithmetic expression.
You aren't passing the correct regular expression to grep.
You need to make sure at least one of the operands is a floating-point value to trigger zsh's floating-point division.
Each pipeline can also be reduced to a single command; use input redirection instead of cat, and use the -c option to get the number of lines that match the regular expression.
echo $(( 1.0 * $(grep -c 'timed\|error\|Error' log2.txt) / $(grep -c Duration log2.txt))
Basic regular expressions treat unescaped | as a literal character, not an alteration operator.
$ echo foo | grep foo\|bar
$ echo foo | grep foo\\\|bar # Pass a literal backslash as part of the regex
foo
$ echo foo | grep 'foo\|bar' # Use '...' instead of explicitly escaping \ and |
foo
$ echo foo | grep -E 'foo|bar' # Use extended regular expressions instead
I do not understand, when it is allowed to omit the dot expression.
It is possible to convert every line of raw input into a JSON string:
$ echo -e "a\nb" | jq -Rc .
"a"
"b"
In that example it makes no difference, when the dot expression is missing:
$ echo -e "a\nb" | jq -Rc
"a"
"b"
Next I can read the output from the first jq and slurp it into an array:
$ echo -e "a\nb" | jq -Rc . | jq -sc .
["a","b"]
Here it makes also no difference, when I omit the dot expression:
$ echo -e "a\nb" | jq -Rc . | jq -sc
["a","b"]
But when I omit both dot expressions, I get an usage error and an empty array as result:
$ echo -e "a\nb" | jq -Rc | jq -sc
jq - commandline JSON processor [version 1.5]
Usage: jq [options] <jq filter> [file...]
...
[]
Why?
Before directly answering the question, I'd like to clarify that:
It is always acceptable to specify a filter explicitly.
Some versions of jq expect that a filter will be specified explicitly.
Different versions of jq behave differently in the absence of an explicit filter.
The main idea guiding jq's evolution with regard to interpreting the absence of a filter intelligently has been that if there's something to read on STDIN, and if a filter has not been specified explicitly, and if it looks like you meant ., then assume you did mean ..
The answer to the question, then, is that the perplexing behavior noted in the question is a bug in a particular version of jq.
(Or if you like, the perplexing behavior reflects the difficulties that arise when developers seek to endow software with the ability to read your mind.)
By the way, the bug has been fixed:
$ jq --version
jq-1.5rc2-150-g1740fd0
$ echo -e "a\nb" | jq -Rc | jq -sc
["a","b"]
The answer is in the rest of the text
Usage: jq [options] <jq filter> [file...]
A filter should be mandatory then, a filter takes an input and produces an output, but in many times you dont need to produce an output and just want the result printed so the default was . (see the issue believe introduced in 1.5, before you must had to include the filter)
so it should be the same if . is the default filtering, unfortunately is how pipe is reading stdin and stout. You can read the details in the GitHub issue
Maybe we should print the usage message only when the program is empty, and stdin and stdout are both terminals? That is, assume . when stdin is not a terminal or when stdout is not a terminal.
so the rule is :
if you want to be perfectionist always use a filter even if . is the filter you want
if you want the result of your command to be the input of another pipe, you must indicate the filter, again if you just want the same result to be taken as input of the next command.
so the same
echo -e "a\nb" | jq -Rc > test.txt will produce an error but echo -e "a\nb" | jq -Rc . > test.txt will write the result of the command into the file
I have the following variable. i want to search with pattern "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/"
export str='16/02/02 11:29:22 INFO mortbay.log: State being saved: {"#class":"com.paypal.fpti.hadoop.copy.FPTICopyState","timestamp":0,"state":"Running","name":"com.paypal.fpti.hadoop.copy.FPTICopyState","id":"99c7cba7-d211-4845-97a1-c34168a91b22","subStates":{"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-4_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"99034acb-cfad-41a0-89ed-e2731b1f82ec","subStates":null,"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-4","sourceDir":"/fpti/v2/hdfs_writer_4//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"40325dec-0fe2-4025-8258-f896f957ddf0","subStates":null,"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data","sourceDir":"/fpti/v2/hdfs_writer//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-1_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"5216f8c1-2cfa-4eac-a390-f4d2bcd6584f","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-1","sourceDir":"/fpti/v2/hdfs_writer_1//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-2_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"5fcd0b6e-3df9-4f82-a76f-bc8ff1493623","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-2","sourceDir":"/fpti/v2/hdfs_writer_2//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-3_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"6ec9223a-fcf0-447a-b9ae-2020e3232f6d","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-3","sourceDir":"/fpti/v2/hdfs_writer_3//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//"},"com.paypal.fpti.hadoop.copy.CopyToLocalJob_fpti-raw-data-5_2016/02/02/10/":{"#class":"com.paypal.fpti.hadoop.copy.CopyToJobState","timestamp":0,"state":"Stopped","name":"com.paypal.fpti.hadoop.copy.CopyToJobState","id":"d123742c-8a55-4e25-bfa0-0a97f6ed25d7","subStates":{},"instanceState":"PostDone","window":"2016-02-02T10:00:00.000Z","datasetname":"fpti-raw-data-5","sourceDir":"/fpti/v2/hdfs_writer_5//2016/02/02/10/","localDir":"/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//"}},"copystate":"CopyToLocalDone","start":"2016-02-02T11:21:24.678Z","end":null,"window":"2016-02-02T10:00:00.000Z","retryCount":0}'
I tried like below it gives the first occurence alone
[ggangadharan#phxbastion2 ~]$ echo $str | awk '{match($0, "/x/home[/,a-z,0-9,_]+*", a)}END{print a[0]}'
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
but i want output like below.
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//
Can somebody help me how to use awk for this scenario?
thanks in advance
I'm not sure how to hack this in awk, but you can safely use egrep here:
$ echo $str | egrep -o /x/home[/,a-z,0-9,_]+*
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//
Using "significant splitting" in AWK:
$ awk -v RS="\"" '/\/x\/home\/pp_dt_fpti_batch\/stampy_copy_orchestration\//' <<< "$str"
which gives
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//
You specified /x/home/pp_dt_fpti_batch/stampy_copy_orchestration/ for your search pattern, so I used that. If you want something different, use something different.
This separates input into records by a quote " (set RS to ", escaped in the shell). Any record matching the regular expression is printed. Input is given from the shell with the string $str. Maybe this is more readable:
$ awk -v RS='"' '/regexp/' <<< "$str"
Here are two approaches using a JSON-aware command-line tool, here jq.
In both cases we assume that the string of interest is embedded in the
JSON object contained in $str
(1) In the following, we simply pretty-print the JSON object and grep for
the string of interest in case it appears in a surprising spot. Further trimming of the result can easily be done (e.g. using sed) as desired:
$ sed 's/^[^{]*//' <<< "$str" | jq '.[]' | fgrep /x/home/pp_dt_fpti_batch/stampy_copy_orchestration/
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//"
"localDir": "/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//"
(2) The following query is appropriate if we are only interested in a
match if it occurs in an object as a value corresponding to the key "localDir":
sed 's/^[^{]*//' <<< "$str" |
jq -r '..
| select(.localDir?)
| .localDir
| select(test("/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/"))'
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_4//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_1//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_2//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_3//2016/02/02/10//
/x/home/pp_dt_fpti_batch/stampy_copy_orchestration/tmp_5//2016/02/02/10//