jq -c fails in AWS pipeline - jq

I use cat file.json | jq -c to print out a minified json file in the logs in an AWS pipeline step. I've tested that it works locally but in the pipeline it fails and prints out the system usage:
jq - commandline JSON processor [version 1.5]
Usage: jq [options] <jq filter> [file...]
...
Why does it fail in the pipeline but not locally?

That's not the right question. jq -c (with no further argument) should fail. According to the very output you obtained from jq, jq always requires at least one argument: a program.[1] That fact that it sometimes doesn't fail when no program is provided is a bug. One that appears to have been fixed in 1.6.[2]
If you simply want to reformat the JSON, you can use the trivial program ..
cat input.json | jq -c .
If the input really is a file, the following is better:
jq -c . input.json
Or the name of a file containing the program if -f is used.
Meaning I wasn't able to reproduce the lack of error in 1.6 after trying using Windows, Cygwin and Ubuntu builds.

The solution was to use cat file.json | jq -c . (note the dot that's added) in AWS.
After some experimenting, I found out that this happens only in jq 1.5 (I had jq 1.6 locally), and only when using jq -c instead of jq -c . in a detached terminal. My AWS pipeline uses jq 1.5.
I reproduced it locally by downloading jq 1.5 and running:
$ cat input.json | ./jq1.5 -c >output.txt &
[1] 822
jq - commandline JSON processor [version 1.5]
Usage: C:\Workspace\jq\jq1.5.exe [options] <jq filter> [file...]
Note that the ampersand executes it in the background which simulates a detached terminal.
What works:
cat input.json | ./jq1.6 -c >output.txt &
cat input.json | ./jq1.5 -c . >output.txt &
cat input.json | ./jq1.5 -c >output.txt
To sum up, if using jq1.5 in a detached terminal, also use the dot.

Related

msys / mingw command line requires or rejects exe extension based on whether stdout is redirected with ConEmu console

The underlying cause of this problem is described elsewhere, with partial workarounds provided.
For example: stdout is not a tty and stdin is not a tty
An example of a command line I'm having problems with in MSYS2 or MINGW64 environments is this:
# psql -c '\d' | grep public
stdout is not a tty
Here's the output, if issued without piping to grep:
# psql -c '\d'
List of relations
Schema | Name | Type | Owner
--------+----------------+----------+----------
public | tagname | table | postgres
public | tagname_id_seq | sequence | postgres
(2 rows)
In order to redirect stdout, it's apparently necessary to edit the command line, changing psql to psql.exe. This modification does the trick:
# psql.exe -c '\d' | grep public
public | tagname | table | postgres
public | tagname_id_seq | sequence | postgres
This version works, whether or not stdout is redirected:
# psql.exe -c '\d'
List of relations
Schema | Name | Type | Owner
--------+----------------+----------+----------
public | tagname | table | postgres
public | tagname_id_seq | sequence | postgres
(2 rows)
Note that the problem only exists for some programs. For example, /usr/bin/find is indifferent to whether the .exe extension is specified. Also, the cygwin version of psql.exe does not suffer from this limitation.
The workaround of appending .exe could be hidden with an alias if you could always call psql as psql.exe, but there are problems with interactive sessions when called with the extension. (in ConEmu terminal under MSYS_NT-10.0-19042, for example)
So my question is this: is it possible to create a wrapper program (for example, in golang or C, or a shell script) to hide this problem? The goal would be to support optionally redirectable command lines without requiring the .exe extension.
Assuming the wrapper is named "fixtty", it would spawn a command line that could be redirected or not. In other words, neither of these command lines would fail with an error message:
# fixtty psql -c '\d'
# fixtty psql -c '\d' | grep public
A wrapper script (call it fixtty) that hides the problem:
#!/bin/bash
if [ -t 1 ]; then
# stdout is a terminal
"$#"
else
# stdout is a file or pipe
PROG=$1 ; shift
set -- "$PROG.exe" "$#" # append .exe suffix
"$#"
fi
Usage:
$ fixtty psql -c '\d' # unfiltered `stdout`
$ fixtty psql -c '\d' | grep public # filtered
$ fixtty psql # launch interactive session

fetch the value of number of active threads in a process

I am trying to fetch the number of threads of a process in a UNIX using command line. After going through the man page of unix command, I learnt that following command:
ps -o nlwp <pid>
returns the number of threads spawned in a process.
Whenever i executed above command in unix, it returned:
NLWP
7
Now, I want to neglect NLWP and a space before 7.
That is I am just interested in a value, as I will be using it in a script, that I am writing for unit testing?
Is it possible to fetch only value, and neglect everything(Title NLWP, space)?
You can always use the --no-headers option in ps to get rid of the headers.
In that case, use awk to just print the first value:
ps --no-headers -o nlwp <pid> | awk '{print $1}'
Or tr to remove the spaces:
ps --no-headers -o nlwp <pid> | tr -d ' '
If --no-headers is not supported in your ps version, either of these make it:
ps -o nlwp <pid> | awk 'END {print $1}'
ps -o nlwp <pid> | tail -1 | tr -d' '

Unix grep command

I have an utility script, that displays an information about deployed java app. Here is an example output of this script:
Name: TestAPP
Version : SNAPSHOT
Type : ear, ejb, webservices, web
Source path : /G/bin/app/TESTAPP_LIVE_1.1.9.1.1.ear
Status : enabled
Is it possible to grep Version and source path values using grep command? Right now im able to do this using following command:
| grep Version
But it outputs the whole string (e.g. Version: Snapshot) when i am need only a values (e.g Snapshot to use in further script commands)
grep Version | cut -d ':' -f 2
Here is a pure grep solution.
Use the -P option for regex mode, and -o option for retrieving only what is matching.
grep -Po "(?<=^Version : ).*"
Here is what you would do for Source:
grep -Po "(?<=^Source : ).*"
It uses a postive lookbehind.
Here's a solution using awk if you're interested:
grep Version | awk '{print $3}'
$3 means to print the third word from that line.
Note that:
This displays one word only
This assumes you have spaces between the colon (and therefore the version is actually the third "word"). If you don't, use $2 instead.

Pipe output of cat to cURL to download a list of files

I have a list URLs in a file called urls.txt. Each line contains 1 URL. I want to download all of the files at once using cURL. I can't seem to get the right one-liner down.
I tried:
$ cat urls.txt | xargs -0 curl -O
But that only gives me the last file in the list.
This works for me:
$ xargs -n 1 curl -O < urls.txt
I'm in FreeBSD. Your xargs may work differently.
Note that this runs sequential curls, which you may view as unnecessarily heavy. If you'd like to save some of that overhead, the following may work in bash:
$ mapfile -t urls < urls.txt
$ curl ${urls[#]/#/-O }
This saves your URL list to an array, then expands the array with options to curl to cause targets to be downloaded. The curl command can take multiple URLs and fetch all of them, recycling the existing connection (HTTP/1.1), but it needs the -O option before each one in order to download and save each target. Note that characters within some URLs ] may need to be escaped to avoid interacting with your shell.
Or if you are using a POSIX shell rather than bash:
$ curl $(printf ' -O %s' $(cat urls.txt))
This relies on printf's behaviour of repeating the format pattern to exhaust the list of data arguments; not all stand-alone printfs will do this.
Note that this non-xargs method also may bump up against system limits for very large lists of URLs. Research ARG_MAX and MAX_ARG_STRLEN if this is a concern.
A very simple solution would be the following:
If you have a file 'file.txt' like
url="http://www.google.de"
url="http://www.yahoo.de"
url="http://www.bing.de"
Then you can use curl and simply do
curl -K file.txt
And curl will call all Urls contained in your file.txt!
So if you have control over your input-file-format, maybe this is the simplest solution for you!
Or you could just do this:
cat urls.txt | xargs curl -O
You only need to use the -I parameter when you want to insert the cat output in the middle of a command.
xargs -P 10 | curl
GNU xargs -P can run multiple curl processes in parallel. E.g. to run 10 processes:
xargs -P 10 -n 1 curl -O < urls.txt
This will speed up download 10x if your maximum download speed if not reached and if the server does not throttle IPs, which is the most common scenario.
Just don't set -P too high or your RAM may be overwhelmed.
GNU parallel can achieve similar results.
The downside of those methods is that they don't use a single connection for all files, which what curl does if you pass multiple URLs to it at once as in:
curl -O out1.txt http://exmple.com/1 -O out2.txt http://exmple.com/2
as mentioned at https://serverfault.com/questions/199434/how-do-i-make-curl-use-keepalive-from-the-command-line
Maybe combining both methods would give the best results? But I imagine that parallelization is more important than keeping the connection alive.
See also: Parallel download using Curl command line utility
Here is how I do it on a Mac (OSX), but it should work equally well on other systems:
What you need is a text file that contains your links for curl
like so:
http://www.site1.com/subdirectory/file1-[01-15].jpg
http://www.site1.com/subdirectory/file2-[01-15].jpg
.
.
http://www.site1.com/subdirectory/file3287-[01-15].jpg
In this hypothetical case, the text file has 3287 lines and each line is coding for 15 pictures.
Let's say we save these links in a text file called testcurl.txt on the top level (/) of our hard drive.
Now we have to go into the terminal and enter the following command in the bash shell:
for i in "`cat /testcurl.txt`" ; do curl -O "$i" ; done
Make sure you are using back ticks (`)
Also make sure the flag (-O) is a capital O and NOT a zero
with the -O flag, the original filename will be taken
Happy downloading!
As others have rightly mentioned:
-cat urls.txt | xargs -0 curl -O
+cat urls.txt | xargs -n1 curl -O
However, this paradigm is a very bad idea, especially if all of your URLs come from the same server -- you're not only going to be spawning another curl instance, but will also be establishing a new TCP connection for each request, which is highly inefficient, and even more so with the now ubiquitous https.
Please use this instead:
-cat urls.txt | xargs -n1 curl -O
+cat urls.txt | wget -i/dev/fd/0
Or, even simpler:
-cat urls.txt | wget -i/dev/fd/0
+wget -i/dev/fd/0 < urls.txt
Simplest yet:
-wget -i/dev/fd/0 < urls.txt
+wget -iurls.txt

Grep a path containing an environment variable and using it

I'm using tcsh, and I'm trying to grep a path from a file with several ID, I'm doing:
grep I241149 $ENV_CASTRO/ALL_CMD_LINES.BAK | grep -o \$"ENV_CASTRO.*.asm"
that gets me:
$ENV_CASTRO/central/WS678/test_do_all.asm
but if I try
cp `grep I241149 $ENV_CASTRO/ALL_CMD_LINES.BAK | grep -o \$"ENV_CASTRO.*.asm"` .
it prompts
cp: cannot stat `$ENV_CASTRO/central/WS678/test_do_all.asm': No such file or directory
How do I tell tcsh that the output of grep contains a $ that means it is an environment variable and is not plain text?
Thanks in advance.
eval is your friend ....
eval cp `grep I241149 $ENV_CASTRO/ALL_CMD_LINES.BAK | grep -o \$"ENV_CASTRO.*.asm"` .
I don't have the time to create files to test this.
I hope this helps.
The problem is that the output of the grep command is not being evaluated by the shell, and so variable substitution is not happening.
One way to solve this would be to execute the desired command within another shell, for example,
sh -c "cp `grep I241149 $ENV_CASTRO/ALL_CMD_LINES.BAK | grep -o '$ENV_CASTRO.*.asm'` ."

Resources