Output on a single line - unix

The following code is working as expected. But I can not format the output.
It will print something like this:
mysql
test
someDB
I want the output on a single line
mysql test someDB
I tried using sed in the script but it did not work.
#!/bin/sh
for dbName in `mysqlshow -uroot -pPassWord | awk '{print $2}'`
do
echo "$dbName" | egrep -v 'Databases|information_schema';
done

whenever you want to combine all lines of output into one you can also use xargs:
e.g.
find
.
./zxcv
./fdsa
./treww
./asdf
./ewr
becomes:
find |xargs echo
. ./zxcv ./fdsa ./treww ./asdf ./ewr

you can use tr to get your output to one line
<output from somewhere> | tr "\n" " "

To do a variation combining naumcho's and rsp's answers that will work for small numbers of results:
echo $(mysqlshow -uroot -pPassWord | awk '{print $2}' | egrep -v 'Databases|information_schema')

The newline is generated by the echo command most likely, the following should do the same without the newlines (not tested)
mysqlshow -uroot -pPassWord | awk '{print $2}' | egrep -v 'Databases|information_schema'
and has the added bonus of spawning just 1 grep instead of 3 grep processes.

Related

Unix piping in system() does not work in R

I am using the system() command to run a Bash command in R, but every time I try to pipe the results of one command into the next (using '|'), I get some error.
For example:
system('grep ^SN bam_stats.txt | cut -f 2- | sed -n 8p | awk -F "\t" '{print $2}'') returns the error: Error: unexpected '{' in "system('grep ^SN bam_stats.txt | cut -f 2- | sed -n 8p | awk -F "\t" '{", and if I try to remove awk -F "\t" '{print $2}' so that I'm left with system('grep ^SN bam_stats.txt | cut -f 2- | sed -n 8p'), I get the following:
/usr/bin/grep: 2-: No such file or directory
[1] 2
I have to keep removing parts of it till I am left with only system('grep ^SN bam_stats.txt'), AKA no pipes are left, for it to work.
Here is a sample from the file 'bam_stats.txt' from which I'm extracting information:
SN filtered sequences: 0
SN sequences: 137710356
SN is sorted: 1
SN 1st fragments: 68855178
SN last fragments: 68855178
SN reads mapped: 137642653
SN reads mapped and paired: 137602018 # paired-end technology bit set + both mates mapped
SN reads unmapped: 67703
SN percentage of properly paired reads (%): 99.8
Can someone tell me why piping is not working? Apologies if this is a stupid question. Please let me know if I should provide more information.
Thank you in advance.
I don't know R but IF Rs implementation of system() just passes it's argument to a shell then, in terms of standard Unix quoting, your example
system('grep ^SN bam_stats.txt | cut -f 2- | sed -n 8p | awk -F "\t" '{print $2}'')
contains 2 strings within quotes and a string in the middle that's outside of quotes:
Inside: grep ^SN bam_stats.txt | cut -f 2- | sed -n 8p | awk -F "\t"
Outside: {print $2}
Inside: <a null string>
because the 2 quotes in the middle around '{print $2}' are ending the first quoted string then later starting a second quoted string.
You don't need sed, grep, or cut if you're using awk anyway though so try just this:
system('awk -F"\t" "/^SN/ && (++cnt==8){print \$3}" bam_stats.txt')

using awk to get column values and then running another command on values and printing them

I've always used Stack Overflow to get help with issues but this is my first post. I am new to UNIX scripting and I was given a task to get values of column two and then run a command on them. The command I am suppose to run is 'echo -n "$2" | openssl dgst -sha1;' which is a function to hash a value. My problem is not hashing one value, but hashing them all and then printing them. Can someone maybe help me figure this out? This is how I am starting but I think the path I am going is wrong.
NOTE: this is a CSV text file and I know I need to use AWK command for this.
awk 'BEGIN { FS = "," } ; { print $2 }'
while [ "$2" != 0 ];
do
echo -n "$2" | openssl dgst -sha1
done
This prints the second column in it's entirety and also print some type of hashed value.
Sorry for the long first post, just trying to be as specific as possible. Thanks!
You don't really need awk just for extracting the second column. You can do by using bash read built in and setting the IFS to the delimiter.
while IFS=, read -ra line; do
[[ ${line[1]} != 0 ]] && echo "${line[1]}" | openssl dgst -sha1
done < inputFile
You should probably post some sample input data and the error you are getting so that someone can debug your existing code better.
This will do the trick:
$ awk '{print $2}' file | xargs -n1 openssl dgst -sha1
Use awk to print the second field in the file and xargs with the -n1 to pass each record separately to openssl.
If by CSV you mean each record is seperated by a comma then you need to add -F, to awk.
$ awk -F, '{print $2}' file | xargs -n1 openssl dgst -sha1

Passing output from one command as argument to another [duplicate]

This question already has answers here:
How to pass command output as multiple arguments to another command
(5 answers)
Closed 5 years ago.
I have this for:
for i in `ls -1 access.log*`; do tail $i |awk {'print $4'} |cut -d: -f 1 |grep - $i > $i.output; done
ls will give access.log, access.log.1, access.log.2 etc.
tail will give me the last line of each file, which looks like: 192.168.1.23 - - [08/Oct/2010:14:05:04 +0300] etc. etc. etc
awk+cut will extract the date (08/Oct/2010 - but different in each access.log), which will allow me to grep for it and redirect the output to a separate file.
But I cannot seem to pass the output of awk+cut to grep.
The reason for all this is that those access logs include lines with more than one date (06/Oct, 07/Oct, 08/Oct) and I just need the lines with the most recent date.
How can I achieve this?
Thank you.
As a sidenote, tail displays the last 10 lines.
A possible solution would be to grepthis way:
for i in `ls -lf access.log*`; do grep $(tail $i |awk {'print $4'} |cut -d: -f 1| sed 's/\[/\\[/') $i > $i.output; done
why don't you break it up into steps??
for file in *access.log
do
what=$(tail "$i" |awk {'print $4'} |cut -d: -f 1)
grep "$what" "$file" >> output
done
You shouldn't use ls that way. Also, ls -l gives you information you don't need. The -f option to grep will allow you to pipe the pattern to grep. Always quote variables that contain filenames.
for i in access.log*; do awk 'END {sub(":.*","",$4); print substr($4,2)}' "$i" | grep -f - $i > "$i.output"; done
I also eliminated tail and cut since AWK can do their jobs.
Umm...
Use xargs or backticks.
man xargs
or
http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_04.html , section 3.4.5. Command substitution
you can try:
grep "$(stuff to get piped over to be grep-ed)" file
I haven't tried this, but my answer applied here would look like this:
grep "$(for i in `ls -1 access.log*`; do tail $i |awk {'print $4'} |cut -d: -f 1 |grep - $i > $i.output; done)" $i

zcat to grep with file name

ls -ltr|grep 'Mar 4'| awk '{print $9 }'|xargs zcat -fq |grep 12345
I'm now using this command to list the records that contain my numeric string how can i alos get this command to print the name of the file the strings were found in?
thanks
Use zgrep
BTW. what you're trying to do can be done with find
find -newermt 'Mar 4' -and ! -newermt 'Mar 5' -exec zgrep -l '12345' {} \;
If you use zgrep instead of zcat+grep (which does the same), you do it like this option like this:
ls -ltr | grep 'Mar 4' | awk '{print $9}' | xargs zgrep 12345
Pass the -t option to xargs, causing it to print the command it is running (the zcat command, including the filename) before running it. The command is printed to stderr, so it will not interfere with your pipe.

How to keep a file's format if you use the uniq command (in shell)?

In order to use the uniq command, you have to sort your file first.
But in the file I have, the order of the information is important, thus how can I keep the original format of the file but still get rid of duplicate content?
Another awk version:
awk '!_[$0]++' infile
This awk keeps the first occurrence. Same algorithm as other answers use:
awk '!($0 in lines) { print $0; lines[$0]; }'
Here's one that only needs to store duplicated lines (as opposed to all lines) using awk:
sort file | uniq -d | awk '
FNR == NR { dups[$0] }
FNR != NR && (!($0 in dups) || !lines[$0]++)
' - file
There's also the "line-number, double-sort" method.
nl -n ln | sort -u -k 2| sort -k 1n | cut -f 2-
You can run uniq -d on the sorted version of the file to find the duplicate lines, then run some script that says:
if this_line is in duplicate_lines {
if not i_have_seen[this_line] {
output this_line
i_have_seen[this_line] = true
}
} else {
output this_line
}
Using only uniq and grep:
Create d.sh:
#!/bin/sh
sort $1 | uniq > $1_uniq
for line in $(cat $1); do
cat $1_uniq | grep -m1 $line >> $1_out
cat $1_uniq | grep -v $line > $1_uniq2
mv $1_uniq2 $1_uniq
done;
rm $1_uniq
Example:
./d.sh infile
You could use some horrible O(n^2) thing, like this (Pseudo-code):
file2 = EMPTY_FILE
for each line in file1:
if not line in file2:
file2.append(line)
This is potentially rather slow, especially if implemented at the Bash level. But if your files are reasonably short, it will probably work just fine, and would be quick to implement (not line in file2 is then just grep -v, and so on).
Otherwise you could of course code up a dedicated program, using some more advanced data structure in memory to speed it up.
for line in $(sort file1 | uniq ); do
grep -n -m1 line file >>out
done;
sort -n out
first do the sort,
for each uniqe value grep for the first match (-m1)
and preserve the line numbers
sort the output numerically (-n) by line number.
you could then remove the line #'s with sed or awk

Resources