Unix search command - unix

I need Unix search command which is used to find a particular word in a line and that line should not contain particular word.
For eg,
Line 1: Java is my World.
Line 2: Java is OOPs language.
I need unix command which returns lines contains "Java" not "World" in that line.
Expected Output:
Java is OOPs language.
Thanks,
Srinivasan R.

Use grep:
cat <File> | grep 'Java' | grep -v 'World'
The -v option inverts selection (i.e. lines NOT containing pattern)
You could also do it with awk and give both conditions together:
cat <File> | awk '/Java/ && !/World/'

First, list out all the files that have the word 'Java' in them and then inversely match 'World' i.e. select those lines which DO NOT have 'World' in them:
cat file | grep 'Java' | grep --invert-match 'World'

Command for particular word
grep -R dirname
grep -R dirname > filename (to append output)
to find a particular word in a line and also search file name where that word is present.

Related

how to list a specific string or number in a file in Unix

for Example if your file has following lines
1=10200|2=2343i|3=otit|5=89898|54=9546i96i|10=2459
1=10200|54=9546i96i|10=2459|2=2343i|3=otit|5=8
1=10200|5=IGY|14=897|459=122|132=1|54=9546i96i|10=2459
1=10200|2=2343i|5=0|54=9546i96i
The output should be
5=89898
5=8
5=IGY
5=0
You could use grep with the -o flag to return only the regexp matches.
Assuming you have a file.txt that you want to parse:
cat file.txt | grep -o -E "(\||^)5=[^|]*" | grep -o "5=[^|]*"
This will match anything that starts with 5= up until the first |.
By running this command on the input you provided I get:
5=89898
5=8
5=IGY
5=0
Cheers
Edit: as Walter A suggested, my previous solution did not cover all cases.
I have added an extra parsing step: first, you get all strings that match 5=... at the start of a line, or |5=..., and then you remove the |.
Use (^|[|]) for matching start of field (start of line or |) and remember/match string until next | or end-of-line.
sed -nr 's/.*(^|[|])(5=[^|]*).*/\2/p' file

unix combine grep w and v command

I want to search a file and include the text #!/bin/bash, but exclude any other line that has a # sign. These two commands: grep -w '#!/bin/bash' file and grep -v '^#' file each do one part of this job. I would like this to be a single command, so here's what I've tried.
grep -w '#!/bin/bash' | grep -v '^#' file
This excludes lines beginning with #, but doesn't include the line #!/bin/bash
grep -w '#!/bin/bash' -v '^#' file
This just prints every line but #!/bin/bash
grep "^[^#]\|^#\!/bin/bash$" test.sh
Explanation:
^[^#] means starts by something different that #
\| is a or
^#\!/bin/bash$ is the exact line #!/bin/bash
So .. it looks as if you're trying to strip comments from bash files without removing their shebang.
The grep command can search for regular expressions, but isn't so good at applying rules of logic. You could do something like this:
grep -v '^#[^!]' input.sh
But you'd fail to strip comments that are affixed to the ends of lines. Note that I'm being a little more liberal with this regex, since it's entirely possible that a script might use something other than /bin/bash for its shebang. :-)
Another possibility would be to use awk. This lets you apply logic that cannot be expressed within a regular expression. For example, if you want to keep the commented line only if it is a shebang on the first line of the file, and remove all other comments, awk can express that as follows:
awk '
NF==1 && /^#!/; # if we're on the first line and find shebang, print.
/^#/ { next } # if this is a comment line, skip it.
1 # print everything else.
' input.sh

ls and xargs to output specific file extentions

I am trying to use ls and xargs to print specific file extensions .bam and .vcf witout the path. The below is close but when I | the two ls commands I get the error below. Separated it works fine except each file is printed on a newline (my actual data has hundreds of files and make it easier to read). Thank you :).
files in directory
1.bam
1.vcf
2.bam
2.vcf
command with error
ls /home/cmccabe/Desktop/NGS/test/R_folder/*.bam | xargs -n1 basename | ls /home/cmccabe/Desktop/NGS/test/R_folder/*.vcf | xargs -n1 basename >> /home/cmccabe/Desktop/NGS/test/log
xargs: basename: terminated by signal 13
desired output
1.bam 1.vcf
2.bam 2.vcf
You cannot pipe output into ls and have it print that with its other output. You should give the parameters to the first one and it will output everything.
ls *.a *.b *.c | xargs ...q
ls isn't really doing anything for you currently, it's the shell that's listing all your files. Since you're piping ls's output around, you're actually vulnerable to dangerous file names.
basename can take multiple arguments with the -a option:
basename -a "path/to/files/"*.{bam,vcf}
To print that in two columns, you could use printf via xargs, with sort for... sorting. The -z or -0 flags throughout cause null bytes to be used as the filename separators:
basename -az "path/to/files/"*.{bam,vcf} | sort -z | xargs -0n 2 printf "%b\t%b\n"
If you're going to be doing any more processing after printing to columns, you may want to replace the %bs in the printf format with %qs. That will escape non-printable characters in the output, but might look a bit ugly to human eyes.

grepping lines from a document using xargs

Let's say I have queries.txt.
queries.txt:
cat
dog
123
now I want to use them are queries to find lines in myDocument.txt using grep.
cat queries.txt | xargs grep -f myDocument.txt
myDocument has lines like
cat
i have a dog
123
mouse
it should return the first 3 lines. but it's not. instead, grep tries to find them as file names. what am i doing wrong?
Here, you just need:
grep -f queries.txt myDocument.txt
This causes grep to read the regular expressions from the file queries.txt and then apply them to myDocument.txt.
In the xargs version, you were effectively writing:
grep -f myDocument.txt cat dog 123
If you absolutely must use xargs, then you'll need to write:
xargs -I % grep -e % myDocument.txt < queries.txt
This avoids a UUOC — Useless Use of cat – award by redirecting standard input from queries.txt. It uses the -I % option to specify where the replacement text should go in the command line. Using the -e option means that if the pattern is, say --help, you won't run into problems with (GNU) grep treating that as an argument (and therefore printing its help message).
The grep -e option will take a pattern string as an argument. -f treats the argument as a file name of a file with patterns in it.

Unix Pipes for Command Argument [duplicate]

This question already has answers here:
How to pass command output as multiple arguments to another command
(5 answers)
Read expression for grep from standard input
(1 answer)
Closed last month.
I am looking for insight as to how pipes can be used to pass standard output as the arguments for other commands.
For example, consider this case:
ls | grep Hello
The structure of grep follows the pattern: grep SearchTerm PathOfFileToBeSearched. In the case I have illustrated, the word Hello is taken as the SearchTerm and the result of ls is used as the file to be searched. But what if I want to switch it around? What if I want the standard output of ls to be the SearchTerm, with the argument following grep being PathOfFileToBeSearched? In a general sense, I want to have control over which argument the pipe fills with the standard output of the previous command. Is this possible, or does it depend on how the script for the command (e.g., grep) was written?
Thank you so much for your help!
grep itself will be built such that if you've not specified a file name, it will open stdin (and thus get the output of ls). There's no real generic mechanism here - merely convention.
If you want the output of ls to be the search term, you can do this via the shell. Make use of a subshell and substitution thus:
$ grep $(ls) filename.txt
In this scenario ls is run in a subshell, and its stdout is captured and inserted in the command line as an argument for grep. Note that if the ls output contains spaces, this will cause confusion for grep.
There are basically two options for this: shell command substitution and xargs. Brian Agnew has just written about the former. xargs is a utility which takes its stdin and turns it into arguments of a command to execute. So you could run
ls | xargs -n1 -J % grep -- % PathOfFileToBeSearched
and it would, for each file output by ls, run grep -e filename PathOfFileToBeSearched to grep for the filename output by ls within the other file you specify. This is an unusual xargs invocation; usually it's used to add one or more arguments at the end of a command, while here it should add exactly one argument in a specific place, so I've used -n and -J arguments to arrange that. The more common usage would be something like
ls | xargs grep -- term
to search all of the files output by ls for term. Although of course if you just want files in the current directory, you can this more simply without a pipeline:
grep -- term *
and likewise in your reversed arrangement,
for filename in *; do
grep -- "$#" PathOfFileToBeSearched
done
There's one important xargs caveat: whitespace characters in the filenames generated by ls won't be handled too well. To do that, provided you have GNU utilities, you can use find instead.
find . -mindepth 1 -maxdepth 1 -print0 | xargs -0 -n1 -J % grep -- % PathOfFileToBeSearched
to use NUL characters to separate filenames instead of whitespace

Resources