Show unique filename only on command grep result [duplicate] - unix

I'm trying to show only unique filenames when I grep a certain string. Currently I'm getting multiple results with the same filename if a certain string appear several times inside a file.
for example:
If I have a string "string12345" and it appears 3 times in several lines inside filename1.txt and appear 3 times in several lines inside filename2.txt as well when I use *grep 'string12345' .txt it shows 3 occurrences of filename1.txt and filename2.txt
Now what I'm trying to achieve is to show only 1 occurrence of filename1.txt and filename2.txt. Thank you in advance.

use the -l flag.
Hello, World!
Hello, World!
Grep search:
$ grep ./ -re "Hello"
./test.txt:Hello, World!
./test.txt:Hello, World!
$ grep ./ -re "Hello" -l
From the manual:
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match.


Unix substitute multiple strings using a reference file [duplicate]

I have a reference file and using that I want to replace multiple files in a directory. I am using AWK GSUB for that, however it is not replacing exact word, but replacing all occurrences. How can I stop that behaviour? How can I replace just the word? in this case the word is "IT"
My reference file
$ cat dev_to_prod.config
IT Business
My current data file
$ cat filefile.txt
Output with current code
awk 'FNR==NR{A[$1]=$2;next}{for(i in A)gsub(i,A[i])}1' dev_to_prod.config file.txt
man awk says:
\< matches the empty string at the beginning of a word.
\> matches the empty string at the end of a word.
Then would you please try:
awk 'FNR==NR{A[$1]=$2;next}{for(i in A)gsub("\\<"i"\\>",A[i])}1' dev_to_prod.config file.txt

Unix - sed get value from a line after a first colon [duplicate]

I have a file (newline.txt) that contains the following line
Footer - Count: 00034300, Facility: TRACE, File Created: 20160506155539
I am trying to get the value after Count: up to the comma (in the example 00034300) from this line.
I tried this but I get is all the numbers concatenated into one large string with that command:
grep -i "Count:" newfile.txt | sed 's/[^0-9]//g'
how do I get just the digits after Count: up to to the first non-digit character?
I just need 00034300.
Using sed
$ sed '/[Cc]ount/ s/[^:]*: *//; s/,.*//' newline.txt
How it works:
/[Cc]ount/ selects lines containing Count or count. This eliminates the need for grep.
s/[^:]*: *// removes everything up to the first colon including any spaces after the colon.
In what remains, s/,.*// removes everything after the first comma.
Using awk
$ awk -F'[[:blank:],]' '/[Cc]ount/ {print $4}' newline.txt
How it works:
-F'[[:blank:],]' tells awk to treat spaces, tabs, and commas as field separators.
/[Cc]ount/ selects lines that contain Count or count.
print $4 prints the fourth field on the selected lines.
Using grep
$ grep -oiP '(?<=Count: )[[:digit:]]+' newline.txt
This looks for any numbers following Count: and prints them.

AWK to check a string pattern and extract it from a file [duplicate]

Below are the file contents:
I trying to accomplish below using awk:
Firstly I want to search for string Pattern "EDXFB*.xsd".
If exists, then extract the strings that starts with "EDXFB" and ends with ".xsd"
The basic awk pattern to extract the expression and print out matched data is following:
gawk 'match($0, /EDXFB.+\.xsd/, a) { print a[0] }'
Though, you should really spend some time reading awk manual.
And the regular expression could be changed to /EDXFB[a-z_]+\.xsd/ if it contains only lower-cased characters and _.
[EDIT]: Updated with cleaner code from #JID. Thanks :)
Here is one way to do it:
awk -F/ '/EDXFB.*\.xsd/ {split($NF,a,"|");print a[1]}' file
It separate the line by / then print last field until |
In your example, probably grep would do what you want:
grep -o 'EDXFB.*\.xsd'

passing variable to this particular sed command [duplicate]

I am first searching for a key word and once that key word is found in a file from that particular line i am supposed delete till end of file.
#! /bin/csh -f
set sa = `grep -n -m 1 "^Pattern" file`
set s = `echo "$sa" | cut -d':' -f1`
set m = `sed '$s,$d' file | tee see > /dev/null`
so first line gives me the matching line with line number, second line i am getting the line number and third line i am trying to delete from line $s say 20 till last but it is not working. I have tried all combinations but it does not take the variable $s. Please help.
But you can do it much more easier with a single line of sed:
sed -n '/SEARCHPATTERN/q;p
-n tells to not print the lines
/SEARCHPATTERN/q exits on search pattern
;p otherwise print the lines
You need to take $s out of the quotes so it will be expanded.
set m = `sed $s',$d' file | tee see > /dev/null`

Using grep to search DNA sequence files

I am trying to using Unix's grep to search for specific sequences within files. The files are usually very large (~1Gb) of 'A's, 'T's, 'C's, and 'G's. These files also span many, many lines with each line being a word of 60ish characters. The problem I am having is that when I search for a specific sequence within these files grep will return results for the pattern that occur on a single line, but not if the pattern spans a line (has a line break somewhere in the middle). For example:
$ grep -i -n "GACGGCT" grep3.txt
To search the file grep3.txt (I put the target 'GACGGCT's in double stars)
So, my problem here is that grep does not find the GACGGCT that spans the end of line 2 and the beginning of line 3.
How can I use grep to find target sequences that may or may not include a linebreak at any point in the string? Or how can I tell grep to ignore linebreaks in the target string? Is there a simple way to do this?
pcregrep -nM "G[\n]?A[\n]?C[\n]?G[\n]?G[\n]?C[\n]?T" grep3.txt
I assume that your each line is 60 char long. Then the below cmd should work
tr '\n' ' ' < grep3.txt | sed -e 's/ //g' -e 's/.\{60\}/&^/g' | tr '^' '\n' | grep -i -n "GACGGCT"
output :
