compare columns from file1 and extracting patterns from file 2 in unix - unix

File 1 contains
a,b,c,d,e
1,2,3,4,5
0,0,0,1,2
file 2 contains
12,12,11,a,b,c,d,e,f,22,33,22
11,22,22,1,2,3,4,5,33,22,33,ww
I would like if the patterns from file 1 is found in file 2 then the entire line from file 2 be printed.
So far i have tried
grep -f file 1 file 2
grep -F
but they does not seems to work.

$ grep -Ff file1 file2
12,12,11,a,b,c,d,e,f,22,33,22
11,22,22,1,2,3,4,5,33,22,33,ww
Because your patterns appear to be fixed strings, not regular expressions, I added the -F flag.

Related

Linux - Get Substring from 1st occurence of character

FILE1.TXT
0020220101
or
01 20220101
Need to extra date part from file where text starts from 2
Options tried:
t_FILE_DT1='awk -F"2" '{PRINT $NF}' FILE1.TXT'
t_FILE_DT2='cut -d'2' -f2- FILE1.TXT'
echo "$t_FILE_DT1"
echo "$t_FILE_DT2"
1st output : 0101
2nd output : 0220101
Expected Output: 20220101
Im new to linux scripting. Could some one help guide where Im going wrong?
Use grep like so:
echo "0020220101\n01 20220101" | grep -P -o '\d{8}\b'
20220101
20220101
Here, GNU grep uses the following options:
-P : Use Perl regexes.
-o : Print the matches only (1 match per line), not the entire lines.
SEE ALSO:
grep manual
perlre - Perl regular expressions
Using any awk:
$ awk '{print substr($0,length()-7)}' file
20220101
20220101
The above was run on this input file:
$ cat file
0020220101
01 20220101
Regarding PRINT $NF in your question - PRINT != print. Get out of the habit of using all-caps unless you're writing Cobol. See correct-bash-and-shell-script-variable-capitalization for some reasons.
The 2 in your scripts is telling awka and cut to use the character 2 as the field separator so each will carve up the input into substrings everywhere a 2 occurs.
The 's in your question are single quotes used to make strings literal, you were intending to use backticks, `cmd`, but those are deprecated in favor of $(cmd) anyway.
I would instead of looking for "after" the 2 .. (not having to worry about whether there is a space involved as well) )
Think instead about extracting the last 8 characters, which you know for fact is your date ..
input="/path/to/txt/file/FILE1.TXT"
while IFS= read -r line
do
# read in the last 8 characters of $line .. You KNOW this is the date ..
# No need to worry about exact matching at that point, or spaces ..
myDate=${line: -8}
echo "$myDate"
done < "$input"
About the cut and awk commands that you tried:
Using awk -F"2" '{PRINT $NF}' file will set the field separator to 2, and $NF is the last field, so printing the value of the last field is 0101
Using cut -d'2' -f2- file uses a delimiter of 2 as well, and then print all fields starting at the second field, which is 0220101
If you want to match the 2 followed by 7 digits until the end of the string:
awk '
match ($0, /2[0-9]{7}$/) {
print substr($0, RSTART, RLENGTH)
}
' file
Output
20220101
The accepted answer shows how to extract the first eight digits, but that's not what you asked.
grep -o '2.*' file
will extract from the first occurrence of 2, and
grep -o '2[0-9]*' file
will extract all the digits after every occurrence of 2. If you specifically want eight digits, try
grep -Eo '2[0-9]{7}'
maybe also with a -w option if you want to only accept a match between two word boundaries. If you specifically want only digits after the first occurrence of 2, maybe try
sed -n 's/[^2]*\(2[0-9]*\).*/\1/p' file

Sort multiple file unix and save

How do I sort multiple files using unix and save the result in the respective files?
Example:
If I have two files abc.txt and xyz.txt
cat abc.txt
3
2
1
cat xyz.txt
100
99
98
Is it possible to sort both the files and save the result in them respectively without writing two command?
That is:
sort abc.txt -o abc.txt
Is this possible for both the files in a single command.
Do not over complicate. Its as easy as :
cat file1.txt file2.txt | sort
Regards. :)

Unix- Using Grep to get unmatched lines

I am new to unix. I want to grep the unmatched pattern from a file1 provided that the patterns are in the file2. The real files are having more than 1000 lines.
Example:
File1:
Hi(Everyone)
How(u)people(are)doing?
ThanksInadvance
File2:
Hi(Every
ThanksI
Required Result:
How(u)people(are)doing?
I want only the pattern to be used like ("Hi(Every") for the grep.It should return the unmatched line from file1.
this line works for given example:
grep -Fvf file2 file1
The 3 options used above:
-F makes grep do fixed-string match
-v invert matching
-f get patterns from file
the Grep-Flag -v inverts the Grep-Command.
cat File1 |grep -v ("Hi(Every")
should return all Lines from File1 where ("Hi(Every") doesnt contains.
best regards,
Jan

Using inverse grep to compare two .txt files

I have two .txt files "test1.txt" and "test2.txt" and I want to use inverse grep (UNIX) to find out all lines in test2.txt that do not contain any of the lines in test1.txt
test1.txt contains only user names, while test2.txt contains longer strings of text. I only want the lines in test2.txt that DO NOT contain the usernames found in test1.txt
Would it be something like?
grep -v test1.txt test2.txt > answer.txt
Your were almost there just missed one option in your command (i.e -f )
Your Solution should be use the -f flag, see below for sample session demonstrating the same
Demo Session
$ # first file
$ cat a.txt
xxxx yyyy
kkkkkk
zzzzzzzz
$ # second file
$ cat b.txt
line doesnot contain any name
This person is xxxx yyyy good
Another line which doesnot contain any name
Is kkkkkk a good name ?
This name itself is sleeping ...zzzzzzzz
I can't find any other name
Lets try the command now
$ # -i is used to ignore the case while searching
$ # output contains only lines from second file not containing text for first file lines
$ grep -v -i -f a.txt b.txt
line doesnot contain any name
Another line which doesnot contain any name
I can't find any other name
Lets try the command now
They're probably better ways to do this ie. without grep but heres a solution which will work
grep -v -P "($(sed ':a;N;$!ba;s/\n/)|(/g' test1.txt))" test2.txt > answer.txt
To explain this:
$(sed ':a;N;$!ba;s/\n/)|(/g' test1.txt) is an embedded sed command which outputs a string where each newline in test1.txt is replaced by )|( the output is then inserted into a perl style regex (-P) for grep to use, so that grep is searching test2.txt for the every line in text1.txt and returns only those in test2.txt which don't contain lines in test1.txt because of the -v param.
What flavor of unix are you using? This will provide us with a better understanding of what is available to you from the command line. Currently what you have will not work, you're looking for the diff command which compares two files.
You can do the following for OS X 10.6 I have tested this at home.
diff -i -y FILE1 FILE2
diff compares the files -i will ignore the case if this does not matter so Hi and HI will still mean the same. Finally -y will output side by side the results If you want to out the information to a file you could do diff -i -y FILE1 FILE2 >> /tmp/Results.txt

Merge files in Unix per lines

I have two txt files, first one contains:
000
111
222
333
444
and the second one contains:
.
How can I merge this two text files in the unix terminal, so I can get another file that contains:
.000
.111
.222
.333
.444
Thanks for your answers
The paste command is generally what you're looking for, but it expects both files to have the same number of lines. You can create a file with the same number of lines repeated with something like yes $(cat file2) | head -$(wc -l < file1)
So the whole thing, using bash file substitution:
paste -d "" <(yes $(cat file2) | head -$(wc -l <file1)) file1

Resources