compare two file in unix using awk - unix

I need to compare two files. File1.txt and File2.txt in unix. The values present in File1.txt and not in File2.txt have to be written into diff.txt. I guess we can implement using awk only. Can anyone please guide me to achieve this?
File1.txt
apple
bat
cat
File2.txt
apple
cat
diff.txt
bat

try this one-liner:
awk 'NR==FNR{a[$0];next}!($0 in a)' file2 file1 > diff.txt

diff file2 file1 | perl -lne 'print $1 if(/^\> (.*)/)'

This is the job that "comm" was created to do:
comm -23 file1 file2
man comm for details. The caveat is that the input files have to be sorted, as yours are.

Related

awk sub ++count every 4 matches unlike every 1 match

Let's say I have the following 1.txt file below:
one file.txt
two file.txt
three file.txt
four file.txt
five file.txt
sixt file.txt
seven file.txt
eight file.txt
nine file.txt
I usually use the following command below to sequentially rename the files listed at 1.txt:
awk '/\.txt/{sub(".txt",++count"&")} 1' 1.txt > 2.txt
The output is 2.txt:
one file1.txt
two file2.txt
three file3.txt
four file4.txt
five file5.txt
sixt file6.txt
seven file7.txt
eight file8.txt
nine file9.txt
But I would like to rename only every 4 matches when the pattern is .txt.
to clarify, a pseudocode would be something like:
awk '/\.txt/{sub(".txt",++count"&")} 1 | <change every 4 matches> ' 1.txt > 3.txt
such that 3.txt is as below:
one file.txt
two file.txt
three file.txt
four file1.txt <-here
five file.txt
sixt file.txt
seven file.txt
eight file2.txt <- here
nine file.txt
I have been looking for both the web and in my learning and I do not remember something like that and I am having difficulty starting something to achieve this result.
Note: Maybe I just need to continue the command below:
awk -v n=0 '/\.txt/{if (n++==4) sub(".txt",++count"&")} 1'
Adding 1 more awk variant here, based on your shown samples only. Simple explanation would be, check if line is NOT NULL AND count variable value is 4, then substitute .txt with count1(increasing with 1 each time) with .txt itself and print the line.
awk 'NF && ++count==4{sub(/\.txt/,++count1"&");count=0} 1' 1.txt > 2.txt
You are almost there. Would you please try:
awk '/\.txt/ {if (++n%4==0) sub(".txt",++count"&")} 1' 1.txt > 2.txt
The condition ++n%4==0 meets every four valid lines.
Another option could be passing n=4 to use it for both the modulo and the division.
For the string you could pass s=".txt" for index() to check if it present and for sub to use in the replacement.
awk -v str=".txt" -v nr=4 'index($0,str){if(!(++i%nr)) sub(str,i/nr"&")}1' 1.txt > 2.txt

Copy content of a file to multiple files using CAT command in UNIX

I have 3 files
File1, File2, File3
I want to copy the content of File1 to File2 and File3 in a single command
Is it possible with CAT command
If yes how, if no then which command is used to do this task
maybe this code can help you.
cat file1.txt >> file2.txt && cat file1.txt >> file3.txt
Use tee:
$ cat file1 | tee file2 > file3
man tee:
NAME
tee - read from standard input and write to standard output and files

combine two files -- one after another

Say, I have two files:
file1:
0
0
-3.44785
-2.15069
5.70183
17.8715
and file2:
31.9812
50.5646
72.361
96.8705
119.893
144.409
Two combine them side by side, I use :
paste -d" " file1 file2
or I use awk command to do such thing.
If I want to combine these two files one after another, what should I write? I know how to do this using "cat". I have tried different things to modify the "paste" command but they don't give desired output.
Could you please help? Thanks.
cat (short for concatenate) is your friend:
cat file1 file2
That's pretty basic; most people are aware of cat long before they learn to deal with awk, so kudos for mastering the latter!
Normally I would use cat file1 file2, but you could do it like:
awk '{print $0}' file1 file2
or
awk '1' file1 file2
(Note: the '1' does the the same thing as print)

How will find uniq file using md5sum cmmand?

I am using Md5sum command ,i get the file content in binary format
I want the result in without same content available in a file
for example
$ md5sum file1 file2 file3 file4
c8675a129a538248bf9b0f8104c8e817 file1
9d3df2c17bfa06c6558cfc9d2f72aa91 file2
9d3df2c17bfa06c6558cfc9d2f72aa91 file3
2e7261df11a2fcefee4674fc500aeb7f file4
I want the output for not matching in a file that mean
file1 and file2 I need .
c8675a129a538248bf9b0f8104c8e817 file1
2e7261df11a2fcefee4674fc500aeb7f file4
That file content in not same in another file that file only I need
Thanks In Advance
You can say:
md5sum file1 file2 file3 file4 | uniq -u -w33
in order to get the unique files.
Quoting man uniq:
-u, --unique
only print unique lines
EDIT: You seem to be looking for alternatives. Try
md5sum ... | sed ':a;$bb;N;/^\(.\).*\n\1[^\n]*$/ba;:b;s/^\(.\).*\n\1[^\n]*\n*//;ta;/./P;D'
Try this: BASH
find -type f -exec md5sum '{}' ';' | sort | uniq --all-repeated=separate -w 33 | cut -c 35-
Explanation:
Find all files, calculate their MD5SUM, find duplicates by comparing the MD5SUM, print the names
Read more here

How to append one file to the other, with first file being edited, hence can't use usual cat command

Suppose that I have two files, each of them have header in the first line and records in the remaining lines. And I want to concatenate two files into one, but don't include header twice.
I tried the following commands while googling for the answer, (hence I may not cope in an optimal way).
cat awk 'NR!=1 {printf "%s\n", $1}' file2.csv >| file.csv
However, I got the following error.
cat: awk: No such file or directory
cat: NR!=1 {printf "%s\n",$1}: No such file or directory
It looks like cat recognized awk as files, not commands. I want the result of awk to be the content of files, so I also tried to pipe it to the argument of cat.
awk 'NR!=1 {printf "%s\n", $1}' file2.csv > cat file.csv
However, in this way, I got file cat, in which I got the result of awk...
So how can I solve it?
Thanks.
You need some grouping:
{
cat file1
sed '1d' file2
} > file.csv
As one line
{ cat file1; sed '1d' file2; } > file.csv
The semicolon before the ending brace is required.
{cat file1; tail -n +2 file2} > out
Print first line from first file, then print line #2 to the end of any file
awk 'NR==1||FNR>1' file1 file2 (file3 file4 ..) > outfile

Resources