awk sub ++count every 4 matches unlike every 1 match - unix

Let's say I have the following 1.txt file below:
one file.txt
two file.txt
three file.txt
four file.txt
five file.txt
sixt file.txt
seven file.txt
eight file.txt
nine file.txt
I usually use the following command below to sequentially rename the files listed at 1.txt:
awk '/\.txt/{sub(".txt",++count"&")} 1' 1.txt > 2.txt
The output is 2.txt:
one file1.txt
two file2.txt
three file3.txt
four file4.txt
five file5.txt
sixt file6.txt
seven file7.txt
eight file8.txt
nine file9.txt
But I would like to rename only every 4 matches when the pattern is .txt.
to clarify, a pseudocode would be something like:
awk '/\.txt/{sub(".txt",++count"&")} 1 | <change every 4 matches> ' 1.txt > 3.txt
such that 3.txt is as below:
one file.txt
two file.txt
three file.txt
four file1.txt <-here
five file.txt
sixt file.txt
seven file.txt
eight file2.txt <- here
nine file.txt
I have been looking for both the web and in my learning and I do not remember something like that and I am having difficulty starting something to achieve this result.
Note: Maybe I just need to continue the command below:
awk -v n=0 '/\.txt/{if (n++==4) sub(".txt",++count"&")} 1'

Adding 1 more awk variant here, based on your shown samples only. Simple explanation would be, check if line is NOT NULL AND count variable value is 4, then substitute .txt with count1(increasing with 1 each time) with .txt itself and print the line.
awk 'NF && ++count==4{sub(/\.txt/,++count1"&");count=0} 1' 1.txt > 2.txt

You are almost there. Would you please try:
awk '/\.txt/ {if (++n%4==0) sub(".txt",++count"&")} 1' 1.txt > 2.txt
The condition ++n%4==0 meets every four valid lines.

Another option could be passing n=4 to use it for both the modulo and the division.
For the string you could pass s=".txt" for index() to check if it present and for sub to use in the replacement.
awk -v str=".txt" -v nr=4 'index($0,str){if(!(++i%nr)) sub(str,i/nr"&")}1' 1.txt > 2.txt

Related

Count lines below grep and include filename in output

I would like to count the number of lines below a sed command and append filename to the output.
Sample file.txt
Aaaaaaa
Bbbbbbb
Ccccccc
Ddddddd
I would like to grep Bbbbbb and find the number of line below and output the number plus the filename
I tried this cat ${samplename}.txt|sed -n 'Bbbbbbb/,$p'| wc -l but the filename is not in the output
In order to know the line where "Bbbbb" is found:
grep -n "Bbbbb" file.txt | cut -d ':' -f 1
// grep -n adds line number in front of the search result, this is followed by a colon.
// You get that number by splitting over that colon and take the first field.
In order to know the amount of lines in a file:
wc -l file.txt
In order to perform calculations:
echo $((43 - 7))
Just combine everything :-)
Have fun

Split data in a line using unix

How do you use the unix to create a csv file where each field is a column?
My data is:
>A::LOLLLL rank=1 x=2 y=9 length=10
Column 1 Column 2 Column 3
>A LOLLLL 10
I tried using awk '{print $1}'input_file to try to separate between the fields but the terminal reads out command not found. I wanted to use this to then have each field I am interested in turned into a separate txt.file where I can change the extension to .csv and combine manually. Is there an easier way to do this?
Using awk you can do this:
echo ">A::LOLLLL rank=1 x=2 y=9 length=10" | awk -F"[: =]" '{print $1,$3,$NF}' OFS="\t"
>A LOLLLL 10
To get to separate files:
awk -F"[: =]" '{print $1 >"c1.csv";print $3 >"c2.csv";print $NF >"c3.csv"}' file

need some help on awk command

need a help with awk. reading a csv file and, doing some substitution on some of the columns. It's like 9th column(string type) should be replaced by value of (9th column itself + value of the 4th column(integer)), then 15th column by $15+$12, column 26th with $26+$23. same has to be done line by line for all the records. Suggestions please
Below is the sample I/O. and the first line which is Description must be left as is.
sample Input
EmpID|Empname|Empadd|roleId|roleDesc|Dept
100|mst|Del|20|SD|DA
101|ms|Del|21|XS|DA
Sample output
EmpID|Empname|Empadd|roleId|roleDesc|Dept
100|mst100|Del|20|SD20|DA
101|ms101|Del|21|XS21|DA
it's like empname has been concatenated with empid & the role desc with roleID.Hope that's helpful :)
This will perform the needed transformation:
$ awk 'NR>1{$2=$2$1;$5=$5$4}1' FS='|' OFS='|' file
EmpID|Empname|Empadd|roleId|roleDesc|Dept
100|mst100|Del|20|SD20|DA
101|ms101|Del|21|XS21|DA
If you have to do this for many columns you can use a for loop like so (provided a arithmetic or geometric stepsize):
$ awk 'NR>1{for(i=2;i<=5;i+=3)$i=$i$(i-1)}1' FS='|' OFS='|' file
EmpID|Empname|Empadd|roleId|roleDesc|Dept
100|mst100|Del|20|SD20|DA
101|ms101|Del|21|XS21|DA
When you say +, I'm assuming you mean string concatentation. IN awk, there is no specific concatenation operator, you just put two strings side-by-side.
awk -F, -v OFS=, '{$9 = $9 $4; $15=$15$12; $26=$26$23; print}' file.csv
Also assuming that by "csv", you actually mean comma-separated.
If you want to edit the file in-place, you need to do this:
awk ... file.csv > newfile && mv file.csv file.csv.bak && mv newfile file.csv
Edit: to leave the first line untouched:
awk -F, -v OFS=, 'NR>1 {$9 = $9 $4; $15=$15$12; $26=$26$23} {print}' file.csv
Now the columns are modified for the 2nd and subsequent lines, but every line is printed.
You'll sometimes see that written this way:
awk -F, -v OFS=, 'NR>1 {$9 = $9 $4; $15=$15$12; $26=$26$23} 1' file.csv

How to append one file to the other, with first file being edited, hence can't use usual cat command

Suppose that I have two files, each of them have header in the first line and records in the remaining lines. And I want to concatenate two files into one, but don't include header twice.
I tried the following commands while googling for the answer, (hence I may not cope in an optimal way).
cat awk 'NR!=1 {printf "%s\n", $1}' file2.csv >| file.csv
However, I got the following error.
cat: awk: No such file or directory
cat: NR!=1 {printf "%s\n",$1}: No such file or directory
It looks like cat recognized awk as files, not commands. I want the result of awk to be the content of files, so I also tried to pipe it to the argument of cat.
awk 'NR!=1 {printf "%s\n", $1}' file2.csv > cat file.csv
However, in this way, I got file cat, in which I got the result of awk...
So how can I solve it?
Thanks.
You need some grouping:
{
cat file1
sed '1d' file2
} > file.csv
As one line
{ cat file1; sed '1d' file2; } > file.csv
The semicolon before the ending brace is required.
{cat file1; tail -n +2 file2} > out
Print first line from first file, then print line #2 to the end of any file
awk 'NR==1||FNR>1' file1 file2 (file3 file4 ..) > outfile

File1 + (File2 - first line) > File3

I have two csv/text files that I'd like to join. Both contain the same first line. I'm trying to figure out how to use sed and cat to produce a merged file, but with only one copy of the first line. And I'm having a time with syntax. Any help would be greatly appreciated :-D!
Thanks,
Andrew
Another option with awk:
awk 'NR==FNR || FNR>1' file1.txt file2.txt .. fileN.txt
This prints all lines in the first file, OR any line in subsequent files after the first line.
This will combine files data1.txt and data2.txt in file merged.txt, skipping the first line from data2.txt. It uses awk if you are ok with it:
(cat data1.txt; awk 'NR>1' data2.txt) > merged.txt
awk appends all lines with line number > 1 from file data2.txt to file merged.txt.
NR is a built-in awk variable that stands for the current line number of the file being processed. If the Boolean expression NR > 1 is true, awk prints the line implicitly.
If you didn't care about keeping data1.txt intact, you could just append your 2nd file (minus its first line) and reduce to just this:
awk 'NR>1' data2.txt >> data1.txt
I'd say the most straightforward solution is:
( cat file1.txt ; tail -n +2 file2.txt ) > file3.txt
It has the advantage of stating clearly just what you're doing: print the entire first file, then print all but the first line of the second file, writing the output to the third file.
solved with one line
'1 d' means to delete first line in file2
the following command will append the result to file1
sed '1 d' file2 >> file1

Resources