remove certains characters at the beginning of a line with awk - unix

I would like to remove some characters that can be at a beginning of a line, with awk. The characters I would like to remove are # and/or =
Here is a file example :
#word <= Remove #
=word <= Remove =
#=word <= Remove # AND =
=#word <= Remove = AND #
At the moment, I use sub(/^\#/, "", $0) to remove # at the beginning of a line. How can I edit this line so it removes #, = and both if they are present together ?

With awk:
awk '{sub(/^[#=]+/, "")}1' File
With sed:
sed -r 's/^[#=]+//' File

In awk:
$ awk 'gsub(/^[=#]+/,"")||1' file
word <= Remove #
word <= Remove =
word <= Remove # AND =
word <= Remove = AND #
Explained:
awk '
gsub(/^[=#]+/,"") || 1 # replace all leading =s and #s with "" and print nevertheless
' file

Related

How to remove the tab delimiter after the last column by using unix

I have a tab separated file. I am using the below code:
awk -v var="MAS_CONTROL_WL_column_nmbr.dat" 'BEGIN{RS="\n"}
{ while(getline line < var){ printf("%s\t",$line)};close(var);
printf( "\n") }' MAS_CONTROL_WL.tsv > test.tsv
This code prints the column number that is present in the column number file but the issue that I am facing is \t is coming after the last column.
How to remove that?
First a test file:
$ cat > foo
1
2
3
And the awk:
$ awk -v var=foo '
BEGIN { RS="\n" }
{
out="" # introducing output buffer
while(getline line < var) {
out=out sprintf("%s%s",(out==""?"":"\t"),line) # controlling tabs
}
close(var)
print out # output output buffer
}' foo | cat -T # useful use of cat
Output:
1^I2^I3
1^I2^I3
1^I2^I3
Instead of printing "field-tab" for every field, print the first field without a tab, then append the rest as "tab-field":
awk -v var="MAS_CONTROL_WL_column_nmbr.dat" '
BEGIN{RS="\n"}
{
if (getline line < var) printf("%s",$line);
while (getline line < var) printf("\t%s",$line);
close(var);
printf( "\n");
}
' MAS_CONTROL_WL.tsv > test.tsv
In case you still need an answer to your original question (removing \t after the last column) :sed -i 's/[[:space:]]$//' your_file.tsv will remove the white space at the end of the lines of your file.

Replace column in header of a large .txt file - unix

i need to replace the date in header of a large file. So i have multiple column in header, using |(pipe) as separator, like this:
A|B05|1|xxc|2018/06/29|AC23|SoOn
So i need the same header but with the date(5th column) updated : A|B05|1|xxc|2018/08/29|AC23
Any solutions for me? I tried with awk and sed but both of them carried me errors greater than me. I'm new on this and i really want to understand the solution. So could you please help me?
You can use below command which replaces 5th column from every line with content of newdate variable:
awk -v newdate="2018/08/29" 'BEGIN{FS=OFS="|"}{ $5 = newdate }1' infile > outfile
Explanation
awk -v newdate="2018/08/29" ' # call awk, and set variable newdate
BEGIN{
FS=OFS="|" # set input and output field separator
}
{
$5 = newdate # assign fifth field with a content of variable newdate
}1 # 1 at the end does default operation
# print current line/row/record, that is print $0
' infile > outfile
If you want to skip first line incase if you have header then use FNR>1
awk -v newdate="2018/08/29" 'BEGIN{FS=OFS="|"}FNR>1{ $5 = newdate }1' infile > outfile
If you want to replace 5th column in 1st row only then use FNR==1
awk -v newdate="2018/08/29" 'BEGIN{FS=OFS="|"}FNR==1{ $5 = newdate }1' infile > outfile
If you still have problem, frame your question with sample input and
expected output, so that it will be easy to interpret your problem.
Short sed solution:
sed -Ei '1s~\|[0-9]{4}/[0-9]{2}/[0-9]{2}\|~|2018/08/29|~' file
-i - modify the file in-place
1s - substitute only in the 1st(header) line
[0-9]{4}/[0-9]{2}/[0-9]{2} - date pattern

Appending whitespace to a variable in AWK script

I have an AWK script, which receives an input variable from another script.
The length of the input variable is compared. if the length is 3, two whitespace is added infront of variable. If the length is 4, 1 whitespace is added in front. I could compare the length but am not able to append white space.
I tried the following in AWK script
if (length(input_variable) ==3 ) {
input_variable = " "input_variable
} else if(length(input_variable) ==4 ){
input_variable = " "input_variable
}print input_variable
Output: No value is getting printed. Please help me
you should use printf
awk '{printf "%5s", $1}'
pads with spaces on the left to the desired length, don't reinvent.

Merging the rows in a file using awk

Can somebody explain the meaning of the below script please?
awk -F "," 's != $1 || NR ==1{s=$1;if(p){print p};p=$0;next}
{sub($1,"",$0);p=p""$0;}
END{print p}' file
The file has the following data:
2,"URL","website","aaa.com"
2,"abc","text","some text"
2,"Password","password","12345678"
3,"URL","website","10.0.10.75"
3,"uname","text","some text"
3,"Password","password","password"
3,"ulang","text","some text"
4,"URL","website","192.168.2.110"
4,"Login","login","admin"
4,"Password","password","blah-blah"
and the output is:
2,"URL","website","aaa.com","abc","text","some text",Password","password","12345678"
3,"URL","website","10.0.10.75","uname","text","some text""Password","password","password","ulang","text","some text"
awk has this structure
pattern {action}
for your script, let's analyze the elements, first pattern
s != $1 || NR == 1 # if the variable s is not equal to first field
# or we're operating on first row
first action
s = $1 # assign variable s to first field
if (p) { # if p is not empty, print
print p
}
p = $0 # assign the current line to p variable
next # move to next, skip the rest
next pattern is missing, so the action will apply to all rows
sub($1, "", $0) # same as $1="", will clear the first field
p = ((p "") $0) # concat new line to p
last pattern is special reserved word END, only applied when all rows are consumed (there is counterpart BEGIN that's applied before the file is opened)
END {
print p # print the value of p
}

how to print second row as second column unix

How can I print every second row as tab delimited second column like below. thanx in advance.
input
wex
2
cr_1.b
4
output
wex 2
cr_1.b 4
Here's another option that doesn't depend on the length of lines:
awk '{ if (NR % 2 == 1) tmp=$0; else print tmp, $0; }' <filename>
If you really want a tab separator, use printf "%s\t%s\n",tmp,$0; instead.
Assuming you have no blank lines in your input file, this should do the trick:
awk 'length(f) > 0 { print f $0; f = "" } length(f) == 0 { f = $0 }' file

Resources