I am trying to remove the first and last characters from two separate columns prior to them being saved to a file. The characters I need to remove are the hyphens. Due to hyphens in the results, I am unable to just remove all of them. Is there a more effective way to use awk for this?
my current thoughts are something similar to this command.
cat file.txt | awk -F '|' '{print $2, $4}' | sed 's/.//;s/.$//' > newfile.txt
file example
1-|-40939-23-|-column-3-|-column-4-|
2-|-9832651-23-|-column-3-|-column-4-|
current output
40939-23- -column-4
9832651-23- -column-4
desired output
40939-23 column-4
9832651-23 column-4
$ awk -F'-[|](-|$)' '{print $2, $4}' file
40939-23 column-4
9832651-23 column-4
Could you please try following and let me know if this helps.
awk -F"|" '{gsub(/^-|-$/,"",$2);gsub(/^-|-$/,"",$(NF-1));print $2,$(NF-1)}' Input_file
Solution 2nd: Using field numbers considering that your Input_file will be always same.
awk 'BEGIN{FS="[-|]";OFS="-"}{print $4 OFS $5 " " $12 OFS $13}' Input_file
Related
I am reading a file and writing first 2 columns into an output file.
I want write with "," as a column separator
I tried with
awk -F"," -OFS"|" '{print $1 , $2}' filename
The output file doesn't have | separator
Thanks
Pratik
Yes it will not print since you didn't write it properly. Following are the 2 ways to mention OFS in any awk program.
1st way: By using -v OFS="|" mention it as a variable.
awk -F"," -v OFS="|" '{print $1,$2}' filename
2nd way: Use BEGIN section of awk for mentioning it(which is recommended too).
awk 'BEGIN{FS=",";OFS="|"}{print $1,$2}' filename
3rd way: As per ghoti's comment adding 1 more way of assigning value for OFS here. We could assign it before mentioning Input_file names too by doing this we could set different OFS values for different Input_file(s)(since awk could read multiple Input_files so it can help in those kind of situations). Eg-->
awk '{print $1,$2}' FS="," OFS="|" Input_file1 FS=":" OFS=";" Input_file2
In above command for Input_file1 FS is , and OFS is | and for Input_file2 FS is : and OFS is ;. Thanks to ghoti sir for mentioning this in comments :)
I have to write a awk command to find the no. of delimiter in a file.
I tried this :
awk -F '|' 'NF >0 {print NR, $0} ' file_name
but its not working
In case you want to know number of total fields in a line then you could use steffen's code in case you need to know number of delimiters in current line then you could use following.
awk -F'|' 'NF{print (NF-1), $0} ' Input_file
NR is the number of records, you're searching for NF, the number of fields:
awk -F '|' 'NF >0 {print NF, $0} ' file_name
I want to combine these two command and want to invoke single command
In first command i am storing 4th column of x.csv(Separator ,) file in z.csv file.
awk -F, '{print $4}' x.CSV > z.csv
In second command, i want to find out unique first-column value of z.csv(Separator-space) file.
awk -F\ '{print $1}' z.csv|sort|uniq
I want to combine these two command in single command,How can i do that?
Pipe the output of the first awk to the second awk:
awk -F, '{print $4}' x.CSV | awk -F\ '{print $1}' |sort|uniq
or, as Avinash Raj suggested,
awk -F, '{print $4}' x.CSV | awk -F\ '{print $1}' | sort -u
Assuming that the content of z.csv is actually wanted, rather than just an artefact of the way you're currently implementing your program, then you can use:
awk -F, '{ print $4 > "z.csv"
split($4, f, " ")
f4[f[1]] = 1
}
END { for (i in f4) print i }' x.CSV
The split function breaks field 4 on spaces, and (associative) array f4 records the key value. The loop at the end prints out the distinct values, unsorted. If you need them sorted, you can either use GNU awk's built-in sort functions or (if you don't have an awk with built-in sort functions) write your own in awk, or pipe the output to sort.
With GNU awk, you can replace the END block with:
END { asorti(f4); for (i in f4) print f4[i] }
If you don't want the z.csv file, then (a) you could have used a pipe in the first place, and (b) you can simply remove the print $4 > "z.csv" line.
awk '{split($4,b," "); a[b[1]]=1} END { for( i in a) print i }' FS=, x.CSV
This does not sort the data, but it's not clear if you actually want it sorted or merely needed that to get unique entries. If you do want it sorted, pipe it to sort.
Awk, I am new this this command, I know it can list out the text file with condition, but i have no idea how to list them when there is a "," in between the text, how do you count the "," in as $1.
but if its email, email won't show for some reason, I am thinking maybe I should include the "," ?, i am not sure how to solve the problem, and don't know what the problem is.
for example i want to show customerid and customersname, i will use:
awk'{print $1,$2}'
Customerid, customersname, email
12312322, MIKE, example#gmail.com
51231221, CALVIN, example2#gmail.com
91234232, LISA, example3#gmail.com
12359432, DICK, example4#gmail.com
94123432, ORAN, example5#gmail.com
63242333, KEVIN, example6#gmail.com
You want to use the comma as separator? Use -F like that:
awk -F, '{print $1,$2}'
If you want comma and spaces as separator you can use a regex:
awk -F',[[:space:]]*' '{print $1,$2}'
I'm not sure whether I got your question properly. You can specifiy the input field separator using the -F command line option:
awk -F, '{print $1, $2}' your.csv
Output:
Customerid customersname
12312322 MIKE
51231221 CALVIN
91234232 LISA
12359432 DICK
94123432 ORAN
63242333 KEVIN
simply using FS:
awk 'BEGIN { FS="," } {print $1,$2}'
from man awk:
7. Builtin-variables
The following variables are built-in and initialized before program execution.
...
FS splits records into fields as a regular expression.
...
Here is the code needed
awk -F "," '{print $1,$2}' input.txt
Output:
Customerid, customersname
12312322, MIKE
51231221, CALVIN
91234232, LISA
12359432, DICK
94123432, ORAN
63242333, KEVIN
Explanation:
-F = Field separator
"," = using comma because columns are separated by ,
'{print $1,$2}' = display first and second column
input.txt = the file you want to pass
Hope its help.
need a help with awk. reading a csv file and, doing some substitution on some of the columns. It's like 9th column(string type) should be replaced by value of (9th column itself + value of the 4th column(integer)), then 15th column by $15+$12, column 26th with $26+$23. same has to be done line by line for all the records. Suggestions please
Below is the sample I/O. and the first line which is Description must be left as is.
sample Input
EmpID|Empname|Empadd|roleId|roleDesc|Dept
100|mst|Del|20|SD|DA
101|ms|Del|21|XS|DA
Sample output
EmpID|Empname|Empadd|roleId|roleDesc|Dept
100|mst100|Del|20|SD20|DA
101|ms101|Del|21|XS21|DA
it's like empname has been concatenated with empid & the role desc with roleID.Hope that's helpful :)
This will perform the needed transformation:
$ awk 'NR>1{$2=$2$1;$5=$5$4}1' FS='|' OFS='|' file
EmpID|Empname|Empadd|roleId|roleDesc|Dept
100|mst100|Del|20|SD20|DA
101|ms101|Del|21|XS21|DA
If you have to do this for many columns you can use a for loop like so (provided a arithmetic or geometric stepsize):
$ awk 'NR>1{for(i=2;i<=5;i+=3)$i=$i$(i-1)}1' FS='|' OFS='|' file
EmpID|Empname|Empadd|roleId|roleDesc|Dept
100|mst100|Del|20|SD20|DA
101|ms101|Del|21|XS21|DA
When you say +, I'm assuming you mean string concatentation. IN awk, there is no specific concatenation operator, you just put two strings side-by-side.
awk -F, -v OFS=, '{$9 = $9 $4; $15=$15$12; $26=$26$23; print}' file.csv
Also assuming that by "csv", you actually mean comma-separated.
If you want to edit the file in-place, you need to do this:
awk ... file.csv > newfile && mv file.csv file.csv.bak && mv newfile file.csv
Edit: to leave the first line untouched:
awk -F, -v OFS=, 'NR>1 {$9 = $9 $4; $15=$15$12; $26=$26$23} {print}' file.csv
Now the columns are modified for the 2nd and subsequent lines, but every line is printed.
You'll sometimes see that written this way:
awk -F, -v OFS=, 'NR>1 {$9 = $9 $4; $15=$15$12; $26=$26$23} 1' file.csv