Formatting file rows using UNIX commands - unix

I have file contents as:
file1.txt|file2.txt|file2.txt|.............................fileN.txt
log1.txt|log2.txt|log3.txt|................logN.txt
I want to print it from right to left for each row as:
Output:
fileN.txt|fileN-1.txt|fileN-2.txt|.............................file1.txt
logN.txt|logN-1.txt|logN-2.txt|................log1.txt
Please help or let me know if it is not clear.

Here's one way using awk:
awk -F "|" '{ for (i=NF;i>=1;i--) printf "%s", $i (i==1 ? "\n" : FS) }' file
Testing:
Contents of file:
file1.txt|file2.txt|file3.txt|file4.txt|file5.txt|file6.txt
log1.txt|log2.txt|log3.txt|log4.txt|log5.txt
Results:
file6.txt|file5.txt|file4.txt|file3.txt|file2.txt|file1.txt
log5.txt|log4.txt|log3.txt|log2.txt|log1.txt

you can write simple perl script when you read this file...
chop the line with "|" sperater and put it into one array and then again write this array with reverse index to file.
I hope you understand the algo
Regards,
Vinay

If you prefer Python
for line in open("file.txt").read().split("\n"):
print "|".join(line.split("|")[::-1])

Related

read parameter from a url

I am trying to read a parameter from the URL, i am able to read for single line but i don't know how to loop in awk, can someone help?
i have file with 1000+ entries like
http://projectreporter.nih.gov/project_info_details.cfm?aid=7714687&icde=0
http://projectreporter.nih.gov/project_info_description.cfm?aid=7896503&icde=0
http://projectreporter.nih.gov/project_info_details.cfm?aid=7895320&icde=0
http://projectreporter.nih.gov/project_info_details.cfm?aid=2675186&icde=9195637
i am trying to only retrive "aid=xxxxxxx", i used the following command to do it and i get the "aid" for the last line
awk '{match($0,"aid=([^ &]+)",a)}END{print a[1]}' file1.txt > outputFile.txt
how to do the same in a loop so i can get all the occurrence?
any help would be appreciated
This should work a little fine tuning for your attempted code.
awk 'match($0,/aid[^&]*/){print substr($0,RSTART,RLENGTH)}' Input_file
In case your single line can have multiple occurrences of aid and you want to print all then try following.
awk '
{
while(match($0,/aid[^&]*/)){
print substr($0,RSTART,RLENGTH)
$0=substr($0,RSTART+RLENGTH)
}
}
' Input_file

Awk command to perform action on lines excluding 1st and last

I have multiple MS excel files in csv format in a particular directory.
I want to update the value of one particular column in all the rows of the csv files.
Also, the action should not be operated on 1st and last line.
So far I have come up with below code for one row:
awk -F, 'NR>2{$2=300;}1' OFS=, test.csv
But i am facing difficulty in excluding the last line.
Also, i need to perform the same for all the files in the directory.
So far tried the below but not able to succeed to replace that string value using awk.
1)
2)
This may do:
awk -F, 't{print t} {a=t=$0} NR>1{$2=300;t=$0} END {print a}' OFS=, test.csv
$ cat file
1,a,b
2,c,d
3,e,f
$ awk 'BEGIN{FS=OFS=","} NR>1{print (NR>2 ? chgd : orig)} {orig=$0; $2=300; chgd=$0} END{print orig}' file
1,a,b
2,300,d
3,e,f
You could simplify the script a bit by reading the file twice:
awk 'BEGIN{FS=OFS=","} NR==FNR {c=NR;next} !(FNR==1||FNR==c){$2=200} 1' file file
This uses the NR==FNR section merely to count lines, giving you a simple expression for determining whether to update the field in question.
And if you have GNU awk available, you might save a few CPU cycles by not reassigning the c variable for every line, using something like this:
gawk 'BEGIN{FS=OFS=","} ENDFILE {c=FNR} NR==FNR{next} !(FNR==1||FNR==c){$2=200} 1' file file
This still reads the file twice, but assigns c only after each file is read.
If you want, you can emulate the ENDFILE condition in non-GNU awk using NR>FNR && FNR==1 if you only have two files, then set c=NR-1. It won't perform as well.
I haven't tested the speed difference between these two, but I suspect it would be negligible except in cases of truly obscenely large files.
Thanks all,
I got to make it work. Below is the command:
awk -v sq="" -F, 't{print t} {a=t=$0} NR>2{$3=sq"ops_data"sq;t=$0} END {print a}' OFS=, test1.csv

How to read a value from recursive xml attribute in Unix using sed/awk/grep only

I have config.xml. Here I need to retrieve the value of the attribute from the xpath
/domain/server/name
I can only use grep/sed/awk. Need Help
The content of the xml is below where I need to retrieve the Server Name only.
<domain>
<server>
<name>AdminServer</name>
<port>1234</port>
</server>
<server>
<name>M1Server</name>
<port>5678</port>
</server>
<machine>
<name>machine01</name>
</machine>
<machine>
<name>machine02</name>
</machine>
</domain>
The output should be :
AdminServer
M1Server
I tried to do,
sed -ne '/<\/name>/ { s/<[^>]*>(.*)<\/name>/\1/; p }' config.xml
sed is only for simple substitutions on individual lines, doing anything else with sed is strictly for mental exercise, not for real code. That's not what you are trying to do so you shouldn't even be considering sed. Just use awk:
$ awk -F'[<>]' 'p=="server" && $2=="name"{print $3} {p=$2}' file
AdminServer
M1Server
That will work with any awk on any UNIX box. If that's not all you need then edit your question to provide more truly representative sample input and expected output.
Try this command. Name your xml and supply that file as an input.
awk '/<server>/,/<\/server>/' < name.xml | grep "name" | cut -d ">" -f2 | cut -d "<" -f1
OutPut:
AdminServer
M1Server
Based on your sample Input_file shown, could you please try following.
awk -F"[><]" '/<\/server>/{a="";next} /<server>/{a=1;next} a && /<name>/{print $3}' Input_file
sed -n '/<server>/{n;s/\s*<[^>]*>//gp}'
for example. for the first match
1. /<server>/
match the line that contains "<server>" got " <server>"
2. n
the "n" command will go to next line. after executed "n" command got " <name>AdminServer</name>"
3.s/\s*<[^>]*>//gp
replece all "\s*<[^>]*>" as "". then print the pattern space
type "info sed" for more sed command
You can get the desired output with just sed:
sed -n 's:.*<name>\(.*\)</name>.*:\1:p' config.xml
I feel dirty parsing XML in awk.
The following finds the correct depth of entry with the right tag name. It does not verify the path, though it depends on the elements you specified. While this works on your example data, it makes certain ugly assumptions and it's not guaranteed to work elsewhere:
awk -F'[<>]' '$2~/^(domain|server|name)$/{n++} $1~/\// {n--} n==3&&$2=="name"{print $3}' input.xml
A better solution would be to parse the XML itself.
$ awk -F'[<>]' -v check="domain.server.name" '$2~/^[a-z]/ { path=path "." $2; closex="</"$2">" } $0~closex { sub(/\.[^.]$/,"",path) } substr(path,2)==check {print path " = " $3}' input.xml
.domain.server.name = AdminServer
Here it is split out for easier commenting.
$ awk -F'[<>]' -v check="domain.server.name" '
# Split fields around pointy brackets. Supply a path to check.
$2~/^[a-z]/ { # If we see an open tag,
path=path "." $2 # append the current tag to our path,
closex="</"$2">" # compose a close tag which we'll check later.
}
$0~closex { # If we see a close tag,
sub(/\.[^.]$/,"",path) # truncate the path.
}
substr(path,2)==check { # If we match the given path,
print path " = " $3 # print the result.
}
' input.xml
Note that this solution barfs horribly if you feed it badly formatted XML. The recognition of tags could be improved, but may be sufficient if you have consistently formatted XML. It may barf horribly for other reasons too. Do not do this. Install the correct tools to parse XML properly.

How to extract this numerical value from the text file in unix

I want to extract the value of VALUE_ID in the below text and store it in a variable.
MSG : SUCCESS! ABCDEFGHIJK
VALUE_ID: 775
Please note that there is a space after : in VALUE_ID.
Can we use awk for this or is there any easier way?
Here is a possible solution:
awk '$1 == "VALUE_ID:" {id=$2}' input_file
This seems fairly pointless to me. If you describe your needs more precisely the I could help you better.
With awk:
var=$(awk '$1 == "VALUE_ID:" {print $2}' File)
Inside awkscript, we check if the first field in the line is VALUE_ID:. if yes, print the value field which will be seperated by space. This output is saved to the bash variable var, which will contain 775.

How do you split a file base on a token?

Let's say you got a file containing texts (from 1 to N) separated by a $
How can a slit the file so the end result is N files?
text1 with newlines $
text2 $etc... $
textN
I'm thinking something with awk or sed but is there any available unix app that already perform that kind of task?
awk 'BEGIN{RS="$"; ORS=""} { textNumber++; print $0 > "text"textNumber".out" }' fileName
Thank to Bill Karwin for the idea.
Edit : Add the ORS="" to avoid printing a newline at the end of each files.
Maybe split -p pattern?
Hmm. That may not be exactly what you want. It doesn't split a line, it only starts a new file when it sees the pattern. And it seems to be supported only on BSD-related systems.
You could use something like:
awk 'BEGIN {RS = "$"} { ... }'
edit: You might find some inspiration for the { ... } part here:
http://www.gnu.org/manual/gawk/html_node/Split-Program.html
edit: Thanks to comment from dmckee, but csplit also seems to copy the whole line on which the pattern occurs.
If I'm reading this right, the UNIX cut command can be used for this.
cut -d $ -f 1- filename
I might have the syntax slightly off, but that should tell cut that you're using $ separated fields and to return fields 1 through the end.
You may need to escape the $.
awk -vRS="$" '{ print $0 > "text"t++".out" }' ORS="" file
using split command we can split using strings.
but csplit command will allow you to slit files basing on regular expressions as well.

Resources