extract string from rows using awk based on string value - unix

I have a text file with data as:
(832555,488012,0,17:31:32.541,2014-08-06 17:31:32.000,0,0,NULL,FBCD,"-6484620512517810993"etcetcetc
I want to extract the string post FBCD so my output should be:
FBCD,"-6484620512517810993"etcetc
I am able to find the position of FBCD using awk:
awk '{print substr("FBCD",1,200)}' file.txt
but I cannot extract the remaining values.

USing awk With substr
awk '{print substr($0,index($0,"FBCD"),200)' file
Using sed
sed -e 's/^.*\(FBCD.\{200\}\).*/\1/' file
If you want it till the end of the line
awk '{print substr($0,index($0,"FBCD"))' file
sed -e 's/^.*\(FBCD\)/\1/' file

Your codes: substr("FBCD",1,200)
will cut the input string from char 1 - 200. but you gave FBCD as input string, and FBCD has only length 4, that's why you got only FBCD.
In fact, grep was born to extract things, would this help you?
grep -oE 'FBCD.{1,196}' file

Related

Unix command to parse string

I'm trying to figure out a command to parse the following file content:
Operation=GET
Type=HOME
Counters=CacheHit=0,Exception=1,Validated=0
I need to extract Exception=1 into its own line. I'm fiddling with awk, sed and grep but not making much progress. Does anyone have any tips on using any unix command to perform this?
Thanks
Since your file is close to bash syntax, there is a fun little trick you can do to make bash itself parse the file. First, use some program like tr to transform the input into a something bash can parse, and then "source" that, which will create shell variables you can expand later to get the values.
source <(tr , $'\n' < file_name_goes_here)
echo $Exception
Many ways to do this. Here is one assuming the file is called "file.txt". Grab the line you want, replace everything from the start of the line up to Except with just Except, then pull out the first field using comma as the delimiter.
$ grep Exception file.txt | sed 's/.*Except/Except/g' | cut -d, -f 1
Exception=1
If you wanted to use gawk:
$ grep Exception file.txt | sed 's/.*Except/Except/g' | gawk -F, '{print $1}'
Exception=1
or just using grep and sed:
$ grep Exception file.txt | sed 's/.*\(Exception=[0-9]*\).*/\1/g'
Exception=1
or as #sheltter reminded me:
$ egrep -o "Exception=[0-9]+" file.txt
Exception=1
No need to use a mix of commands.
awk -F, 'NR==2 {print RS$1}' RS="Exception" file
Exception=1
Here we split the line by the keyword we look for RS="Exception"
If the line has two record (only when keyword is found), then
print first field, separated using command, with Record selector.
PS This only works if you have one Exception field

How to read nth line and mth field of text file in unix

Suppose i have | delimeted file,
Line1: 1|2|3|4
Line2: 5|6|7|8
Line3: 9|9|1|0
Now i need to read 3 field at second line which is 7 in above example how i can do that using Cut or Sed Command. I'm new to unix please help
A job for awk:
awk -F '|' 'NR==2{print $3}' file
or
awk -F '|' -v row=2 -v col=3 'NR==row{print $col}' file
Output:
7
This should work:
sed -n '2p' file |awk -F '|' '{print $3}'
This might work for you (GNU sed):
sed -rn '2s/^(([^|]*)\|?){3}.*/\2/p' file
Turn off automatic printing by setting the -n option, turn on easier regexp declaration by -r option. Use pattern matching and back references to replace the whole of the second line by the third field of the same line and print the result.
The address of the substitution command is limited to only the second line.
The regexp groups the non-delimited characters followed by a delimiter a specific number of times. The second group, only retains the non-delimited characters for the specific number. Each grouping is replaced by the next and so the last grouping is reported, the .* consumes the remainder of the line and so only the third field (contents of second group) is printed.
N.B. the delimiter would be present following the final column and is therefore optional \|?

Replace string pattern having space and line feeds using sed

Below I am setting a variable read from a file:
$ replace=$(cat file.txt)
and I am trying to use sed as below
$ sed -i 's|old|'$replace'|g'
However, I get the below error:
sed: -e expression #1, char 7: unterminated `s' command
Note - In my case the replace string is read from a file which has space and line feed like below.
file.txt
line 1
line 2
Can't sed handle patterns that have new lines?
Is it possible? Probably though maybe with some non-portable GNU extensions but it's not what sed is for anyway (simple substitutions on individual lines) so don't do it with sed, just use awk:
awk 'NR==FNR{new = new $0 ORS; next} {sub(/old/,new)}1' file.txt targetFile
Note you don't need a shell variable with cat to do it but you can use one if you like:
replace=$(cat file.txt)
awk -v new="$replace" '{sub(/old/,new)}1' targetFile

Remove all text starting from a specific character

I have a file setup like
TEXT1:TEXT2
Basically lines of text separated by a :
I would like all text to the right of the : gone,
so TEXT1:TEXT2 would turn into just TEXT1
Using cut
We tell cut that our field separator is a colon, -d:, and that we want to select the first field, -f1:
$ cut -d: -f1 file
TEXT1
Using sed
We tell sed to remove the first colon on a line and everything after:
$ sed 's/:.*//' file
TEXT1
Using grep
We tell grep to select the first part of each up to but not including the first colon:
$ grep -o '^[^:]*' file
TEXT1
awk -F: '{$0=$1}1' infile
TEXT1
make ":" as your delimiter and then set column1 as your record.
Below script
awk -v FS=":" '{print $1}' file
would also give you the same result.
In AWK, replace everything after the : with nothing:
$ awk 'sub(/:.*/,"",$0)' test
TEXT1
Using sed
sed -i.bkp 's/:.*//' infile.txt
This will also change the file inplace and create a backup file named infile.txt.bkp
Using grep
grep -oP '.*(?=:)' infile.txt

using sed -n with variables

I am having a log file a.log and i need to extract a piece of information from it.
To locate the start and end line numbers of the pattern i am using the following.
start=$(sed -n '/1112/=' file9 | head -1)
end=$(sed -n '/true/=' file9 | head -1)
i need to use the variables (start,end) in the following command:
sed -n '16q;12,15p' orig-data-file > new-file
so that the above command appears something like:
sed -n '($end+1)q;$start,$end'p orig-data-file > new-file
I am unable to replace the line numbers with the variables. Please suggest the correct syntax.
Thanks,
Rosy
When I realized how to do it, I was looking for anyway to get line number into a file containing the requested info, and display the file from that line to EOF.
So, this was my way.
with
PATTERN="pattern"
INPUT_FILE="file1"
OUTPUT_FILE="file2"
line number of first match of $PATTERN into $INPUT_FILE can be retrieved with
LINE=`grep -n ${PATTERN} ${INPUT_FILE} | awk -F':' '{ print $1 }' | head -n 1`
and the outfile will be the text from that $LINE to EOF. This way:
sed -n ${LINE},\$p ${INPUT_FILE} > ${OUTPUT_FILE}
The point here, is the way how can variables be used with command sed -n:
first witout using variables
sed -n 'N,$p' <file name>
using variables
LINE=<N>; sed -n ${LINE},\$p <file name>
Remove the single quotes thus. Single quotes turn off the shell parsing of the string. You need shell parsing to do the variable string replacements.
sed -n '('$end'+1)q;'$start','$end''p orig-data-file > new-file

Resources