I've come across a bunch of files I need to import into my database with an awful time format
A09:13:08C
not even sure what it stands for
Is there any fast way using sed to replace 'A' by space and delete 'C'?
sed -r 's/A(.*)C/ \1/' filename
Simply you are saving all the words between A and C and then using it with \1
A more careful sentence would be:
sed -r 's/A([0-9:]+)C/ \1/'
Presumably, there is other data on the line, so using a casual .* is likely to mangle things. I'd use a rather verbose but restrictive pattern:
sed -e 's/A\([012][0-9]:[0-5][0-9]:[0-5][0-9]\)C/ \1/'
This looks for an A followed by a 24-hour clock time value and C, preserving the time portion. It would accept some invalid times (25-29 as the hour; indeed, 24:00:01 is not normally valid either, but 24:00:00 can be); it would be your judgement call whether it is worth refining these patterns (frankly, I doubt it, but it depends on how well you know your data).
If this is all that is in the file then
grep -o '[^AC]\+' file
If there are other fields i would use (g)awk.
Where N is the field.
awk '{match($1,/([^AC]+)/,x)}$1=x[1]' file
This looks much simplier:
tr A ' ' | tr -d C
Related
I have requirement in unix. I have a file which has the end time of a workflow like the below:
End Time:[Thu oct 05:12:12:12 2017]
I have to convert this to mm-dd-yyy hh24:mi:ss and write to another file.
Please let us know how can i do that?
Thanks,
Amit
Something like
grep input_filename -e "End Time:" | sed -E 's/End Time:\[\w{3}\s(\w{3})\s([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2})\s([0-9]{4}).*$/\1-\2-\6 \3:\4:\5/' | sed 's/jan/01/ig; s/feb/02/ig; s/mar/03/ig; s/apr/04/ig; s/may/05/ig; s/jun/06/ig; s/jul/07/ig; s/aug/08/ig; s/sep/09/ig; s/oct/10/ig; s/nov/11/ig; s/dec/12/ig' > output_filename
This is broken into three commands. First is grep to extract the line with the time you need. Second is sed to rearrange the date into the output format, but note that the month is still a string. The third command is another sed call and this replaces the month name with a number.
Caveats:
the above regex are rather strict and will fail on simple deviations
no need to split the regex stuff into two sed invocations, this could be handled in one
or for that matter, the whole process could be more gracefully handled with python, perl, etc
I am new to Unix and currently I have a large file of various data. In this file there are lines that are now redundant and will need to be removed.
In the file the pattern:
<contact contact_id="<number>" txn="D">
</contact>
Edit: There are also similar lines to the ones to be removed, an example is:
<contact contact_id="<number>" txn="N">
</contact>
I have attempted to use grep -A 1 to pick up the pattern and remove the next line however I am operating on an old version of Solaris and -A is an illegal expression.
As well as this I have attempted to use sed -e '12442,+1d' and this just give the ouput of
sed: command garbled: 12442,+1d
.
Please can you help me with a new solution.
use awk?
something like
/<contact contact_id=.* txn="D">/ { got_contact = 1; next }
got_contact == 1 { got_contact = 0; next }
{ print }
even the ancient awk should be able to handle that. (There might be a more compact solution, but this isn't code golf)
Can you use GNU sed ?
For those who want to write portable sed scripts, be aware that some implementations have been known to limit line lengths (for the pattern and hold spaces) to be no more than 4000 bytes. The POSIX standard specifies that conforming sed implementations shall support at least 8192 byte line lengths. GNU sed has no built-in limit on line length; as long as it can malloc() more (virtual) memory, you can feed or construct lines as long as you like.
The next solution starts converting the file to one long line:
tr '\n' '\r' < your_file |
sed 's#<contact contact_id=[^ ]* txn="D">\r</contact>\r##g;
s#\r#\n#g'
I have a parameter name like
PAR="DBS_OUT"
and I have a text file(Repla.txt) with below values:
DB_TECH
DB_ADMIN
DB_TERA
DB_APS
These values in the files can defer but the parameter value will remain the same.
Now I have some Unix shell script where I need to find all such values mentioned in the file (Repla.txt)
and replace them with the parameter (PAR). Since the values in the Repla.txt is not fixed I am not able to use the sed command. for eg:
sed 's/old/new/g' input.txt > output.txt
Can anyone please help me.
Thanks
I'm not sure I completely understand what you are trying to do but if you are trying to use the values contained in Repla.txt as the strings that you want to replace in other files then the following bash line will do what you want:
PAR="DBS_OUT"; for FIND in `cat Repla.txt`; do $( find /path/to/files -name 'test?.txt' -exec sed -i "s/$FIND/$PAR/g" '{}' \;); done;
It will replace the strings contained in Repla.txt with the string DBS_OUT in all files that match test?.txt in the dir (and subdirs) /path/to/files. You will need to understand how find works.
Also note that I am not telling sed to backup, you probably want to test this out on some test files before you execute it for real. Hopefully you also have your scripts in source control so its not a big deal if you mess things up.
I hope your replacement on Capital letters only.
sed 's/DBS_[A-Z]*/DBS_OUT/g' repla.txt > destination file
or
sed 's/DBS_[A-Z]*/DBS_OUT/g' repla.txt
I use grep to sort log big file into small one but still there is long dir path in output log file which is common every time.I have to do find and replace every time.
Isnt there any way i can grep -r "format" log.log | execute findnreplce thing?
Sed will do what you want. Basic syntax to replace all the matches of foo with bar in-place in $file is:
sed -i 's/foo/bar/g' $file
If you're just wanting to delete rather than replace, simply leave out the 'bar' (so s/foo//g).
See this tutorial for a lot more detail, such as regex support.
sed -n '/match/s/pattern/repl/p'
Will print all the lines that match the regex match, with all instances of pattern replaced by repl. Since your lines may contain paths, you will probably want to use a different delimeter. / is customary, but you can also do:
sed -n '\#match#s##repl#p`
In the second case, omitting pattern will cause match to be used for the pattern to be replaced.
I want to remove all lines except the line(s) containing matching pattern.
This is how I did it:
sed -n 's/matchingpattern/matchingpattern/p' file.txt
But I'm just curious because I rename matching pattern to the matching pattern itself. Looks like a waste here.
Is there a better way to do this?
sed '/pattern/!d' file.txt
But you're reinventing grep here.
grep is certainly better...because it's much faster.
e.g. using grep to extract all genome sequence data for chromosome 6 in a data set I'm working with:
$ time grep chr6 seq_file.in > temp.out
real 0m11.902s
user 0m9.564s
sys 0m1.912s
compared to sed:
$ time sed '/chr6/!d' seq_file.in > temp.out
real 0m21.217s
user 0m18.920s
sys 0m1.860s
I repeated it 3X and ~same values each time.
This might work for you:
sed -n '/matchingpattern/p' file.txt
/.../ is an address which may have actions attached in this case p.
Instead of using sed, which is complicated, use grep.
grep matching_pattern file
This should give you the desired result.