Replacing all occurrences of a string within a file - unix

I'm producing many files with numeric fields. All the fields end with . Like 1234.
I need to Replace all occurrences of '.,' with a ','in all files

Assuming all the files are in the same dir, AND you're using an new-ish gnu sed that supports the -i (inplace) optin , you can do
cd /path/to/dataDir
for file in * ; do
sed -i 's/\([0-9]\)\.,/\1./g' "$f"
done
If you're using Mac OSX, you can either supply a file extension to the -i option like
sed -i".bak" ....
or indicate "overwrite existing" with
sed -i""
If you're in an vendored-unix environment, you may need to manage the output yourself. Then you can replace inner-loop with
sed s/\([0-9]\)\.,/\1./g' "$f" > "$f".new && /bin/mv "$f".new "$f"
IHTH

Related

Combine two sed commands in one line

I'm looking for an in place command to change all file lines which end with :.:.
From
chr01 1453173 . C T 655.85 PASS . GT:AD:DP:PGT:PID 0/1:25,29:54:.:.
To
chr01 1453173 . C T 655.85 PASS . GT:AD:DP 0/1:25,29:54
In words, I'm basically deleting :PGT:PID and :.:. from any line ending with :.:.
With GNU sed and Solaris sed:
sed '/:\.:\.$/{s///;s/PGT:PID//;}' file
If you want to edit your file with GNU sed "in place" use option -i.
With awk that'd be:
awk 'sub(/:\.:\.$/,""){sub(/:PGT:PID/,"")} 1' file
chr01 1453173 . C T 655.85 PASS . GT:AD:DP 0/1:25,29:54
and for inplace editing with gawk you could add the -i inplace option while with any awk you can just add > tmp && mv tmp file.
You are looking for something like:
sed -i.bak "/^.*:\.:\.$/ {s/:PGT:PID//g; s/:\.:\.//g;}" file
it does inplace replace with file and creates a backup as file.bak
the /^.*:\.:\.$/ restricts the s command to lines ending in :.:. the . need quoting because they are special characters for regexes
the s sommand replaces the strings with the empty string
As a one-liner:
sed -i -e '/:\.:\.$/!n' -e 's///' -e 's/:PGT:PID//g' "$file"
Expanded:
/:\.:\.$/!n # leave lines untouched unless they end in ":.:."
s/// # replace the matched ":.:." with nothing
s/:PGT:PID//g # replace ":PGT:PID" with nothing, everywhere
We use the -i flag to perform in-place edits. We pass each line as a separate -e expression for portability: some sed implementations allow several commands to be concatenated with ;, but that is not required by the POSIX standard.

Unix command to find and replace a string and to list all the files where the string has been replaced

I need to replace a string in all files starting from the name file_* in my current directory.
For e.g/:
cat file_1
test1
test2
test3
cat file_2
test4
test2
test3
I want to replace test2 with test100 from both the files in my current directory.
The following command finds and replaces the string but does NOT list the files that have been modified.
find . -name '*file_*'|xargs sed -i 's/test2/test100/g'
Can someone help me to solve this issue? I want to display all the file names that have been modified.
Thanks!
To only list files that have actually been modified (assumes Linux):
find . -name '*file_*' -exec sh -c \
'md5Before=$(md5sum "{}");
sed -i "s/test2/test100/g" "{}";
[ "$(md5sum "{}")" != "$md5Before" ] && echo "{}"' \;
On OSX, replace md5sum with md5 and use -i "" rather than just -i.
Note that comparing last-modified timestamps (stat -c %Y on Linux, stat -f %m on OSX) is NOT an option in this case, because sed -i will rewrite ALL files, even if their content wasn't modified.
Update: #Jonathan Leffler suggests a more concise and elegant alternative:
find . -name '*file_*' -exec sh -c \
'grep -l "test2" "{}" && sed -i "s/test2/test100/g" "{}";' \;
grep -l lists (outputs) the input filename, but only if a match was found (and exits as soon as the first match is found)
the (silent) sed -i command is then only invoked if a match was actually found (thanks to &&)
Aside from being shorter (and most likely faster), the added advantage is that not ALL files are rewritten -- only those that actually need it.
(The only slight disadvantage is that the search term is duplicated, but you could assign it to a shell variable and splice it into the sed program in both locations; if the search term were a sophisticated regex, things could get trickier, because regex support differs across utilities).
find . -name '*file_*' | while read rs
do
echo "$rs"
sed -i 's/test2/test100/g' "$rs"
done

redirecting in a shell script

I'm trying to write a script to swap out text in a file:
sed s/foo/bar/g myFile.txt > myFile.txt.updated
mv myFile.txt.updated myFile.txt
I evoke the sed program, which swaps out text in myFile.txt and redirects the changed lines of text to a second file. mv then moves .updated txt file to myFile.txt, overwriting it. That command works in the shell.
I wrote:
#!/bin/sh
#First, I set up some descriptive variables for the arguments
initialString="$1"
shift
desiredChange="$1"
shift
document="$1"
#Then, I evoke sed on these (more readable) parameters
updatedDocument=`sed s/$initialString/$desiredChange/g $document`
#I want to make sure that was done properly
echo updated document is $updatedDocument
#then I move the output in to the new text document
mv $updatedDocument $document
I get the error:
mv: target `myFile.txt' is not a directory
I understand that it thinks my new file's name is the first word of the string that was sed's output. I don't know how to correct that. I've been trying since 7am and every quotation, creating a temporary file to store the output in (disastrous results), IFS...everything so far gives me more and more unhelpful errors. I need to clear my head and I need your help. How can I fix this?
Maybe try
echo $updatedDocument > $document
Change
updatedDocument=`sed s/$initialString/$desiredChange/g $document`
to
updatedDocument=${document}.txt
sed s/$initialString/$desiredChange/g $document
Backticks will actually put the entire piped output of the sed command into your variable value.
An even faster way would be to not use updatedDocument or mv at all by doing an in-place sed:
sed -i s/$initialString/$desiredChange/g $document
The -i flag tells sed to do the replacement in-place. This basically means creating a temp file for the output and replacing your original file with the temp file once it is done, pretty much exactly as you are doing.
#!/bin/sh
#First, I set up some descriptive variables for the arguments
echo "$1" | sed #translation of special regex char like . * \ / ? | read -r initialString
echo "$2" | sed 's|[\&/]|\\&|g' | read -r desiredChange
document="$3"
#Then, I evoke sed
sed "s/${initialString}/${desiredChange}/g" ${document} | tee ${document}
don't forget that initialString and desiredChange are pattern interpreted as regex, so a trnaslation is certainly needed
sed #translation of special regex char like . * \ / ? is to replace by the correct sed (discuss on several post on the site)

sed -i option is not working on solaris

I am using sed to replace a line with NULL in a file. The command i used is
sed -i "s/.*shayam.*//g" FILE
This is working fine in linux. shayam is replaced with blank in the FILE. But when i used this in solaris it is showing some error.
sed: illegal option -- i
How to use -i functionality of sed in solaris. Kindly help.
The -i option is GNU-specific. The Solaris version does not support the option.
You will need to install the GNU version, or rename the new file over the old one:
sed 's/.shayam.//g' FILE > FILE.new && mv FILE.new FILE
I just answered a similar question sed -i + what the same option in SOLARIS, but for those who find this thread instead (I saw it in the related thread section):
The main problem I see with most answers given is that it doesn't work if you want to modify multiple files. The answer I gave in the other thread:
It isn't exactly the same as sed -i, but i had a similar issue. You
can do this using perl:
perl -pi -e 's/find/replace/g' file
doing the copy/move only works for single files. if you want to
replace some text across every file in a directory and
sub-directories, you need something which does it in place. you can do
this with perl and find:
find . -exec perl -pi -e 's/find/replace/g' '{}' \;
sed doesn't haven an -i option.
You are probably using some vendor-specific variant of sed. If you want to use the vendor-specific non-standardized extensions of your vendor-specific non-standardized variant of sed, you need to make sure that you install said vendor-specific non-standardized variant and need to make sure that you call it and don't call the standards-compliant version of sed that is part of your operating environment.
Note that as always when using non-standardized vendor-specific extensions, there is absolutely no guarantee that your code will be portable, which is exactly the problem you are seeing.
In this particular case, however, there is a much better solution: use the right tool for the job. sed is a stream editor (that's why it is called "sed"), i.e. it is for editing streams, not files. If you want to edit files, use a file editor, such as ed:
ed FILE <<-HERE
,s/.shayam.//g
w
q
HERE
See also:
Unable to use SED to edit files fast
How can I replace a specific line by line number in a text file?
Either cat the file or try <?
Then pipe (|) the result to a temp file and if all goes well (&&) mv the tempfile to the original file.
Example:
cat my_file | sed '!A!B!' > my_temp_file && mv my_temp_file my_file

Shell script - search and replace text in multiple files using a list of strings

I have a file "changesDictionary.txt" containing (a variable number of) pairs of key-value strings.
e.g.
"textToSearchFor" = "theReplacementText"
(The format of the dictionary is unimportant, and be changed as required.)
I need to iterate through the contents of a given directory, including sub-directories. For each file encountered with the extension ".txt", we search for each of the keys in changesDictionary.txt, replacing each found instance with the replacement string value.
i.e. a search and replace over multiple files, but using a list of search/replace terms rather than a single search/replace term.
How could I do this? (I have studied single search/replace examples, but do not understand how to do multiple searches within a file.)
The implementation (bash, perl, whatever) is not important as long as I can run it from the command line in Mac OS X. Thanks for any help.
I'd convert your changesDictionary.txt file to a sed script, with... sed:
$ sed -e 's/^"\(.*\)" = "\(.*\)"$/s\/\1\/\2\/g/' \
changesDictionary.txt > changesDictionary.sed
Note, any special characters for either regular expressions or sed expressions in your dictionary will be falsely interpreted by sed, so your dictionary can either only have only the most primitive search-and-replacements, or you'll need to maintain the sed file with valid expressions. Unfortunately, there's no easy way in sed to either shut off regular expression and use only string matching or quote your searches and replacements as "literals".
With the resulting sed script, use find and xargs -- rather than find -exec -- to convert your files with the sed script as quickly as possible, by processing them more than one at a time.
$ find somedir -type f -print0 \
| xargs -0 sed -i -f changesDictionary.sed
Note, the -i option of sed edits files "in-place", so be sure to make backups for safety, or use -i~ to create tilde-backups.
Final note, using search and replaces can have unintended consequences. Will you have searches that are substrings of other searches? Here's an example.
$ cat changesDictionary.txt
"fix" = "broken"
"fixThat" = "Fixed"
$ sed -e 's/^"\(.*\)" = "\(.*\)"$/s\/\1\/\2\/g/' changesDictionary.txt \
| tee changesDictionary.sed
s/fix/broken/g
s/fixThat/Fixed/g
$ mkdir subdir
$ echo fixThat > subdir/target.txt
$ find subdir -type f -name '*.txt' -print0 \
| xargs -0 sed -i -f changesDictionary.sed
$ cat subdir/target.txt
brokenThat
Should "fixThat" have become "Fixed" or "brokenThat"? Order matters for sed script. Similarly, a search and replace can be search and replaced more than once -- changing "a" to "b", may be changed by another search-and-replace later from "b" to "c".
Perhaps you've already considered both of these, but I mention because I've tried what you were doing before and didn't think of it. I don't know of anything that simply does the right thing for doing multiple search and replacements at once. So, you need to program it to do the right thing yourself.
Here are the basic steps I would do
Copy the changesDictionary.txt file
In it replace "a"="b" to the equivalent sed line: e.g. (use $1 for the file name)
sed -e 's/a/b/g' $1
(you could write a script to do this or just do it by hand, if you just need to do this once and it's not too big).
If the files are all in one directory, then you can do something like:
ls *.txt | xargs scriptFromStep2.sh
If they are in subdirs, use a find to call that script on all of the files, something like
find . -name '*.txt' -exec scriptFromStep2.sh {} \;
These aren't exact, do some experiments to make sure you get it right -- it's just the approach I would use.
(but, if you can, just use perl, it would be a lot simpler)
Use this tool, which is written in Perl - with quite a lot of bells and whistles - oldie, but goodie:
http://unixgods.org/~tilo/replace_string/
Features:
do multiple search-replace or query-search-replace operations
search-replace expressions can be given on the command line or read from a file
processes multiple input files
recursively descend into directory and do multiple search/replace operations on all files
user defined perl expressions are applied to each line of each input file
optionally run in paragraph mode (for multi-line search/replace)
interactive mode
batch mode
optionally backup files and backup numbering
preserve modes/owner when run as root
ignore symbolic links, empty files, write protected files, sockets, named pipes, and directory names
optionally replace lines only matching / not matching a given regular expression
This script has been used quite extensively over the years with large data sets.
#!/bin/bash
f="changesDictionary.tx"
find /path -type f -name "*.txt" | while read FILE
do
awk 'BEGIN{ FS="=" }
FNR==NR{ s[$1]=$2; next }
{
for(i in s){
if( $0 ~ i ){ gsub(i,s[i]) }
}
print $0
}' $f $FILE > temp
mv temp $FILE
done
for i in ls -1 /script/arq*.sh
do
echo -e "ARQUIVO ${i}"
sed -i 's|/$file_path1|/file_path2|g' ${i}
done

Resources