Remove pattern from filenames

Remove pattern from filenames - unix

how can I remove the word "myfile" in a list of filenames with this structure?
mywork_myfile_XSOP.txt
mywork_myfile_ATTY.txt
mywork_myfile_ATPY.txt
Desired_output:
mywork_XSOP.txt
mywork_ATTY.txt
mywork_ATPY.txt

The simplest method is to use the common rename command which is available in most Unices.
rename 's/^mywork_myfile_/mywork_/' *
This of course expects you to be on the directory of the files. This will not overwrite files. If you want that, just pass the -f option. Also, take note that there's multiple versions of rename out there which may have different options.

Based on this answer on "Rename all files in "Rename all files in directory from $filename_h to $filename_half?", this can be a way:
for file in mywork_myfile*txt
do
mv "$file" "${file/_myfile/}"
done
Note that it uses the bash string operations as follows:
$ file="mywork_myfile_XSOP.txt"
$ echo ${file/_myfile/}
mywork_XSOP.txt

This would work in any Posix shell...
#!/bin/sh
for i
in mywork_myfile_XSOP.txt \
mywork_myfile_ATTY.txt \
mywork_myfile_ATPY.txt; do
set -x
mv "$i" "$(echo $i | sed -e s/myfile_//)"
set +x
done

Related

Rename files based on pattern in path

I have thousands of files named "DOCUMENT.PDF" and I want to rename them based on a numeric identifier in the path. Unfortunately, I don't seem to have access to the rename command.
Three examples:
/000/000/002/605/950/ÐÐ-02605950-00001/DOCUMENT.PDF
/000/000/002/591/945/ÐÐ-02591945-00002/DOCUMENT.PDF
/000/000/002/573/780/ÐÐ-02573780-00002/DOCUMENT.PDF
To be renamed as, without changing their parent directory:
2605950.pdf
2591945.pdf
2573780.pdf

Use a for loop, and then use the mv command
for file in *
do
num=$(awk -F "/" '{print $(NF-1)}' file.txt | cut -d "-" -f2);
mv "$file" "$num.pdf"
done

You could do this with globstar in Bash 4.0+:
cd _your_base_dir_
shopt -s globstar
for file in **/DOCUMENT.PDF; do # loop picks only DOCUMENT.PDF files
# here, we assume that the serial number is extracted from the 7th component in the directory path - change it according to your need
# and we don't strip out the leading zero in the serial number
new_name=$(dirname "$file")/$(cut -f7 -d/ <<< "$file" | cut -f2 -d-).pdf
echo "Renaming $file to $new_name"
# mv "$file" "$new_name" # uncomment after verifying
done
See this related post that talks about a similar problem: How to recursively traverse a directory tree and find only files?

In-place processing with grep

I've got a script that calls grep to process a text file. Currently I am doing something like this.
$ grep 'SomeRegEx' myfile.txt > myfile.txt.temp
$ mv myfile.txt.temp myfile.txt
I'm wondering if there is any way to do in-place processing, as in store the results to the same original file without having to create a temporary file and then replace the original with the temp file when processing is done.
Of course I welcome comments as to why this should or should not be done, but I'm mainly interested in whether it can be done. In this example I'm using grep, but I'm interested about Unix tools in general. Thanks!

sponge (in moreutils package in Debian/Ubuntu) reads input till EOF and writes it into file, so you can grep file and write it back to itself.
Like this:
grep 'pattern' file | sponge file

Perl has the -i switch, so does sed and Ruby
sed -i.bak -n '/SomeRegex/p' file
ruby -i.bak -ne 'print if /SomeRegex/' file
But note that all it ever does is creating "temp" files at the back end which you think you don't see, that's all.
Other ways, besides grep
awk
awk '/someRegex/' file > t && mv t file
bash
while read -r line;do case "$line" in *someregex*) echo "$line";;esac;done <file > t && mv t file

No, in general it can't be done in Unix like this. You can only create/truncate (with >) or append to a file (with >>). Once truncated, the old contents would be lost.

In general, this can't be done. But Perl has the -i switch:
perl -i -ne 'print if /SomeRegEx/' myfile.txt
Writing -i.bak will cause the original to be saved in myfile.txt.bak.
(Of course internally, Perl just does basically what you're already doing -- there's no special magic involved.)

To edit file in-place using vim-way, try:
$ ex -s +'%!grep foo' -cxa myfile.txt
Alternatively use sed or gawk.

Most installations of sed can do in-place editing, check the man page, you probably want the -i flag.

Store in a variable and then assign it to the original file:
A=$(cat aux.log | grep 'Something') && echo "${A}" > aux.log

Take a look at my slides "Field Guide To the Perl Command-Line Options" at http://petdance.com/perl/command-line-options.pdf for more ideas on what you can do in place with Perl.

cat myfile.txt | grep 'sometext' > myfile.txt
This will find sometext in myfile.txt and save it back to myfile.txt, this will accomplish what you want. Not sure about regex, but it does work for text.

batch rename to change only single character

How to rename all the files in one directory to new name using the command mv. Directory have 1000s of files and requirement is to change the last character of each file name to some specific char. Example: files are
abc.txt
asdf.txt
zxc.txt
...
ab_.txt
asd.txt
it should change to
ab_.txt
asd_.txt
zx_.txt
...
ab_.txt
as_.txt

You have to watch out for name collisions but this should work okay:
for i in *.txt ; do
j=$(echo "$i" | sed 's/..txt$/_.txt/')
echo mv \"$i\" \"$j\"
#mv "$i" "$j"
done
after you uncomment the mv (I left it commented so you could see what it does safely). The quotes are for handling files with spaces (evil, vile things in my opinion :-).

If all files end in ".txt", you can use mmv (Multiple Move) for that:
mmv "*[a-z].txt" "#1_.txt"
Plus: mmv will tell you when this generates a collision (in your example: abc.txt becomes ab_.txt which already exists) before any file is renamed.
Note that you must quote the file names, else the shell will expand the list before mmv sees it (but mmv will usually catch this mistake, too).

If your files all have a .txt suffix, I suggest the following script:
for i in *.txt
do
r=`basename $i .txt | sed 's/.$//'`
mv $i ${r}_.txt
done

Is it a definite requirement that you use the mv command?
The perl rename utility was written for this sort of thing. It's standard for debian-based linux distributions, but according to this page it can be added really easily to any other.
If it's already there (or if you install it) you can do:
rename -v 's/.\.txt$/_\.txt/' *.txt
The page included above has some basic info on regex and things if it's needed.

Find should be more efficient than for file in *.txt, which expands all of your 1000 files into a long list of command line parameters. Example (updated to use bash replacement approach):
find . \( -type d ! -name . -prune \) -o \( -name "*.txt" \) | while read file
do
mv $file ${file%%?.txt}_.txt
done

I'm not sure if this will work with thousands of files, but in bash:
for i in *.txt; do
j=`echo $i |sed 's/.\.txt/_.txt/'`
mv $i $j
done

You can use bash's ${parameter%%word} operator thusly:
for FILE in *.txt; do
mv $FILE ${FILE%%?.txt}_.txt
done

Shell script - search and replace text in multiple files using a list of strings

I have a file "changesDictionary.txt" containing (a variable number of) pairs of key-value strings.
e.g.
"textToSearchFor" = "theReplacementText"
(The format of the dictionary is unimportant, and be changed as required.)
I need to iterate through the contents of a given directory, including sub-directories. For each file encountered with the extension ".txt", we search for each of the keys in changesDictionary.txt, replacing each found instance with the replacement string value.
i.e. a search and replace over multiple files, but using a list of search/replace terms rather than a single search/replace term.
How could I do this? (I have studied single search/replace examples, but do not understand how to do multiple searches within a file.)
The implementation (bash, perl, whatever) is not important as long as I can run it from the command line in Mac OS X. Thanks for any help.

I'd convert your changesDictionary.txt file to a sed script, with... sed:
$ sed -e 's/^"$.*$" = "$.*$"$/s\/\1\/\2\/g/' \
changesDictionary.txt > changesDictionary.sed
Note, any special characters for either regular expressions or sed expressions in your dictionary will be falsely interpreted by sed, so your dictionary can either only have only the most primitive search-and-replacements, or you'll need to maintain the sed file with valid expressions. Unfortunately, there's no easy way in sed to either shut off regular expression and use only string matching or quote your searches and replacements as "literals".
With the resulting sed script, use find and xargs -- rather than find -exec -- to convert your files with the sed script as quickly as possible, by processing them more than one at a time.
$ find somedir -type f -print0 \
| xargs -0 sed -i -f changesDictionary.sed
Note, the -i option of sed edits files "in-place", so be sure to make backups for safety, or use -i~ to create tilde-backups.
Final note, using search and replaces can have unintended consequences. Will you have searches that are substrings of other searches? Here's an example.
$ cat changesDictionary.txt
"fix" = "broken"
"fixThat" = "Fixed"
$ sed -e 's/^"$.*$" = "$.*$"$/s\/\1\/\2\/g/' changesDictionary.txt \
| tee changesDictionary.sed
s/fix/broken/g
s/fixThat/Fixed/g
$ mkdir subdir
$ echo fixThat > subdir/target.txt
$ find subdir -type f -name '*.txt' -print0 \
| xargs -0 sed -i -f changesDictionary.sed
$ cat subdir/target.txt
brokenThat
Should "fixThat" have become "Fixed" or "brokenThat"? Order matters for sed script. Similarly, a search and replace can be search and replaced more than once -- changing "a" to "b", may be changed by another search-and-replace later from "b" to "c".
Perhaps you've already considered both of these, but I mention because I've tried what you were doing before and didn't think of it. I don't know of anything that simply does the right thing for doing multiple search and replacements at once. So, you need to program it to do the right thing yourself.

Here are the basic steps I would do
Copy the changesDictionary.txt file
In it replace "a"="b" to the equivalent sed line: e.g. (use $1 for the file name)
sed -e 's/a/b/g' $1
(you could write a script to do this or just do it by hand, if you just need to do this once and it's not too big).
If the files are all in one directory, then you can do something like:
ls *.txt | xargs scriptFromStep2.sh
If they are in subdirs, use a find to call that script on all of the files, something like
find . -name '*.txt' -exec scriptFromStep2.sh {} \;
These aren't exact, do some experiments to make sure you get it right -- it's just the approach I would use.
(but, if you can, just use perl, it would be a lot simpler)

Use this tool, which is written in Perl - with quite a lot of bells and whistles - oldie, but goodie:
http://unixgods.org/~tilo/replace_string/
Features:
do multiple search-replace or query-search-replace operations
search-replace expressions can be given on the command line or read from a file
processes multiple input files
recursively descend into directory and do multiple search/replace operations on all files
user defined perl expressions are applied to each line of each input file
optionally run in paragraph mode (for multi-line search/replace)
interactive mode
batch mode
optionally backup files and backup numbering
preserve modes/owner when run as root
ignore symbolic links, empty files, write protected files, sockets, named pipes, and directory names
optionally replace lines only matching / not matching a given regular expression
This script has been used quite extensively over the years with large data sets.

#!/bin/bash
f="changesDictionary.tx"
find /path -type f -name "*.txt" | while read FILE
do
awk 'BEGIN{ FS="=" }
FNR==NR{ s[$1]=$2; next }
{
for(i in s){
if( $0 ~ i ){ gsub(i,s[i]) }
}
print $0
}' $f $FILE > temp
mv temp $FILE
done

for i in ls -1 /script/arq*.sh
do
echo -e "ARQUIVO ${i}"
sed -i 's|/$file_path1|/file_path2|g' ${i}
done

How to do a mass rename?

I need to rename files names like this
transform.php?dappName=Test&transformer=YAML&v_id=XXXXX
to just this
XXXXX.txt
How can I do it?
I understand that i need more than one mv command because they are at least 25000 files.

Easiest solution is to use "mmv"
You can write:
mmv "long_name*.txt" "short_#1.txt"
Where the "#1" is replaced by whatever is matched by the first wildcard.
Similarly #2 is replaced by the second, etc.
So you do something like
mmv "index*_type*.txt" "t#2_i#1.txt"
To rename index1_type9.txt to t9_i1.txt
mmv is not standard in many Linux distributions but is easily found on the net.

If you are using zsh you can also do this:
autoload zmv
zmv 'transform.php?dappName=Test&transformer=YAML&v_id=(*)' '$1.txt'

You write a fairly simple shell script in which the trickiest part is munging the name.
The outline of the script is easy (bash syntax here):
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i <modified name>
done
Modifying the name has many options. I think the easiest is probably an awk one-liner like
`echo $i | awk -F'=' '{print $4}'`
so...
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i `echo $i | awk -F'=' '{print $4}'`.txt
done
update
Okay, as pointed out below, this won't necessarily work for a large enough list of files; the * will overrun the command line length limit. So, then you use:
$ find . -name 'transform.php?dappName=Test&transformer=YAML&v_id=*' -prune -print |
while read
do
mv $reply `echo $reply | awk -F'=' '{print $4}'`.txt
done

Try the rename command
Or you could pipe the results of an ls into a perl regex.

You may use whatever you want to transform the name (perl, sed, awk, etc.). I'll use a python one-liner:
for file in 'transform.php?dappName=Test&transformer=YAML&v_id='*; do
mv $file `echo $file | python -c "print raw_input().split('=')[-1]"`.txt;
done
Here's the same script entirely in Python:
import glob, os
PATTERN="transform.php?dappName=Test&transformer=YAML&v_id=*"
for filename in glob.iglob(PATTERN):
newname = filename.split('=')[-1] + ".txt"
print filename, '==>', newname
os.rename(filename, newname)
Side note: you would have had an easier life saving the pages with the right name while grabbing them...

find -name '*v_id=*' | perl -lne'rename($_, qq($1.txt)) if /v_id=(\S+)/'

vimv lets you rename multiple files using Vim's text editing capabilities.
Entering vimv opens a Vim window which lists down all files and you can do pattern matching, visual select, etc to edit the names. After you exit Vim, the files will be renamed.
[Disclaimer: I'm the author of the tool]

I'd use ren-regexp, which is a Perl script that lets you mass-rename files very easily.
21:25:11 $ ls
transform.php?dappName=Test&transformer=YAML&v_id=12345
21:25:12 $ ren-regexp 's/transform.php.*v_id=(\d+)/$1.txt/' transform.php*
transform.php?dappName=Test&transformer=YAML&v_id=12345
1 12345.txt
21:26:33 $ ls
12345.txt

This should also work:
prfx='transform.php?dappName=Test&transformer=YAML&v_id='
ls $prfx* | sed s/$prfx// | xargs -Ipsx mv "$prfx"psx psx

this renamer command would do it:
$ renamer --regex --find 'transform.php?dappName=Test&transformer=YAML&v_id=(\w+)' --replace '$1.txt' *

Ok, you need to be able to run a windows binary for this.
But if you can run Total Commander, do this:
Select all files with *, and hit ctrl-M
In the Search field, paste "transform.php?dappName=Test&transformer=YAML&v_id="
(Leave Replace empty)
Press Start
It doesn't get much simpler than that.
You can also rename using regular expressions via this dialog, and you see a realtime preview of how your files are going to be renamed.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex