How to do a mass rename? - unix

I need to rename files names like this
transform.php?dappName=Test&transformer=YAML&v_id=XXXXX
to just this
XXXXX.txt
How can I do it?
I understand that i need more than one mv command because they are at least 25000 files.

Easiest solution is to use "mmv"
You can write:
mmv "long_name*.txt" "short_#1.txt"
Where the "#1" is replaced by whatever is matched by the first wildcard.
Similarly #2 is replaced by the second, etc.
So you do something like
mmv "index*_type*.txt" "t#2_i#1.txt"
To rename index1_type9.txt to t9_i1.txt
mmv is not standard in many Linux distributions but is easily found on the net.

If you are using zsh you can also do this:
autoload zmv
zmv 'transform.php?dappName=Test&transformer=YAML&v_id=(*)' '$1.txt'

You write a fairly simple shell script in which the trickiest part is munging the name.
The outline of the script is easy (bash syntax here):
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i <modified name>
done
Modifying the name has many options. I think the easiest is probably an awk one-liner like
`echo $i | awk -F'=' '{print $4}'`
so...
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i `echo $i | awk -F'=' '{print $4}'`.txt
done
update
Okay, as pointed out below, this won't necessarily work for a large enough list of files; the * will overrun the command line length limit. So, then you use:
$ find . -name 'transform.php?dappName=Test&transformer=YAML&v_id=*' -prune -print |
while read
do
mv $reply `echo $reply | awk -F'=' '{print $4}'`.txt
done

Try the rename command
Or you could pipe the results of an ls into a perl regex.

You may use whatever you want to transform the name (perl, sed, awk, etc.). I'll use a python one-liner:
for file in 'transform.php?dappName=Test&transformer=YAML&v_id='*; do
mv $file `echo $file | python -c "print raw_input().split('=')[-1]"`.txt;
done
Here's the same script entirely in Python:
import glob, os
PATTERN="transform.php?dappName=Test&transformer=YAML&v_id=*"
for filename in glob.iglob(PATTERN):
newname = filename.split('=')[-1] + ".txt"
print filename, '==>', newname
os.rename(filename, newname)
Side note: you would have had an easier life saving the pages with the right name while grabbing them...

find -name '*v_id=*' | perl -lne'rename($_, qq($1.txt)) if /v_id=(\S+)/'

vimv lets you rename multiple files using Vim's text editing capabilities.
Entering vimv opens a Vim window which lists down all files and you can do pattern matching, visual select, etc to edit the names. After you exit Vim, the files will be renamed.
[Disclaimer: I'm the author of the tool]

I'd use ren-regexp, which is a Perl script that lets you mass-rename files very easily.
21:25:11 $ ls
transform.php?dappName=Test&transformer=YAML&v_id=12345
21:25:12 $ ren-regexp 's/transform.php.*v_id=(\d+)/$1.txt/' transform.php*
transform.php?dappName=Test&transformer=YAML&v_id=12345
1 12345.txt
21:26:33 $ ls
12345.txt

This should also work:
prfx='transform.php?dappName=Test&transformer=YAML&v_id='
ls $prfx* | sed s/$prfx// | xargs -Ipsx mv "$prfx"psx psx

this renamer command would do it:
$ renamer --regex --find 'transform.php?dappName=Test&transformer=YAML&v_id=(\w+)' --replace '$1.txt' *

Ok, you need to be able to run a windows binary for this.
But if you can run Total Commander, do this:
Select all files with *, and hit ctrl-M
In the Search field, paste "transform.php?dappName=Test&transformer=YAML&v_id="
(Leave Replace empty)
Press Start
It doesn't get much simpler than that.
You can also rename using regular expressions via this dialog, and you see a realtime preview of how your files are going to be renamed.

Related

Unix - Using ls with grep

How can I use ls (or other commands) and grep together to search from specific files for a certain word inside that file?
Example I have a file - 201503003_315_file.txt and I have other files in my dir.
I only want to search files that have a file name that contains _315_ and inside that file, search for the word "SAMPLE".
Hope this is clear and thanks in advance for any help.
You can do:
ls * _315_* | xargs grep "SAMPLE"
The first part: ls * _315_* will list only files that have 315 as part of the file name, this list of files is piped to grep which will scan each one of them and look for "SAMPLE"
UPDATE
A bit easier (and actually safer) approach was mentioned by David in the comments bellow:
grep "SAMPLE" *_315_*.txt
The reason why it's safer is that ls doesn't handle well special characters.
Another option, as mentioned by Charles Duffy in the comments below:
printf '%s\0' *_315_* | xargs -0 grep
Change to that directory (using cd dir) and try:
grep SAMPLE *_315_*
If you really MUST use ls AND grep try this:
ls *_315_* | xargs grep SAMPLE
The first example, however, requires less typing...

In-place processing with grep

I've got a script that calls grep to process a text file. Currently I am doing something like this.
$ grep 'SomeRegEx' myfile.txt > myfile.txt.temp
$ mv myfile.txt.temp myfile.txt
I'm wondering if there is any way to do in-place processing, as in store the results to the same original file without having to create a temporary file and then replace the original with the temp file when processing is done.
Of course I welcome comments as to why this should or should not be done, but I'm mainly interested in whether it can be done. In this example I'm using grep, but I'm interested about Unix tools in general. Thanks!
sponge (in moreutils package in Debian/Ubuntu) reads input till EOF and writes it into file, so you can grep file and write it back to itself.
Like this:
grep 'pattern' file | sponge file
Perl has the -i switch, so does sed and Ruby
sed -i.bak -n '/SomeRegex/p' file
ruby -i.bak -ne 'print if /SomeRegex/' file
But note that all it ever does is creating "temp" files at the back end which you think you don't see, that's all.
Other ways, besides grep
awk
awk '/someRegex/' file > t && mv t file
bash
while read -r line;do case "$line" in *someregex*) echo "$line";;esac;done <file > t && mv t file
No, in general it can't be done in Unix like this. You can only create/truncate (with >) or append to a file (with >>). Once truncated, the old contents would be lost.
In general, this can't be done. But Perl has the -i switch:
perl -i -ne 'print if /SomeRegEx/' myfile.txt
Writing -i.bak will cause the original to be saved in myfile.txt.bak.
(Of course internally, Perl just does basically what you're already doing -- there's no special magic involved.)
To edit file in-place using vim-way, try:
$ ex -s +'%!grep foo' -cxa myfile.txt
Alternatively use sed or gawk.
Most installations of sed can do in-place editing, check the man page, you probably want the -i flag.
Store in a variable and then assign it to the original file:
A=$(cat aux.log | grep 'Something') && echo "${A}" > aux.log
Take a look at my slides "Field Guide To the Perl Command-Line Options" at http://petdance.com/perl/command-line-options.pdf for more ideas on what you can do in place with Perl.
cat myfile.txt | grep 'sometext' > myfile.txt
This will find sometext in myfile.txt and save it back to myfile.txt, this will accomplish what you want. Not sure about regex, but it does work for text.

How do I perform a recursive directory search for strings within files in a UNIX TRU64 environment?

Unfortunately, due to the limitations of our Unix Tru64 environment, I am unable to use the GREP -r switch to perform my search for strings within files across multiple directories and sub directories.
Ideally, I would like to pass two parameters. The first will be the directory I want my search is to start on. The second is a file containing a list of all the strings to be searched. This list will consist of various directory path names and will include special characters:
ie:
/aaa/bbb/ccc
/eee/dddd/ggggggg/
etc..
The purpose of this exercise is to identify all shell scripts that may have specific hard coded path names identified in my list.
There was one example I found during my investigations that perhaps comes close, but I am not sure how to customize this to accept a file of string arguments:
eg: find etb -exec grep test {} \;
where 'etb' is the directory and 'test', a hard coded string to be searched.
This should do it:
find dir -type f -exec grep -F -f strings.txt {} \;
dir is the directory from which searching will commence
strings.txt is the file of strings to match, one per line
-F means treat search strings as literal rather than regular expressions
-f strings.txt means use the strings in strings.txt for matching
You can add -l to the grep switches if you just want filenames that match.
Footnote:
Some people prefer a solution involving xargs, e.g.
find dir -type f -print0 | xargs -0 grep -F -f strings.txt
which is perhaps a little more robust/efficient in some cases.
By reading, I assume we can not use the gnu coreutil, and egrep is not available.
I assume (for some reason) the system is broken, and escapes do not work as expected.
Under normal situations, grep -rf patternfile.txt /some/dir/ is the way to go.
a file containing a list of all the strings to be searched
Assumptions : gnu coreutil not available. grep -r does not work. handling of special character is broken.
Now, you have working awk ? no ?. It makes life so much easier. But lets be on the safe side.
Assume : working sed ,one of od OR hexdump OR xxd (from vim package) is available.
Lets call this patternfile.txt
1. Convert list into a regexp that grep likes
Example patternfile.txt contains
/foo/
/bar/doe/
/root/
(example does not print special char, but it's there.) we must turn it into something like
(/foo/|/bar/doe/|/root/)
Assuming echo -en command is not broken, and xxd , or od, or hexdump is available,
Using hexdump
cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n'
Using od
cat patternfile.txt |od -A none -t x1|tr -d '\n'
and pipe it into (common for both hexdump and od)
|sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'
then pipe result into
|sed 's:^:\\(:g' |sed 's:$:\\):g'
and you have a regexp pattern that is escaped.
2. Feed the escaped pattern into broken regexp
Assuming the bare minimum shell escape is available,
we use grep "$(echo -en "ESCAPED_PATTERN" )" to do our job.
3. To sum it up
Building a escaped regexp pattern (using hexdump as example )
grep "$(echo -en "$( cat patternfile.txt |hexdump -ve '1/1 "%02x \n"' |tr -d '\n' |sed 's:[ ]*0a[ ]*$::g'|sed 's: 0a:\\|:g' |sed 's:^[ ]*::g'|sed 's:^: :g' |sed 's: :\\x:g'|sed 's:^:\\(:g' |sed 's:$:\\):g')")"
will escape all characters and enclose it with (|) brackets so a regexp OR match will be performed.
4. Recrusive directory lookup
Under normal situations, even when grep -r is broken, find /dir/ -exec grep {} \; should work.
Some may prefer xargs instaed (unless you happen to have buggy xargs).
We prefer find /somedir/ -type f -print0 |xargs -0 grep -f 'patternfile.txt' approach, but since
this is not available (for whatever valid reason),
we need to exec grep for each file,and this is normaly the wrong way.
But lets do it.
Assume : find -type f works.
Assume : xargs is broken OR not available.
First, if you have a buggy pipe, it might not handle large number of files.
So we avoid xargs in such systems (i know, i know, just lets pretend it is broken ).
find /whatever/dir/to/start/looking/ -type f > list-of-all-file-to-search-for.txt
IF your shell handles large size lists nicely,
for file in cat list-of-all-file-to-search-for.txt ; do grep REGEXP_PATTERN "$file" ;
done ; is a nice way to get by. Unfortunetly, some systems do not like that,
and in that case, you may require
cat list-of-all-file-to-search-for.txt | split --help -a 4 -d -l 2000 file-smaller-chunk.part.
to turn it into smaller chunks. Now this is for a seriously broken system.
then a for file in file-smaller-chunk.part.* ; do for single_line in cat "$file" ; do grep REGEXP_PATTERN "$single_line" ; done ; done ;
should work.
A
cat filelist.txt |while read file ; do grep REGEXP_PATTERN $file ; done ;
may be used as workaround on some systems.
What if my shell doe not handle quotes ?
You may have to escape the file list beforehand.
It can be done much nicer in awk, perl, whatever, but since we restrict our selves to
sed, lets do it.
We assume 0x27, the ' code will actually work.
cat list-of-all-file-to-search-for.txt |sed 's#['\'']#'\''\\'\'\''#g'|sed 's:^:'\'':g'|sed 's:$:'\'':g'
The only time I had to use this was when feeding output into bash again.
What if my shell does not handle that ?
xargs fails , grep -r fails , shell's for loop fails.
Do we have other things ? YES.
Escape all input suitable for your shell, and make a script.
But you know what, I got board, and writing automated scripts for csh just seems
wrong. So I am going to stop here.
Take home note
Use the tool for the right job. Writing a interpreter on bc is perfectly
capable, but it is just plain wrong. Install coreutils, perl, a better grep
what ever. makes life a better thing.

Unix [Homework]: Get a list of /home/user/ directories in /etc/passwd

I'm very new to Unix, and currently taking a class learning the basics of the system and its commands.
I'm looking for a single command line to list off all of the user home directories in alphabetical order from the /etc/passwd directory. This applies only to the home directories, and not the contents within them. There should be no duplicate entries. I've tried many permutations of commands such as the following:
sort -d | find /etc/passwd /home/* -type -d | uniq | less
I've tried using -path, -name, removing -type, using -prune, and changing the search pattern to things like /home/*/$, but haven't gotten good results once. At best I can get a list of my own directory (complete with every directory inside it, which is bad), and the directories of the other students on the server (without the contained directories, which is good). I just can't get it to display the /home/user directories and nothing else for my own account.
Many thanks in advance.
/etc/passwd is a file. the home directory is usually at field/column 6, where ":" is the delimiter. When you are dealing with file structure that has distinct characters as delimiters, you should use a tool that can break your data down into smaller chunks for easier manipulation using fields and field delimiters. awk/cut etc, even using the shell with IFS variable set can do the job. eg
awk -F":" '{print $6}' /etc/passwd | sort
cut -d":" -f6 /etc/passwd |sort
using the shell to read the file
while IFS=":" read -r a b c d e home_dir g
do
echo $home_dir
done < /etc/passwd | sort
I think the tools you want are grep, tr and awk. Grep will give you lines from the file that actually contain home directories. tr will let you break up the delimiter into spaces, which makes each line easier to parse.
Awk is just one program that would help you display the results that you want.
Good luck :)
Another hint, try ls --color=auto /etc, passwd isn't the kind of file that you think it is. Directories show up in blue.
In Unix, find is a command for finding files under one or more directories. I think you are looking for a command for finding lines within a file that match a pattern? Look into the command grep.
sed 's|\(.[^:]*\):\(.[^:]*\):\(.*\):\(.[^:]*\):\(.[^:]*\)|\4|' /etc/passwd|sort
I think all this processing could be avoided. There is a utility to list directory contents.
ls -1 /home
If you'd like the order of the sorting reversed
ls -1r /home
Granted, this list out the name of just that directory name and doesn't include the '/home/', but that can be added back easily enough if desired with something like this
ls -1 /home | (while read line; do echo "/home/"$line; done)
I used something like :
ls -l -d $(cut -d':' -f6 /etc/passwd) 2>/dev/null | sort -u
The only thing I didn't do is to sort alphabetically, didn't figured that yet

batch rename to change only single character

How to rename all the files in one directory to new name using the command mv. Directory have 1000s of files and requirement is to change the last character of each file name to some specific char. Example: files are
abc.txt
asdf.txt
zxc.txt
...
ab_.txt
asd.txt
it should change to
ab_.txt
asd_.txt
zx_.txt
...
ab_.txt
as_.txt
You have to watch out for name collisions but this should work okay:
for i in *.txt ; do
j=$(echo "$i" | sed 's/..txt$/_.txt/')
echo mv \"$i\" \"$j\"
#mv "$i" "$j"
done
after you uncomment the mv (I left it commented so you could see what it does safely). The quotes are for handling files with spaces (evil, vile things in my opinion :-).
If all files end in ".txt", you can use mmv (Multiple Move) for that:
mmv "*[a-z].txt" "#1_.txt"
Plus: mmv will tell you when this generates a collision (in your example: abc.txt becomes ab_.txt which already exists) before any file is renamed.
Note that you must quote the file names, else the shell will expand the list before mmv sees it (but mmv will usually catch this mistake, too).
If your files all have a .txt suffix, I suggest the following script:
for i in *.txt
do
r=`basename $i .txt | sed 's/.$//'`
mv $i ${r}_.txt
done
Is it a definite requirement that you use the mv command?
The perl rename utility was written for this sort of thing. It's standard for debian-based linux distributions, but according to this page it can be added really easily to any other.
If it's already there (or if you install it) you can do:
rename -v 's/.\.txt$/_\.txt/' *.txt
The page included above has some basic info on regex and things if it's needed.
Find should be more efficient than for file in *.txt, which expands all of your 1000 files into a long list of command line parameters. Example (updated to use bash replacement approach):
find . \( -type d ! -name . -prune \) -o \( -name "*.txt" \) | while read file
do
mv $file ${file%%?.txt}_.txt
done
I'm not sure if this will work with thousands of files, but in bash:
for i in *.txt; do
j=`echo $i |sed 's/.\.txt/_.txt/'`
mv $i $j
done
You can use bash's ${parameter%%word} operator thusly:
for FILE in *.txt; do
mv $FILE ${FILE%%?.txt}_.txt
done

Resources