Rename files based on sorted creation date? - unix

I have a directory filled with files with random names. I'd like to be able to rename them 'file 1' 'file 2' etc based on chronological order, ie file creation date. I could be writing a short Python script but then I wouldn't learn anything. I was wondering if there's a clever 1 line command that can solve this. If anyone could point me in the right direction.
I'm using zsh.
Thanks!

For zsh:
saveIFS="$IFS"; IFS=$'\0'; while read -A line; do mv "${line[2]}" "${line[1]%.*}.${line[2]}"; done < <(find -maxdepth 1 -type f -printf "%T+ %f\n"); IFS="$saveIFS"
For Bash (note the differences in the option to read and zero-based indexing instead of one-based):
saveIFS="$IFS"; IFS=$'\0'; while read -a line; do mv "${line[1]}" "${line[0]%.*}.${line[1]}"; done < <(find -maxdepth 1 -type f -printf "%T+\0%f\n"); IFS="$saveIFS"
These rename files by adding the modification date to the beginning of the original filename, which is retained to prevent name collisions.
A filename resulting from this might look like:
2009-12-15+11:08:52.original.txt
Because a null is used as the internal field separator (IFS), filenames with spaces should be preserved.

Related

How to find files that match names in a list and copy them to a directory?

I have a list of 50 names that look like this:
O8-E7
O8-F2
O8-F6
O8-F8
O8-H2
O9-A5
O9-B8
O9-D8
O9-E2
O9-F5
O9-H12
S37-A5
S37-B11
S37-B12
S37-C12
S37-D12
S37-E8
S37-G2
I want to look inside a specific directory for all the subdirectories whose name contains one of these elements.
For example, the directory Sample_S37-G2-from-Specimen-001 would be a match.
Inside those subdirectories, there is a file called accepted_hits.bam (unfortunately named the same way in all of them). I want to find these files and copy them into a single folder, with the name of the sample subdirectory that they came from.
For example, I would copy the accepted_hits.bam file from the subdirectory Sample_S37-G2-from-Specimen-001 to the new_dir as S37-G2_accepted_hits.bam
I tried using find, but it's not working and I don't really understand why.
cat sample.list | while read FILENAME; do find /path/to/sampleDirectories -name "$FILENAME" -exec cp '{}' new
_dir\; done
Any ideas? Thanks!
You are looking for dirs that are exactly the same as the lines in your input.
The first improvement would be using wildcards
cat sample.list | while read FILENAME; do
find /path/to/sampleDirectories -name "*${FILENAME}*" -exec cp '{}' new_dir\; done
Your new problem is that now you will be looking for dir's, not files. You want to find dir's with the filename accepted_hits.bam.
So your next try would be parsing the output of
find /path/to/sampleDirectories -name accepted_hits.bam | grep "${FILENAME}"
but you do not want to call find for each entry in sample.list.
You need to start with 1 find command and get the relevant subdirs from it.
A complication is that you want to have the substring from orgfile in your destfile name. Look at the grep options o and f, they help!
find /path/to/sampleDirectories -name accepted_hits.bam | while read orgfile | do
matched_part=$(echo "${orgfile}" | grep -of sample.list)
if [ -n "${matched_part}" ]; then
cp ${orgfile} newdir/${matched_part}accepted_hits.bam
fi
done
This will only work when your sample.list is without additional spaces. When you have spaces and can not cange the file, you need to copy/parse sample.list to another file.
When one of your 50 entries in sample.list is a substring of "accepted_hits.bam", you need to do some extra work.
Edit: if [ -n "${matched_part}" ] was missing the $.
Try using egrep with alternation
build a text file with single line of patterns: (pat1|pat2|pat3)
call find to list all of the regular files
use egrep to select the ones based on the patterns in the pattern file
awk 'BEGIN { printf("(") } FNR==1 {printf("%s", $0)} FNR>1 {printf("|%s", $0)} END{printf(")\n") } ' sample.list > t.sed
find /path/to/sampleDirectories -type f | egrep -f t.sed > filelist

shell script to find number of unique files in a directory as well its sub- directories?

I’m trying to find number of unique files in a directory as well its sub- directories Is this possible?
Say for example there is a directory with 100 files. How would I count the number of unique files under that directory?
Assuming you're asking about file names, you can
First, list all the files in the directory tree
Second, get the unique values from the list
To list all the files, you can use find. Normally find prints the entire path name of each result, but since you just want to compare the base file names, you will have to customize its output:
find directoryName -type f -printf '%f\n'
This will print each base file name, one per line. Now you can get only the unique file names by sorting and then collapsing all adjacent entries that share a name into a single entry. The sort command with the -u switch does this for you:
find directoryName -type f -printf '%f\n' | sort -u
If you want to get a count of the number of repetitions of each unique file name, then just use sort by itself and use uniq -c to handle the collapsing and the counting:
find directoryName -type f -printf '%f\n' | sort | uniq -c
Note that the above solution will get confused by file names that contain newline (\n) characters. If you have any such file names, you should read in the find manual about null-terminating (instead of newline-terminating) your output.
Finally, if you're simply looking for a numeric output, pipe the whole thing through "wc -l" to count it.
find directoryName -type f -printf '%f\n' | sort | uniq -c | wc -l

unix command to change directory name

Hi this is a simple question but the solution eludes me at the moment..
I can find out the folder name that I want to change the name of, and I know the command to change the name of a folder is mv
so from the current directory if i go
ls ~/relevant.directory.containing.directory.name.i.want.to.change
to which i get the name of the directory is called say lorem-ipsum-v1-3
but the directory name may change in the future but it is the only directory in the directory:
~/relevant.directory.containing.directory.name.i.want.to.change
how to i programmatically change it to a specific name like correct-files
i can do it normally by just doing something like
mv lorem-ipsum-v1-3 correct-files
but I want to start automating this so that I don't need to keep copying and pasting the directory name....
any help would be appreciated...
Something like:
find . -depth -maxdepth 1 -type d | head -n 1 | xargs -I '{}' mv '{}' correct-files
should work fine as long as only one directory should be moved.
If you are absolutely certain that relevant.directory.containing.directory.name.i.want.to.change only contains the directory you want to rename, then you can simply use a wildcard:
mv ~/relevant.directory.containing.directory.name.i.want.to.change/*/ ~/relevant.directory.containing.directory.name.i.want.to.change/correct-files
This can can also be simplified further, using bash brace expansion, to:
mv ~/relevant.directory.containing.directory.name.i.want.to.change/{*/,correct-files}
cd ~/relevant.directory.containing.directory.name.i.want.to.change
find . -type d -print | while read a ;
do
mv $a correct-files ;
done
Caveats:
No error handling
There may be a way of reversing the parameters to mv so you can use xargs instead of a while loop, but that's not standard (as far as I'm aware)
Not parameterised
If there any any subdirectories it won't work. The depth parameters on the find command are (again, AFAIK) not standard. They do exist on GNU versions but seem to be missing on Solaris
Probably others...

Using mtime other than with FIND

I am trying to write a script which will move files older than 1 day to an archive directory. I used the following find command:
for filename in `find /file_path/*.* -type f -mtime +1`
This fails since my argument list is too big to be handled by find. I got the following error:
/usr/bin/find: arg list too long
Is it possible to use find in an IF-ELSE statement? Can someone provide some examples of using mtime other then in find.
Edit: To add the for loop of which the find is a part.
find /file_path -name '*.*' -mtime +1 -type f |
while read filename
do ...move operation...
done
That assumes your original code was acceptable in the way it handled spaces etc in file names,
and that there is no sensible way to do the move in the action of find. It also avoids problems with overlong argument lists.
Why not just use the -exec part of find?
If you just want to cp files, you could use
find /file_path -name "." -mtime +1 -type f | xargs -i mv {} /usr/local/archived

Diff files present in two different directories

I have two directories with the same list of files. I need to compare all the files present in both the directories using the diff command. Is there a simple command line option to do it, or do I have to write a shell script to get the file listing and then iterate through them?
You can use the diff command for that:
diff -bur folder1/ folder2/
This will output a recursive diff that ignore spaces, with a unified context:
b flag means ignoring whitespace
u flag means a unified context (3 lines before and after)
r flag means recursive
If you are only interested to see the files that differ, you may use:
diff -qr dir_one dir_two | sort
Option "q" will only show the files that differ but not the content that differ, and "sort" will arrange the output alphabetically.
Diff has an option -r which is meant to do just that.
diff -r dir1 dir2
diff can not only compare two files, it can, by using the -r option, walk entire directory trees, recursively checking differences between subdirectories and files that occur at comparable points in each tree.
$ man diff
...
-r --recursive
Recursively compare any subdirectories found.
...
Another nice option is the über-diff-tool diffoscope:
$ diffoscope a b
It can also emit diffs as JSON, html, markdown, ...
If you specifically don't want to compare contents of files and only check which one are not present in both of the directories, you can compare lists of files, generated by another command.
diff <(find DIR1 -printf '%P\n' | sort) <(find DIR2 -printf '%P\n' | sort) | grep '^[<>]'
-printf '%P\n' tells find to not prefix output paths with the root directory.
I've also added sort to make sure the order of files will be the same in both calls of find.
The grep at the end removes information about identical input lines.
If it's GNU diff then you should just be able to point it at the two directories and use the -r option.
Otherwise, try using
for i in $(\ls -d ./dir1/*); do diff ${i} dir2; done
N.B. As pointed out by Dennis in the comments section, you don't actually need to do the command substitution on the ls. I've been doing this for so long that I'm pretty much doing this on autopilot and substituting the command I need to get my list of files for comparison.
Also I forgot to add that I do '\ls' to temporarily disable my alias of ls to GNU ls so that I lose the colour formatting info from the listing returned by GNU ls.
When working with git/svn or multiple git/svn instances on disk this has been one of the most useful things for me over the past 5-10 years, that somebody might find useful:
diff -burN /path/to/directory1 /path/to/directory2 | grep +++
or:
git diff /path/to/directory1 | grep +++
It gives you a snapshot of the different files that were touched without having to "less" or "more" the output. Then you just diff on the individual files.
In practice the question often arises together with some constraints. In that case following solution template may come in handy.
cd dir1
find . \( -name '*.txt' -o -iname '*.md' \) | xargs -i diff -u '{}' 'dir2/{}'
Here is a script to show differences between files in two folders. It works recursively. Change dir1 and dir2.
(search() { for i in $1/*; do [ -f "$i" ] && (diff "$1/${i##*/}" "$2/${i##*/}" || echo "files: $1/${i##*/} $2/${i##*/}"); [ -d "$i" ] && search "$1/${i##*/}" "$2/${i##*/}"; done }; search "dir1" "dir2" )
Try this:
diff -rq /path/to/folder1 /path/to/folder2

Resources