Diff files present in two different directories - unix

I have two directories with the same list of files. I need to compare all the files present in both the directories using the diff command. Is there a simple command line option to do it, or do I have to write a shell script to get the file listing and then iterate through them?

You can use the diff command for that:
diff -bur folder1/ folder2/
This will output a recursive diff that ignore spaces, with a unified context:
b flag means ignoring whitespace
u flag means a unified context (3 lines before and after)
r flag means recursive

If you are only interested to see the files that differ, you may use:
diff -qr dir_one dir_two | sort
Option "q" will only show the files that differ but not the content that differ, and "sort" will arrange the output alphabetically.

Diff has an option -r which is meant to do just that.
diff -r dir1 dir2

diff can not only compare two files, it can, by using the -r option, walk entire directory trees, recursively checking differences between subdirectories and files that occur at comparable points in each tree.
$ man diff
...
-r --recursive
Recursively compare any subdirectories found.
...
Another nice option is the über-diff-tool diffoscope:
$ diffoscope a b
It can also emit diffs as JSON, html, markdown, ...

If you specifically don't want to compare contents of files and only check which one are not present in both of the directories, you can compare lists of files, generated by another command.
diff <(find DIR1 -printf '%P\n' | sort) <(find DIR2 -printf '%P\n' | sort) | grep '^[<>]'
-printf '%P\n' tells find to not prefix output paths with the root directory.
I've also added sort to make sure the order of files will be the same in both calls of find.
The grep at the end removes information about identical input lines.

If it's GNU diff then you should just be able to point it at the two directories and use the -r option.
Otherwise, try using
for i in $(\ls -d ./dir1/*); do diff ${i} dir2; done
N.B. As pointed out by Dennis in the comments section, you don't actually need to do the command substitution on the ls. I've been doing this for so long that I'm pretty much doing this on autopilot and substituting the command I need to get my list of files for comparison.
Also I forgot to add that I do '\ls' to temporarily disable my alias of ls to GNU ls so that I lose the colour formatting info from the listing returned by GNU ls.

When working with git/svn or multiple git/svn instances on disk this has been one of the most useful things for me over the past 5-10 years, that somebody might find useful:
diff -burN /path/to/directory1 /path/to/directory2 | grep +++
or:
git diff /path/to/directory1 | grep +++
It gives you a snapshot of the different files that were touched without having to "less" or "more" the output. Then you just diff on the individual files.

In practice the question often arises together with some constraints. In that case following solution template may come in handy.
cd dir1
find . \( -name '*.txt' -o -iname '*.md' \) | xargs -i diff -u '{}' 'dir2/{}'

Here is a script to show differences between files in two folders. It works recursively. Change dir1 and dir2.
(search() { for i in $1/*; do [ -f "$i" ] && (diff "$1/${i##*/}" "$2/${i##*/}" || echo "files: $1/${i##*/} $2/${i##*/}"); [ -d "$i" ] && search "$1/${i##*/}" "$2/${i##*/}"; done }; search "dir1" "dir2" )

Try this:
diff -rq /path/to/folder1 /path/to/folder2

Related

Unix Find directories containing more than 2 files

I have a directory containing over a thousand subdirectories. I only want to 'ls' all the directories that contain more than 2 files. I don't need the directories that contain less than 2 files. This is in C-shell, not bash. Anyone know of a good command for this?
I tried this command but it's not giving the desired output. I simply want the full list of directories with more than 2 files. A reason it isn't working is because it will go into sub dirs in those dirs to find if they have more than 2 files. I don't want a recursive search. Just a list of first level directories in the main directory they are in.
$ find . -type f -printf '%h\n' | sort | uniq -c | awk '$1 > 2'
My mistake, I was thinking bash rather than csh. Although I don't have a csh to test with, I think this is the csh syntax for the same thing:
foreach d (*)
if (d "$d" && `ls -1 "$d"|wc -l` > 2) echo "$d"
end
I've added a guard so that non-directories aren't unnecessarily processed, and I've included double-quotes in case there are any "funny" file or directory names (containing spaces e.g.).
One possible problem (I don't know what your exact task is): any immediate subdirectories will also count as files.
Sorry, I was working in bash here:
for d in *;do if [ $(ls -1 $d|wc -l) -gt 2 ];then echo $d;fi;done
For a faster solution, you could try "cheating" by deconstructing the format of the directories themselves if you're on pure Unix. They're just files themselves with contents that can be analyzed in that case. Needless to say that is NOT PORTABLE, to e.g. any bash running on Windows, so not recommended.

terminal command to act on filenames that don't contain text

I have a directory full of files with names such as:
file_name_is_001
file_name_001
file_name_is_002
file_name_002
file_name_is_003
file_name_003
I want to copy only the files that don't contain 'is'. I'm not sure how to do this. I have tried to search for it, but can't seem to google the right phrase to find the results.
Details depend on operating system, shell, etc.
For a unix system a quite verbose but easy to understand approach could look like this (please mind that I didn't test it):
mkdir some_temporary_directory
mv *_is_* some_temporary_directory
cp * where_ever_you_want_to_copy_it
mv some_temporary_directory/* .
rmdir some_temporary_directory
You can do this using bash. First, here's a command to get you a list of files that don't contain the text _is_:
ls | grep -v "_is_"
This takes the output of ls and matches all values with DO NOT contain _is_ using grep -v.
In order to then copy these files, we need to turn the lines output by grep into arguments of cp. We can do this using xargs:
ls | grep -v "_is_" | xargs -J % cp % new_folder
From the xargs man page, it is a tool to "build and execute command lines from standard input".

unix command to change directory name

Hi this is a simple question but the solution eludes me at the moment..
I can find out the folder name that I want to change the name of, and I know the command to change the name of a folder is mv
so from the current directory if i go
ls ~/relevant.directory.containing.directory.name.i.want.to.change
to which i get the name of the directory is called say lorem-ipsum-v1-3
but the directory name may change in the future but it is the only directory in the directory:
~/relevant.directory.containing.directory.name.i.want.to.change
how to i programmatically change it to a specific name like correct-files
i can do it normally by just doing something like
mv lorem-ipsum-v1-3 correct-files
but I want to start automating this so that I don't need to keep copying and pasting the directory name....
any help would be appreciated...
Something like:
find . -depth -maxdepth 1 -type d | head -n 1 | xargs -I '{}' mv '{}' correct-files
should work fine as long as only one directory should be moved.
If you are absolutely certain that relevant.directory.containing.directory.name.i.want.to.change only contains the directory you want to rename, then you can simply use a wildcard:
mv ~/relevant.directory.containing.directory.name.i.want.to.change/*/ ~/relevant.directory.containing.directory.name.i.want.to.change/correct-files
This can can also be simplified further, using bash brace expansion, to:
mv ~/relevant.directory.containing.directory.name.i.want.to.change/{*/,correct-files}
cd ~/relevant.directory.containing.directory.name.i.want.to.change
find . -type d -print | while read a ;
do
mv $a correct-files ;
done
Caveats:
No error handling
There may be a way of reversing the parameters to mv so you can use xargs instead of a while loop, but that's not standard (as far as I'm aware)
Not parameterised
If there any any subdirectories it won't work. The depth parameters on the find command are (again, AFAIK) not standard. They do exist on GNU versions but seem to be missing on Solaris
Probably others...

Efficient way of getting listing of files in large filesystem

What is the most efficient way to get a "ls"-like output of the most recently created files in a very large unix file system (100 thousand files +)?
Have tried ls -a and some other varients.
You can also use less to search and scroll it easily.
ls -la | less
If I'm understanding your question correctly try
ls -a | tail
More information here
If the files are in a single directory, then you can use:
ls -lt | less
the -t option to ls will sort the files by modification time and less will let you scroll through them
If the want recent files across an entire file system --- i.e., in different directories, then you can use the find command:
find dir -mtime 1 -print | xargs ls -ld
Substitute the directory where you want to start the search for "dir". The find command will print the names of all of the files that have been modified in the last day (-mtime 1 means modified in the last one day) and the xargs command will take that list of files and feed it to ls, giving you the ls-like output you want

Unix [Homework]: Get a list of /home/user/ directories in /etc/passwd

I'm very new to Unix, and currently taking a class learning the basics of the system and its commands.
I'm looking for a single command line to list off all of the user home directories in alphabetical order from the /etc/passwd directory. This applies only to the home directories, and not the contents within them. There should be no duplicate entries. I've tried many permutations of commands such as the following:
sort -d | find /etc/passwd /home/* -type -d | uniq | less
I've tried using -path, -name, removing -type, using -prune, and changing the search pattern to things like /home/*/$, but haven't gotten good results once. At best I can get a list of my own directory (complete with every directory inside it, which is bad), and the directories of the other students on the server (without the contained directories, which is good). I just can't get it to display the /home/user directories and nothing else for my own account.
Many thanks in advance.
/etc/passwd is a file. the home directory is usually at field/column 6, where ":" is the delimiter. When you are dealing with file structure that has distinct characters as delimiters, you should use a tool that can break your data down into smaller chunks for easier manipulation using fields and field delimiters. awk/cut etc, even using the shell with IFS variable set can do the job. eg
awk -F":" '{print $6}' /etc/passwd | sort
cut -d":" -f6 /etc/passwd |sort
using the shell to read the file
while IFS=":" read -r a b c d e home_dir g
do
echo $home_dir
done < /etc/passwd | sort
I think the tools you want are grep, tr and awk. Grep will give you lines from the file that actually contain home directories. tr will let you break up the delimiter into spaces, which makes each line easier to parse.
Awk is just one program that would help you display the results that you want.
Good luck :)
Another hint, try ls --color=auto /etc, passwd isn't the kind of file that you think it is. Directories show up in blue.
In Unix, find is a command for finding files under one or more directories. I think you are looking for a command for finding lines within a file that match a pattern? Look into the command grep.
sed 's|\(.[^:]*\):\(.[^:]*\):\(.*\):\(.[^:]*\):\(.[^:]*\)|\4|' /etc/passwd|sort
I think all this processing could be avoided. There is a utility to list directory contents.
ls -1 /home
If you'd like the order of the sorting reversed
ls -1r /home
Granted, this list out the name of just that directory name and doesn't include the '/home/', but that can be added back easily enough if desired with something like this
ls -1 /home | (while read line; do echo "/home/"$line; done)
I used something like :
ls -l -d $(cut -d':' -f6 /etc/passwd) 2>/dev/null | sort -u
The only thing I didn't do is to sort alphabetically, didn't figured that yet

Resources