Compare two folders which have many files inside contents - unix

Have two folders with approx. 150 java property files.
In a shell script, how to compare both folders to see if there is any new property file in either of them and what are the differences between the property files.
The output should be in a report format.

To get summary of new/missing files, and which files differ:
diff -arq folder1 folder2
a treats all files as text, r recursively searched subdirectories, q reports 'briefly', only when files differ

diff -r will do this, telling you both if any files have been added or deleted, and what's changed in the files that have been modified.

I used
diff -rqyl folder1 folder2 --exclude=node_modules
in my nodejs apps.

Could you use dircmp ?

Diff command in Unix is used to find the differences between files(all types). Since directory is also a type of file, the differences between two directories can easily be figure out by using diff commands. For more option use man diff on your unix box.
-b Ignores trailing blanks (spaces and tabs)
and treats other strings of blanks as
equivalent.
-i Ignores the case of letters. For example,
`A' will compare equal to `a'.
-t Expands <TAB> characters in output lines.
Normal or -c output adds character(s) to the
front of each line that may adversely affect
the indentation of the original source lines
and make the output lines difficult to
interpret. This option will preserve the
original source's indentation.
-w Ignores all blanks (<SPACE> and <TAB> char-
acters) and treats all other strings of
blanks as equivalent. For example,
`if ( a == b )' will compare equal to
`if(a==b)'.
and there are many more.

Related

UNIX command to move multiple files to multiple subdirectories?

I work in an X11 window on a MAC OS X machine. Now I have hundreds of files in one directory, each file name containing a substring such as "1970", "1971",..., "2014", etc. indicating that the file is for that year. Now I have just created subdirectories named "1970", "1971", ..., "2014".
What is the one-line UNIX command that would move all the files into the subdirectories corresponding to their years?
If the year-name sub-directories are in the single directory that currently contains all the files, then you should be able to use something like this, assuming that the current directory is that single directory:
shopt -s nullglob
for year in {1970..2014}
do
mv *?${year}* $year
mv *${year}?* $year
done
The globbing insists on at least one more character in the name to be moved than just the year, either before or after the year, to prevent an attempt to move 1970 into itself (which would fail). You need two mv commands to prevent a-1970-b from matching both glob expressions (which would cause the second to fail as the file would have already been removed). Using globbing like this preserves spaces etc in file names correctly. (Using command substitution, etc, does not.)
The shopt command means that if there are no files for a given glob, there'll be nothing in the output. That will generate a usage error from mv (a nuisance), but is otherwise harmless. You could decide to filter such error messages if you really want to; you probably don't want to send all error messages to /dev/null, though.
Since you're on a Mac, you don't have GNU mv with the very useful -t target option.
You said you need a single line command; replace each newline except the one after do with a semicolon; replace the newline after do with a space.
If you know that the year is never at the beginning or end of the file name, you can use a single mv *?${year}?* $year command.

unix command line ...how to grep and show only file names that contain a string?

I know I can search for a string with:
grep -n -d recurse 'snoopy' *
and then it shows every file name and instance that contains that string, like:
file/name.txt:23 some snoopy here
file/name2.txt:59 another snoopy there
file/name2.txt:343 some more snoopy
etc...
The problem is that with many occurrences, the list is huge. How do I make it show only the actual file names that contain the string, without duplicates and without the occurrence?
Only like:
file/name1.txt
file/name52.txt
file/name28293.txt
Thanks a lot for any help :)
The -l flag (or, in both BSD and GNU grep, --files-with-matches) does what you want.
From the POSIX spec:
Write only the names of files containing selected lines to standard output. Pathnames shall be written once per file searched. If the standard input is searched, a pathname of "(standard input)" shall be written, in the POSIX locale. In other locales, "standard input" may be replaced by something more appropriate in those locales.
Both BSD and GNU also explicitly guarantee that this will be more efficient. (Older BSD versions say "… grep will only search a file until a match has been found, making searches potentially less expensive", newer BSD and GNU say "The scanning will stop on the first match".) If you don't know which grep you have and which options it has, just type man grep at the shell and you should get the manpage.

How to make the glob() function also match hidden dot files in Vim?

In a Linux or Mac environment, Vim’s glob() function doesn’t match dot files such as .vimrc or .hiddenfile. Is there a way to get it to match all files including hidden ones?
The command I’m using:
let s:BackupFiles = glob("~/.vimbackup/*")
I’ve even tried setting the mysterious {flag} parameter to 1, and yet it still doesn’t return the hidden files.
Update: Thanks ib! Here’s the result of what I’ve been working on: delete-old-backups.vim.
That is due to how the glob() function works: A single-star pattern
does not match hidden files by design. In most shells, the default
globbing style can be changed to do so (e.g., via shopt -s dotglob
in Bash), but it is not possible in Vim, unfortunately.
However, one has several possibilities to solve the problem still.
First and most obvious is to glob hidden and not hidden files
separately and then concatenate the results:
:let backupfiles = glob(&backupdir..'/*').."\n"..glob(&backupdir..'/.[^.]*')
(Be careful not to fetch the . and .. entries along with hidden files.)
Another, perhaps more convenient but less portable way is to use
the backtick expansion within the glob() call:
:let backupfiles = glob('`find '..&backupdir..' -maxdepth 1 -type f`')
This forces Vim to execute the command inside backticks to obtain
the list of files. The find shell command lists all files (-type f)
including the hidden ones, in the specified directory (-maxdepth 1
forbids recursion).

Unix wildcard selectors? (Asterisks)

In Ryan Bates' Railscast about git, his .gitignore file contains the following line:
tmp/**/*
What is the purpose of using the double asterisks followed by an asterisk as such: **/*?
Would using simply tmp/* instead of tmp/**/* not achieve the exact same result?
Googling the issue, I found an unclear IBM article about it, and I was wondering if someone could clarify the issue.
It says to go into all the subdirectories below tmp, as well as just the content of tmp.
e.g. I have the following:
$ find tmp
tmp
tmp/a
tmp/a/b
tmp/a/b/file1
tmp/b
tmp/b/c
tmp/b/c/file2
matched output:
$ echo tmp/*
tmp/a tmp/b
matched output:
$ echo tmp/**/*
tmp/a tmp/a/b tmp/a/b/file1 tmp/b tmp/b/c tmp/b/c/file2
It is a default feature of zsh, to get it to work in bash 4, you perform:
shopt -s globstar
From http://blog.privateergroup.com/2010/03/gitignore-file-for-android-development/:
(kwoods)
"The double asterisk (**) is not a git thing per say, it’s really a linux / Mac shell thing.
It would match on everything including any sub folders that had been created.
You can see the effect in the shell like so:
# ls ./tmp/* = should show you the contents of ./tmp (files and folders)
# ls ./tmp/** = same as above, but it would also go into each sub-folder and show the contents there as well."
According to the documentation of gitignore, this syntax is supported since git version 1.8.2.
Here is the relevant section:
Two consecutive asterisks (**) in patterns matched against full pathname may have special meaning:
A leading ** followed by a slash means match in all directories. For example, **/foo matches file or directory foo anywhere, the
same as pattern foo. **/foo/bar matches file or directory bar
anywhere that is directly under directory foo.
A trailing /** matches everything inside. For example, abc/** matches all files inside directory abc, relative to the location of
the .gitignore file, with infinite depth.
A slash followed by two consecutive asterisks then a slash matches zero or more directories. For example, a/**/b matches a/b,
a/x/b, a/x/y/b and so on.
Other consecutive asterisks are considered invalid.

To restrict a node from `tree` by Tree or Git

How can you restrict a node from the command tree?
#1
I need to give a tree of my project files reqularly for my supervisor.
These files contain some third-party components which I do not want to show in the tree.
I have solved this problem this far by coping the project file to tmp, removing 3rd party-files and then running tree.
However, this procedure is becoming cumbersome.
I would like to get a better way to give tree of my files to my supervisor.
#2
I have the files which I want to show in Git so Git may solve this problem.
I run unsuccessfully
git ls-files --with-tree
You can specify the files you want to match and avoid using general patterns. From the tree manpage:
-P pattern
List only those files that match the wild-card pattern. Note: you must use the -a option to also consider those files beginning with a dot '.' for matching. Valid wildcard operators are '*' (any zero or more characters), '?' (any single character), '[...]' (any single character listed between brackets (optional - (dash) for character range may be used: ex: [A-Z]), and '[^...]' (any single character not listed in brackets) and '|' separates alternate patterns.
-I pattern
Do not list those files that match the wild-card pattern.
In your specific case, running
tree -I '3rd*'
should hide a directory called '3rd_party', including subdirs and files, while still allowing matches like 'party_3rd'. Obviously, other files and directories not containing '3rd' in the name will also display as normal. I've verified this behaviour with tree v1.5.2.1 on Linux.
You can put the third party tools is a separate subdirectory.
Then you only have to eliminate one node.
Instead of changing the tree command it might be better to place the 3rd-party files in a sibling folder of, not in a child folder of, your own source.

Resources