Zsh glob: objects recursivley under sudirectories, excluding current/base directory - zsh

I am trying to do a file name generation of all objects (files, directories, and so on) recursively under all subdirectories of the current directory. Excluding the objects in said current directory.
In other words, given:
--dir1 --dir2.1
| | dir2.2 --file3.1
| --file2.1
--file1
I want to generate:
./dir2.1
./dir2.2
./dir2.2/file3.1
./file2.1
I have set the EXTENDED_GLOB option, and I assumed that the following pattern would do the trick:
./**/*~./*
But it returns:
zsh: no matches found: ./**/*~./*
I don't know what the problem is, it should work.
./**/* gives:
./dir1
./dir2.1
./dir2.2
./dir2.2/file3.1
./file2.1
./file1
And ./* gives:
./dir1
./file1
How come ./**/*~./* fails? And more important, how can I generate the name of the elements recursively in the subdirectories excluding the elements in current/base directory?
Thanks.

The (1)x~y glob operator uses y as a shell's ordinally pattern matching rather than a file name generation, so ./**/*~./* gives "no matches found":
% print -l ./**/*~./*
;# ./dir1 # <= './*' matches, so exclude this entry
;# ./dir2.1 # <= './*' matches, so exclude this entry
;# .. # ditto...
;# => finally, no matches found
The exclusion pattern ./* matches everything generated by the glob ./**/*, so zsh finally yields "no matches found". (zsh does not do filename generations for the ~y part.)
We could make the exclusion pattern a little more precise/complicated form for excluding the elements in current directory. Such that it starts with ./ and has one or more characters other than /.
% print -l ./**/*~./[^/]## ;# use '~./[^/]##' rather than '~./*'
./dir1/dir2.1
./dir1/dir2.2
./dir1/dir2.2/file3.1
./dir1/file2.1
Then, to strip the current-dir-component /dir1, we could use the (2)estring glob qualifier, such that it removes the first occurrence of /[^/]## (for example /dir1):
# $p for avoiding repetitive use of the exclusion pattern.
% p='./[^/]##'; print -l ./**/*~${~p}(e:'REPLY=${REPLY/${~p[2,-1]}}':)
./dir2.1
./dir2.2
./dir2.2/file3.1
./file2.1
Or to strip it using ordinally array/replace rather than estring glob qualifier:
% p='./[^/]##'; a=(./**/*~${~p}) ; a=(${a/${~p[2,-1]}}); print -l $a
./dir2.1
./dir2.2
./dir2.2/file3.1
./file2.1
At last, iterating over current dir's dirs could do the job, too:
a=(); dir=;
for dir in *(/); do
pushd "$dir"
a+=(./**/*)
popd
done
print -l $a
#=> ./dir2.1
./dir2.2
./dir2.2/file3.1
./file2.1
Here are some zsh documents.
(1)x~y glob operator:
x~y
(Requires EXTENDED_GLOB to be set.) Match anything that matches the pattern x but does not match y. This has lower precedence than any operator except ‘|’, so ‘*/*~foo/bar’ will search for all files in all directories in ‘.’ and then exclude ‘foo/bar’ if there was such a match. Multiple patterns can be excluded by ‘foo~bar~baz’. In the exclusion pattern (y), ‘/’ and ‘.’ are not treated specially the way they usually are in globbing.
--- zshexpn(1), x~y, Glob Operators
(2)estring glob qualifier:
estring
+cmd
...
During the execution of string the filename currently being tested is available in the parameter REPLY; the parameter may be altered to a string to be inserted into the list instead of the original filename.
--- zshexpn(1), estring, Glob Qualifiers

Related

Zsh glob: Get everything except stuff in a certin folder

Trying to find all files except those inside vendor/ folders, but why is this failing?
setopt extendedglob
for file in **/*~vendor/; do
done
See if this does what you're looking for:
setopt extendedglob
print -l ^vendor/**/*(.)
The *~ negation syntax usually needs parentheses in order to determine where the expression after the tilde ends. Your pattern is requesting all files and folders except those where the glob result name ends with vendor/. The glob result never includes the trailing slash, so you end up with all of the files and folders.
Adding parens will change the behavior of that pattern, but probably not in a useful way. This will result in a list of all of the directories where the last component is not vendor:
print -l **/(*~vendor)/
so x/y, x/y/vendor/z, and vendor/a will be included, but x/y/vendor will not.
The parentheses limit the 'not' pattern to just one piece of the path. In order to exclude matches at the top-level, the tested component needs to be at the front of the pattern:
print -l (*~vendor)/**/*
The very first pattern above uses the ^ syntax to produce the same results. The (.) glob qualifier in that pattern limits the globbing to plain files, so directories are not included.
Another variation that may be useful - this will exclude directories that have any component named vendor. It is similar to find -prune:
print -l (^vendor/)#*(.)
This will produce a list of all files except those in subdirectories with names like vendor/x, a/vendor and a/vendor/b.

What is the meaning of each parameter for *(*ocNY1) from the shell command `echo`?

I could not find the proper place to look up for the parameter explanation for the below command.
echo *(*ocNY1)
After some tests, I discovered that *(*oc) prints executable files(file with x permission) from the current directory. And NY1 prints the first item of such. But I cannot find the manual for such options. Where can I find the definition/manual for the parameters of such?
Where can I lookup to see the explanation for each parameters for the pattern matching?
Is this glob pattern or regex that echo is using?
Sometimes it is really hard to take the first step if you do not know where you are heading.
*(*ocNY1) is a zsh glob pattern - see man zshexpn.
* is a glob operator that matches any string, including the null string.
The trailing (...) contains glob qualifiers:
* to match executable plain files
oc sort by time of last inode change, youngest first
N sets the nullglob option for the current pattern
Yn expand to at most n filenames

Unexpected behavior of "**/*" (double star) on grep

I'm experiencing issue with the double star associated to grep.
I'm using ubuntu 16.04.
In my understanding (and after a lot of research):
grep 'a' **/* should find any occurence of 'a' in all files in my directory and all sub-directories (reccursively).
However, it doesn't work like that in my system.
Here's a test:
My file directory
.a (file containing an "a")
ba/a (file containing "in ba")
ba/ca/a (file containing "in ca in ba")
grep 'a' *
a:a
ba is a directory
grep 'a' **/*
ba/a:in ba
grep: ba/ca: is a directory
The first case is obvious, but I was expecting from the second case to see the three files...
What is the explanation behind that?
Thanks,
Bob
grep is not "associated" with the double star. Your shell expands these.
Depending on which shell you use and which settings you have in that shell, the double star may mean to expand any number of subdirectory levels.
If you want to recursively grep, use grep -r like grep -r a .

applying zsh qualifiers on array elements or directly on a result of a command substitution

I did
a=( pacman -Qlq packagename )
to put files belonging to package into array
Why is this printing only the frist match, and how to print them all in zsh:
print -l ${a[(r)*i*]}
Also, how to apply zsh qualifiers on all array elements, say to list files
only via (.)
Is there an easier way to skip intermediary array in this process,
in a way to have qualifier specified on a result of a command substition?
As per documentation the subscript flag (r) will only return the first matching array element.
In order to get all matching elements you can use the {name:#pattern} parameter expansion, which removes any element maching pattern from the expansion. In order to remove the non-matching elements you can either use the flag (M) or negate the pattern with ^ (this requires the EXTENDED_GLOB option to be enabled):
print -l ${(M)a:#*i*}
setopt extendedglob
print -l ${a:#^*i*}
You can skip explicitly creating an intermediary array by just using the parameter expansion on the command substitution ($(...)) directly:
print -l ${(M)$(pacman -Qlq packagename):#*i*}
It seems that globbing qualifiers do not work with patterns inside parameter expansions. But you can enable the RC_EXPAND_PARAM option to expand every single array element within a word instead of the whole array. So foo${xx}bar with x=(a b c) will be expanded to fooabar foobbar foocbar instead of fooa b cbar. You can enabley it either globally with setopt rcexpandparam or for a specific expansion by wrapping it in ${^...}. This way you can add a glob qualifier to each element of the filtered array. To print only elements that are paths to files, you can use
print -l ${^${(M)$(pacman -Qlq packagename):#*i*}}(.N)
This essentially takes each path and attaches (.N) as glob qualifier (which works, even though there are no globs). The resulting patterns are then evaluated as part of filename generation. . tells zsh to only match plain files. N enables the NULL_GLOB option for these patterns, otherwise the command would abort with an "no matches found" error, if it encounters a pattern that is not a plain file (e.g. /usr is a directory, so /usr(.) does not match any plain file on your system.).

How to edit path variable in ZSH

In my .bash_profile I have the following lines:
PATHDIRS="
/usr/local/mysql/bin
/usr/local/share/python
/opt/local/bin
/opt/local/sbin
$HOME/bin"
for dir in $PATHDIRS
do
if [ -d $dir ]; then
export PATH=$PATH:$dir
fi
done
However I tried copying this to my .zshrc, and the $PATH is not being set.
First I put echo statements inside the "if directory exists" function and I found that the if statement was evaluating to false, even for directories that clearly existed.
Then I removed the directory-exists check, and the $PATH was being set incorrectly like this:
/usr/bin:/bin:/usr/sbin:/sbin:
/usr/local/bin
/opt/local/bin
/opt/local/sbin
/Volumes/Xshare/kburke/bin
/usr/local/Cellar/ruby/1.9.2-p290/bin
/Users/kevin/.gem/ruby/1.8/bin
/Users/kevin/bin
None of the programs in the bottom directories were being found or executed.
What am I doing wrong?
Unlike other shells, zsh does not perform word splitting or globbing after variable substitution. Thus $PATHDIRS expands to a single string containing exactly the value of the variable, and not to a list of strings containing each separate whitespace-delimited piece of the value.
Using an array is the best way to express this (not only in zsh, but also in ksh and bash).
pathdirs=(
/usr/local/mysql/bin
…
~/bin
)
for dir in $pathdirs; do
if [ -d $dir ]; then
path+=$dir
fi
done
Since you probably aren't going to refer to pathdirs later, you might as well write it inline:
for dir in \
/usr/local/mysql/bin \
… \
~/bin
; do
if [[ -d $dir ]]; then path+=$dir; fi
done
There's even a shorter way to express this: add all the directories you like to the path array, then select the ones that exist.
path+=/usr/local/mysql/bin
…
path=($^path(N))
The N glob qualifier selects only the matches that exist. Add the -/ to the qualifier list (i.e. (-/N) or (N-/)) if you're worried that one of the elements may be something other than a directory or a symbolic link to one (e.g. a broken symlink). The ^ parameter expansion flag ensures that the glob qualifier applies to each array element separately.
You can also use the N qualifier to add an element only if it exists. Note that you need globbing to happen, so path+=/usr/local/mysql/bin(N) wouldn't work.
path+=(/usr/local/bin/mysql/bin(N-/))
You can put
setopt shwordsplit
in your .zshrc. Then zsh will perform world splitting like all Bourne shells do. That the default appears to be noshwordsplit is a misfeature that causes many a head scratching. I'd be surprised if it wasn't a FAQ. Lets see... yup:
http://zsh.sourceforge.net/FAQ/zshfaq03.html#l18
3.1: Why does $var where var="foo bar" not do what I expect?
Still not sure what the problem was (maybe newlines in $PATHDIRS)? but changing to zsh array syntax fixed it:
PATHDIRS=(
/usr/local/mysql/bin
/usr/local/share/python
/usr/local/scala/scala-2.8.0.final/bin
/opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin
/opt/local/etc
/opt/local/bin
/opt/local/sbin
$HOME/.gem/ruby/1.8/bin
$HOME/bin)
and
path=($path $dir)

Resources