Loop over space-separated string in zsh - zsh

What controls the environment to know to split by space in zsh?
I'm sure it's something simple but in all my searching have yet to figure it out what controls it.
Trying to loop over items in a space-separated string like so:
s='foo bar baz'
for i in $s; do
echo "$i END"
done
# foo bar baz END
# ---
s='foo bar baz'
a=( $s )
echo ${a[0]} # (empty)
echo ${a[1]} # foo bar baz
# ---
s='foo bar baz'
IFS=' ' read a <<< $s
for i in "${a[#]}"; do
echo "$i END"
done
# foo bar baz END
The different methods work via sh and bash, but in a shell with oh-my-zsh I'm unable to separate by space, getting the results above. May not be oh-my-zsh - but looking to understand what drives this.
Working example from bash:
s='foo bar baz'
for i in $s; do
echo "$i END"
done
# foo END
# bar END
# baz END

Zsh and bash are two different programming languages. They're similar, but not identical. In bash, and more generally in Bourne-style shells (sh, dash, ksh, …), an unquoted variable expansion $foo does the following:
Take the value of the variable foo, which is a string. (If there is no variable foo, take the empty string.)
Split the string into whitespace-separated parts. (More generally, the value of the IFS variable determines how the string is split; I won't go into all the details here.) The result is a list of strings.
For every element in the list, if it is a globbing pattern, i.e. if it contains at least one wildcard character *?\[ (and possibly more depending on some shell options), and that pattern matches at least one file name, then the element is replaced by the list of matching file names. Elements that don't contain any wildcard character, and elements that contain a wildcard character but don't match any file name, are left alone. The result is again a list of strings.
Zsh is mostly a Bourne-style shell, but it has some differences, and this is the main one: $foo has the following, simpler behavior.
Take the value of the variable foo, which is a string. (If there is no variable foo, take the empty string.)
If this results in an empty word, this word is eliminated. (So for example $foo$bar is only eliminated if both foo and bar are empty or unset.)
Note that in sh or bash, $foo only works to split a string if it doesn't contain any wildcard character or if globbing is disabled with set -f.
To split a string at whitespace in zsh, there are two simple methods:
Use the = parameter expansion specified to apply IFS word splitting. For example $=foo splits at whitespace as determined by IFS.
Use the p parameter expansion flag. For example ${(p: :)foo} splits at spaces (not tabs or newlines).
This has nothing to do with oh-my-zsh, which is a plugin to configure zsh for interactive use.

Just ${(p: :)foo} didn't work for me, and was giving a a zsh: error in flags error. After reading Parameter-Expansion doc, I see that it should be ${(ps: :)foo}. Even the flag p explanation in the doc uses the additional s flag.
The p flag doc says:
p  : 
Recognize the same escape sequences as the print builtin
in string arguments to any of the flags described below
that follow this argument.
So what I ended up using was just ${(s: :)foo}. See the example for the behavior where only space is used as the separator, contiguous spaces are treated as one, while tabs and newlines are preserved as is:
> FRUITS="apple\tbanana grapes orange passion_fruit\nwatermelon"
> for F in ${(ps: :)FRUITS}; do echo "GOT: <$F>"; done
GOT: <apple banana>
GOT: <grapes>
GOT: <orange>
GOT: <passion_fruit
watermelon>

Related

zsh - how does the indexing for this associative array work?

The code below seems to be working correctly:
#!/bin/zsh
zparseopts -D -E -A opts f -foo
if [[ -n ${opts[(ie)-f]} || -n ${opts[(ie)--foo]} ]]; then
echo "foo is set."
else
echo "foo is not set."
fi
~/tmp > ./args.sh
foo is not set.
~/tmp > ./args.sh -f
foo is set.
~/tmp > ./args.sh --foo
foo is set.
What does the syntax for the index of opts mean i.e. (ie)-f? Is there some documentation where I can learn more about this? I don't even know what to search for to learn more about this kind of indexing.
My bad - I found it in the zsh manual here. It's explained in section 5.4.2 Using Associative Arrays.
To explain, it seems like this is a part of parameter substitutions in zsh. I don't know if it applies to bash as well.
This allows you to use certain "parameters" to perform some functionalities.
The syntax is to include the parameters within parentheses and prefix the specific part of the object that you want to substitute, modify etc.
For example, taking opts in my question,
echo ${opts}
prints the values of the associative array.
We have the parameter k which signifies the keys and v which signifies values. This can be used as follows:
echo ${(k)opts} # print only the keys
echo ${(kv)opts} # print the keys and values
To answer the main part of my question, what does (ie)-f mean, these are parameters that apply to the index of the associative array. Looking at the manual I had linked to, here is what i does - it searches for the key and returns the key instead of the value.
Explanation from the manual:
If instead of an ordinary subscript you use a subscript preceded by the flag (i), the shell will search for a matching key (not value) with the pattern given and return that. This is deliberately the same as searching an ordinary array to get its key (which in that case is just a number, the index), but note this time it doesn't match on the value, it really does match, as well as return, the key
And with regards to e - this seems a bit more complicated. But reading through the manual, it seems like this further evaluates the value if necessary i.e. in the event that it's not a constant.
Here is an example:
bar=4
foo='$bar'
> echo $foo
$bar
> echo ${(e)foo}
4
So combining the two together (ie) in my question returns the key and also expands it if necessary.

Sort list of files in different directories with zsh

Assume I have the following directory/file structure
dirA/1fileAA.zsh
dirA/99fileAB.zsh
dirB/2fileBA.zsh
dirB/50fileBB.zsh
dirB/subdirA/20fileBAA.zsh
which I want to have ordered by the numbers the filenames begin with, ignoring any directories, so I get
dirA/1fileAA.zsh
dirB/2fileBA.zsh
dirB/subdirA/20fileBAA.zsh
dirA/99fileAB.zsh
dirB/50fileBB.zsh
using just built-in zsh functionality.
What would be the best way to achieve this?
I could think of rewriting strings sort and write them back?
Or better try to create an associated array and sort by keys?
I'm still a zsh and want to avoid digging into the wrong direction, too much.
Here is one way to accomplish this using only zsh builtins. The function prepends the filename to the front of each path for sorting and then removes it:
function sortByFilename {
local -a ary
printf -v ary '%s/%s' ${${argv:t}:^argv}
print -l ${${(n)ary}#*/}
}
With your example directory setup, it can be invoked from the parent directory of dirA and dirB with:
sortByFilename **/*.zsh
Testing it:
sortByFilename \
dirA/1fileAA.zsh \
dirA/99fileAB.zsh \
dirB/2fileBA.zsh \
dirB/50fileBB.zsh \
'/leadslash/42 and spaces' \
dirB/subdirA/20fileBAA.zsh
Result:
dirA/1fileAA.zsh
dirB/2fileBA.zsh
dirB/subdirA/20fileBAA.zsh
/leadslash/42 and spaces
dirB/50fileBB.zsh
dirA/99fileAB.zsh
The pieces:
printf -v ary <fmt> ...: runs printf with the format string, and assign the results to the ary array. Each iteration of the format string will become another element in the array.
%s/%s: the format string. This will concatenate two strings with a slash separator.
If there are more values than in the input than specifiers in the format string, printf will repeat the format pattern. So here, it will pull pairs (of filename/pathname) from the input array.
${${argv:t}:^argv}: this will produce an array alternating with filenames and full paths, i.e. (file1 path1 file2 path2 ...)
${ :^ }: zsh parameter expansion that will zip two arrays to create the alternating filenames and paths.
${argv:t}: array of filenames. Built using the function positional parameters in argv, and the :t modifier, which returns the filename component for each element in the array.
argv: array of full paths.
print -l: print each element of the input on a separate line.
${${(n)ary}#*/}: the final sorted list of paths.
${(n)ary}: Returns the array sorted numerically, using the n parameter expansion flag. At this point, each element in ary is the concatenation of the filename, a slash, and the input path.
The n flag works here because of the filename pattern; it will sort by decimal value instead of lexically within a common / empty prefix, e.g. foo1 foo3 foo12.
${ #*/}: Removes the pattern */ from the front of each element in the array. This deletes the prefix that was being used for sorting, leaving the original path.
local -a ary: declares an array variable. This is used as an indicator to printf -v to split its output.
It's possible to eliminate this line and make the function shorter and a bit more cryptic by (re-/mis-/ab)using the pre-declared array argv.
function sortByFilename {
printf -v argv %s/%s ${${argv:t}:^argv}
print -l ${${(n)argv}#*/}
}
Edit - a single-line version:
(){print -l ${"${(n0)$(printf '%s/%s\0' ${${argv:t}:^argv})}"#*/}} **/*.zsh
Including this simply because one-liners are fun to create, not because it's recommended. With the anonymous function, process substitution, and additional parameter expansion flags, this is less readable and possibly less efficient than the function above.

zsh command line processing - separating the last arguments from the previous ones

I am writing a zsh script, which is invoked with a variable number of arguments, such as
scriptname a b c d filename
Inside the script, I want first to loop over the arguments (except the last one) and process them, and finally do something with the processed data and the last argument (filename).
I got this working, but am not entirely happy with my solution. Here is what I came up with (where process and apply are some other scripts not relevant to my problem):
#!/bin/zsh
set -u
x=""
filename=$#[-1]
# Process initial arguments
for ((i=1; i<$#; i++))
do
x+=$(process ${#[$i]})
done
apply $x $filename
I find the counting loop too cumbersome. If filename where the first argument, I would do a shift and then could simply loop over the arguments, after having saved the filename. However I want to keep the filename as the last argument (for consistency with other tools).
Any ideas how to write this neatly without counting loop?
You can slice off the last argument from the original list and save them into an array, if thats an option
args=("${#:1:$# -1}")
for arg in "${args[#]}"; do # iterate over all, except the last
printf '%s\n' "$arg"
done
Using the array as a placeholder is optional as you can iterate over the arguments slice directly i.e. for arg in "${#:1:$# -1}"; do. The syntax is even available in bash also.
As pointed out by chepner's comment, you could use a zsh specifc syntax as
for arg in $#[1,-2]; do
printf '%s\n' "$arg"
done

noglob function then use ls with param?

I just want to pass a glob through and then use it against ls directly. The simplest example would be:
test() { ls -d ~/$1 }
alias test="noglob test"
test D*
If I simply run ls D in my home directory: it outputs three files. but if I run the snippet provided, I get "/Users/jubi/D*": No such file or directory. What should I be doing? thanks!
The authoritative and complete documentation of Zsh expansion mechanism is located at http://zsh.sourceforge.net/Doc/Release/Expansion.html.
Here's the reason your version doesn't work:
If a word contains an unquoted instance of one of the characters ‘*’, ‘(’, ‘|’, ‘<’, ‘[’, or ‘?’, it is regarded as a pattern for filename generation, unless the GLOB option is unset.
emphasis mine. Your glob operator, generated by parameter expansion, isn't considered unquoted.
You need the GLOB_SUBST option to evaluate the parameter expansion result as a glob pattern. a setopt globsubst, unsetopt globsubst pair works, of course, but the easiest way is to use the following pattern specifically for this purpose:
${~spec}
Turn on the GLOB_SUBST option for the evaluation of spec; if the ‘~’ is doubled, turn it off. When this option is set, the string resulting from the expansion will be interpreted as a pattern anywhere that is possible, such as in filename expansion and filename generation and pattern-matching contexts like the right hand side of the ‘=’ and ‘!=’ operators in conditions.
In nested substitutions, note that the effect of the ~ applies to the result of the current level of substitution. A surrounding pattern operation on the result may cancel it. Hence, for example, if the parameter foo is set to *, ${~foo//\*/*.c} is substituted by the pattern *.c, which may be expanded by filename generation, but ${${~foo}//\*/*.c} substitutes to the string *.c, which will not be further expanded.
So:
t () { ls -d ~/${~1} }
alias t="noglob t"
By the way, test is a POSIX shell builtin (aka [). Don't shadow it.

Something similar to $1 but gathers all input regardless of whitespace

Is there something similar to $1, but that gathers all input from the terminal input, including whitespace characters? This would be used to collect a pasted directory path that may have whitespaces - I need the whole string.
Thanks In Advance
Thankfully, I've received the answer to my first question. In execution, however, I can't get it to work. Here is my code. Can anyone explain what I'm doing wrong? Thanks.
alias finder='cd $* && open .'
It's returning segmented returns - every time it hits a space, it treats it as a separate entry.
Try $* or $#.
$* All of the positional parameters, seen as a single word
$# Same as $*, but each parameter is a quoted string, that is, the
parameters are passed on intact, without interpretation or expansion.
Normally you'd just refer to the first argument as "$1", including the quotation marks. If you want to use a directory name as an argument, and the name has spaces in it, you'd typically quote it on the command line:
alias finder='cd "$1" && open .'
...
finder "/some/dir/with spaces/in its name"
That also works well with tab completion, which escapes whitespace for you. And in this particular case, you probably might as well use the open command directly.
But if you want the finder alias to concatenate multiple arguments into a single string, separated by spaces, that actually turns out to be harder. I've tried some possibilities using $* and $#, but they don't work correctly. For testing, I'm using my own command echol, which prints each of its arguments on a separate line.
$ echol foo bar
foo
bar
$ alias e='echol "$*"'
$ e foo bar
foo
bar
$ alias e='eval echo \""$*"\"'
$ e foo bar
foo bar
That last one is the closest I've come, but it adds an extra leading space.
I think you're better off just quoting the directory name.

Resources