Filtering zsh array by wildcard - zsh

Given a Zsh array myarray, I can make out of it a subset array
set -A subarray
for el in $myarray
do
if [[ $el =~ *x*y* ]]
then
subarray+=($el)
fi
done
which, in this example, contains all elements from myarray which have somewhere an x and an y, in that order.
Question:
Given the plethora of array operations available in zsh, is there an easier way to achieve this? I checked the man page and the zsh-lovers page, but couldn't find anything suitable.

This should do the trick
subarray=(${(M)myarray:#*x*y*z})
You can find the explanation in the [section about Parameter Expansion] in the zsh manpage. It is a bit hidden as ${name:#pattern} without the flag (M) does the opposite of what you want:
${name:#pattern}
If the pattern matches the value of name, then substitute the empty string; otherwise, just substitute the value of name. If name is an array the matching array elements are removed (use the (M) flag to remove the non-matched elements).

Related

Find key that matches value in zsh associative array?

In a regular array, I can use (i) or (I) to search for the index of entries matching a given value (first match from the start or end of the array, respectively):
list=(foo bar baz)
echo $list[(i)bar]
# => 2
This doesn't work for associative arrays, to get (one of) the key(s) where a value is found:
declare -A hash=([foo]=bar [baz]=zoo)
echo $hash[(i)bar]
# => no output
Is there another mechanism for doing this, other than manually looping through?
The (r) subscript flag combined with the (k) parameter flag should give you
what you want:
declare -A hash=([foo]=bar [baz]=zoo)
echo ${(k)hash[(r)bar]}
# => foo
The man page section on the (r) subscript flag only talks about returning
values and ignores this usage, so it's hard to find.
Here is something completely disgusting:
% declare -A hash=([foo]=bar [baz]=zoo)
% echo ${${(kA)hash}[${${(A)hash[#]}[(i)bar]}]}
foo
Basically, it consists of two parts:
${${(A)hash[#]}[(i)bar]}, which computes the index of bar in an anonymous array consisting of the values of the associative array.
${${(kA)hash}[...]}, which indexes the anonymous array consisting of the keys of the associative array using the numerical index computed by the previous expansion.
I'm not aware of a short equivalent to the I flag, and I too am surprised that the seemingly obvious extension to associative arrays doesn't exist.

filename expansion on assigning a non-array variable

This is about Zsh 5.5.1.
Say I have a glob pattern which expands to exactly one file, and I would like to assign this file to a variable. This works:
# N: No error if no files match. D: Match dot files. Y1: Expand to exactly one entry.
myfile=(*(NDY1))
and echo $myfile will show the file (or directory). But this one does not work:
myfile=*(NDY1)
In the latter case, echo $myfile holds the pattern, i.e. *(NDY1).
Of course I could do some cheap trick, such as creating a chilprocess via
myfile=$(echo *(NDY1))
but is there a way to do the assinment without such tricks?
By default, zsh does not do filename expansion in scalar assignment, but the option GLOB_ASSIGN could help. (This option is provided as for backwards compatibility only.)
local myfile=''
() {
setopt localoptions globassign
myfile=*(NDY1)
}
echo $myfile
;#>> something
Here are some descriptions in zsh docs:
The value of a scalar parameter may also be assigned by writing:
name=value
In scalar assignment, value is expanded as a single string, in which the elements of arrays are joined together; filename expansion is not performed unless the option GLOB_ASSIGN is set.
--- zshparam(1), Description, zsh parameters
GLOB_ASSIGN <C>
If this option is set, filename generation (globbing) is performed on the right hand side of scalar parameter assignments of the form 'name=pattern (e.g. foo=*'). If the result has more than one word the parameter will become an array with those words as arguments. This option is provided for backwards compatibility only: globbing is always performed on the right hand side of array assignments of the form name=(value) (e.g. foo=(*)) and this form is recommended for clarity; with this option set, it is not possible to predict whether the result will be an array or a scalar.
--- zshoptions(1), GLOB_ASSIGN, Expansion and Globbing, Description Of Options, zsh options

What is the zsh equivalent for $BASH_REMATCH[]?

What is the equivalent in zsh for $BASH_REMATCH, and how is it used?
Alternatively, one could simply use
$match[1]
in place of
$BASH_REMATCH[1]
To make zsh behave the same as bash, use:
setopt BASH_REMATCH
Or within a function consider:
setopt local_options BASH_REMATCH
(this will only set the option within the scope of the function)
Then just use $BASH_REMATCH as you would in bash.
The manual says about BASH_REMATCH:
When set, matches performed with the =~ operator will set the BASH_REMATCH array variable, instead of the default MATCH and match variables. The first element of the BASH_REMATCH array will contain the entire matched text and subsequent elements will contain extracted substrings. This option makes more sense when KSH_ARRAYS is also set, so that the entire matched portion is stored at index 0 and the first substring is at index 1. Without this option, the MATCH variable contains the entire matched text and the match array variable contains substrings.
Then =~ will behave like in bash, but if you want the full behaviour as described in the manual:
string =~ regexp
true if string matches the regular expression regexp. If the option RE_MATCH_PCRE is set regexp is tested as a PCRE regular expression using the zsh/pcre module, else it is tested as a POSIX extended regular expression using the zsh/regex module. Upon successful match, some variables will be updated; no variables are changed if the matching fails.
If the option BASH_REMATCH is not set the scalar parameter MATCH is set to the substring that matched the pattern and the integer parameters MBEGIN and MEND to the index of the start and end, respectively, of the match in string, such that if string is contained in variable var the expression ‘${var[$MBEGIN,$MEND]}’ is identical to ‘$MATCH’. The setting of the option KSH_ARRAYS is respected. Likewise, the array match is set to the substrings that matched parenthesised subexpressions and the arrays mbegin and mend to the indices of the start and end positions, respectively, of the substrings within string. The arrays are not set if there were no parenthesised subexpresssions. For example, if the string ‘a short string’ is matched against the regular expression ‘s(...)t’, then (assuming the option KSH_ARRAYS is not set) MATCH, MBEGIN and MEND are ‘short’, 3 and 7, respectively, while match, mbegin and mend are single entry arrays containing the strings ‘hor’, ‘4’ and ‘6’, respectively.
If the option BASH_REMATCH is set the array BASH_REMATCH is set to the substring that matched the pattern followed by the substrings that matched parenthesised subexpressions within the pattern.

Replacing a specific part

I have a list like this:
DEL075MD1BWP30P140LVT
AN2D4BWP30P140LVT
INVD0P7BWP40P140
IND2D6BWP30P140LVT
I want to replace everything in between D and BWP with a *
How can I do that in unix and tcl
Do you have the whole list available at the same time, or are you getting one item at a time from somewhere?
Should all D-BWP groups be processed, or just one per item?
If just one per item, should it be the first or last (those are the easiest alternatives)?
Tcl REs don't have any lookbehind, which would have been nice here. But you can do without both lookbehinds and lookaheads if you capture the goalpost and paste them into the replacement as back references. The regular expression for the text between the goalposts should be [^DB]+, i.e. one or more of any text that doesn't include D or B (to make sure the match doesn't escape the goalposts and stick to other Ds or Bs in the text). So: {(D)[^DB]+(BWP)} (braces around the RE is usually a good idea).
If you have the whole list and want to process all groups, try this:
set result [regsub -all {(D)[^DB]+(BWP)} $lines {\1*\2}]
(If you can only work with one line at a time, it's basically the same, you just use a variable for a single line instead of a variable for the whole list. In the following examples, I use lmap to generate individual lines, which means I need to have the whole list anyway; this is just an example.)
Process just the first group in each line:
set result [lmap line $lines {
regsub {(D)[^DB]+(BWP)} $line {\1*\2}
}]
Process just the last group in each line:
set result [lmap line $lines {
regsub {(D)[^DB]+(BWP[^D]*)$} $line {\1*\2}
}]
The {(D)[^DB]+(BWP[^D]*)$} RE extends the right goalpost to ensure that there is no D (and hence possibly a new group) anywhere between the goalpost and the end of the string.
Documentation:
lmap (for Tcl 8.5),
lmap,
regsub,
set,
Syntax of Tcl regular expressions

What is meaning of ##*/ in unix?

I found syntax like below.
${VARIABLE##*/}
what is the meaning of ##*/ in this?
I know meaning of */ in ls */ but not aware about what above syntax does.
This example will make it clear:
VARIABLE='abcd/def/123'
echo "${VARIABLE#*/}"
def/123
echo "${VARIABLE##*/}"
123
##*/ is stripping out longest match of anything followed by / from start of input.
#*/ is stripping out shortest match of anything followed by / from start of input.
PS: Using all capital variable names is not considered very good practice in Unix shell. Better to use variable instead of VARIABLE.
From man bash:
${parameter#word}
${parameter##word}
Remove matching prefix pattern. The word is expanded to produce
a pattern just as in pathname expansion. If the pattern matches
the beginning of the value of parameter, then the result of the
expansion is the expanded value of parameter with the shortest
matching pattern (the ``#'' case) or the longest matching pat‐
tern (the ``##'' case) deleted. If parameter is # or *, the
pattern removal operation is applied to each positional parame‐
ter in turn, and the expansion is the resultant list. If param‐
eter is an array variable subscripted with # or *, the pattern
removal operation is applied to each member of the array in
turn, and the expansion is the resultant list.

Resources