Generate an associative array from command output - zsh

I am writing a zsh completion function to complete IDs from a database. There is a program listnotes which outputs a list like this:
bf848bf6-63d2-474b-a2c0-e7e3c4865ce8 Note Title
aba21e55-22c6-4c50-8bf6-bf3b337468e2 Another one
09ead915-bf2d-449d-a943-ff589e79794a yet another "one"
...
How do I generate an associative array note_ids from the output of the listnotes command such that I get an associative array like this?
( bf848bf6-63d2-474b-a2c0-e7e3c4865ce8 "Note Title" aba21e55-22c6-4c50-8bf6-bf3b337468e2 "Another one" 09ead915-bf2d-449d-a943-ff589e79794a "yet another \"one\"" )
Note that there may be whitespace in the keys. I tried to generate something with sed:
note_ids=($(listnotes | sed 's/^\(.*\) \(.*\)$/\1 "\2"/'))
but quoting strings like this doesn’t seem to work, and double quotes in the title make it even more difficult.

Try something like
typeset -A note_ids
for line in ${(f)"$(listnotes)"}; do
note_ids+=(${line%% *} ${line#* })
done
${(f)PARAM}: split the result of the expansion of $PARAM at newlines
"$(listnotes)": put the output of listnotes verbatim into the expansion.
for line in LIST: iterate over the items in LIST as split by ${(f)…}.
note_ids+=(key value): add key-value pair to an the associative array note_ids
${line%% *}: cut the largest portion matching " *" (a space followed by anything) from the end of the expansion of line. So remove everying after including the first space, leaving only the key.
${line#* }: cut the smallest portion matching "* " (anything followed by three spaces) from the beginning of the expansion of $line. So remove the key and the three spaces used as separator.
Instead of using the parameter expansion flag (f) you could also read the output of listnotes line by line with read:
listnotes | while read; do
note_ids+=(${REPLY%% *} ${REPLY#* })
done
Unless specified otherwise read puts the read values into the REPLY parameter.

Related

Is there a way to split a string into an array with space(" ") separated values in ZSH?

I have this code:
carp="2 hello 192.180.00.00"
array=( ${carp} )
echo "\n${array[2]}"
and I am trying to split carp into an array named array. my output should be "hello", but it just prints blank. Is there a simple way to split carp into an array?
zsh has a parameter expansion syntax for that: ${=spec}. To modify your example, it would be:
carp="2 hello 192.180.00.00"
array=( ${=carp} )
echo "\n${array[2]}"
The description is:
Perform word splitting using the rules for SH_WORD_SPLIT during the evaluation of spec, but regardless of whether the parameter appears in double quotes; if the ‘=’ is doubled, turn it off. This forces parameter expansions to be split into separate words before substitution, using IFS as a delimiter. This is done by default in most other shells.

In ruamel how to start a string with '*' with no quotes like any other string

yaml = ruamel.yaml.YAML()
yaml.indent(mapping=4)
test_yaml_file = open("test.yaml")
test_file = yaml.load(test_yaml_file)
# test = LiteralScalarString('*clvm')
test = "*testing"
test_file['test_perf'] = test
with open("test.yaml", 'w') as changed_file:
yaml.dump(test_file, changed_file)
In this the expected output was
test_perf: *testing
but the output has been
test_perf: '*testing'
how to achieve this using ruamel?
Your scalar starts with a *, which is used in YAML to indicate an alias node. To prevent *testing to be interpreted as an alias during loading (even though the corresponding anchor (&testing) is not specified in the document), the scalar must be quoted or represented as a literal or folded block scalar.
So there is no way to prevent the quotes from happening apart from choosing to represent the scalar as literal or folded block scalar (where you don't get the quotes, but do get the | resp. >)
You should not worry about these quotes, because after loading you'll again have the string *testing and not something that all of a sudden has extra (unwanted) quotes).
There are other characters that have special meaning in YAML (&, !, etc.) and when indicated at the beginning of a scalar cause the scalar to be quoted. What the dump routine actually does is dump the string and read it back and if that results in a different value, the dumper knows that quoting is needed. This also works with strings like 2022-01-28, which when read back result in a date, such strings get quoted automatically when dumped as well (same for strings that look like floats, integers, true/false values).

Zsh filling associative array with read from file leads to strange separation

I have a textfile generated by
find Path -printf '%s\t%p\n' > textfile
When I do
declare -A DICT;
while IFS='\t' read -r SIZE PFAD
do DICT[$SIZE]=$PFAD
done < ../Listen/textfile
the content of DICT surprises me:
print "${(#k)DICT}"
shows, that the keys of DICT are not just the SIZE of the files, but consist of
SIZE\tRoot_of_PFAD/2_letters_of_following_directory.
The values contain the rest of the line = Rest of the path with the filename.
Looks to me as if read separates the lines by '\t+9 characters'
IFS=$(printf '\t')
seems to have done the trick.
#Gairfowl hinted in the right direction.
I hadn't grasped, that the tenth character in the path was a t.
Thank you very much!

filename expansion on assigning a non-array variable

This is about Zsh 5.5.1.
Say I have a glob pattern which expands to exactly one file, and I would like to assign this file to a variable. This works:
# N: No error if no files match. D: Match dot files. Y1: Expand to exactly one entry.
myfile=(*(NDY1))
and echo $myfile will show the file (or directory). But this one does not work:
myfile=*(NDY1)
In the latter case, echo $myfile holds the pattern, i.e. *(NDY1).
Of course I could do some cheap trick, such as creating a chilprocess via
myfile=$(echo *(NDY1))
but is there a way to do the assinment without such tricks?
By default, zsh does not do filename expansion in scalar assignment, but the option GLOB_ASSIGN could help. (This option is provided as for backwards compatibility only.)
local myfile=''
() {
setopt localoptions globassign
myfile=*(NDY1)
}
echo $myfile
;#>> something
Here are some descriptions in zsh docs:
The value of a scalar parameter may also be assigned by writing:
name=value
In scalar assignment, value is expanded as a single string, in which the elements of arrays are joined together; filename expansion is not performed unless the option GLOB_ASSIGN is set.
--- zshparam(1), Description, zsh parameters
GLOB_ASSIGN <C>
If this option is set, filename generation (globbing) is performed on the right hand side of scalar parameter assignments of the form 'name=pattern (e.g. foo=*'). If the result has more than one word the parameter will become an array with those words as arguments. This option is provided for backwards compatibility only: globbing is always performed on the right hand side of array assignments of the form name=(value) (e.g. foo=(*)) and this form is recommended for clarity; with this option set, it is not possible to predict whether the result will be an array or a scalar.
--- zshoptions(1), GLOB_ASSIGN, Expansion and Globbing, Description Of Options, zsh options

Replacing a specific part

I have a list like this:
DEL075MD1BWP30P140LVT
AN2D4BWP30P140LVT
INVD0P7BWP40P140
IND2D6BWP30P140LVT
I want to replace everything in between D and BWP with a *
How can I do that in unix and tcl
Do you have the whole list available at the same time, or are you getting one item at a time from somewhere?
Should all D-BWP groups be processed, or just one per item?
If just one per item, should it be the first or last (those are the easiest alternatives)?
Tcl REs don't have any lookbehind, which would have been nice here. But you can do without both lookbehinds and lookaheads if you capture the goalpost and paste them into the replacement as back references. The regular expression for the text between the goalposts should be [^DB]+, i.e. one or more of any text that doesn't include D or B (to make sure the match doesn't escape the goalposts and stick to other Ds or Bs in the text). So: {(D)[^DB]+(BWP)} (braces around the RE is usually a good idea).
If you have the whole list and want to process all groups, try this:
set result [regsub -all {(D)[^DB]+(BWP)} $lines {\1*\2}]
(If you can only work with one line at a time, it's basically the same, you just use a variable for a single line instead of a variable for the whole list. In the following examples, I use lmap to generate individual lines, which means I need to have the whole list anyway; this is just an example.)
Process just the first group in each line:
set result [lmap line $lines {
regsub {(D)[^DB]+(BWP)} $line {\1*\2}
}]
Process just the last group in each line:
set result [lmap line $lines {
regsub {(D)[^DB]+(BWP[^D]*)$} $line {\1*\2}
}]
The {(D)[^DB]+(BWP[^D]*)$} RE extends the right goalpost to ensure that there is no D (and hence possibly a new group) anywhere between the goalpost and the end of the string.
Documentation:
lmap (for Tcl 8.5),
lmap,
regsub,
set,
Syntax of Tcl regular expressions

Resources