hi am new to zsh and am trying to create multi-line prompt and came across this line of code:
local pad=${(pl.$pad_len.. .)}
My 1st question is what is the pl inside the parentheses? Is it a command or operator or a flag(s)?
And my 2nd question is what are the dots that follow $pad_len?
Those are Zsh parameter expansion flags.
l.$pad_len. makes the given (in this case, empty) string exactly $pad_len long, either by truncating it from the left or by padding it on the left with spaces.
l.$pad_len.. . does the same as the above, but specifies explicitly to use the space character for padding — which is unnecessary, since the default is to pad with spaces.
The .s here are arbitrary separators used to enclose each argument for the preceding flag. It doesn’t matter which (matching pair of) punctuation characters you use for this, as long they enclose each argument in pairs. So, l:$pad_len:: : and l<$pad_len>< > do the exact same thing.
p makes l support print escape codes in the second argument — which is unnecessary, since we don’t use any here.
So, a shorter way to write this would be
local pad=${(l.$pad_len.)}
If you want to do this operation on a non-empty string, you can either pass the name of a variable
local foo=bar
local pad=${(l.$pad_len.)foo}
or pass a literal string with :-
local pad=${(l.$pad_len.):-bar}
My regex pattern looks something like
<xxxx location="file path/level1/level2" xxxx some="xxx">
I am only interested in the part in quotes assigned to location. Shouldn't it be as easy as below without the greedy switch?
/.*location="(.*)".*/
Does not seem to work.
You need to make your regular expression lazy/non-greedy, because by default, "(.*)" will match all of "file path/level1/level2" xxx some="xxx".
Instead you can make your dot-star non-greedy, which will make it match as few characters as possible:
/location="(.*?)"/
Adding a ? on a quantifier (?, * or +) makes it non-greedy.
Note: this is only available in regex engines which implement the Perl 5 extensions (Java, Ruby, Python, etc) but not in "traditional" regex engines (including Awk, sed, grep without -P, etc.).
location="(.*)" will match from the " after location= until the " after some="xxx unless you make it non-greedy.
So you either need .*? (i.e. make it non-greedy by adding ?) or better replace .* with [^"]*.
[^"] Matches any character except for a " <quotation-mark>
More generic: [^abc] - Matches any character except for an a, b or c
How about
.*location="([^"]*)".*
This avoids the unlimited search with .* and will match exactly to the first quote.
Use non-greedy matching, if your engine supports it. Add the ? inside the capture.
/location="(.*?)"/
Use of Lazy quantifiers ? with no global flag is the answer.
Eg,
If you had global flag /g then, it would have matched all the lowest length matches as below.
Here's another way.
Here's the one you want. This is lazy [\s\S]*?
The first item:
[\s\S]*?(?:location="[^"]*")[\s\S]* Replace with: $1
Explaination: https://regex101.com/r/ZcqcUm/2
For completeness, this gets the last one. This is greedy [\s\S]*
The last item:[\s\S]*(?:location="([^"]*)")[\s\S]*
Replace with: $1
Explaination: https://regex101.com/r/LXSPDp/3
There's only 1 difference between these two regular expressions and that is the ?
The other answers here fail to spell out a full solution for regex versions which don't support non-greedy matching. The greedy quantifiers (.*?, .+? etc) are a Perl 5 extension which isn't supported in traditional regular expressions.
If your stopping condition is a single character, the solution is easy; instead of
a(.*?)b
you can match
a[^ab]*b
i.e specify a character class which excludes the starting and ending delimiiters.
In the more general case, you can painstakingly construct an expression like
start(|[^e]|e(|[^n]|n(|[^d])))end
to capture a match between start and the first occurrence of end. Notice how the subexpression with nested parentheses spells out a number of alternatives which between them allow e only if it isn't followed by nd and so forth, and also take care to cover the empty string as one alternative which doesn't match whatever is disallowed at that particular point.
Of course, the correct approach in most cases is to use a proper parser for the format you are trying to parse, but sometimes, maybe one isn't available, or maybe the specialized tool you are using is insisting on a regular expression and nothing else.
Because you are using quantified subpattern and as descried in Perl Doc,
By default, a quantified subpattern is "greedy", that is, it will
match as many times as possible (given a particular starting location)
while still allowing the rest of the pattern to match. If you want it
to match the minimum number of times possible, follow the quantifier
with a "?" . Note that the meanings don't change, just the
"greediness":
*? //Match 0 or more times, not greedily (minimum matches)
+? //Match 1 or more times, not greedily
Thus, to allow your quantified pattern to make minimum match, follow it by ? :
/location="(.*?)"/
import regex
text = 'ask her to call Mary back when she comes back'
p = r'(?i)(?s)call(.*?)back'
for match in regex.finditer(p, str(text)):
print (match.group(1))
Output:
Mary
Trying to figure out how ksh is processing the construct !(text). For example,
$ echo !(hello)
produces a list of files in the current directory (similar to the output of an ls command, except it's sorted into columns rather than rows). It doesn't matter what text is in the parens, the output is the same.
Can anyone enlighten me as to what the command is actually doing? Thanks!
It echoes all files except hello. You can also use wildcards like echo !(*.java)
Here's some more detailed information. For more info, look in the "file name generation" section of the ksh man page (bash works the same way). See here for more patterns: https://www.mkssoftware.com/docs/man1/sh.1.asp
A sub-pattern begins with a ?, *, +, #, or ! character followed by a pattern-list
enclosed in parentheses. Pattern-lists themselves can contain sub-patterns.
The following list describes valid sub-patterns.
?(pattern-list)
Matches exactly zero or exactly one occurrence of the specified pattern-list.
*(pattern-list)
Matches zero or more occurrences of the specified pattern-list.
+(pattern-list)
Matches one or more occurrences of the specified pattern-list.
#(pattern-list)
Matches exactly one occurrence of the specified pattern-list.
!(pattern-list)
Matches any string that does not match the specified pattern-list.
So for your example, when the shell sees the unquoted exclamation point, followed by parenthesis it goes into file name matching mode, then it displays files in the current directory that do not match "hello".
I have file listing as the following one:
001file.jpg
003file.jpg
001-800x600-sq.jpg
001-800x600.jpg
002-800x600-sq.jpg
002-800x600.jpg
003-800x600-sq.jpg
003-800x600.jpg
004-800x531-sq.jpg
004-800x531.jpg
005-800x531-sq.jpg
005-800x531.jpg
006-800x531-sq.jpg
006-800x531.jpg
007-800x531-sq.jpg
007-800x531.jpg
008-800x1067-sq.jpg
008-800x1067.jpg
009-800x1067-sq.jpg
009-800x1067.jpg
010-800x533-sq.jpg
010-800x533.jpg
011-800x1200-sq.jpg
011-800x1200.jpg
012-800x533-sq.jpg
012-800x533.jpg
013-800x600-sq.jpg
013-800x600.jpg
014-800x1067-sq.jpg
014-800x1067.jpg
015-800x533-sq.jpg
015-800x533.jpg
016-800x533-sq.jpg
016-800x533.jpg
In ZSH, I want to list all files beginning with any number, not containing dash in filename, so I tried:
print -l <->[^-]*.jpg
with no success. What is wrong with this pattern!?
This is, I think, similar to the case that the documentation for <-> warns about:
Be careful when using other wildcards adjacent to patterns of this form; for example, <0-9>* will actually match any number whatsoever at the start of the string, since the `<0-9>' will match the first
digit, and the `*' will match any others. This is a trap for the unwary, but is in fact an inevitable
consequence of the rule that the longest possible match always succeeds. Expressions such as
`<0-9>[^[:digit:]]*' can be used instead.
In print -l <->[^-]*.jpg, the <-> matches the first digit, then [^-] matches the 2nd digit, and * matches everything thing else.
Use instead
print -l <->[^[:digit:]-]*.jpg
Is it possible to set the grammar to match case insensitively.
so for example a rule:
checkName = 'CHECK' Word;
would match check name as well as CHECK name
Creator of PEGKit here.
The only way to do this currently is to use a Semantic Predicate in a round-about sort of way:
checkName = { MATCHES_IGNORE_CASE(LS(1), #"check") }? Word Word;
Some explanations:
Semantic Predicates are a feature lifted directly from ANTLR. The Semantic Predicate part is the { ... }?. These can be placed anywhere in your grammar rules. They should contain either a single expression or a series of statements ending in a return statement which evaluates to a boolean value. This one contains a single expression. If the expression evaluates to false, matching of the current rule (checkName in this case) will fail. A true value will allow matching to proceed.
MATCHES_IGNORE_CASE(str, regexPattern) is a convenience macro I've defined for your use in Predicates and Actions to do regex matches. It has a case-sensitive friend: MATCHES(str, regexPattern). The second argument is an NSString* regex pattern. Meaning should be obvious.
LS(num) is another convenience macro for your use in Predicates/Actions. It means fetch a Lookahead String and the argument specifies how far to lookahead. So LS(1) means lookahead by 1. In other words, "fetch the string value of the first upcoming token the parser is about to try to match".
Notice that I'm still matching Word twice at the end there. The first Word is necessary for matching 'check' (even though it was already tested in the predicate, it was not matched and consumed). The second Word is for your name or whatever.
Hope that helps.