zsh parameter expansion with asterisk as variable does not work - zsh

I am trying to remove a file extension using parameter expansion. e.g. given a filename of 123.sh, return 123.
If I store the pattern, ".*" in a variable, it does not work —${filename%$suffix} does not work.
If I specify the pattern literally, it does — ${filename%.*}
What am I doing wrong?

In the expansion ${filename%$suffix}, the value of $suffix is substituted literally. To have it be substitute as a pattern instead, you will need to use glob substitution: ${filename%$~suffix}
However, none of this is necessary for what you're trying to do. To remove the extension from a filename, you can simply use the r modifier:
❯ filename="123.sh"
❯ print $filename:r
123

Related

Why is ZSH preforming string editing with`:r` present in a string [duplicate]

I try to run code under zsh
a=123
b="$a:r"
echo $b
which I want the result to be 123:r, but it turns out to be
123 # without ":r"
And the same thing for character t, q.
However, if I run it under bash, it brings me the desired result 123:r.
If I add {}, runs
a=123
b="${a}:r"
echo $b
which also brings the desired result.
Does anybody know what's going on here?
In zsh, "$a:r" is the same as "${a:r}" by default.
To quote from the documentation (Emphasis added):
${name}
The value, if any, of the parameter name is substituted. The braces are required if the expansion is to be followed by a letter, digit, or underscore that is not to be interpreted as part of name. In addition, more complicated forms of substitution usually require the braces to be present; exceptions, which only apply if the option KSH_ARRAYS is not set, are a single subscript or any colon modifiers appearing after the name, or any of the characters ‘^’, ‘=’, ‘~’, ‘#’ or ‘+’ appearing before the name, all of which work with or without braces.
The :r modifer means:
Remove a filename extension leaving the root name. Strings with no filename extension are not altered. A filename extension is a ‘.’ followed by any number of characters (including zero) that are neither ‘.’ nor ‘/’ and that continue to the end of the string. For example, the extension of ‘foo.orig.c’ is ‘.c’, and ‘dir.c/foo’ has no extension.
To disable this behavior:
$ setopt KSH_ARRAYS
(Note: Doing this on my setup actually causes zsh to segfault; the option changes behavior in multiple ways, one of which conflicts badly with something in my .zshrc. Your results may vary.)

Extract entire string following specific characters & trouble with str_extract() [duplicate]

For example, this regex
(.*)<FooBar>
will match:
abcde<FooBar>
But how do I get it to match across multiple lines?
abcde
fghij<FooBar>
Try this:
((.|\n)*)<FooBar>
It basically says "any character or a newline" repeated zero or more times.
It depends on the language, but there should be a modifier that you can add to the regex pattern. In PHP it is:
/(.*)<FooBar>/s
The s at the end causes the dot to match all characters including newlines.
The question is, can the . pattern match any character? The answer varies from engine to engine. The main difference is whether the pattern is used by a POSIX or non-POSIX regex library.
A special note about lua-patterns: they are not considered regular expressions, but . matches any character there, the same as POSIX-based engines.
Another note on matlab and octave: the . matches any character by default (demo): str = "abcde\n fghij<Foobar>"; expression = '(.*)<Foobar>*'; [tokens,matches] = regexp(str,expression,'tokens','match'); (tokens contain a abcde\n fghij item).
Also, in all of boost's regex grammars the dot matches line breaks by default. Boost's ECMAScript grammar allows you to turn this off with regex_constants::no_mod_m (source).
As for oracle (it is POSIX based), use the n option (demo): select regexp_substr('abcde' || chr(10) ||' fghij<Foobar>', '(.*)<Foobar>', 1, 1, 'n', 1) as results from dual
POSIX-based engines:
A mere . already matches line breaks, so there isn't a need to use any modifiers, see bash (demo).
The tcl (demo), postgresql (demo), r (TRE, base R default engine with no perl=TRUE, for base R with perl=TRUE or for stringr/stringi patterns, use the (?s) inline modifier) (demo) also treat . the same way.
However, most POSIX-based tools process input line by line. Hence, . does not match the line breaks just because they are not in scope. Here are some examples how to override this:
sed - There are multiple workarounds. The most precise, but not very safe, is sed 'H;1h;$!d;x; s/\(.*\)><Foobar>/\1/' (H;1h;$!d;x; slurps the file into memory). If whole lines must be included, sed '/start_pattern/,/end_pattern/d' file (removing from start will end with matched lines included) or sed '/start_pattern/,/end_pattern/{{//!d;};}' file (with matching lines excluded) can be considered.
perl - perl -0pe 's/(.*)<FooBar>/$1/gs' <<< "$str" (-0 slurps the whole file into memory, -p prints the file after applying the script given by -e). Note that using -000pe will slurp the file and activate 'paragraph mode' where Perl uses consecutive newlines (\n\n) as the record separator.
gnu-grep - grep -Poz '(?si)abc\K.*?(?=<Foobar>)' file. Here, z enables file slurping, (?s) enables the DOTALL mode for the . pattern, (?i) enables case insensitive mode, \K omits the text matched so far, *? is a lazy quantifier, (?=<Foobar>) matches the location before <Foobar>.
pcregrep - pcregrep -Mi "(?si)abc\K.*?(?=<Foobar>)" file (M enables file slurping here). Note pcregrep is a good solution for macOS grep users.
See demos.
Non-POSIX-based engines:
php - Use the s modifier PCRE_DOTALL modifier: preg_match('~(.*)<Foobar>~s', $s, $m) (demo)
c# - Use RegexOptions.Singleline flag (demo): - var result = Regex.Match(s, #"(.*)<Foobar>", RegexOptions.Singleline).Groups[1].Value;- var result = Regex.Match(s, #"(?s)(.*)<Foobar>").Groups[1].Value;
powershell - Use the (?s) inline option: $s = "abcde`nfghij<FooBar>"; $s -match "(?s)(.*)<Foobar>"; $matches[1]
perl - Use the s modifier (or (?s) inline version at the start) (demo): /(.*)<FooBar>/s
python - Use the re.DOTALL (or re.S) flags or (?s) inline modifier (demo): m = re.search(r"(.*)<FooBar>", s, flags=re.S) (and then if m:, print(m.group(1)))
java - Use Pattern.DOTALL modifier (or inline (?s) flag) (demo): Pattern.compile("(.*)<FooBar>", Pattern.DOTALL)
kotlin - Use RegexOption.DOT_MATCHES_ALL : "(.*)<FooBar>".toRegex(RegexOption.DOT_MATCHES_ALL)
groovy - Use (?s) in-pattern modifier (demo): regex = /(?s)(.*)<FooBar>/
scala - Use (?s) modifier (demo): "(?s)(.*)<Foobar>".r.findAllIn("abcde\n fghij<Foobar>").matchData foreach { m => println(m.group(1)) }
javascript - Use [^] or workarounds [\d\D] / [\w\W] / [\s\S] (demo): s.match(/([\s\S]*)<FooBar>/)[1]
c++ (std::regex) Use [\s\S] or the JavaScript workarounds (demo): regex rex(R"(([\s\S]*)<FooBar>)");
vba vbscript - Use the same approach as in JavaScript, ([\s\S]*)<Foobar>. (NOTE: The MultiLine property of the RegExp object is sometimes erroneously thought to be the option to allow . match across line breaks, while, in fact, it only changes the ^ and $ behavior to match start/end of lines rather than strings, the same as in JavaScript regex)
behavior.)
ruby - Use the /m MULTILINE modifier (demo): s[/(.*)<Foobar>/m, 1]
rtrebase-r - Base R PCRE regexps - use (?s): regmatches(x, regexec("(?s)(.*)<FooBar>",x, perl=TRUE))[[1]][2] (demo)
ricustringrstringi - in stringr/stringi regex funtions that are powered with the ICU regex engine. Also use (?s): stringr::str_match(x, "(?s)(.*)<FooBar>")[,2] (demo)
go - Use the inline modifier (?s) at the start (demo): re: = regexp.MustCompile(`(?s)(.*)<FooBar>`)
swift - Use dotMatchesLineSeparators or (easier) pass the (?s) inline modifier to the pattern: let rx = "(?s)(.*)<Foobar>"
objective-c - The same as Swift. (?s) works the easiest, but here is how the option can be used: NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionDotMatchesLineSeparators error:&regexError];
re2, google-apps-script - Use the (?s) modifier (demo): "(?s)(.*)<Foobar>" (in Google Spreadsheets, =REGEXEXTRACT(A2,"(?s)(.*)<Foobar>"))
NOTES ON (?s):
In most non-POSIX engines, the (?s) inline modifier (or embedded flag option) can be used to enforce . to match line breaks.
If placed at the start of the pattern, (?s) changes the bahavior of all . in the pattern. If the (?s) is placed somewhere after the beginning, only those .s will be affected that are located to the right of it unless this is a pattern passed to Python's re. In Python re, regardless of the (?s) location, the whole pattern . is affected. The (?s) effect is stopped using (?-s). A modified group can be used to only affect a specified range of a regex pattern (e.g., Delim1(?s:.*?)\nDelim2.* will make the first .*? match across newlines and the second .* will only match the rest of the line).
POSIX note:
In non-POSIX regex engines, to match any character, [\s\S] / [\d\D] / [\w\W] constructs can be used.
In POSIX, [\s\S] is not matching any character (as in JavaScript or any non-POSIX engine), because regex escape sequences are not supported inside bracket expressions. [\s\S] is parsed as bracket expressions that match a single character, \ or s or S.
If you're using Eclipse search, you can enable the "DOTALL" option to make '.' match any character including line delimiters: just add "(?s)" at the beginning of your search string. Example:
(?s).*<FooBar>
In many regex dialects, /[\S\s]*<Foobar>/ will do just what you want. Source
([\s\S]*)<FooBar>
The dot matches all except newlines (\r\n). So use \s\S, which will match ALL characters.
We can also use
(.*?\n)*?
to match everything including newline without being greedy.
This will make the new line optional
(.*?|\n)*?
In Ruby you can use the 'm' option (multiline):
/YOUR_REGEXP/m
See the Regexp documentation on ruby-doc.org for more information.
"." normally doesn't match line-breaks. Most regex engines allows you to add the S-flag (also called DOTALL and SINGLELINE) to make "." also match newlines.
If that fails, you could do something like [\S\s].
For Eclipse, the following expression worked:
Foo
jadajada Bar"
Regular expression:
Foo[\S\s]{1,10}.*Bar*
Note that (.|\n)* can be less efficient than (for example) [\s\S]* (if your language's regexes support such escapes) and than finding how to specify the modifier that makes . also match newlines. Or you can go with POSIXy alternatives like [[:space:][:^space:]]*.
Use:
/(.*)<FooBar>/s
The s causes dot (.) to match carriage returns.
Use RegexOptions.Singleline. It changes the meaning of . to include newlines.
Regex.Replace(content, searchText, replaceText, RegexOptions.Singleline);
In notepad++ you can use this
<table (.|\r\n)*</table>
It will match the entire table starting from
rows and columns
You can make it greedy, using the following, that way it will match the first, second and so forth tables and not all at once
<table (.|\r\n)*?</table>
In a Java-based regular expression, you can use [\s\S].
This works for me and is the simplest one:
(\X*)<FooBar>
Generally, . doesn't match newlines, so try ((.|\n)*)<foobar>.
In JavaScript you can use [^]* to search for zero to infinite characters, including line breaks.
$("#find_and_replace").click(function() {
var text = $("#textarea").val();
search_term = new RegExp("[^]*<Foobar>", "gi");;
replace_term = "Replacement term";
var new_text = text.replace(search_term, replace_term);
$("#textarea").val(new_text);
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<button id="find_and_replace">Find and replace</button>
<br>
<textarea ID="textarea">abcde
fghij<Foobar></textarea>
Solution:
Use pattern modifier sU will get the desired matching in PHP.
Example:
preg_match('/(.*)/sU', $content, $match);
Sources:
Pattern Modifiers
In the context of use within languages, regular expressions act on strings, not lines. So you should be able to use the regex normally, assuming that the input string has multiple lines.
In this case, the given regex will match the entire string, since "<FooBar>" is present. Depending on the specifics of the regex implementation, the $1 value (obtained from the "(.*)") will either be "fghij" or "abcde\nfghij". As others have said, some implementations allow you to control whether the "." will match the newline, giving you the choice.
Line-based regular expression use is usually for command line things like egrep.
Try: .*\n*.*<FooBar> assuming you are also allowing blank newlines. As you are allowing any character including nothing before <FooBar>.
I had the same problem and solved it in probably not the best way but it works. I replaced all line breaks before I did my real match:
mystring = Regex.Replace(mystring, "\r\n", "")
I am manipulating HTML so line breaks don't really matter to me in this case.
I tried all of the suggestions above with no luck. I am using .NET 3.5 FYI.
I wanted to match a particular if block in Java:
...
...
if(isTrue){
doAction();
}
...
...
}
If I use the regExp
if \(isTrue(.|\n)*}
it included the closing brace for the method block, so I used
if \(!isTrue([^}.]|\n)*}
to exclude the closing brace from the wildcard match.
Often we have to modify a substring with a few keywords spread across lines preceding the substring. Consider an XML element:
<TASK>
<UID>21</UID>
<Name>Architectural design</Name>
<PercentComplete>81</PercentComplete>
</TASK>
Suppose we want to modify the 81, to some other value, say 40. First identify .UID.21..UID., then skip all characters including \n till .PercentCompleted.. The regular expression pattern and the replace specification are:
String hw = new String("<TASK>\n <UID>21</UID>\n <Name>Architectural design</Name>\n <PercentComplete>81</PercentComplete>\n</TASK>");
String pattern = new String ("(<UID>21</UID>)((.|\n)*?)(<PercentComplete>)(\\d+)(</PercentComplete>)");
String replaceSpec = new String ("$1$2$440$6");
// Note that the group (<PercentComplete>) is $4 and the group ((.|\n)*?) is $2.
String iw = hw.replaceFirst(pattern, replaceSpec);
System.out.println(iw);
<TASK>
<UID>21</UID>
<Name>Architectural design</Name>
<PercentComplete>40</PercentComplete>
</TASK>
The subgroup (.|\n) is probably the missing group $3. If we make it non-capturing by (?:.|\n) then the $3 is (<PercentComplete>). So the pattern and replaceSpec can also be:
pattern = new String("(<UID>21</UID>)((?:.|\n)*?)(<PercentComplete>)(\\d+)(</PercentComplete>)");
replaceSpec = new String("$1$2$340$5")
and the replacement works correctly as before.
Typically searching for three consecutive lines in PowerShell, it would look like:
$file = Get-Content file.txt -raw
$pattern = 'lineone\r\nlinetwo\r\nlinethree\r\n' # "Windows" text
$pattern = 'lineone\nlinetwo\nlinethree\n' # "Unix" text
$pattern = 'lineone\r?\nlinetwo\r?\nlinethree\r?\n' # Both
$file -match $pattern
# output
True
Bizarrely, this would be Unix text at the prompt, but Windows text in a file:
$pattern = 'lineone
linetwo
linethree
'
Here's a way to print out the line endings:
'lineone
linetwo
linethree
' -replace "`r",'\r' -replace "`n",'\n'
# Output
lineone\nlinetwo\nlinethree\n
Option 1
One way would be to use the s flag (just like the accepted answer):
/(.*)<FooBar>/s
Demo 1
Option 2
A second way would be to use the m (multiline) flag and any of the following patterns:
/([\s\S]*)<FooBar>/m
or
/([\d\D]*)<FooBar>/m
or
/([\w\W]*)<FooBar>/m
Demo 2
RegEx Circuit
jex.im visualizes regular expressions:

filename expansion on assigning a non-array variable

This is about Zsh 5.5.1.
Say I have a glob pattern which expands to exactly one file, and I would like to assign this file to a variable. This works:
# N: No error if no files match. D: Match dot files. Y1: Expand to exactly one entry.
myfile=(*(NDY1))
and echo $myfile will show the file (or directory). But this one does not work:
myfile=*(NDY1)
In the latter case, echo $myfile holds the pattern, i.e. *(NDY1).
Of course I could do some cheap trick, such as creating a chilprocess via
myfile=$(echo *(NDY1))
but is there a way to do the assinment without such tricks?
By default, zsh does not do filename expansion in scalar assignment, but the option GLOB_ASSIGN could help. (This option is provided as for backwards compatibility only.)
local myfile=''
() {
setopt localoptions globassign
myfile=*(NDY1)
}
echo $myfile
;#>> something
Here are some descriptions in zsh docs:
The value of a scalar parameter may also be assigned by writing:
name=value
In scalar assignment, value is expanded as a single string, in which the elements of arrays are joined together; filename expansion is not performed unless the option GLOB_ASSIGN is set.
--- zshparam(1), Description, zsh parameters
GLOB_ASSIGN <C>
If this option is set, filename generation (globbing) is performed on the right hand side of scalar parameter assignments of the form 'name=pattern (e.g. foo=*'). If the result has more than one word the parameter will become an array with those words as arguments. This option is provided for backwards compatibility only: globbing is always performed on the right hand side of array assignments of the form name=(value) (e.g. foo=(*)) and this form is recommended for clarity; with this option set, it is not possible to predict whether the result will be an array or a scalar.
--- zshoptions(1), GLOB_ASSIGN, Expansion and Globbing, Description Of Options, zsh options

zsh: command substitution, proper quoting and backslash (again)

(Note: This is a successor question to my posting zsh: Command substitution and proper quoting , but now with an additional complication).
I have a function _iwpath_helper, which outputs to stdout a path, which possibly contains spaces. For the sake of this discussion, let's assume that _iwpath_helper always returns a constant text, for instance
function _iwpath_helper
{
echo "home/rovf/my directory with spaces"
}
I also have a function quote_stripped expects one parameter and if this parameter is surrounded by quotes, it removes them and returns the remaining text. If the parameter is not surrounded by quotes, it returns it unchanged. Here is its definition:
function quote_stripped
{
echo ${1//[\"\']/}
}
Now I combine both functions in the following way:
target=$(quote_stripped "${(q)$(_iwpath_helper)}")
(Of course, 'quote_stripped' would be unnecessary in this toy example, because _iwpath_helper doesn't return a quote-delimited path here, but in the real application, it sometimes does).
The problem now is that the variable target contains a real backslash character, i.e. if I do a
echo +++$target+++
I see
+++home/rovf/my\ directory\ with\ spaces
and if I try to
cd $target
I get on my system the error message, that the directory
home/rovf/my/ directory/ with/ spaces
would not exist.
(In case you are wondering where the forward slashes come from: I'm running on Cygwin, and I guess that the cd command just interprets backslashes as forward slashes in this case, to accomodate better for the Windows environment).
I guess the backslashes, which physically appear in the variable target are caused by the (q) expansion flag which I apply to $(_iwpath_helper). My problem is now that I can not simply drop the (q), because without it, the function quote_stripped would get on parameter $1 only the first part of the string, up to the first space (/home/rovf/my).
How can I write this correctly?
I think you just want to avoid trying to strip quotes manually, and use the (Q) expansion flag. Compare:
% v="a b c d"
% echo "$v"
a b c d
% echo "${(q)v}"
a\ b\ c\ d
% echo "${(Q)${(q)v}}"
a b c d
chepner was right: The way I tried to unquote the string was silly (I was thinking too much in a "Bourne Shell way"), and I should have used the (Q) flag.
Here is my solution:
target="${(Q)$(_iwpath_helper)}"
No need for the quote_stripped function anymore....

ActiveXObject("Shell.Application") - how to pass arguments with spaces?

I run exe from my asp.net with JavaScript using ActiveXObject. It runs successfully, except parameters:
function CallEXE() {
var oShell = new ActiveXObject("Shell.Application");
var prog = "C:\\Users\\admin\\Desktop\\myCustom.exe";
oShell.ShellExecute(prog,"customer name fullname","","open","1");
}
Example, I pass that like parameters,[1] customer name,[2] fullname, but after space character, Javascript perceive different parameter.
How can I fix?
ShellExecute takes the 2nd parameter to be a string that represents all the arguments and processes these using normal shell processing rules: spaces and quotes, in particular.
oShell.ShellExecute(prog,"customer name fullname",...)
In this case the 3 parameters that are passed are customer, name, fullname
oShell.ShellExecute(prog,"customer 'a name with spaces' fullname",...)
As corrected/noted by Remy Lebeau - TeamB, double-quotes can be used to defined argument boundaries:
oShell.ShellExecute(prog,'customer "a name with spaces" fullname',...)
In this case the 3 parameters that are passed are customer, a name with spaces, fullname
That is, think of how you would call myCustom.exe from the command-prompt. It's the same thing when using ShellExecute.
Happy coding.
Try escaping your spaces with a backslash. The cmd.exe cd command does this, maybe you'll get lucky and it'll work here as well...
oShell.ShellExecute(prog,"customer a\ name\ with\ spaces fullname", ...)

Resources