Regex: Match opening/closing chars with spaces - asp.net

I'm trying to complete a regular expression that will pull out matches based on their opening and closing characters, the closest I've gotten is
^(\[\[)[a-zA-Z.-_]+(\]\])
Which will match a string such as "[[word1]]" and bring me back all the matches if there is more than one, The problem is I want it to pick up matchs where there may be a space in so for example "[[word1 word2]]", now this will work if I add a space into my pattern above however this pops up a problem that it will only get one match for my entire string so for example if I have a string
"Hi [[Title]] [[Name]] [[surname]], How are you"
then the match will be [[Title]] [[Name]] [[surname]] rather than 3 matches [[Title]], [[Name]], [[surname]]. I'm sure I'm just a char or two away in the Regex but I'm stuck, How can I make it return the 3 matches.
Thanks

You just need to make you regex non-greedy by using a ? like:
^(\[\[)[a-zA-Z.-_ ]+?(\]\])
Also there is a bug in your regex. You've included - in the char class thinking of it as a literal hyphen. But - in a char class is a meta char. So it effectively will match all char between . (period) and _ (underscore). So you need to escape it as:
^(\[\[)[a-zA-Z.\-_ ]+?(\]\])
or you can put is in some other place in the regex so that it will not have things on both sides of it as:
^(\[\[)[a-zA-Z._ -]+?(\]\])
or
^(\[\[)[-a-zA-Z._ ]+?(\]\])

You need to turn off greedy matching. See these examples for different languages:
asp.net
java
javascript

You should use +? instead of +.
The one without the question mark will try to match as much as possible, while the one with the question mark as little as possible.
Another approach would be to use [^\]] as your characters instead of [a-zA-Z.-_]. That way, a match will never extend over your closing brackets.

Related

Extract a certain element from URL using regular expressions

I need to extract the first element ("adidas-originals") after "designer" in the following URL using regular expressions.
xxx/en-ca/men/designers/adidas-originals/shorts
This needs to be done in Google Big Query API (standard SQL). To this end, I have tried several ways to get the desired valued without any success. Below is the best solution that I have found so far which obviously is not the right one as it returns "/adidas-originals/shorts".
REGEXP_EXTRACT(hits.page.pagePath, r'designers([^\n]*)')
Thanks!
The [^\n]* matches 0 or more chars other than a newline, LF, so no wonder it matches too much.
You need a pattern to match up to the next /, so you may use
designers/([^/]+)
Or a more precise:
(?:^|/)designers/([^/]+)
See the regex demo
Details
(?:^|/) - either start of a string or / (you may just use / if designers is always preceded with /)
designers/ a designers/ substring
([^/]+) - Capturing group 1 (just what will be returned with the REGEXP_EXTRACT function): one or more chars other than /.

Creating RegEx That Reads Entire String

My current regex is only picking up part of my string. It creates a match as soon as one if found, even though I need the longer version of that match to hit. For example, I am creating matches for both:
SSS111
and
SSS111-L
The first SSS111 matches fine with my current regex, but the SSS111-L is only getting matched to the SSS111, leaving the -L out.
How can I create a greedy regex to read the whole line before matching? I am currently using
[-A-Z0-9]{3,12}
to capture the numbers and letters, but have not had any luck outside of this.
Regex are allways greedy. This ist mostly the Problem.
Here i think you have only to escape the '-'
#"[-A-Z]{3-12}"

Difference between (.|[\r\n]){1,1500} and ^.{1,1500}$

What is the difference between below two regular expressions
(.|[\r\n]){1,1500}
^.{1,1500}$
The first matches up-to-1500 chars, and the second (assuming you haven't set certain regex options) matches a first single line of up-to-1500 chars, with no newlines.
. does not match new lines.
The second one matches the first 1500 characteres of a line IF the line contains 1500 characters or less
First expression matches some <= 1500 characters of the file(or other source).
Second expression matches a entire line with charsNumber <= 1500.
. matches any character except \n newline.
If it's for use in a RegularExpressionValidator, you probably want to use this regex:
^[\s\S]{1,1500}$
This is because the regex may be run on either the server (.NET) or the client (JavaScript). In .NET regexes you can use the RegexOptions.Singleline flag (or its inline equivalent, (?s)) to make the dot match newlines, but JavaScript has no such mechanism.
[\s\S] matches any whitespace character or anything that's not a whitespace character--in other words, anything. It's the most popular idiom for matching anything including a newline in JavaScript; it's much, much more efficient than alternation-based approaches like (.|\n).
Note that you'll still need to use a RequiredFieldValidator if you don't want the user to leave the textbox empty.

Regular Expression for ASP.NET ID Using Javascript

I am trying to extract the word "need" from this string.
ctl00_ctl00_ContentMainContainer_ContentColumn1__needDont_Panel1
I have tried [__]([.]?=Dont)
This is using javascript .match()
I have even tried to use http://gskinner.com/RegExr/ but just can't solve this one. Thanks for the help!
(?<=__)\w+(?=Dont)
Matches all alpha-numbers between __ and Dont
Edit
Sorry, I havent noticed word JavaScript. It does not support lookbehind, so __(\w+)(?=Dont) can be used there.
If Regex should match even when nothing comes between __ and Dont use "\w*" instead of "\w+". Be careful with ".*" because dot matches almost all characters, do you allow spaces in ID?
I haven't noticed
This will accomplish what you're looking for:
__(.*)(?=Dont)
You seem to be mixing up what a character class - square brackets [] - does, instead you should be using regular brackets ().
In your regex [__] will only match a single underscore _ and [.] will match a single period.
Your error is writing [__] instead of __ (without the braces). [__] matches only a single underscore, so it will match _ctl00_ContentMainContainer_ContentColumn1__need.
[.] is also wrong. You should use something like: [^_]+ (anything except underscore).

Don't want spaces in the text, but this regex is passing not sure why

I am using the following regex
/[a-zA-Z0-9]+/i.test(value)
If I enter a space in the word, it passes.
I don't see where spaces are aloud in the regex, why is it passing?
You need to set the beginning and end bounderies so that the entire string must match the regular expression, otherwise it'll look for any match (which in this case is one or more of the characters specified).
Try this:
/^[a-zA-Z0-9]+$/i.test(value)
Because you haven't anchored it.
For these sorts of tests, it's typically safer to make sure you don't have the negated character class:
/[^a-zA-Z0-9]/

Resources