Javascript Regex Browser Inconsistancy? - asp.net

I have a regex that I am using in an asp.net RegularExpressionValidator to check a TextField.
^(?=.*[a-z])(?=.*\d)(?=.*[A-Z]).{8,}$
The example string I have stumbled on is 'RedCoal1'
Firefox = Matched
IE8 = Matched
Chrome = Matched
IE7 = DOES NOT MATCH
WHY!!!!

The implementation of lookahead in WSH's RegExp as used by IE is just broken. The bug usually pops up in exactly this case, trying to use one regex to verify several things at once.
Plus some older browsers don't support lookahead at all (it wasn't in the original JavaScript spec, though it is now in ECMA-262-3). So all in all it's best to avoid lookahead in browser RegExp.
It would be best to separate out each check (each character class, and length) into manual validation steps.

Related

Regex not working in asp.net(.aspx page)

I have a regex thats working normally (when i tried through online regex checking websitesites).
This should not allow 1234.1234.1234.1234 but while I am using it in asp.net,it is allowing even those values.
Any suggestion?
var ipfilter = new RegExp("(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?$)");
.NET regex differs from JavaScript one immensely. However, in this case, it is a regular problem: the dot must be preceded with a literal backslash, or placed inside a character class. I suggest the latter as it is less error-prone, and you need to add a ^ (start of string) anchor:
var rx = "^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[.](25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[.](25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[.](25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?$)";
Is the online regex checking website you used testing regex for .NET? .NET regex differs slightly from Javascript regex.
http://refiddle.com/ - you can test against .NET on this by selecting .NET from the regex options drop down on the left.

When, if ever, should characters like { and } (curly braces) be percent-encoded in URLs?

According to RFC 3986 the following characters are reserved and need to be percent-encoded in order to be used in a URI other than as their reserved uses:
:/?#[]#!$&'()*+,;=
Furthermore it specifies some characters that are specifically unreserved: a-zA-Z0-9\-._~
It seems clear that generally one should encode reserved characters (to prevent misinterpretation) and not encode unreserved characters (for readability), but how should characters that do not fall into either category be handled? For example { and } do not appear in either list, but they are standard ASCII characters.
Looking to modern browsers for guidance, it seems they sometimes have different behaviors.
For example, consider pasting the URL https://www.google.com/search?q={ into the address bar of a web browser:
Chrome 34.0.1847.116 m does not change it.
Firefox 28.0 does not change it.
Internet Explorer 9.0 does not change it.
Safari 5.1.7 changes it to https://www.google.com/search?q=%7B
However, if one pastes https://www.google.com/#q={ (removing "search" and changing the ? to a #, making the character part of the fragment/hash rather than the query string) we find that:
Chrome 34.0.1847.116 m changes it to https://www.google.com/#q=%7B (via JavaScript)
Firefox 28.0 does not change it.
Internet Explorer 9.0 does not change it.
Safari 5.1.7 changes it to https://www.google.com/#q=%7B (before executing JavaScript)
Furthermore, when using JavaScript to perform the request asynchronously (i.e. using this MDN example modified to use a URL of ?q={), the URL is not percent-encoded automatically. (I'm guessing this is because the XMLHttpRequest API assumes that the URL be encoded/escaped beforehand.)
I would like to (for a reason related to a bizarre customer requirement) use { and } in the filename portion of URLs without (1) breaking things and ideally also without (2) creating ugly-looking percent-encoded entries in the network panel of modern browsers' web inspectors/debuggers.
(RFC 2396)
You should be encoding any of the unwise section and the rfc gives the reason.
additional information from the RFC
Account for < > # % primarily
any control characters 00-1F and 7F
also marked as unwise in the rfc: " { } | \ ^ [ ] `
if you are intending to allow for # to be in the querystring values then that's a special case, because a # is a fragment identifier of a uri.
Some characters which do not have to be encoded, are accepted either encoded or not such as ~
There are 2 generally accepted encodings for (space) %20 and +
Here's a fiddle with some of the test cases I'm using.

What's the best way to constraint validation to not allow any spaces in ASP.NET

I have different cases:
No spaces allowed at all
No spaces allowed at the beginning or the end of the string only
..A little question, is it good to check (to validate) the input for spaces through the RegEx of the RegularExpressionValidator ?
The \S escape sequence matches anything that isn't a whitespace character.
Thus the regexes that you need follow:
^\S+$
^\S.*\S$
In your previous question, you mentioned you wanted from 0 to 50 characters. If that's still the case, here's what you want:
/^\S{0,50}$/
/^(?!\s).{0,50}(?<!\s)$/
As of right now, I think these are the only regexes posted that allow for less than one letter with the first pattern, and less than two letters with the second pattern.
Regexes are not a "bad" thing, they're just a specialized tool that isn't suited for every task. If you're trying to validate input in ASP.NET, I would definitely use a RegularExpressionValidator for this particular pattern, because otherwise you'll have to waste your time writing a CustomValidator for a pretty meager performance boost. See my answer to this other question for a little guidance on when and when not to use regex.
In this case, the reason I'd use a regex validator has less to do with the pattern itself and more to do with ASP.NET. A RegularExpressionValidator can just be dragged and dropped into your ASPX code, and all you'd have to write would be 10-21 characters of regex. With a CustomValidator, you'd have to write custom validation functions, both in the codebehind and the JavaScript. You might squeeze a little more performance out of it, but think about when validation comes into play: only once per postback. The performance difference is going to be less than a millisecond. It's simply not worth your time as a developer -- to you or your employer. Remember: Hardware is cheap, programmers are expensive, and premature optimization is the root of all evil.
You know the saying about regex's (now you have two problems) - this is doable without regex and should be more performant and easier to read.
string checkString = /* whatever */
if(checkString.IndexOf(" ") > -1)
// Failed Condition 1
if(checkString.Trim() != checkString)
// Failed Condition 2
No spaces allowed at all:
^\S+$
No spaces allowed at the beginning or end:
^\S+.*\S+$
The System.String class contains everything you need:
No spaces allowed at all
This will handle the case of spaces only:
bool valid = !str.Contains(" ");
If you need to check for tabs as well:
char[] naughty = " \t".ToCharArray();
bool fail = (str.IndexOfAny(naughty) == -1);
There are other whitespace characters you could check for, see Character Escapes for more details.
No spaces allowed at the beginning or the end of the string only
A bit simpler, since Trim() will remove any kind of whitespace, including newlines:
bool valid = str.Length == str.Trim().Length;

Regular Expression for percents (with % sign) in ASP.Net RegEx Validator

I need a regex for the ASP.Net (4) Regex Validation control. It needs to be a RegEx validator to support other dynamic behaviors outside the scope of this post..
I was using the following, but it fails if the user enters the % sign following the number (which is a req of my spec):
^(100(?:\.0{1,2})?|0*?\.\d{1,2}|\d{1,2}(?:\.\d{1,2})?)$
I tried adding an atomic group of ^(?>%?) at the end, with no luck, after reading the excellent post
Regular expression greedy match not working as expected
Does anyone have any ideas?
Try this
^(100(?:.0{1,2})?%?|0*?.\d{1,2}%?|\d{1,2}(?:.\d{1,2})?%?)$
try this one instead:
^0*(100(\.00?)?|[0-9]?[0-9](\.[0-9][0-9]?)?)%?$

Regex to limit string length for strings with new line characters

Looks like a simple task - get a regex that tests a string for particular length:
^.{1,500}$
But if a string has "\r\n" than the above match always fails!
How should the correct regex look like to accept new line characters as part of the string?
I have a <asp:TextBox TextMode="Multiline"> and use a RegularExpressionValidator to check the length of what user types in.
Thank you,
Andrey
You could use the RegexOptions.Singleline option when validating input. This treats the input as a single line statement, and parses it as such.
Otherwise you could give the following expression a try:
^(.|\s){1,500}$
This should work in multiline inputs.
Can you strip the line breaks before checking the length of the string? That'd be easy to do when validating server-side. (In .net you could use a custom validator for that)
From a UX perspective, though, I'd implement a client-side 'character counter' as well. There's plenty to be found. jQuery has a few options. Then you can implement the custom validator to only run server-side, and then use the character counter as your client-side validation. Much nicer for the user to see how many characters they have left WHILE they are typing.
The inability to set the RegexOptions is screwing you up here. Since this is in a RegularExpressionValidator, you could try setting the options in the regular expression itself.
I think this should work:
(?s)^.{1,500}$
The (?s) part turns on the Singleline option which will allow the dot to match every character including line feeds. For what it's worth, the article here also lists the other RegexOptions and the notation needed to set them as an inline statement.

Resources