Issue with regex in ASP.net for german, french & spanish languages - asp.net

I want to support German, French & Spanish characters on a particular field of my website. I need a regex for this. Presently I am using -
^[\w\s-\+\$\*\.\?\:\;\!\,"'\%\&\/\(\)\#\#«»£°¿¡_ÀÂÆÇÈÉÊËÎÏÔŒÙÛÜàâæçèéêëîïôœùûüÄÖäößÁÍÑÓÚáíñóú\u201E\u201C\u201D\u20AC]{1,255}$
This regex basically uses all the char set from the 3 languages I mentioned.
Is there a neat way to avoid this lengthy regex? I tried /p{L}/p{Z} regex. However this didnt work.
My website is in ASP.net

/p{L}/p{Z} is wrong, should be \p{L}\{Z}.
all the letters, like "ÀÂÆÇÈ" shouldn't be needed, they are all included in \w in .net!
You don't need most of the escaping in a character class
You can't write something like " in a character class, only thing what happens is that every single character is added to the class.
This should be quite similar to what you used:
^[-\p{L}\p{N}\p{P}\p{Z}_+$*%&/##«»£°\u201E\u201C\u201D\u20AC]{1,255}$
I haven't checked those Unicode codepoints at the end of the class, I don't now if they are needed or not.
For an explanation of all the \p{...} items see Unicode Regular Expressions on regular-expressions.info

Related

Why do URL parameters use %-encoding instead of a simple escape character

For example, in Unix, a backslash (\) is a common escape character. So to escape a full stop (.) in a regular expression, one does this:
\.
But with % encoding URL parameters, we have an escape character, %, and a control code, so an ampersand (&) doesn't become:
%&
Instead, it becomes:
%26
Any reason why? Seems to just make things more complicated, on the face of it, when we could just have one escape character and a mechanism to escape itself where necessary:
%%
Then it'd be:
simpler to remember; we just need to know which characters to escape, not which to escape and what to escape them to
encoding-agnostic, as we wouldn't be sending an ASCII or Unicode representation explicitly, we'd just be sending them in the encoding the rest of the URL is going in
easy to write an encoder: s/[!\*'();:#&=+$,/?#\[\] "%-\.<>\\^_`{|}~]/%&/g (untested!)
better because we could switch to using \ as an escape character, and life would be simpler and it'd be summer all year long
I might be getting carried away now. Someone shoot me down? :)
EDIT: replaced two uses of "delimiter" with "escape character".
Percent encoding happens not only to escape delimiters, but also so that you can transport bytes that are not allowed inside URIs (such as control characters or non-ASCII characters).
I guess it's because the URL Specification and specifically the HTTP part of it, only allow certain characters so to escape those one must replace them with characters that are allowed.
Also some allowed characters have special meanings like & and ? etc
so replacing them with a control code seems the only way to solve it
If you find it hard to recognize them, bookmark this page
http://www.w3schools.com/tags/ref_urlencode.asp

RegEx for Client-Side Validation of FileUpload

I'm trying to create a RegEx Validator that checks the file extension in the FileUpload input against a list of allowed extensions (which are user specified). The following is as far as I have got, but I'm struggling with the syntax of the backward slash (\) that appears in the file path. Obviously the below is incorrect because it just escapes the (]) which causes an error. I would be really grateful for any help here. There seems to be a lot of examples out there, but none seem to work when I try them.
[a-zA-Z_-s0-9:\]+(.pdf|.PDF)$
To include a backslash in a character class, you need to use a specific escape sequence (\b):
[a-zA-Z_\s0-9:\b]+(\.pdf|\.PDF)$
Note that this might be a bit confusing, because outside of character classes, \b represents a word boundary. I also assumed, that -s was a typo and should have represented a white space. (otherwise it shouldn't compile, I think)
EDIT: You also need to escape the dots. Otherwise they will be meta character for any character but line breaks.
another EDIT: If you actually DO want to allow hyphens in filenames, you need to put the hyphen at the end of the character class. Like this:
[a-zA-Z_\s0-9:\b-]+(\.pdf|\.PDF)$
You probably want to use something like
[a-zA-Z_0-9\s:\\-]+\.[pP][dD][fF]$
which is same as
[\w\s:\\-]+\.[pP][dD][fF]$
because \w = [a-zA-Z0-9_]
Be sure character - to put as very first or very last item in the [...] list, otherwise it has special meaning for range or characters, such as a-z.
Also \ character has to be escaped by another slash, even inside of [...].

Textboxvalidation - regularexpressionvalidator including accented letters

A while ago I've asked a question regarding textboxvalidation with regex (link).
So according to the answer, I use a (clientside) regularexpressionvalidator with the following regex:
([\s*]*\w[\s*]*){3,}
Which worked as expected unless a word with accents is entered ( eg élè) for searching élève.
In that case the validation is not passed.
Can someone help me out on how to include accented letters in the above regex?
Some pages tells that \w should include accented letters, however when I test it with an online validator it fails.
Thanks.
Try this:
(\s*[a-zA-Z_0-9À-ÿ]\s*){3,}
OR
([\s*]*[a-zA-Z_0-9À-ÿ][\s*]*){3,}
This will include all characters from À to ÿ (all accent characers including French accents in uppercase and lowercase)
Try to use \p{L} instead of \w. This would allow all characters in the Unicode "Letter" category. You might have to include number manually (\p{N}). (See the MSDN)

Regular Expression Validator for Letters and Numbers only

What is the Regular Expression Validator for only Letters and Numbers in asp.net?
I need to enter only 0-9,a-z and A-Z. I don't want to allow any special characters single or double quotes etc. I am using asp.net 3.5 framework.
I tried ^[a-zA-Z0-9]+$ and ^[a-zA-Z0-9]*$. They are not working.
Any help will be appreciated.
Try the following.
^[a-zA-Z0-9]+$
go to this example and also alphanumerics for more
then try this
^[a-zA-Z0-9]*$
If length restriction is necessary use
^[a-zA-Z0-9]{0,50}$
This will match alphanumeric strings of 0 to 50 chars.
you can try this....
^[a-zA-Z0-9]+$
see more info at here
You can define a regular expression as follows,
Regex myRegularExpression = new Regex(" \b^[a-zA-Z0-9]+$\b");
be sure to include System.Text.RegularExpression
and then use the Regex to match it with your user-control as follows,
eg : if your user-control is a textbox
myRegularExpression.isMatch(myTextBox.Text);
Dear English speaking people. With all due respect. A-Z are not the only letters in the world. Please use \w instead of [A-Za-z0-9] if you support other languages in your apps

Regular Expression for username and password?

I am trying to use a regular expression for name field in the asp.net application.
Conditions:name should be minimum 6 characters ?
I tried the following
"^(?=.*\d).{6}$"
I m completely new to the regex.Can any one suggest me what must be the regex for such condition ?
You could use this to match any alphanumeric character in length of 6 or more: ^[a-zA-Z0-9]{6,}$. You can tweak it to allow other characters or go the other route and just put in exclusions. The Regex Coach is a great environment for testing/playing with regular expressions (I wrote a blog post with some links to other tools too).
Look at Expression library and choose user name and/or password regex for you. You can also test your regex in online regex testers like RegexPlanet.
My regex suggestions are:
^[a-zA-Z][a-zA-Z0-9._\-]{5,}$
This regex accepts user names with minimum 6 characters, starting with a letter and containing only letters, numbers and ".","-","_" characters.
Next one:
^[a-zA-Z0-9._\\-]{6,}$
Similar to above, but accepts ".", "-", "_" and 0-9 to be first characters too.
If you want to validate only string length (minimum 6 characters), this simple regex below will be enough:
^.{6,}$
What about
^.{6,}$
What's all the stuff at the start of yours, and did you want to limit yourself to digits?
NRegex is a nice site for testing out regexes.
To just match 6 characters, ".{6}" is enough
In its simplest form, you can use the following:
.{6,}
This will match on 6 or more characters and fail on anything less. This will accept ANY character - unicode, ascii, whatever you are running through. If you have more requirements (i.e. only the latin alphabet, must contain a number, etc), the regex would obviously have to change.

Resources