Hi I'm creating a registration page. It has "Enter License Number:" i want to create a validation expression that if the user type a wrong format in that field. The form will not be submitted. It must be corrected before they submitted. I dragged the "Regular Expression Validator" in my website. But they don't have a default expression for license number. I must custom the expression to have my own expression.
Now i only want to know what is the validation expression of this sample license number:
G11-11-004064 -- A Philippines sample driver's license.
LetterNumberNumber - NumberNumber - NumberNumberNumberNumberNumberNumber
Could you convert it?
Here's a regular expression editor. It's aimed towards Ruby but will do for .NET as well:
http://rubular.com/
I don't know about the detailed specification of the license numbers you#re looking for, but I created a regex based on your example: ^[A-Z]\d{2}-\d{2}-\d{6}$.
You can modify it here:
http://rubular.com/r/7bHsX1tJ23
The example explained:
^[A-Z]\d{2}-\d{2}-\d{6}$
^ = start of line
[A-Z] = a single upper case letter
\d{2} = any number with 2 digits
\d{6} = any number with 6 digits
$ = end of line
If you want to make sure you don't miss lower case letters starting the license use [A-Za-z] instead of [A-Z]
(Thanks to Paul Sullivan)
/[A-Za-z][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9]/
I'm sure this is as basic as it gets but it will match
see online regex tested
Related
I have a req to provide US Zip+4 with the +4 being optional and the +4 can't be 0000. I'm doing this in .NET therefore I'm using RegularExpressionValidator with RegEx set. In my first validator I'm checking if the Zip code is xxxxx-xxxx or xxxxx format that is 5+4 or 5. In my 2nd validator I check if the last 4 are not set to 0000. This means 1234-0000 is invalid. These are my Regex and I want to be sure they are valid. Seems they test okay, however when cross checking them with the regex101 app online I'm getting different behavior than .NET.
xxxxx-xxxx or xxxxx = ^[0-9]{5}(?:-[0-9]{4})?$
xxxxx-0000 = \d{5}(?!-0000).*
This last one I quite don't understand how it works, but it seems to work. Someone help explain me the ?! and .* they both seem to need to be necessary for this to function. My understanding is the .* means all char and the ?! means negative lookahead????
Actually, the regex pattern I would suggest here is actually a combination of the two you provided above:
^[0-9]{5}(?!-0000$)(?:-[0-9]{4})?$
Demo
Here is an explanation of the pattern:
^ from the start of the ZIP code
[0-9]{5} match a 5 digit ZIP code
(?!-0000$) then assert that the PO box is NOT -0000
(?:-[0-9]{4})? match an optional -xxxx PO box (which can't be 0000)
$ end of the ZIP code
Of note, the (?!-0000$) term is called a negative lookahead, because it looks ahead in the input and asserts that what follows is not -0000. But, using a lookahead does not advance the pattern, so after completing the negative assertion, the pattern continues trying to match an optional -xxxx PO box following.
I'm trying to extract UK postcodes from address strings in R, using the regular expression provided by the UK government here.
Here is my function:
address_to_postcode <- function(addresses) {
# 1. Convert addresses to upper case
addresses = toupper(addresses)
# 2. Regular expression for UK postcodes:
pcd_regex = "[Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) {0,1}[0-9][A-Za-z]{2})"
# 3. Check if a postcode is present in each address or not (return TRUE if present, else FALSE)
present <- grepl(pcd_regex, addresses)
# 4. Extract postcodes matching the regular expression for a valid UK postcode
postcodes <- regmatches(addresses, regexpr(pcd_regex, addresses))
# 5. Return NA where an address does not contain a (valid format) UK postcode
postcodes_out <- list()
postcodes_out[present] <- postcodes
postcodes_out[!present] <- NA
# 6. Return the results in a vector (should be same length as input vector)
return(do.call(c, postcodes_out))
}
According to the guidance document, the logic this regular expression looks for is as follows:
"GIR 0AA" OR One letter followed by either one or two numbers OR One letter followed by a second letter that must be one of
ABCDEFGHJ KLMNOPQRSTUVWXY (i.e..not I) and then followed by either one
or two numbers OR One letter followed by one number and then another
letter OR A two part post code where the first part must be One letter
followed by a second letter that must be one of ABCDEFGH
JKLMNOPQRSTUVWXY (i.e..not I) and then followed by one number and
optionally a further letter after that AND The second part (separated
by a space from the first part) must be One number followed by two
letters. A combination of upper and lower case characters is allowed.
Note: the length is determined by the regular expression and is
between 2 and 8 characters.
My problem is that this logic is not completely preserved when using the regular expression without the ^ and $ anchors (as I have to do in this scenario because the postcode could be anywhere within the address strings); what I'm struggling with is how to preserve the order and number of characters for each segment in a partial (as opposed to complete) string match.
Consider the following example:
> address_to_postcode("1A noplace road, random city, NR1 2PK, UK")
[1] "NR1 2PK"
According to the logic in the guideline, the second letter in the postcode cannot be 'z' (and there are some other exclusions too); however look what happens when I add a 'z':
> address_to_postcode("1A noplace road, random city, NZ1 2PK, UK")
[1] "Z1 2PK"
... whereas in this case I would expect the output to be NA.
Adding the anchors (for a different usage case) doesn't seem to help as the 'z' is still accepted even though it is in the wrong place:
> grepl("^[Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) {0,1}[0-9][A-Za-z]{2})$", "NZ1 2PK")
[1] TRUE
Two questions:
Have I misunderstood the logic of the regular expression and
If not, how can I correct it (i.e. why aren't the specified letter
and character ranges exclusive to their position within the regular expression)?
Edit
Since posting this answer, I dug deeper into the UK government's regex and found even more problems. I posted another answer here that describes all the issues and provides alternatives to their poorly formatted regex.
Note
Please note that I'm posting the raw regex here. You'll need to escape certain characters (like backslashes \) when porting to r.
Issues
You have many issues here, all of which are caused by whoever created the document you're retrieving your regex from or the coder that created it.
1. The space character
My guess is that when you copied the regular expression from the link you provided it converted the space character into a newline character and you removed it (that's exactly what I did at first). You need to, instead, change it to a space character.
^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$
here ^
2. Boundaries
You need to remove the anchors ^ and $ as these indicate start and end of line. Instead, wrap your regex in (?:) and place a \b (word boundary) on either end as the following shows. In fact, the regex in the documentation is incorrect (see Side note for more information) as it will fail to anchor the pattern properly.
See regex in use here
\b(?:([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2}))\b
^^^^^ ^^^
3. Character class oversight
There's a missing - in the character class as pointed out by #deadcrab in his answer here.
\b(?:([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2}))\b
^
4. They made the wrong character class optional!
In the documentation it clearly states:
A two part post code where the first part must be:
One letter followed by a second letter that must be one of ABCDEFGHJKLMNOPQRSTUVWXY (i.e..not I) and then followed by one number and optionally a further letter after that
They made the wrong character class optional!
\b(?:([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2}))\b
^^^^^^
it should be this one ^^^^^^^^
5. The whole thing is just awful...
There are so many things wrong with this regex that I just decided to rewrite it. It can very easily be simplified to perform a fraction of the steps it currently takes to match text.
\b(?:[A-Za-z][A-HJ-Ya-hj-y]?[0-9][0-9A-Za-z]? [0-9][A-Za-z]{2}|[Gg][Ii][Rr] 0[Aa]{2})\b
Answer
As mentioned in the comments below my answer, some postcodes are missing the space character. For missing spaces in the postcodes (e.g. NR12PK), simply add a ? after the spaces as shown in the regex below:
\b(?:[A-Za-z][A-HJ-Ya-hj-y]?[0-9][0-9A-Za-z]? ?[0-9][A-Za-z]{2}|[Gg][Ii][Rr] ?0[Aa]{2})\b
^^ ^^
You may also shorten the regex above with the following and use the case-insensitive flag (ignore.case(pattern) or ignore_case = TRUE in r, depending on the method used.):
\b(?:[A-Z][A-HJ-Y]?[0-9][0-9A-Z]? ?[0-9][A-Z]{2}|GIR ?0A{2})\b
Note
Please note that regular expressions only validate the possible format(s) of a string and cannot actually identify whether or not a postcode legitimately exists. For this, you should use an API. There are also some edge-cases where this regex will not properly match valid postcodes. For a list of these postcodes, please see this Wikipedia article.
The regex below additionally matches the following (make it case-insensitive to match lowercase variants as well):
British Overseas Territories
British Forces Post Office
Although they've recently changed it to align with the British postcode system to BF, followed by a number (starting with BF1), they're considered optional alternative postcodes
Special cases outlined in that article (as well as SAN TA1 - a valid postcode for Santa!)
See this regex in use here.
\b(?:(?:[A-Z][A-HJ-Y]?[0-9][0-9A-Z]?|ASCN|STHL|TDCU|BBND|[BFS]IQ{2}|GX11|PCRN|TKCA) ?[0-9][A-Z]{2}|GIR ?0A{2}|SAN ?TA1|AI-?[0-9]{4}|BFPO[ -]?[0-9]{2,3}|MSR[ -]?1(?:1[12]|[23][135])0|VG[ -]?11[1-6]0|[A-Z]{2} ? [0-9]{2}|KY[1-3][ -]?[0-2][0-9]{3})\b
I would also recommend anyone implementing this answer to read this StackOverflow question titled UK Postcode Regex (Comprehensive).
Side note
The documentation you linked to (Bulk Data Transfer: Additional Validation for CAS Upload - Section 3. UK Postcode Regular Expression) actually has an improperly written regular expression.
As mentioned in the Issues section, they should have:
Wrapped the entire expression in (?:) and placed the anchors around the non-capturing group. Their regular expression, as it stands, will fail in for some cases as seen here.
The regular expression is also missing - in one of the character classes
It also made the wrong character class optional.
here is my regular expression
txt="0288, Bishopsgate, London Borough of Tower Hamlets, London, Greater London, England, EC2M 4QP, United Kingdom"
matches=re.findall(r'[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}', txt)
In my requirement a Textbox should allow Alphabets,Numeric s, Special Characters,Special Symbols With at least one Alphabet.
I will try like this but i am not getting.
^\d*[a-zA-Z][a-zA-Z0-9#*,$._&% -!><^#]*$
You may want to have 2 regular expression validators; one for validating the allowed characters, and one for validating that at least on alphabet has been provided. You may be able to get at least one, but this way, you can have two separate validation messages to show the user explaining why the input is wrong.
Just match for special characters until you encounter a letter, then match for everything until the end of the string:
^[0-9#*,$._&% -!><^#]*[a-zA-Z0-9#*,$._&% -!><^#]*$
Use lookaheads :
/^(?=.*[a-zA-Z])[\w#*,$.&%!><^#-]*$/
Edit :
I assume the - is meant as the actual - character and not a range of space to !.
I removed the space character. You can of course add it if you want.
[ -!]
Effectively means :
[ -!] # Match a single character in the range between “ ” and “!”
And I have no idea what that range entails!
I am trying to use a regular expression for name field in the asp.net application.
Conditions:name should be minimum 6 characters ?
I tried the following
"^(?=.*\d).{6}$"
I m completely new to the regex.Can any one suggest me what must be the regex for such condition ?
You could use this to match any alphanumeric character in length of 6 or more: ^[a-zA-Z0-9]{6,}$. You can tweak it to allow other characters or go the other route and just put in exclusions. The Regex Coach is a great environment for testing/playing with regular expressions (I wrote a blog post with some links to other tools too).
Look at Expression library and choose user name and/or password regex for you. You can also test your regex in online regex testers like RegexPlanet.
My regex suggestions are:
^[a-zA-Z][a-zA-Z0-9._\-]{5,}$
This regex accepts user names with minimum 6 characters, starting with a letter and containing only letters, numbers and ".","-","_" characters.
Next one:
^[a-zA-Z0-9._\\-]{6,}$
Similar to above, but accepts ".", "-", "_" and 0-9 to be first characters too.
If you want to validate only string length (minimum 6 characters), this simple regex below will be enough:
^.{6,}$
What about
^.{6,}$
What's all the stuff at the start of yours, and did you want to limit yourself to digits?
NRegex is a nice site for testing out regexes.
To just match 6 characters, ".{6}" is enough
In its simplest form, you can use the following:
.{6,}
This will match on 6 or more characters and fail on anything less. This will accept ANY character - unicode, ascii, whatever you are running through. If you have more requirements (i.e. only the latin alphabet, must contain a number, etc), the regex would obviously have to change.
I'm having a hard time trying to create a right regular expression for the RegularExpressionValidator control that allows password to be checked for the following:
- Is greater than seven characters.
- Contains at least one digit.
- Contains at least one special (non-alphanumeric) character.
Cant seem to find any results out there too. Any help would be appreciated! Thanks!
Maybe you will find this article helpful. You may try the following expression
^.*(?=.{8,})(?=.*[\d])(?=.*[\W]).*$
and the breakdown:
(?=.{8,}) - contains at least 8 characters
(?=.*[\d]) - contains at least one digit
(?=.*[\W]) - contains at least one special character
http://msdn.microsoft.com/en-us/library/ms972966.aspx
Search for "Lookaround processing" which is necessary in these examples. You can also test for a range of values by using .{4,8} as in Microsoft's example:
^(?=.*\d).{4,8}$
Try this
((?=.*\d)(?=.*[a-z])(?=.*[\W]).{6,20})
Description of above Regular Expression:
( # Start of group
(?=.*\d) # must contains one digit from 0-9
(?=.*[a-z]) # must contains one lowercase characters
(?=.*[\W]) # must contains at least one special character
. # match anything with previous condition checking
{7,20} # length at least 7 characters and maximum of 20
) # End of group
"/W" will increase the range of characters that can be used for password and pit can be more safe.
Use for Strong password with Uppercase, Lowercase, Numbers, Symbols & At least 8 Characters.
//Code for Validation with regular expression in ASP.Net core.
[RegularExpression(#"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[^\da-zA-Z]).{8,15}$")]
Regular expression password validation:
#"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[^\da-zA-Z]).{8,15}$"