Regular Expression to get A-Z, excluding other characters and numbers - plsql

I have a field Facility with some of the following records:
ABC-XY
ABC-ZZ
EFG-AA
NM
NM-100
NM-202
HYK-109
LI-022
How can I get the letters before the hyphen, but also get the letters when no hyphen is present (like in NM) ?

Your regex should be:
^ [A-Z]+
^ is the start of the string, allowed character class A-Z can present more than once until some other character is found.

Related

Check if character is number

I want to check if a character can be safely converted to a numeric by using a regex.
However, I don't see my error. Example:
stringr::str_detect("4.", pattern = "-{0,1}[0-9]+(.[0-9]+){0,1}")
This produces a TRUE. My intention was to specifiy that whenever a . follows the first sequence of numbers, there must be at least one other number, therefore (.[0-9]+){0,1}.
What's wrong here?
Note:
(.[0-9]+){0,1} is an optional pattern because {0,1} (=?) makes the .[0-9]+ pattern sequence match one or zero times. So, yes, one or more digits ([0-9]+) must follow any char other than line break chars (matched with an unescaped .), but this pattern is optional, and thus you cannot require anything with it.
. is unescaped, so it matches any char other than line break chars. Escape it to match a literal dot
Your regex is not anchored, and can match partial substrings in a longer string. Use ^ and $ to make the pattern match the whole string.
So, consider using
stringr::str_detect("4.", pattern = "^-?[0-9]+(?:\\.[0-9]+)?$")
where
^ - start of string
-? - an optional - char
[0-9]+ - one or more digits
(?:\.[0-9]+)? - a non-capturing group matching an optional sequence of a . and then one or more digits
$ - end of string.

Regular expression for one Alphabet and many numbers

I need a regular expression that check my string contains one alphabet+many digits.Eg, A546456 or B456 or N455,asp.net
I have tried
^[a-z|A-Z|]+[a-z|A-Z|0-9]+[0-9]*
and
[a-zA-Z0-9]+
A regular expression to validate if a string starts with a single letter A-Z in any case and has 1 or more digits and nothing else is: ^[A-Za-z]\d+$
Explanation:
^ ... beginning of string (or beginning of line in other context).
[A-Za-z] ... a character class definition for a single character (no multiplier appended) matching a letter in range A-Z in any case. \w cannot be used as it matches also an underscore and letters from other languages as for example German umlauts.
\d ... a digit which is equal the expression [0-9].
+ ... previous expression (any digit) 1 or more times.
$ ... end of string (or end of line in other context).
Be careful with a regular expression containing a backslash as used here for \d put into a string as the backslash character is in many programming/scripting languages also the escape character inside strings and must be therefore often escaped with one more backslash, i.e. "^[A-Za-z]\\d+$"

regex to allow space if followed by character

I have an asp.net regularexpressionvalidator that I need to match on a textbox. If there is any text, logically the rules are as follows:
The text must be at least three characters, after any trimming to remove spaces.
Characters allowed are a-zA-Z0-9-' /\&.
I'm having major pain trying to construct an expression that will allow a space as the thrid character only if there is a fourth non-space character.
Can anyone suggest an expression? My last attempt was:
^[a-zA-Z0-9-'/\\&\.](([a-zA-Z0-9-'/\\&\.][a-zA-Z0-9-' /\\&\.])|([a-zA-Z0-9-' /\\&\.][a-zA-Z0-9-'/\\&\.]))[a-zA-Z0-9-' /\\&\.]{0,}$
but that does not match on 'a a'.
Thanks.
OK, now this is all in one regex:
^\s*(?=[a-zA-Z0-9'/\\&.-])([a-zA-Z0-9'/\\&.\s-]{3,})(?<=\S)\s*$
Explanation:
^ # Start of string
\s* # Optional leading whitespace, don't capture that.
(?= # Assert that...
[a-zA-Z0-9'/\\&.-] # the next character is allowed and non-space
)
( # Match and capture...
[a-zA-Z0-9'/\\&.\s-]{3,} # three or more allowed characters, including space
)
(?<=\S) # Assert that the previous character is not a space
\s* # Optional trailing whitespace, don't capture that.
$ # End of string
This matches
abc
aZ- &//
a ab abc x
aaa
a a
and doesn't match
aa
abc!
a&
Simplifying your allowed characters to be a-z and space for clarity, doesn't this do it?
^ *[a-z][a-z ]+[a-z] *$
Ignore spaces. Now a letter. Then some letters or spaces. Then a letter. Ignore more spaces.
The full thing becomes:
^ *[a-zA-Z0-9-'/\\&\.][a-zA-Z0-9-'/\\&\. ]+[a-zA-Z0-9-'/\\&\.] *$

Regular Expression to match only letters

I need write a regular expression for RegularExpressionValidator ASP.NET Web Controls.
The regular expression should ALLOW all alphabetic characters but not numbers or special characters (example: |!"£$%&/().
Any idea how to do it?
^[A-Za-z]+$
validates a string of length 1 or greater, consisting only of ASCII letters.
^[^\W\d_]+$
does the same for international letters, too.
Explanation:
[^ # match any character that is NOT a
\W # non-alphanumeric character (letters, digits, underscore)
\d # digit
_ # or underscore
] # end of character class
Effectively, you get \w minus (\d and _).
Or, you could use the fact that ASP.NET supports Unicode properties:
^\p{L}+$
validates a string of Unicode letters of length 1 or more.
Including spaces:
"^[a-zA-Z ]*$"
Excluding Spaces:
"^[a-zA-Z]*$"
To make it non-optional, change the * to a +
You can use the regex:
^[a-zA-Z]+$
Explanation:
^ : Start anchor
[..] : Char class
+ : one or more repetations
$ : End anchor

.net regex meaning of [^\\.]+

I have a question about a regex. Given this part of a regex:
(.[^\\.]+)
The part [^\.]+ Does this mean get everything until the first dot? So with this text:
Hello my name is Martijn. I live in Holland.
I get 2 results: both sentences. But when I leave the + sign, I get 2 two characters: he, ll, o<space>, my, etc. Why is that?
Your regex .[^\\.]+ means:
Match any character
Match any character until you get slash or a dot ".". Note that [^\\.] means NOT slash or NOT dot, which means either a dot or a slash is not a match. It will keep on matching characters until it founds a dot or slash because of the "+" at the end. It is called a greedy quantifier because of that.
When you input (quotes not included): "Hello my name is Martijn. I live in Holland."
The matches are:
Hello my name is Martijn
. I live in Holland
Note that the dot is not included in the first match since it stops at n in Martijn and the second match starts with the dot.
When you remove the +: (.[^\\.])
It just means:
Match any character
Match any character except a dot or a slash.
Because a dot outside a character class (ie, not between []) means (almost) any character.
So, .[^\\.] means match (almost) any character followed by something which is not a dot nor a backslash (dots don't need to be escaped in a character class to mean just a dot, but backslashes do),
This, in your example, is h (any character) e (not a dot nor a backslash) and so on and so forth.
Whereas with a + (one or more of not a dot nor a backslash) you will match all characters which are not dots until a dot.
The regex means:
any one character followed by more than zero characters that are not a backslash or a period.

Resources