Regular expression in c# for input like (integer-anything) - asp.net

string strRegexclass = #"^([0-9]+)\-([a-zA-Z])$";
I want to make regular expression which accept input like this (1-class). Any integer value before dash(-) and then must have dash then anything after dash.

You can use the code like this:
string strRegexclass = #"^\d+-.*$";
Or you can use the next code
string strRegexclass = #"^\d+-\w*$";
if you want to allow only letters after the dash.

If you plan to only match a string that starts with 1 or more ASCII digits, then a hyphen, and then any 0+ chars use:
^[0-9]+-.*$
See the regex demo
Note that \d and [0-9] are not equal in .NET regex flavor.

Related

R: How to use stringr to extract the substring as the output to mutate a column of strings that begins with a string pattern and end with a number?

I'm creating a small example to be put into mutate(). Not sure why this doesn't work.
> str_extract("rs1234-<b>C</b>","^rs*\\d$")
[1] NA
I'd be great if you can point to my misunderstanding of the language instead of merely providing a solution. I expect to get "rs1234".
The ^rs*\d$ regex matches
^ - start of string
rs* - r and zero or more occurrences of s char
\d - a digit
$ - end of string.
So, your pattern matches strings like rsssss1, r3, etc.
You need
str_extract("rs1234-<b>C</b>", "^rs\\d+")
where ^rs\d+ matches rs at the start of string and then one or more digits. See this regex demo.
But if I just want the substring in between "rs" and the last number. What should I do?
You would use rs.*\d:
str_extract("rs1234-<b>C</b>", "rs.*\\d")
where rs.*\d matches rs, then any zero or more chars other than line break chars as many as possible and then a digit.
NOTE: If you need to match line endings, too, you need to prepend the last pattern with (?s) inline DOTALL modifier.
See this regex demo.

REGEX pattern match in R for Course number

I need to identify matching course number that have xx.3xxxxxx.
These are some examples of the course numbers.
26.3730004
27.0210000
26.3730009
26.7114001
23.9610071
26.0A34430
23.3670005
26.0B05430
I tried many patterns one example I used is the pattern below. It did not get any match.
"[^0-9]{2}\Q.\E3[^0-9]+$"
I tried using grep and grepl. I actually need the code to return indexes.
This code shows my attempt to tag the rows that have matches.
Teacher$virtual[
which(
grepl("[^0-9]{2}\\Q.\\E3[^0-9]+$",Teacher$CourseNumber))]
<- "1"
I need to remove any row from my dataframe that have the course number with that pattern. XX.3XXXXXX
But, my code did not find any match. Can you please help me?
You should use
grepl("^[0-9]{2}\\.3", Teacher$CourseNumber)
See the regex graph:
Details:
^ - start of a string
[0-9]{2} - two digits
\\. - a dot (note that a regex escape is a literal backslash, but inside a string literal, "...", a single backslash is used to form string escape sequences, hence the backslash must be double to obtain a literal backslash char necessary for a regex escape)
3 - a 3 char.
NOTE: If you want to use in-pattern quoting with \Q and \E (in between which all chars are treated literally) you need to use PCRE regex, add perl=TRUE and use
grepl("^[0-9]{2}\\Q.\\E3", Teacher$CourseNumber, perl=TRUE)
Now, the dot is treated as a literal dot, not a . metacharacter that matches any char but a line break char (in a PCRE regex, . does not match line break chars by default).
Here, this simple expression would likely cover that:
^[0-9]{2}\.[3].+$
which has a [3] boundary right after the .. It would probably work without start and end anchors:
[0-9]{2}\.[3].+
Demo
We can add or reduce the boundaries, if it'd be necessary.

Removing string before alphabets

I want to remove the characters or string before the occurrence of alphabets.
For example consider
"--Test-T1" , "---Test-T2" , "----Test-T3" .
In the above strings i want to remove the hyphens before alphabets start and want to keep hyphens after that. I tried substring, remove LastIndexOf()+1, Regex.Replace but none did work. Please guide me how to achieve it.
If you want to just remove the - symbols from the start of a string, a mere
var res = s.TrimStart('-');
will do. If you need to make sure there are alphabets after the leading hyphens, use a regex:
var res = Regex.Replace(s, #"^-+(?=[a-zA-Z])", "");
Here,
^ - asserts the position at the string start
-+ - matches and consumes 1+ hyphens
(?=[a-zA-Z]) - the positive lookahead checks if there is an ASCII letter after the hyphens (but does not consume the letter, it is not part of the match value).
Alternatively, use a capturing group based version:
var res = Regex.Replace(s, #"^-+([a-zA-Z])", "$1");
Here, the letter is consumed, but since we've captured it, $1 restores it in the result.
Note that [a-zA-Z] can be replaced with \p{L} to match any Unicode letter.

Regular expression to allow dash sign

I currently need to allow a "-" sign in this regular expression ^[a-zA-Z0-9]*$.
You could use a regex like this:
^[a-z\d]+[-a-z\d]+[a-z\d]+$
Working demo
The idea is to use insensitive flag to avoid having A-Za-z and use only a-z. And also use \d that's the shortcut for 0-9.
So, basically the regex is compound of three parts:
^[a-z\d]+ ---> Start with alphanumeric characters
[-a-z\d]+ ---> can continue with alphanumeric characters or dashes
[a-z\d]+$ ---> End with alphanumeric characters
Simply add it as the first character after the opening bracket: ^[-a-zA-Z0-9]*$
Or, to match one or more of letters/numbers with a dash in between: ^[a-zA-Z0-9]+-[a-zA-Z0-9]+$
Hyphen can be included immediately after the open bracket [ or before the closing bracket
] in the character class. You should not include in the middle of the character class, otherwise it will treat as range characters and some Regex engine might not work also.
In your case both are valid solutions
(^[-a-zA-Z0-9]*$) - Starting of the Char class
(^[a-zA-Z0-9-]*$) - End of the Char class
Demo:
http://regex101.com/r/yP3sH7/2

Regular Expression to match only letters

I need write a regular expression for RegularExpressionValidator ASP.NET Web Controls.
The regular expression should ALLOW all alphabetic characters but not numbers or special characters (example: |!"£$%&/().
Any idea how to do it?
^[A-Za-z]+$
validates a string of length 1 or greater, consisting only of ASCII letters.
^[^\W\d_]+$
does the same for international letters, too.
Explanation:
[^ # match any character that is NOT a
\W # non-alphanumeric character (letters, digits, underscore)
\d # digit
_ # or underscore
] # end of character class
Effectively, you get \w minus (\d and _).
Or, you could use the fact that ASP.NET supports Unicode properties:
^\p{L}+$
validates a string of Unicode letters of length 1 or more.
Including spaces:
"^[a-zA-Z ]*$"
Excluding Spaces:
"^[a-zA-Z]*$"
To make it non-optional, change the * to a +
You can use the regex:
^[a-zA-Z]+$
Explanation:
^ : Start anchor
[..] : Char class
+ : one or more repetations
$ : End anchor

Resources