not only numbers regular expression - asp.net

I need a regular expression that allows letters (English and Arabic) with numbers, but not only numbers, also allows punctuates, spaces and multi-line(\n), when i searched i found this one
(?!^\d+$)^.+$
that doesn't allow multi-line.
i tried to write my own which is
(([a-zA-Zء-ي\s:-])|([0-9]+[a-zA-Zء-ي\s:-])|([a-zA-Zء-ي\s:-]+[0-9]$))*
the problem of that is :
1. it doesn't accept a number as end of string as employer9 but if it was employer9+"space" it workes fine.
2. i have to write every punctuates that is allowed, is their is a way easier to do this?

You can probably use DOTALL or the s modifier for the regex. You could do this with:
(?s)(?!^\d+$)^.+$
...or you could use the compiler flags when constructing the regex.
An alternative not using DOTALL would be:
(?!^\d+$)^[\s\S]+$

Related

Regular Expressions, containing specific words

I need to specify that my user name has to start with one of two words, followed by a backslash. It can only accept the words cat or dog at the beggining and then a backslash is expected, for example:
cat\something
dog\something
Both are accepted. What's the regular expression for this? I've tried some regular expressions but I haven«t figured it out yet.
The solution is:
^cat\\.*|^dog\\.*
https://regex101.com/ is a great tool for testing and evaluating regular expressions.

Extract mm/dd/yyyy and m/dd/yyyy dates from string in R [duplicate]

My regex pattern looks something like
<xxxx location="file path/level1/level2" xxxx some="xxx">
I am only interested in the part in quotes assigned to location. Shouldn't it be as easy as below without the greedy switch?
/.*location="(.*)".*/
Does not seem to work.
You need to make your regular expression lazy/non-greedy, because by default, "(.*)" will match all of "file path/level1/level2" xxx some="xxx".
Instead you can make your dot-star non-greedy, which will make it match as few characters as possible:
/location="(.*?)"/
Adding a ? on a quantifier (?, * or +) makes it non-greedy.
Note: this is only available in regex engines which implement the Perl 5 extensions (Java, Ruby, Python, etc) but not in "traditional" regex engines (including Awk, sed, grep without -P, etc.).
location="(.*)" will match from the " after location= until the " after some="xxx unless you make it non-greedy.
So you either need .*? (i.e. make it non-greedy by adding ?) or better replace .* with [^"]*.
[^"] Matches any character except for a " <quotation-mark>
More generic: [^abc] - Matches any character except for an a, b or c
How about
.*location="([^"]*)".*
This avoids the unlimited search with .* and will match exactly to the first quote.
Use non-greedy matching, if your engine supports it. Add the ? inside the capture.
/location="(.*?)"/
Use of Lazy quantifiers ? with no global flag is the answer.
Eg,
If you had global flag /g then, it would have matched all the lowest length matches as below.
Here's another way.
Here's the one you want. This is lazy [\s\S]*?
The first item:
[\s\S]*?(?:location="[^"]*")[\s\S]* Replace with: $1
Explaination: https://regex101.com/r/ZcqcUm/2
For completeness, this gets the last one. This is greedy [\s\S]*
The last item:[\s\S]*(?:location="([^"]*)")[\s\S]*
Replace with: $1
Explaination: https://regex101.com/r/LXSPDp/3
There's only 1 difference between these two regular expressions and that is the ?
The other answers here fail to spell out a full solution for regex versions which don't support non-greedy matching. The greedy quantifiers (.*?, .+? etc) are a Perl 5 extension which isn't supported in traditional regular expressions.
If your stopping condition is a single character, the solution is easy; instead of
a(.*?)b
you can match
a[^ab]*b
i.e specify a character class which excludes the starting and ending delimiiters.
In the more general case, you can painstakingly construct an expression like
start(|[^e]|e(|[^n]|n(|[^d])))end
to capture a match between start and the first occurrence of end. Notice how the subexpression with nested parentheses spells out a number of alternatives which between them allow e only if it isn't followed by nd and so forth, and also take care to cover the empty string as one alternative which doesn't match whatever is disallowed at that particular point.
Of course, the correct approach in most cases is to use a proper parser for the format you are trying to parse, but sometimes, maybe one isn't available, or maybe the specialized tool you are using is insisting on a regular expression and nothing else.
Because you are using quantified subpattern and as descried in Perl Doc,
By default, a quantified subpattern is "greedy", that is, it will
match as many times as possible (given a particular starting location)
while still allowing the rest of the pattern to match. If you want it
to match the minimum number of times possible, follow the quantifier
with a "?" . Note that the meanings don't change, just the
"greediness":
*? //Match 0 or more times, not greedily (minimum matches)
+? //Match 1 or more times, not greedily
Thus, to allow your quantified pattern to make minimum match, follow it by ? :
/location="(.*?)"/
import regex
text = 'ask her to call Mary back when she comes back'
p = r'(?i)(?s)call(.*?)back'
for match in regex.finditer(p, str(text)):
print (match.group(1))
Output:
Mary

How to write a regex OR statement within strapply in R

I have been using strapplyc in R to select different portions of a string that match one particular set of criteria. These have worked successfully until I found a portion of the string where the required portion could be defined one of two ways.
Here is an example of the string which is liberally sprinkled with \t:
\t\t\tsome words here\t\t\tDefect: some more words here Action: more words
I can write the strapply statement to capture the text between Defect: and the start of Action:
strapplyc(record[i], "Defect:(.*?)Action")
This works and selects the chosen text between Defect: and Action. In some cases there is no action section to the string and I've used the following code to capture these cases.
strapplyc(record[i], "Defect:(.*?)$")
What I have been trying to do is capture the text that either ends with Action, or with the end of the string (using $).
This is the bit that keeps failing. It returns nothing for either option. Here is my failing code:
strapplyc(record[i], "Defect:(.*?)Action|$")
Any idea where I'm going wrong, or a better solution would be much appreciated.
If you are up for a more efficient solution, you could drop the .*? matching and unroll your pattern like:
Defect:((?:[^A]+|A(?!ction))*)
This matches Defect: followed by any amount of characters that are not an A or are an A and not followed by ction. This avoids the expanding that is needed for the lazy dot matching. It will work for both ways, as it does stop matching when it hits Action or the end of your string.
As suggested by Wiktor, you can also use
Defect:([^A]*(?:A(?!ction)[^A]*)*)
Which is a little bit faster when there are many As in the string.
You might want to consider to use A(?!ction:) or A(?!ction\s*:), to avoid false early matches.
The alternation operator | is the regex operator with the lowest precedence. That means the regex Defect:(.*?)Action|$ is actually a combination of Defect:(.*?)Action and $ - since an empty string is a valid match for $, your regex returns the empty string.
To solve that, you should combine the regexes Defect:(.*?)Action and Defect:(.*?)$ with an OR:
Defect:(.*?)Action|Defect:(.*?)$
Or you can enclose Action|$ in a group as Sebastian Proske said in the comments:
Defect:(.*?)(?:Action|$)

regular expression to allow only specific numbers

I am looking to have a regular expression that allows only specific numbers to be entered, e.g. 2,4,5,6,10,18
I tried something like
"'2'|'4'|'5'|'6'|'10'|'18'"
and anything that i typed failed the regex and then the computer pointed its finger at me and laughed.
where am i going wrong?
The single quotes are unnecessary. The regex you are looking for is: ^(2|4|5|6|10|18)$.
The symbols ^ and $ denote the start and the end of the line, to prevent 121 from matching (since it contains 2).

Regular Expression for username and password?

I am trying to use a regular expression for name field in the asp.net application.
Conditions:name should be minimum 6 characters ?
I tried the following
"^(?=.*\d).{6}$"
I m completely new to the regex.Can any one suggest me what must be the regex for such condition ?
You could use this to match any alphanumeric character in length of 6 or more: ^[a-zA-Z0-9]{6,}$. You can tweak it to allow other characters or go the other route and just put in exclusions. The Regex Coach is a great environment for testing/playing with regular expressions (I wrote a blog post with some links to other tools too).
Look at Expression library and choose user name and/or password regex for you. You can also test your regex in online regex testers like RegexPlanet.
My regex suggestions are:
^[a-zA-Z][a-zA-Z0-9._\-]{5,}$
This regex accepts user names with minimum 6 characters, starting with a letter and containing only letters, numbers and ".","-","_" characters.
Next one:
^[a-zA-Z0-9._\\-]{6,}$
Similar to above, but accepts ".", "-", "_" and 0-9 to be first characters too.
If you want to validate only string length (minimum 6 characters), this simple regex below will be enough:
^.{6,}$
What about
^.{6,}$
What's all the stuff at the start of yours, and did you want to limit yourself to digits?
NRegex is a nice site for testing out regexes.
To just match 6 characters, ".{6}" is enough
In its simplest form, you can use the following:
.{6,}
This will match on 6 or more characters and fail on anything less. This will accept ANY character - unicode, ascii, whatever you are running through. If you have more requirements (i.e. only the latin alphabet, must contain a number, etc), the regex would obviously have to change.

Resources