I need to specify that my user name has to start with one of two words, followed by a backslash. It can only accept the words cat or dog at the beggining and then a backslash is expected, for example:
cat\something
dog\something
Both are accepted. What's the regular expression for this? I've tried some regular expressions but I haven«t figured it out yet.
The solution is:
^cat\\.*|^dog\\.*
https://regex101.com/ is a great tool for testing and evaluating regular expressions.
My regex pattern looks something like
<xxxx location="file path/level1/level2" xxxx some="xxx">
I am only interested in the part in quotes assigned to location. Shouldn't it be as easy as below without the greedy switch?
/.*location="(.*)".*/
Does not seem to work.
You need to make your regular expression lazy/non-greedy, because by default, "(.*)" will match all of "file path/level1/level2" xxx some="xxx".
Instead you can make your dot-star non-greedy, which will make it match as few characters as possible:
/location="(.*?)"/
Adding a ? on a quantifier (?, * or +) makes it non-greedy.
Note: this is only available in regex engines which implement the Perl 5 extensions (Java, Ruby, Python, etc) but not in "traditional" regex engines (including Awk, sed, grep without -P, etc.).
location="(.*)" will match from the " after location= until the " after some="xxx unless you make it non-greedy.
So you either need .*? (i.e. make it non-greedy by adding ?) or better replace .* with [^"]*.
[^"] Matches any character except for a " <quotation-mark>
More generic: [^abc] - Matches any character except for an a, b or c
How about
.*location="([^"]*)".*
This avoids the unlimited search with .* and will match exactly to the first quote.
Use non-greedy matching, if your engine supports it. Add the ? inside the capture.
/location="(.*?)"/
Use of Lazy quantifiers ? with no global flag is the answer.
Eg,
If you had global flag /g then, it would have matched all the lowest length matches as below.
Here's another way.
Here's the one you want. This is lazy [\s\S]*?
The first item:
[\s\S]*?(?:location="[^"]*")[\s\S]* Replace with: $1
Explaination: https://regex101.com/r/ZcqcUm/2
For completeness, this gets the last one. This is greedy [\s\S]*
The last item:[\s\S]*(?:location="([^"]*)")[\s\S]*
Replace with: $1
Explaination: https://regex101.com/r/LXSPDp/3
There's only 1 difference between these two regular expressions and that is the ?
The other answers here fail to spell out a full solution for regex versions which don't support non-greedy matching. The greedy quantifiers (.*?, .+? etc) are a Perl 5 extension which isn't supported in traditional regular expressions.
If your stopping condition is a single character, the solution is easy; instead of
a(.*?)b
you can match
a[^ab]*b
i.e specify a character class which excludes the starting and ending delimiiters.
In the more general case, you can painstakingly construct an expression like
start(|[^e]|e(|[^n]|n(|[^d])))end
to capture a match between start and the first occurrence of end. Notice how the subexpression with nested parentheses spells out a number of alternatives which between them allow e only if it isn't followed by nd and so forth, and also take care to cover the empty string as one alternative which doesn't match whatever is disallowed at that particular point.
Of course, the correct approach in most cases is to use a proper parser for the format you are trying to parse, but sometimes, maybe one isn't available, or maybe the specialized tool you are using is insisting on a regular expression and nothing else.
Because you are using quantified subpattern and as descried in Perl Doc,
By default, a quantified subpattern is "greedy", that is, it will
match as many times as possible (given a particular starting location)
while still allowing the rest of the pattern to match. If you want it
to match the minimum number of times possible, follow the quantifier
with a "?" . Note that the meanings don't change, just the
"greediness":
*? //Match 0 or more times, not greedily (minimum matches)
+? //Match 1 or more times, not greedily
Thus, to allow your quantified pattern to make minimum match, follow it by ? :
/location="(.*?)"/
import regex
text = 'ask her to call Mary back when she comes back'
p = r'(?i)(?s)call(.*?)back'
for match in regex.finditer(p, str(text)):
print (match.group(1))
Output:
Mary
I have a file "file7.txt"
The contents of file7.txt,
I want to know how the back reference works with grep command.
when i type the commands, i get the following results,
So now i want to know how this works.
I would explain you the first grep that you have tried.
grep '\([a-z]\)\1'
This matches the sample string with 'first character' same as the next character.
f i l l i n g
Grep matches the first character within 'a' to 'z'.
At the beginnign it checks for every character, one by one.
$1 holds the character, The pattern is to have the next character as 1.
You need a way of remembering what you found, and seeing if the same pattern occurred again.
You can mark part of a pattern using "(" and ")".
You can recall the remembered pattern with "\" followed by a single digit.
You can have 9 different remembered patterns.
This reduces the pattern search efficiently and saves your time.
This is how all the back references work.
My current regex is only picking up part of my string. It creates a match as soon as one if found, even though I need the longer version of that match to hit. For example, I am creating matches for both:
SSS111
and
SSS111-L
The first SSS111 matches fine with my current regex, but the SSS111-L is only getting matched to the SSS111, leaving the -L out.
How can I create a greedy regex to read the whole line before matching? I am currently using
[-A-Z0-9]{3,12}
to capture the numbers and letters, but have not had any luck outside of this.
Regex are allways greedy. This ist mostly the Problem.
Here i think you have only to escape the '-'
#"[-A-Z]{3-12}"
I am looking to have a regular expression that allows only specific numbers to be entered, e.g. 2,4,5,6,10,18
I tried something like
"'2'|'4'|'5'|'6'|'10'|'18'"
and anything that i typed failed the regex and then the computer pointed its finger at me and laughed.
where am i going wrong?
The single quotes are unnecessary. The regex you are looking for is: ^(2|4|5|6|10|18)$.
The symbols ^ and $ denote the start and the end of the line, to prevent 121 from matching (since it contains 2).