I have a document, and I need to find all the words(no spaces) borded with '. (e.g. 'apple', 'hello') What would be the regular expression?
I've tried ^''$ but it didn't work.
If there isn't any solution, it could not be "any word" but also it can be a word from an order(e.g. apple, banana, lemon) but it still must have the (')s.
Thank you so much
Andrew
If you want to capture single-quoted strings, literally any character run except single-quotes but between the single-quotes, use
/'[^']+'/
If you need single words, i.e. alphabetic characters but no spaces, try
/'[a-zA-Z]+'/
I'm asssuming a couple things here:
You're using a language that delimits regexes with slashes. This includes Javascript and Perl to my knowledge, and probably a bunch of others. In some other languages, like C#, you should use double quotes to delimit, e.g. "'[a-zA-Z]+'"
You're using a flavor of regex that does not need to escape the plus sign.
You're trying to capture all such words within a long string. I.e., if the input string is "Here is a 'long' string with 'some' 'words' single-quoted" then you will capture three words: 'long','some', and 'words'.
My regex pattern looks something like
<xxxx location="file path/level1/level2" xxxx some="xxx">
I am only interested in the part in quotes assigned to location. Shouldn't it be as easy as below without the greedy switch?
/.*location="(.*)".*/
Does not seem to work.
You need to make your regular expression lazy/non-greedy, because by default, "(.*)" will match all of "file path/level1/level2" xxx some="xxx".
Instead you can make your dot-star non-greedy, which will make it match as few characters as possible:
/location="(.*?)"/
Adding a ? on a quantifier (?, * or +) makes it non-greedy.
Note: this is only available in regex engines which implement the Perl 5 extensions (Java, Ruby, Python, etc) but not in "traditional" regex engines (including Awk, sed, grep without -P, etc.).
location="(.*)" will match from the " after location= until the " after some="xxx unless you make it non-greedy.
So you either need .*? (i.e. make it non-greedy by adding ?) or better replace .* with [^"]*.
[^"] Matches any character except for a " <quotation-mark>
More generic: [^abc] - Matches any character except for an a, b or c
How about
.*location="([^"]*)".*
This avoids the unlimited search with .* and will match exactly to the first quote.
Use non-greedy matching, if your engine supports it. Add the ? inside the capture.
/location="(.*?)"/
Use of Lazy quantifiers ? with no global flag is the answer.
Eg,
If you had global flag /g then, it would have matched all the lowest length matches as below.
Here's another way.
Here's the one you want. This is lazy [\s\S]*?
The first item:
[\s\S]*?(?:location="[^"]*")[\s\S]* Replace with: $1
Explaination: https://regex101.com/r/ZcqcUm/2
For completeness, this gets the last one. This is greedy [\s\S]*
The last item:[\s\S]*(?:location="([^"]*)")[\s\S]*
Replace with: $1
Explaination: https://regex101.com/r/LXSPDp/3
There's only 1 difference between these two regular expressions and that is the ?
The other answers here fail to spell out a full solution for regex versions which don't support non-greedy matching. The greedy quantifiers (.*?, .+? etc) are a Perl 5 extension which isn't supported in traditional regular expressions.
If your stopping condition is a single character, the solution is easy; instead of
a(.*?)b
you can match
a[^ab]*b
i.e specify a character class which excludes the starting and ending delimiiters.
In the more general case, you can painstakingly construct an expression like
start(|[^e]|e(|[^n]|n(|[^d])))end
to capture a match between start and the first occurrence of end. Notice how the subexpression with nested parentheses spells out a number of alternatives which between them allow e only if it isn't followed by nd and so forth, and also take care to cover the empty string as one alternative which doesn't match whatever is disallowed at that particular point.
Of course, the correct approach in most cases is to use a proper parser for the format you are trying to parse, but sometimes, maybe one isn't available, or maybe the specialized tool you are using is insisting on a regular expression and nothing else.
Because you are using quantified subpattern and as descried in Perl Doc,
By default, a quantified subpattern is "greedy", that is, it will
match as many times as possible (given a particular starting location)
while still allowing the rest of the pattern to match. If you want it
to match the minimum number of times possible, follow the quantifier
with a "?" . Note that the meanings don't change, just the
"greediness":
*? //Match 0 or more times, not greedily (minimum matches)
+? //Match 1 or more times, not greedily
Thus, to allow your quantified pattern to make minimum match, follow it by ? :
/location="(.*?)"/
import regex
text = 'ask her to call Mary back when she comes back'
p = r'(?i)(?s)call(.*?)back'
for match in regex.finditer(p, str(text)):
print (match.group(1))
Output:
Mary
I need a regular expression that allows letters (English and Arabic) with numbers, but not only numbers, also allows punctuates, spaces and multi-line(\n), when i searched i found this one
(?!^\d+$)^.+$
that doesn't allow multi-line.
i tried to write my own which is
(([a-zA-Zء-ي\s:-])|([0-9]+[a-zA-Zء-ي\s:-])|([a-zA-Zء-ي\s:-]+[0-9]$))*
the problem of that is :
1. it doesn't accept a number as end of string as employer9 but if it was employer9+"space" it workes fine.
2. i have to write every punctuates that is allowed, is their is a way easier to do this?
You can probably use DOTALL or the s modifier for the regex. You could do this with:
(?s)(?!^\d+$)^.+$
...or you could use the compiler flags when constructing the regex.
An alternative not using DOTALL would be:
(?!^\d+$)^[\s\S]+$
I am looking to have a regular expression that allows only specific numbers to be entered, e.g. 2,4,5,6,10,18
I tried something like
"'2'|'4'|'5'|'6'|'10'|'18'"
and anything that i typed failed the regex and then the computer pointed its finger at me and laughed.
where am i going wrong?
The single quotes are unnecessary. The regex you are looking for is: ^(2|4|5|6|10|18)$.
The symbols ^ and $ denote the start and the end of the line, to prevent 121 from matching (since it contains 2).
I need a Regular Expression for "Atleast 6 characters with no spaces"
Please help
To allow anything other than whitespace ^\S{6,}$
To allow only alphabets: ^[a-zA-Z]{6,}$
To allow word chars (alphanumeric and _): ^\w{6,}$
You can use \w{6,}
Also FYI:
you can use Regex Coach for Regular Expression experiments.
It is easy to play with Regular Expressions by using tools like Regex Coach.
For literally any six non-space characters (other than newlines, depending on options):
[^ ]{6,}
For six non-whitespace characters:
\S{6,}