I am using the following regex with a .net validator.
^100|150|200|250|300|350|400|450|500|550|600|650|700|750|800|850|900|950|1000$
The aim is to allow 1 of the values in the list.
However, whilst it works great with most, inputting '1000' produces an error.
Any ideas?
You need to limit the scope of your alternation:
^(100|150|200|250|300|350|400|450|500|550|600|650|700|750|800|850|900|950|1000)$
And of course you can optimize your regex:
^([1-9][05]0|1000)$
Related
I am attempting to use a fairly complex REGEX expression (see REGEX101 demos below), which I amended slightly from one created by an expert on this site. It parses specific patterns of log events:
1EXE_IN1EXE_CO2CONTENT_ACCESS3CONTENT_ACCESS
These log sequences will always begin with a random selection of EXE_IN or EXE_CO events, preceded sequence numbers. These selections can be any number, in any order. In this case, we just have two EXE events but this may be 200. Or 1. Note that there is a sequence number and we need to capture it.
The second part of the sequence will always be a series of digit-prefaced CONTENT.ACCESS events. Again from 1 to infinity in length.
The following demo shows a working example and probably conveys the concept better than I can : Demo 1
It nicely captures a full match, sequence number, and event in separate groups.
I need to add a timestamp to the pattern (after the sequence number, with a preceding underscore), and then parse this event log e.g.
1_11/08/2014 23:03EXE_IN1_11/08/2014 23:03EXE_CO2_12/08/2014 09:17CONTENT_ACCESS3_13/08/2014 09:17CONTENT_ACCESS
I need to capture the timestamps as well.
I attempted to adjust the regex expression, with mixed results. Please see this demo: demo2
Ideally I'd like to see something like this for each event:
Match n
Full match 266-308 `2_12/08/2014 09:17CONTENT_ACCESS`
Group 1. 266-267 `2`
Group 2. 268-284 `12/08/2014 09:17`
Group 3. 284-308 `CONTENT_ACCESS`
I hope you can help me. REGEX101 pcre testing is sufficient (for the record, I am using perl-compatible str_match_all_perl function in R).
Many thanks in advance.
(\d+)_(.*?)(EXE_CO|EXE_IN|CONTENT_ACCESS)
https://regex101.com/r/EHHcKm/1
Due to comments it was changed to (?:\G(?!^)(?(?=\d+_\d{2}\/\d{2}\/\d{4}\s\d{2}\:\d{2}(?:EXE_CO|EXE_IN))(?<!\d_\d{2}\/\d{2}\/\d{4}\s\d{2}\:\d{2}CONTENT_ACCESS))|(?=(?:\d+_\d{2}\/\d{2}\/\d{4}\s\d{2}\:\d{2}(?:EXE_CO|EXE_IN))+(?:\d+_\d{2}\/\d{2}\/\d{4}\s\d{2}\:\d{2}CONTENT_ACCESS)+))(\d+)_(\d{2}\/\d{2}\/\d{4}\s\d{2}\:\d{2})(EXE_CO|EXE_IN|CONTENT_ACCESS)
https://regex101.com/r/EHHcKm/3
Ans also another version, which is shorter
(?:\G(?!^)(?(?=\d+_.{16}(?:EXE_CO|EXE_IN))(?<!\d_.{16}CONTENT_ACCESS))|(?=(?:\d+_.{16}(?:EXE_CO|EXE_IN))+(?:\d+_.{16}CONTENT_ACCESS)+))(\d+)_(.{16})(EXE_CO|EXE_IN|CONTENT_ACCESS)
https://regex101.com/r/EHHcKm/4
And even more shorter (?:\G(?!^)(?(?=\d+_.{16}E)(?<!S))|(?=(?:\d+_.{16}(?:EXE_CO|EXE_IN))+\d+_.{16}C))(\d+)_(.{16})(EXE_CO|EXE_IN|CONTENT_ACCESS)
https://regex101.com/r/EHHcKm/5
And super short (?:\G|(?=\d+_.{16}E.*CON))(\d+)_(.*?)(EXE_CO|EXE_IN|CONTENT_ACCESS)
https://regex101.com/r/EHHcKm/8
I am trying to create a regular expression for a dollar amount that accepts values between 5.00 and 1000.00.
Here is what I have so far:
^([5-9](\d){0,4}([.](\d){1,2})?|1000([.](0){1,2})?)?$
I have already tried the range validator and it isn't working this field.
Any help is much appreciated.
This is what I came up with, which could likely be improved. It seems to work for my limited testing. You may want to tag your question with "regex" to get some expert advice!
^(?:[5-9](?:\.\d{0,2})?|\d{2,3}(?:\.\d{0,2})?|1000(?:\.0{0,2})?)$
Should be validating 1-280 input characters, but it hangs when more than 280 characters are input.
Clarification
I am using the above regex to validate the length of input string to be 280 characters maximum.
I am using asp:RegularExpressionValidator to do that.
There's nothing “wrong” with it per se, but it's horrendous because with most RE engines (you don't say which one you're using) when it doesn't match with the first thing it tries because it causes the engine to backtrack and try loads of different possibilities (none of which can ever cause a match). So it's not a hang, but rather just a machine that's trying to execute around 2280 operations to see if there's a match possible. Excuse me if I don't wait around for that!
Of course, it's theoretically possible for the RE compiler to merge the (.|\s) part of the RE into something it doesn't need to backtrack to deal with. Some RE engines do this (typically the more automata-theoretic ones) but many don't (the stack-based ones).
It is trying every possible combination of . and \s for each character trying to find a version of the pattern that matches the string.
. already matches any character, so (.|\s) is redundant. Further, if you just want to check what the length of the string is, then just do that - why are you pulling out regexes?
If you really want to use a regular expression, you could use .{1, 280}$ combined with the SingleLine option, so that the . metacharacter will match everything, including new lines (see here, Regular Expression API section).
I'd like to find patterns and sort them by number of occurrences on an HEX file I have.
I am not looking for some specific pattern, just to make some statistics of the occurrences happening there and sort them.
DB0DDAEEDAF7DAF5DB1FDB1DDB20DB1BDAFCDAFBDB1FDB18DB23DB06DB21DB15DB25DB1DDB2EDB36DB43DB59DB32DB28DB2ADB46DB6FDB32DB44DB40DB50DB87DBB0DBA1DBABDBA0DB9ADBA6DBACDBA0DB96DB95DBB7DBCFDBCBDBD6DB9CDBB5DB9DDB9FDBA3DB88DB89DB93DBA5DB9CDBC1DBC1DBC6DBC3DBC9DBB3DBB8DBB6DBC8DBA8DBB6DBA2DB98DBA9DBB9DBDBDBD5DBD9DBC3DB9BDBA2DB84DB83DB7DDB6BDB58DB4EDB42DB16DB0DDB01DB02DAFCDAE9DAE5DAD9DAE2DAB7DA9BDAA6DA9EDAAADAC9DACADAC4DA92DA90DA84DA89DA93DAA9DA8CDA7FDA62DA53DA6EDA
That's an excerpt of the HEX file, and as an example I'd like to get:
XX occurrences of BDBDBD
XX occurrences of B93D
Is there a way to mine the file to generate that output?
Sure. Use a sliding window to create the counts (The link is for Perl, but it seems general enough to understand the algorithm). Your patterns are named N-grams. You will have to limit the maximal pattern, though.
This is a pretty classic CS problem. The code in general is non-trivial to implement as it will require at least one full parse of the sequence, and depending on your efficiency and memory/processor constraints might require several. See here.
You will need to partition your input string in some way to ensure that you get a good subsequence across it.
If there is a specific problem we might be able to help more, but the general strategy is in the Wikipedia article above.
You can use Regular Expressions to make a pattern to search for.
The regex needed would be very simple. Just use the exact phrase you're searching for. Then there should be a regular expression function in the language you're using (you didn't specify) that can count the number of matches.
Use that to create a simple counter.
I want to use an ASP.NET RegularExpressionValidator to limit the number of words in a text box. (The RegularExpressionValidator is my favoured solution because it will do both client and server side checks).
So what would be the correct Regex to put in the RegularExpressionValidator that will count the words and enforce a word-limit? For, lets say, 150 words.
(NB: I see that this question is similar, but the answers given seem to also rely on code such as Split() so I don't think any of them could plug into a RegularExpressionValidator which is why I'm asking again)
Since ^ and $ is implicitly set with RegularExpressionValidators, use the following:
(\S*\s*){0,10}
The 0 here allows empty strings (more specifically 0 words) and 150 is the max number of words to accept. Adjust these as necessary to increase/decrease the number of words accepted.
The above regex is non-greedy, so you'll get a quicker match verses the one given in the question you reference. (\b.*\b){0,10} is greedy, so as you increased the number of words you'll see a decrease in performance.
Here is a quick reference for regular expressions:
http://msdn.microsoft.com/en-us/library/az24scfc.aspx
You can use this site to test the expressions:
http://regexpal.com/
Here is my regex example that works with both minimum and maximum word count (and fixes bug with leading spacing):
^\s*(\S+\s+|\S+$){10,150}$
Check this site:
http://lawrence.ecorp.net/inet/samples/regexp-validate.php#count
It's JavaScript RegEx, but it's very similar to asp.net
It's something like this:
(\b[a-z0-9]+\b.*){4,}