GREP, command does not work as I've been reading

GREP, command does not work as I've been reading - unix

I have given an assignment and the question is:
Suppose you were working a word puzzle and needed a 5-letter word beginning with a vowel (including ‘y’) in upper- or lower-case, a lower-case ‘t’ in the third position, and ending with a lower-case ‘s’.
The remaining letters could be any character that occurs in English words, including upper and lower-case alphabetics, numbers, hyphens, etc. (We will accept the file /usr/share/dict/words as the authority on what constitutes a valid English word and what characters can appear within one.)
What grep command would you use to list the words from /usr/share/dict/words that match this requirement?
I have tried a lot of commands and I can not seem to get it. Is there any kind of hint someone could give me towards the
answer?

Related

How do I terminate my pattern at a line break?

I have a long character that comes from a pdf that I want to process.
I have recurring instances of Table X. Name of the table, that in my character are always followed by a \r\n
However, when I try to extract all the tables in a list, using List_Tables <-str_extract_all(Plain_Text, "Table\\s+\\d+\\.\\s+(([A-z]|\\s))+\\r\\n"), I do have often another line that is still in my extraction, e.g.
> List_Tables
[[1]]
[1] "Table 1. Real GDP\r\n Percentage changes\r\n"
[2] "Table 2. Nominal GDP\r\n Percentage changes\r\n"
What have I missed in my code ?

\s matches all whitespace, including line breaks! When combined with the greedy quantifier +, this means that (([A-z]|\\s))+ matches, in your first example,
Real GDP\r\n […] Percentage changes\r\n
The easiest way to fix this is to use a non-greedy quantifier: i.e. +? instead of +.
Just for completeness’ sake I’ll mention that there are alternatives, but they get more complicated. For instance, you could use negative assertions to include an “if” test to match whitespace which isn’t a line break character; or you could use the character class [ \t] instead of \s, which is more restrictive but also more explicit and probably closer to what you want.

Negative lookbehind or tempered pattern for checking one two or three words before a string

I am trying to write a code that would identify supporting terms (i.e. 'detect' 'evidence') unless there is a negation term up to 3 words before.
Some examples:
"FISH tests did not detect BCL2 translocation"
"FISH tests did not provide evidence of a BCL2 translocation"
I tried using lookbehind, but since it requires an exact length I can't have a the flexibility of looking back 1-3 words.
I tried using a tempered dot, but it gives any number of words.
The code I currently have, looks only a single word before the 'support diagnosis' term.
grepl("(?<!\\bnot\\b\\s|cannot\\s|n't\\s|\\bno\\b\\s|negative\\s)(reveal|seen|show|detect|demonstrate|confirm|identif|evidence|suggest|positive|observe)(?:(?!\\bnot\\b)(?!cannot)(?!n't)(?!\\bno\\b)(?!negative for)(?!, ).)*?(bcl-?2|14[q]?[;:]18)"), y, perl=TRUE,ignore.case = T)

A lookbehind doesn't help in this situation, what you can do is to systematically search the negative terms and to discard parts of your string using (*SKIP)(*FAIL) up to three words:
(\\bnot\\b|\\bcannot\\b|n't\\b)(?:\\W++(?!(?1))\\w+){0,3}(*SKIP)(*F)|\\b(reveal|seen|show)\\b(?!\\snot\\b)

Combining 2 regular expression in web.config passwordStrengthRegularExpression=""

I am not able to combine below two regular expressions. Password standard requirement:
Password cannot contain your username or parts of your full name
exceeding two consecutive characters
Passwords must be at least 6 characters in length
Passwords must contain characters from three of the following categories
Uppercase characters (English A-Z)
Lowercase characters (English a-z)
Base 10 digits (0-9)
Non-alphabetic characters (e.g., !, #, #, $, %, etc.)
Expression:
passwordStrengthRegularExpression="((?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{6,20})"
Passwords cannot contain the word “Test” or “test” or variants of the word
passwordStrengthRegularExpression="((?=.*\"^((?!Test|test|TEST).*)$"
Both are working fine individually.

Because your second regexp primarily uses a negative lookahead, you can remodel that slightly and stick it right at the beginning of the other expression. First, I'm going to change your second regex to:
"(?!.*(?:Test|test|TEST))"
In english, the string may not contain any number of (or zero) characters followed by test.
Then, I'm going to stick that right at the beginning of your other expression
passwordStrengthRegularExpression="^(?!.*(?:Test|test|TEST))(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{6,20}$"
Finally, I'm going to show you how to make only one part of a regex case-insensitive. This may or may not be supported depending on what program this is actually for.
passwordStrengthRegularExpression="^(?!.*(?i:test))(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{6,20}$"
See the (?i:...)? That means that the flags between the ? and the : are applied only to that part of the expression, that is, only that area is case-insensitive.

Combining your requirements and https://stackoverflow.com/a/2860380/156388 i've come up with this:
(?=^[^\s]{6,}$)(?!.*(?i:test))((?=.*?\d)(?=.*?[A-Z])(?=.*?[a-z])|(?=.*?\d)(?=.*?[^\w\d\s])(?=.*?[a-z])|(?=.*?[^\w\d\s])(?=.*?[A-Z])(?=.*?[a-z])|(?=.*?\d)(?=.*?[A-Z])(?=.*?[^\w\d\s]))^.*
Dont think your first regex is actually working fine if you want to meet the requirements in bullets above it. Clamps to 20 chars but doesn't say you have to. Requires all four of the categories but requirements says 3 of the 4. Doesn't check the username requirement at all. So I've gutted out most of the initial regex.
It matches these (as expected):
Short5
TeSamplePrd6
TEBREaKST6
WinningUser6#
It fails on these (as expected):
SamplePassword
TestUser6#
Shrt5
TeSTTest
Remaining problems
For some reason it matches this:
TEBREKST6
but it only meets two of the four requirements + min length - not sure why?
There is nothing taken into account about the "Password cannot contain your username or parts of your full name exceeding two consecutive characters" requirement and I'm not sure you can even do this through web.config min password requirement as you dont have access to it within the regex.

RegEx to check the string contains least one alphabet or digit

I want a regular expression that check string must contain least an alphabet [a-zA-Z] or a digit. All other special characters are allowed, but only special characters or only spaces or only spaces with special characters will now be accepted.
I have tried /\b(?=[A-Z]*[0-9])(?=[0-9]*[A-Z])[\s\S]\b/i and ^(a-zA-Z0-9).*[\s\S]*$ and ^(a-zA-Z0-9).*[\s].*[\S]*$ etc. But its not working. Awaiting for your valuable response.
Thanks

^(?=.*[\w\d]).+
This pattern will fail if there is not at least one character or one digit with any combination of special characters and spaces.

I'm not sure I understood you correctly, but from what I've gathered you want to have atleast one letter (a-z, 0-9) in the string. This regex will do just that: /^(?=.*[a-z\d]).+/igm
(Set the flags however they need to be set in asp.net. The m-flag might be redundant for you, I only used it for the demo. The g-flag likely does not exist. If so, just remove it.)
Demo+explanation: http://regex101.com/r/jY9fJ5

If you want at least one alphabet or digit, followed by only spaces and symbols:
/^.*[a-zA-Z0-9][^a-zA-Z0-9]*$/
If you want only one alphabet or digit, followed by the same:
/^[^a-zA-Z0-9]*[a-zA-Z0-9][^a-zA-Z0-9]*$/
I can't imagine what else it is that you are looking for. Examples would help immensely.

(?=.*?[0-9])(?=.*?[A-Za-z]).+
Allows special characters and makes sure at least one number and one letter.
(?=.*?[0-9])(?=.*?[A-Za-z])(?=.*[^0-9A-Za-z]).+
Demands at least one letter, one digit and one special-character.
The first one does not demand special chars, only allows them.

ASP.Net Regular Expression Validator no spaces

I'm trying to set up a validation expression for an ASP.Net Regular Expression Validator control. It is for validating the creation of a user name, so I want to limit the number of characters, and I also want to prevent them from using spaces. Here's what I've got so far:
^.*(?=.{5,20})(?=.*\w{5,255}).*$
The \w{5,255} part prevents spaces and special characters (except for underscores, apparently). I have no idea how "5,255" makes it work, but it does; I just copied it from somewhere else.
The main problem I'm having is that if the first or last character is a space (or special character), it passes validation, which is not acceptable. Can anyone help me? I'm sure it is something simple, but I know next to nothing about regular expressions.

You can use something simpler like this:
^[a-zA-Z0-9_]{5,255}$
This will allow alphanumeric usernames between 5-255 characters in length.

(let's expand overall understanding of how to at least use regex!)
The main reason why the posted regex wasn't working is because you were attempting to use lookahead. Lookahead is a 0-length pattern that just guarantees that the next part of the string will match a certain pattern (and is usually used to take advantage of it being 0-length, so it doesn't expand your capturing group).
Effectively, what your regex (going off of the original /^.(?=.{5,20})(?=.\w{5,255}).*$/) meant was:
^. "The beginning of our line should match any single character (provided it's not a newline, although this depends on the regex implementation as well as flags that may or may not have been passed in)"
(?= "and guarantee that after here"
.{5,20}) are any 5-20 characters."
(?= "Also, after that same first character (since, remember, lookahead is 0-length), guarantee"
. "one arbitrary character"
\w{5,255}) "and 5-255 word characters."
.*$ And of course, since all of that exhaustive matching was 0-length, we want the rest of the line to be an arbitrary number of characters."
What you technically could have done to use lookaround was ^(?=\w{5,255}).{5,255}$, but that's just overly convoluted. I'd suggest just using \w{5,255} or something along those lines.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

GREP, command does not work as I've been reading - unix

Related

How do I terminate my pattern at a line break?

Negative lookbehind or tempered pattern for checking one two or three words before a string

Combining 2 regular expression in web.config passwordStrengthRegularExpression=""

RegEx to check the string contains least one alphabet or digit

ASP.Net Regular Expression Validator no spaces

Categories

Resources