Regex to check character and integer inside a string - r

I have a data frame and I want to flag all the elements inside this which contains character or character plus integer value.
After a little bit of search, I could make this regex but its not giving the expected output:
([A-Za-z]+[0-9]|[0-9]+[A-Za-z])[A-Za-z0-9]*
Expected output
alpha - True
Alpha1 - True
A35. 1ha-True
Alp1Ha - True
A pha6- True
12345 - False
0 - False
-23442 - False

Try this : demo
^(?=.*[a-zA-Z]).+$

You may use
^[^A-Za-z]*[A-Za-z].*$
See the regex demo
Details
^ - start of a string
[^A-Za-z]* - 0 or more chars other than ASCII letters
[A-Za-z] - an ASCII letter
.* - any 0+ chars other than line break chars as many as possible
$ - end of string.

Related

grep in R, literal and pattern match

I have seen in manuals how to use grep to match either a pattern or an exact string. However, I cannot figure out how to do both at the same time. I have a latex file where I want to find the following pattern:
\caption[SOME WORDS]
and replace it with:
\caption[\textit{SOME WORDS}]
I have tried with:
texfile <- sub('\\caption[','\\caption[\\textit ', texfile, fixed=TRUE)
but I do not know how to tell grep that there should be some text after the square bracket, and then a closed square bracket.
You can use
texfile <- "\\caption[SOME WORDS]" ## -> \caption[\textit{SOME WORDS}]
texfile <-gsub('(\\\\caption\\[)([^][]*)]','\\1\\\\textit{\\2}]', texfile)
cat(texfile)
## -> \caption[\textit{SOME WORDS}]
See the R demo online.
Details:
(\\caption\[) - Group 1 (\1 in the replacement pattern): a \caption[ string
([^][]*) - Group 2 (\2 in the replacement pattern): any zero or more chars other than [ and ]
] - a ] char.
Another solution based on a PCRE regex:
gsub('\\Q\\caption[\\E\\K([^][]*)]','\\\\textit{\\1}]', texfile, perl=TRUE)
See this R demo online. Details:
\Q - start "quoting", i.e. treating the patterns to the right as literal text
\caption[ - a literal fixed string
\E - stop quoting the pattern
\K - omit text matched so far
([^][]*) - Group 1 (\1): any zero or more non-bracket chars
] - a ] char.

Regular Expression, Match ccurrenced text

I have this string :
*Field1=0(1936-S),Field13=0(2),Field2=0(),Field4=0(19.01.17),Field3=0(),Field5700=0(),Field5400=0(KS),Field14=0(21)*
And I need store n-th value of FieldN so I tried this:
(?<=Field[0-9]=0){4}?.*?(?=,)
This give me (1936-S) when I put N=4, I expect 19.01.17.
Can you help me please?
You may use
^\*(?:Field[0-9]+=0\([^)]*\),){3}Field[0-9]+=0\(\K[^)]*
(demo) or a capturing group based pattern:
^\*(?:Field[0-9]+=0\([^)]*\),){3}Field[0-9]+=0\(([^)]*)
See the regex demo
Details
^ - start of a string
\* - a * char
(?:Field[0-9]+=0\([^)]*\),){3} - 3 sequences of:
Field - a Field substring
[0-9]+ - 1 or more digits
=0\( - a =0( substring
[^)]* - any 0+ chars other than )
\), - a ), substring
Field[0-9]+=0\( - a Field subtring followed with 1+ digits, and then =0( substring
([^)]*) - Group 1: any 0+ chars other than ).

extract string from in R using regex

I have this vector:
jvm<-c("test - PROD_DB_APP_185b#SERVER01" ,"uat - PROD_DB_APP_SYS[1]#SERVER2")
I need to extract text until "[" or if there is no "[", then until the "#" character.
result should be
PROD_DB_APP_185b
PROD_DB_APP_SYS
I've tried something like this:
str_match(jvm, ".*\\-([^\\.]*)([.*)|(#.*)")
not working, any ides?
A sub solution with base R:
jvm<-c("test - PROD_DB_APP_185b#SERVER01" ,"uat - PROD_DB_APP_SYS[1]#SERVER2")
sub("^.*?\\s+-\\s+([^#[]+).*", "\\1", jvm)
See the online R demo
Details:
^ - start of string
.*? - any 0+ chars as few as possible
\\s+-\\s+ - a hyphen enclosed with 1 or more whitespaces
([^#[]+) - capturing group 1 matching any 1 or more chars other than #
and [
.* - any 0+ chars, up to the end of string.
Or a stringr solution with str_extract:
str_extract(jvm, "(?<=-\\s)[^#\\[]+")
See the regex demo
Details:
(?<=-\\s) - a positive lookbehind that matches an empty string that is preceded with a - and a whitespace immediately to the left of the current location
[^#\\[]+ - 1 or more chars other than # and [.

Regular Expression To exclude sub-string name(job corps) Includes at least 1 upper case letter, 1 lower case letter, 1 number and 1 symbol except "#"

Regular Expression To exclude sub-string name(job corps)
Includes at least 1 upper case letter, 1 lower case letter, 1 number and 1 symbol except "#"
I have written something like below :
^((?!job corps).)(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[!#$%^&*]).*$
I tested with the above regular expression, not working for special character.
can anyone guide on this..
If I understand well your requirements, you can use this pattern:
^(?![^a-z]*$|[^A-Z]*$|[^0-9]*$|[^!#$%^&*]*$|.*?job corps)[^#]*$
If you only want to allow characters from [a-zA-Z0-9^#$%&*] changes the pattern to:
^(?![^a-z]*$|[^A-Z]*$|[^0-9]*$|[^!#$%^&*]*$|.*?job corps)[a-zA-Z0-9^#$%&*]*$
details:
^ # start of the string
(?! # not followed by any of these cases
[^a-z]*$ # non lowercase letters until the end
|
[^A-Z]*$ # non uppercase letters until the end
|
[^0-9]*$
|
[^!#$%^&*]*$
|
.*?job corps # any characters and "job corps"
)
[^#]* # characters that are not a #
$ # end of the string
demo
Note: you can write the range #$%& like #-& to win a character.
stribizhev, your answer is correct
^(?!.job corps)(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[!#$%^&])(?!.#).$
can verify the expression in following url:
http://www.freeformatter.com/regex-tester.html

Regular expression to match version numbers

I need a regular expression that is matching below criteria
For example : below should be matched
1
1134
1.1
1.4.5.6
Those below should not match:
.1
1.
1..6
You can use
^\d+(\.\d+)*$
See demo
^ - beginning of string
\d+ - 1 or more digits
(\.\d+)* - a group matching 0 or more sequences of . + 1 or more digits
$ - end of string.
You can use a non-capturing group, too: ^\d+(?:\.\d+)*$, but it is not so necessary here.

Resources