I need a regular expression that is matching below criteria
For example : below should be matched
1
1134
1.1
1.4.5.6
Those below should not match:
.1
1.
1..6
You can use
^\d+(\.\d+)*$
See demo
^ - beginning of string
\d+ - 1 or more digits
(\.\d+)* - a group matching 0 or more sequences of . + 1 or more digits
$ - end of string.
You can use a non-capturing group, too: ^\d+(?:\.\d+)*$, but it is not so necessary here.
Related
I have strings of varying lengths in this format:
"/S498QSB 0 'Score=0' 1 'Score=1' 2 'Score=2' 3 'Score=3' 7 'Not administered'"
the first item is a column name and the other items tell us how this column is encoded
I want the following output:
/S498QSB
0 'Score=0'
1 'Score=1'
2 'Score=2'
3 'Score=3'
7 'Not administered'"
str_split should do it, but it's not working for me:
str_split("/S498QSB 0 'Score=0' 1 'Score=1' 2 'Score=2' 3 'Score=3' 7 'Not administered'",
"([ ].*?[ ].*?)[ ]")
You can use
str_split(x, "\\s+(?=\\d+\\s+')")
See the regex demo.
Details:
\s+ - one or more whitespaces
(?=\d+\s+') - a positive lookahead that requires the following sequence of patterns immediately to the right of the current location:
\d+ - one or more digits
\s+ - one or more whitespaces
' - a single quotation mark.
I have this string :
*Field1=0(1936-S),Field13=0(2),Field2=0(),Field4=0(19.01.17),Field3=0(),Field5700=0(),Field5400=0(KS),Field14=0(21)*
And I need store n-th value of FieldN so I tried this:
(?<=Field[0-9]=0){4}?.*?(?=,)
This give me (1936-S) when I put N=4, I expect 19.01.17.
Can you help me please?
You may use
^\*(?:Field[0-9]+=0\([^)]*\),){3}Field[0-9]+=0\(\K[^)]*
(demo) or a capturing group based pattern:
^\*(?:Field[0-9]+=0\([^)]*\),){3}Field[0-9]+=0\(([^)]*)
See the regex demo
Details
^ - start of a string
\* - a * char
(?:Field[0-9]+=0\([^)]*\),){3} - 3 sequences of:
Field - a Field substring
[0-9]+ - 1 or more digits
=0\( - a =0( substring
[^)]* - any 0+ chars other than )
\), - a ), substring
Field[0-9]+=0\( - a Field subtring followed with 1+ digits, and then =0( substring
([^)]*) - Group 1: any 0+ chars other than ).
How do I match the year such that it is general for the following examples.
a <- '"You Are There" (1953) {The Death of Socrates (399 B.C.) (#1.14)}'
b <- 'Þegar það gerist (1998/I) (TV)'
I have tried the following, but did not have the biggest success.
gsub('.+\\(([0-9]+.+\\)).?$', '\\1', a)
What I thought it did was to go until it finds a (, then it would make a group of numbers, then any character until it meets a ). And if there are several matches, I want to extract the first group.
Any suggestions to where I go wrong? I have been doing this in R.
You could use
library(stringr)
strings <- c('"You Are There" (1953) {The Death of Socrates (399 B.C.) (#1.14)}', 'Þegar það gerist (1998/I) (TV)')
years <- str_match(strings, "\\((\\d+(?: B\\.C\\.)?)")[,2]
years
# [1] "1953" "1998"
The expression here is
\( # (
(\d+ # capture 1+ digits
(?: B\.C\.)? # B.C. eventually
)
Note that backslashes need to be escaped in R.
Your pattern contains .+ parts that match 1 or more chars as many as possible, and at best your pattern could grab last 4 digit chunks from the incoming strings.
You may use
^.*?\((\d{4})(?:/[^)]*)?\).*
Replace with \1 to only keep the 4 digit number. See the regex demo.
Details
^ - start of string
.*? - any 0+ chars as few as possible
\( - a (
(\d{4}) - Group 1: four digits
(?: - start of an optional non-capturing group
/ - a /
[^)]* - any 0+ chars other than )
)? - end of the group
\) - a ) (OPTIONAL, MAY BE OMITTED)
.* - the rest of the string.
See the R demo:
a <- c('"You Are There" (1953) {The Death of Socrates (399 B.C.) (#1.14)}', 'Þegar það gerist (1998/I) (TV)', 'Johannes Passion, BWV. 245 (1725 Version) (1996) (V)')
sub("^.*?\\((\\d{4})(?:/[^)]*)?\\).*", "\\1", a)
# => [1] "1953" "1998" "1996"
Another base R solution is to match the 4 digits after (:
regmatches(a, regexpr("\\(\\K\\d{4}(?=(?:/[^)]*)?\\))", a, perl=TRUE))
# => [1] "1953" "1998" "1996"
The \(\K\d{4} pattern matches ( and then drops it due to \K match reset operator and then a (?=(?:/[^)]*)?\\)) lookahead ensures there is an optional / + 0+ chars other than ) and then a ). Note that regexpr extracts the first match only.
I have this vector:
jvm<-c("test - PROD_DB_APP_185b#SERVER01" ,"uat - PROD_DB_APP_SYS[1]#SERVER2")
I need to extract text until "[" or if there is no "[", then until the "#" character.
result should be
PROD_DB_APP_185b
PROD_DB_APP_SYS
I've tried something like this:
str_match(jvm, ".*\\-([^\\.]*)([.*)|(#.*)")
not working, any ides?
A sub solution with base R:
jvm<-c("test - PROD_DB_APP_185b#SERVER01" ,"uat - PROD_DB_APP_SYS[1]#SERVER2")
sub("^.*?\\s+-\\s+([^#[]+).*", "\\1", jvm)
See the online R demo
Details:
^ - start of string
.*? - any 0+ chars as few as possible
\\s+-\\s+ - a hyphen enclosed with 1 or more whitespaces
([^#[]+) - capturing group 1 matching any 1 or more chars other than #
and [
.* - any 0+ chars, up to the end of string.
Or a stringr solution with str_extract:
str_extract(jvm, "(?<=-\\s)[^#\\[]+")
See the regex demo
Details:
(?<=-\\s) - a positive lookbehind that matches an empty string that is preceded with a - and a whitespace immediately to the left of the current location
[^#\\[]+ - 1 or more chars other than # and [.
Regular Expression To exclude sub-string name(job corps)
Includes at least 1 upper case letter, 1 lower case letter, 1 number and 1 symbol except "#"
I have written something like below :
^((?!job corps).)(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[!#$%^&*]).*$
I tested with the above regular expression, not working for special character.
can anyone guide on this..
If I understand well your requirements, you can use this pattern:
^(?![^a-z]*$|[^A-Z]*$|[^0-9]*$|[^!#$%^&*]*$|.*?job corps)[^#]*$
If you only want to allow characters from [a-zA-Z0-9^#$%&*] changes the pattern to:
^(?![^a-z]*$|[^A-Z]*$|[^0-9]*$|[^!#$%^&*]*$|.*?job corps)[a-zA-Z0-9^#$%&*]*$
details:
^ # start of the string
(?! # not followed by any of these cases
[^a-z]*$ # non lowercase letters until the end
|
[^A-Z]*$ # non uppercase letters until the end
|
[^0-9]*$
|
[^!#$%^&*]*$
|
.*?job corps # any characters and "job corps"
)
[^#]* # characters that are not a #
$ # end of the string
demo
Note: you can write the range #$%& like #-& to win a character.
stribizhev, your answer is correct
^(?!.job corps)(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[!#$%^&])(?!.#).$
can verify the expression in following url:
http://www.freeformatter.com/regex-tester.html