Remove a comma from a string with XQuery - xquery

I have an XQuery variable, $RequestinteractionIds, with a value like '47575','65656',
I would like to get rid of the last comma.
Please suggest a solution using XQuery (I am using Oracle's XQuery OSB).

A simpler regular expression for replace() that would do the job would be:
replace("'47575','65656',", "(.*),$", "$1")
However, not everyone likes regular expressions or understands them, so you may find it more understandable to use tokenize and then string-join:
string-join(tokenize("'47575','65656',", ","), ",")

Good old substring should work too:
let $RequestinteractionIds := "'47575','65656',"
return substring($RequestinteractionIds, 1, string-length($RequestinteractionIds) - 1)
HTH!

Related

Match everything up until first instance of a colon

Trying to code up a Regex in R to match everything before the first occurrence of a colon.
Let's say I have:
time = "12:05:41"
I'm trying to extract just the 12. My strategy was to do something like this:
grep(".+?(?=:)", time, value = TRUE)
But I'm getting the error that it's an invalid Regex. Thoughts?
Your regex seems fine in my opinion, I don't think you should use grep, also you are missing perl=TRUE that is why you are getting the error.
I would recommend using :
stringr::str_extract( time, "\\d+?(?=:)")
grep is little different than it is being used here, its good for matching separate values and filtering out those which has similar pattern, but you can't pluck out values within a string using grep.
If you want to use Base R you can also go for sub:
sub("^(\\d+?)(?=:)(.*)$","\\1",time, perl=TRUE)
Also, you may split the string using strsplit and filter out the first string like below:
strsplit(time, ":")[[1]][1]

Repeating a regex pattern for date parsing

I have the following string
"31032017"
and I want to use regular expressions in R to get
"31.03.2017"
What is the best function to do it?
And a general question, how can I repeat the matched part, like as in sed in bash? There, we use \1 to repeat the first matched part.
You need to put the single parts in round brackets like this:
sub("([0-9]{2})([0-9]{2})([0-9]{4})", "\\1.\\2.\\3", "31032017")
You can then use \\1 to access the part matched by the first group, \\2 for the second and so on.
Note that if your string is a date, there are better ways to parse / reformat it than directly using regex.
date_vector = c("31032017","28052017","04052022")
as.character(format(as.Date(date_vector, format = "%d%m%Y"), format = "%d.%m.%Y"))
#[1] "31.03.2017" "28.05.2017" "04.05.2022"
If you want to work/do math with dates, omit as.character.

time pattern in list.files function (R)

I'm trying to get a list of subdirectories from a path. These subdirectories have a time pattern month\day\hour, i.e. 03\21\11.
I naively used the following:
list.files("path",pattern="[0-9]\[0-9]\[0-9]", recursive = TRUE, include.dirs = TRUE)
But it doesn't work.
How to code for the digitdigit\digitdigit\digitdigit pattern here?
Thank you
This Regex works for 10\11\18.
(\d\d\\\d\d\\\d\d)
I think you may need lazy matching for regex, unless there's always two digits - in which case other responses look valid.
If you could provide a vector of file name strings, that would be super helpful.
Capturing backslashes is confusing, I've found this thread helpful: R - gsub replacing backslashes
My guess is something like this: '[0-9]+?\\\\[0-9]+?\\\\[0-9]+'

not only numbers regular expression

I need a regular expression that allows letters (English and Arabic) with numbers, but not only numbers, also allows punctuates, spaces and multi-line(\n), when i searched i found this one
(?!^\d+$)^.+$
that doesn't allow multi-line.
i tried to write my own which is
(([a-zA-Zء-ي\s:-])|([0-9]+[a-zA-Zء-ي\s:-])|([a-zA-Zء-ي\s:-]+[0-9]$))*
the problem of that is :
1. it doesn't accept a number as end of string as employer9 but if it was employer9+"space" it workes fine.
2. i have to write every punctuates that is allowed, is their is a way easier to do this?
You can probably use DOTALL or the s modifier for the regex. You could do this with:
(?s)(?!^\d+$)^.+$
...or you could use the compiler flags when constructing the regex.
An alternative not using DOTALL would be:
(?!^\d+$)^[\s\S]+$

XQuery : substring-before/after a specific char appearing many times?

I have this string ($str) : 67-89-90T, and I want to keep 67-89.
Of course substring-before($str,'-') returns 67, but how can I choose the 2nd dash?
A XPath 2 solution would be:
replace("67-89-90T", "(.*)-[^-]*$", "$1")
(also works, if there is no -, since then it does not match and replaces nothing)
An XPath 2.0 solution using tokenization -- might be more readable and understandable than using RegEx:
string-join(tokenize(concat($vStr,'-'), '-')
[.][not(position() eq last())],
'-')
When run with an XQuery processor or with an XPath 2.0 processor, the expression:
string-join(tokenize(concat('-', '67-89-90T','-'), '-')
[.][not(position() eq last())],
'-')
produces the wanted, correct result:
67-89
Ok I finally found functx:substring-before-last. Doesn't choose the specific dash but works for me in this case.
http://www.xqueryfunctions.com/xq/functx_substring-before-last.html

Resources