Reg Expression Formula-Extract all text matching pattern - formula

I have the following Regexp substring in a formula field for a Netsuite workflow:
REGEXP_SUBSTR({title},'[S][0-9]+')
This works correctly if there is a string such as S1223548 in the {title} field.
However, if the title field has a string like the following,
FW: Invoice 1545478,S121548
I receive an "ERROR: Invalid Expression" error message.
Why is that? Shouldn't this be searching for the pattern of the letter S and then numbers regardless of where that string pattern is? i.e. at the start of the text string or in the middle
EDIT: Full formula being used:

Related

get only numbers inside parenthesis filter or custom filter or what?

The string is "Some Words(1440)" and I want to store the numbers inside the parenthesis as a variable in twig so it can be output and used. I thought maybe I could do it with a split but I wasn't able to escape the parenthesis properly.
What I have:
Some Words (1440)
What I want to extract from the string is just the numbers in parenthesis
1440

How to extract a substring from main string starting from valid uuid using lua

I have a main string as below
"/tmp/xjtscpdownload/7eb17cc6-b3c9-4ebd-945b-c0e0656a33f0/output/9999.317528060546245771146821638997525068657/"
From the main string i need to extract a substring starting from the uuid part
"/7eb17cc6-b3c9-4ebd-945b-c0e0656a33f0/output/9999.317528060546245771146821638997525068657/"
I tried
string.match("/tmp/xjtscpdownload/7eb17cc6-b3c9-4ebd-945b-c0e0656a33f0/output/9999.317528060546245771146821638997525068657/", "/[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}/(.)/(.)/$"
But noluck.
if you want to obtain
"/7eb17cc6-b3c9-4ebd-945b-c0e0656a33f0/output/9999.317528060546245771146821638997525068657/"
from
"/tmp/xjtscpdownload/7eb17cc6-b3c9-4ebd-945b-c0e0656a33f0/output/9999.317528060546245771146821638997525068657/"
or let's say 7eb17cc6-b3c9-4ebd-945b-c0e0656a33f0, output and 9999.317528060546245771146821638997525068657 as this is what your pattern attempt suggests. Otherwise leave out the parenthesis in the following solution.
You can use a pattern like this:
local text = "/tmp/xjtscpdownload/7eb17cc6-b3c9-4ebd-945b-c0e0656a33f0/output/9999.317528060546245771146821638997525068657/"
print(text:match("/([%x%-]+)/([^/]+)/([^/]+)"))
"/([^/]+)/" captures at least one non-slash-character between two slashs.
On your attempt:
You cannot give counts like {4} in a string pattern.
You have to escape - with % as it is a magic character.
(.) would only capture a single character.
Please read the Lua manual to find out what you did wrong and how to use string patterns properly.
Try also the code
s="/tmp/xjtscpdownload/7eb17cc6-b3c9-4ebd-945b-c0e0656a33f0/output/9999.317528060546245771146821638997525068657/"
print(s:match("/.-/.-(/.+)$"))
It skips the first two "fields" by using a non-greedy match.

Using R, how does one extract multiple URLs/pattern matches from a string in a dataset, and then place each URL in its own adjacent column?

I have a (large) dataset that initially consists of an identifier and associated text (in raw HTML). Oftentimes the text will include one or more embedded links. Here's a sample dataset:
id text
1 <p>I love dogs!</p>
2 <p>My <strong>favorite</strong> dog is this kind.</p>
3 <p>I've had both Labs and Huskies in my life.</p>
What I'd like as output (with the text column included in the same spot, but I removed it for visibility here) is:
id link1 link2
1
2 doge.com
3 labs.com huskies.com
I've tried using str_extract_all() paired with <a\s+(?:[^>]*?\s+)?href=(["'])(.*?)\1, but even when I double escape the backslashes I either get an "unexpected" error OR it keeps asking me for more and I have to Escape out. I feel like this method is the one I want and SHOULD work, but I can't seem to get the regex to play nicely. Here are my results so far:
> str_extract_all(text, "<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)\1")
Error: '\s' is an unrecognized escape in character string starting ""<a\s"
> str_extract_all(text, perl(<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)\1))
Error: unexpected '<' in "str_extract_all(text, perl(<"
> str_extract_all(text, "<a\\s+(?:[^>]*?\\s+)?href=(["'])(.*?)\\1")
+
> str_extract_all(text, perl(<a\\s+(?:[^>]*?\\s+)?href=(["'])(.*?)\\1))
Error: unexpected '<' in "str_extract_all(text, perl(<"
I've also tried parseURI from the XML package and for whatever reason it crashes my R session.
The other solutions I've found to date either only deal with single links, or return items in a list or vector altogether. I want to keep things separated by their identifier and in a dataset.
If needed, I could tolerate generating a separate dataset and merging them together, but there will be cases where there are no links, so I'd want to avoid any pitfalls of rows being deleted due to not having a value in any of the link columns.
R does not like quotes within strings so in your example above R is considering the string ongoing:
str_extract_all(text, "<a\\s+(?:[^>]*?\\s+)?href=(["'])(.*?)\\1")
R is still looking for the end of the string since it was not escaped in the regex. R has special cases in which as single \ can be used for escaping, (e.g \n for new line), see this. \' escapes a single quote and \" escapes a double quote in R regex:
str_extract_all(text, "<a\\s+(?:[^>]*?\\s+)?href=([\"])(.*?)\\1", text, perl=T)
"\ itself is a special character that needs escape, e.g. \\d. Do not
confuse these regular expressions with R escape sequences such as
\t."
or in your case \"

TextPad Find Replace Commands Wild Cards

I am trying to figure out how I can put together a find and replace command with wildcards or figure out a way to find and replace the following example:
I would like to find terms that contain double quotes in front of them with a single quote at the end:
Example:
find "joe' and replace with 'joe'
Basically, I'm trying to find all terms with terms having "in front and at the end.'
Check the [x] Regular expression checkbox in textpad's replace dialog and enter the following values:
Find what:
"([^'"]*)'
Replace with:
'\1'
Explanation:
In a regular expression, square brackets are used to indicate character classes. A character class beginning with a caret will match anything not in the class.
Thus [^'"] will match any character except ' and ". The following * indicates that any number of these characters can follow. The ( and ) mark a group. And the group we're looking for starts with " and ends with '. Finally in the replace string we can refer to any group via \n where n is the nth group. In our case it is the first and only group and that is why we used \1.

How to escape a % sign in sqlite?

I do a full text search using LIKE clause and the text can contain a '%'.
What is a good way to search for a % sign in an sqlite database?
I did try
SELECT * FROM table WHERE text_string LIKE '%[%]%'
but that doesn't work in sqlite.
From the SQLite documentation
If the optional ESCAPE clause is present, then the expression following the ESCAPE keyword must evaluate to a string consisting of a single character. This character may be used in the LIKE pattern to include literal percent or underscore characters. The escape character followed by a percent symbol (%), underscore (_), or a second instance of the escape character itself matches a literal percent symbol, underscore, or a single escape character, respectively.
We can achieve same thing with the below query
SELECT * FROM table WHERE instr(text_string, ?)>0
Here :
? => your search word
Example :
You can give text directly like
SELECT * FROM table WHERE instr(text_string, '%')>0
SELECT * FROM table WHERE instr(text_string, '98.9%')>0 etc.
Hope this helps better.

Resources