Query first letters of each words and full text search in one query - sqlite

Suppose I have such data:
Lena
Lera
Elena
Mark
Allen
Paul
When user enters 'le' I need to return
Find words by first letters
After that should follow all another words that contains 'le' chars (full text search)
So it should return:
Lena
Lera
Alen
Elena
Something like that (this example does not return anything):
SELECT name FROM table WHERE name LIKE 'le%' AND like '%le%' ORDER BY name ASC
Thank you.

After it turns out that UNION mixes up the order of the sub selects, we'll try something else. You could try it with a custom ORDER BY. First order by whether it starts with your search term, then by name.
SELECT name
FROM table
WHERE UPPER(name) LIKE UPPER('%le%')
ORDER BY (CASE WHEN UPPER(name) LIKE UPPER('le%') THEN 1 ELSE 2 END),
name ASC
Comparing the strings with UPPER helps ignoring the case. But UPPER only support ASCII apparently, read this article for more information on the topic. It also contains other ways to ignore case when comparing strings.

Related

SQLite FTS search if there's a word that contains string

I want to select all rows that contains an string (for example abc)
example:
abc
abcd
0abc
0abcd
I want all the above to be returned.
My first approach was:
SELECT * FROM notes WHERE notes MATCH '^abc*'
but it returns 0 results.
Second was:
SELECT * FROM notes WHERE notes MATCH '*abc*'
but it returns an error (I belive that asterix can't be used as the first character).
How can I do this?
Thanks in advance
The issue is your usage of ^: your string does not start with abc, so it's not a match. According to SQLite documentation:
MATCH '^one' -- first token in any column must be "one"

How to extract specific string until blank space/next line from a text in Oracle?

I am trying to extract the following from the text field using Regrex in Oracle.
For example
"This is example,
and this really a example :h,j,j,j,j,
l //Updated question , as this letter is on the next line
now this is a disease:yes"
I am expecting a result as h,j,j,j,j,l, but if I use
REGEXP_SUBSTR(text_field,'example :[^:]+,') AS Result
I am getting example:h,j,j,j,j
But I am not getting the last letter 'l' like above and I am guessing that's because it's on the next line.Also, if I want the string "disease:yes" only, that will be so helpful as well. Thank you much!
The result you are getting is because your pattern includes the word 'example' and ends with a comma, leaving out the ending 'l'. Try this form instead. Note the example is shown using a Common table Expression (CTE). The WITH statement creates the table called tbl which just sets up test data, kind of like a temp table. This is also a great way to set up data when asking a question. This form of the REGEXP_SUBSTR() function uses a captured group, which is the set of characters after the string 'example:' until the end of that line in the multi-line field. From this you should be able to get the other string you are after. Give it a go.
WITH tbl(text_field) AS (
SELECT 'This is example,
and this really a example :h,j,j,j,j,l
now this is a disease:yes' FROM dual
)
SELECT REGEXP_SUBSTR(text_field,'example :(.*)', 1, 1, NULL, 1) AS Result
FROM tbl;
RESULT
-----------
h,j,j,j,j,l
1 row selected.
Edit based on new info. Since that last letter could be on it's own line, you'll need to allow for the newline. Use the 'n' flag to REGEXP_REPLACE() which allows the newline to match in the usage of the dot (match any character) symbol in regex. We switch to REGEXP_REPLACE as we'll need to return multiple capture groups. Here the WITH sets up 2 rows, one with an embedded newline in the data and one without. The capture groups are (going left to right) 1-the data after "example :" and ending in a comma, 2-the optional newline and 3-the next single character. Then replace the entire data with captured groups 1 and 3 (leaving out the newline).
NOTE this is very specific to the case of only 1 character on the following line.
WITH tbl(ID, text_field) AS (
SELECT 1, 'This is example,
and this really a example :h,j,j,j,j,
l
now this is a disease:yes' FROM dual UNION ALL
SELECT 2, 'This is example,
and this really a example :h,j,j,j,j,l
now this is a disease:yes' FROM dual
)
SELECT ID,
REGEXP_REPLACE(text_field, '.*example :(.*,)('||CHR(10)||')?(.).*', '\1\3', 1, 1, 'n') AS Result
FROM tbl;
ID RESULT
---------- ------------
1 h,j,j,j,j,l
2 h,j,j,j,j,l
2 rows selected.

Postgres's query to select value in array by index

My data is string like:
'湯姆 is a boy.'
or '梅isagirl.'
or '約翰,is,a,boy.'.
And I want to split the string and only choose the Chinese name.
In R, I can use the command
tmp=strsplit(string,[A-z% ])
unlist(lapply(tmp,function(x)x[1]))
And then getting the Chinese name I want.
But in PostgreSQL
select regexp_split_to_array(string,'[A-z% ]') from db.table
I get a array like {'湯姆','','',''},{'梅','','',''},...
And I don't know how to choose the item in the array.
I try to use the command
select regexp_split_to_array(string,'[A-z% ]')[1] from db.table
and I get an error.
I don't think that regexp_split_to_array is the appropriate function for what you are trying to do here. Instead, use regexp_replace to selectively remove all ASCII characters:
SELECT string, regexp_replace(string, '[[:ascii:]~:;,"]+', '', 'g') AS name
FROM yourTable;
Demo
Note that you might have to adjust the set of characters to be removed, depending on what other non Chinese characters you expect to have in the string column. This answer gives you a general suggestion for how you might proceed here.

regexp_substr get last two words from end of the sentence in Oracle SQL

I have a string: ON P6B 0B8. The output I need is: P6B OB8.
I can use regexp_substr('ON P6B 0B8','[^ ]+$',1) to get the last word from the end of the sentence. But how would I get the word after the space—the second word from the end?
How do I tell regexp_substr to not stop at the first space when looking from behind, and instead move on until it hits the second space?
I had a tough time understanding the metacharacters provided by Oracle regexp.
Here's a regex that will get the last 2 sets of characters from your string. Since it appears you are getting a Canadian postcode though you may want to be a little more careful.
The WITH clause sets up a table with data. Notice the first row is a valid postcode format, but the second row is bad (2 letters in a row). Always use unexpected data for your test cases, you don't want any surprises and the data WILL always contain surprises.
The first regex matches 2 sets of 3 characters separated by a space at the end of the string. At first glance this may seem OK but if the data is bad it will get returned. To tighten it up, use the second regex, which specifically checks for the Canadian postcode format of uppercase_letter-digit-uppercase_letter-space-digit-uppercase_letter-digit and will return NULL if it is not found. Maybe you want to catch this with a NVL() call and return a message instead.
with tbl(str) as (
select 'Windsor ON P6B 0B8' from dual union all
select 'Windsor_bad_postcode ON A3C 9BB' from dual
)
select --regexp_substr(str, '.* (.{3} .{3})$', 1, 1, NULL, 1) postcode_w_bad
regexp_substr(str, '.* ([A-Z]\d[A-Z] \d[A-Z]\d)$', 1, 1, NULL, 1) postcode
from tbl;

How to exclude certain characters from like condition in Oracle

I Have multiple records in table like below. Each record holds mutiple entries separated by #.
record1 - 123.45.56:ABCD:789:E # 1011.1213.1415:FGHI:1617:J #
record2 - 123.45.56:ABCD:1617:E # 1011.1213.1415:FGHI:12345:J #
I need to pass an argument to a different project/service which builds an sql query and send the output to me.
Now if I send an argument like below, it gives me wrong output
123.45.56:*:1617
This recognizes both record1 and record 2 as proper output because of wildcard char. But as per my requirement only record2 is proper as record1 has 123.45.56 in one entry and 1617 in a different entry.
Is there a way to construct an expression that says the like condition to ignore such invalid entries.
Please note that I cant change the query as I am not constructing it. The only way for me is to tweak the expression that I can send as argument.
You need to restrict the pattern you match to be specic enough such that it only matches the first record and not the second one.
You can try:
SELECT *
FROM yourTable
WHERE col LIKE '123.45.56:' AND col LIKE '1617:J #'

Resources