How to use regexp_substr() to return the numbers after a specific word in a string - regexp-substr

I have a table column full of strings like this:
'top-level:volume(1):semifinished(21491628):serial(21441769)'.
I would like to return just the numbers after 'serial' (i.e. '21441769') using regex_substr().
select ('top-level:volume(1):semifinished(21491628):serial(21441769)', ????)

We can use REGEXP_SUBSTR with a capture group:
SELECT col, REGEXP_SUBSTR(col, 'serial\\((\\d+)\\)', 1, 1, 'e', 1) AS serial
FROM yourTable;

Try this "(?<=serial().[0-9]+"

Related

VALUES clause in SQLAlchemy without column name (to be compatible with sqlite)

I have seen the answers from here VALUES clause in SQLAlchemy without being satisfactory. Basically SQLAlchemy forces you to give each column a name building the query as
SELECT * FROM (VALUES (1, 2, 3)) AS sq (colname1, colname2);
instead of using the default names "column1, column2, ..." when you don't specify (colname1, colname2). The problem with this is that specifying the column names is not compatible with sqlite. Do you know any way of doing that? I am thinking of using bare text query. The problem with that is that my full query is
SELECT pairs.column1 AS element_id,
pairs.column2 as variant_id,
products_elements.name as element_name,
elements_variants.name as variant_name
FROM (
VALUES (1, 2),
(2, 2),
(3, 1)
) AS pairs
JOIN (products_elements, elements_variants) ON (
products_elements.id = pairs.column1
AND elements_variants.id = pairs.column2
);
and I don't know how to embed the values. Thanks
If you want a raw query you can name to columns with a CTE:
WITH pairs(colname1, colname2) AS (VALUES (1, 2), (2, 2), (3, 1))
SELECT pairs.colname1 AS element_id,
pairs.colname2 AS variant_id,
products_elements.name AS element_name,
elements_variants.name AS variant_name
FROM pairs
JOIN products_elements ON products_elements.id = pairs.colname1
JOIN elements_variants ON elements_variants.id = pairs.colname2;

SQLite - Extract substring between delimiters for REPLACE function

I have a column field: location. I need to extract the string between the first and second delimeter ('/').
I already have a column name where I ltrim to the first '/'. I've tried to create a similar query with a combination of rtrim, replace, substr as my source column to no avail. Here is what my data looks like. I want to extract AML, for example. Right now, there are only three options (value1, value2, value3) between the first and second delimiters, but there could be more later.
Attribute data
----------+--------------------------------------------------------------------------------------------------------------------
Field | First value
----------+--------------------------------------------------------------------------------------------------------------------
location | './AML/Counties/*****************kyaml_20190416_transparent_mosaic_group1.tif'
name | 'kyaml_20190416_transparent_mosaic_group1.tif'
----------+--------------------------------------------------------------------------------------------------------------------
What is the best way of creating my column source with the value from location?
Output should be like this:
Attribute data
----------+--------------------------------------------------------------------------------------------------------------------
Field | First value
----------+--------------------------------------------------------------------------------------------------------------------
location | './AML/Counties/****************kyaml_20190416_transparent_mosaic_group1.tif'
name | 'kyaml_20190416_transparent_mosaic_group1.tif'
source | 'AML'
----------+--------------------------------------------------------------------------------------------------------------------
With substr() and instr():
select *,
substr(
substr(location, instr(location, '/') + 1),
1,
instr(substr(location, instr(location, '/') + 1), '/') - 1
) as source
from data
See the demo.
I used forpas query to modify my query. Here is my final query
ogrinfo box_tiles.shp -dialect SQLITE -sql \
"UPDATE box_tiles SET source = \
substr(\
substr(location, instr(location, '/') + 1), 1, \
instr(substr(location, instr(location, '/') + 1), '/') - 1)"

order of search for Sqlite's "IN" operator guaranteed?

I'm performing an Sqlite3 query similar to
SELECT * FROM nodes WHERE name IN ('name1', 'name2', 'name3', ...) LIMIT 1
Am I guaranteed that it will search for name1 first, name2 second, etc? Such that by limiting my output to 1 I know that I found the first hit according to my ordering of items in the IN clause?
Update: with some testing it seems to always return the first hit in the index regardless of the IN order. It's using the order of the index on name. Is there some way to enforce the search order?
The order of the returned rows is not guaranteed to match the order of the items inside the parenthesis after IN.
What you can do is use ORDER BY in your statement with the use of the function INSTR():
SELECT * FROM nodes
WHERE name IN ('name1', 'name2', 'name3')
ORDER BY INSTR(',name1,name2,name3,', ',' || name || ',')
LIMIT 1
This code uses the same list from the IN clause as a string, where the items are in the same order, concatenated and separated by commas, assuming that the items do not contain commas.
This way the results are ordered by their position in the list and then LIMIT 1 will return the 1st of them which is closer to the start of the list.
Another way to achieve the same results is by using a CTE which returns the list along with an Id which serves as the desired ordering of the results, which will be joined to the table:
WITH list(id, item) AS (
SELECT 1, 'name1' UNION ALL
SELECT 2, 'name2' UNION ALL
SELECT 3, 'name3'
)
SELECT n.*
FROM nodes n INNER JOIN list l
ON l.item = n.name
ORDER BY l.id
LIMIT 1
Or:
WITH list(id, item) AS (
SELECT * FROM (VALUES
(1, 'name1'), (2, 'name2'), (3, 'name3')
)
)
SELECT n.*
FROM nodes n INNER JOIN list l
ON l.item = n.name
ORDER BY l.id
LIMIT 1
This way you don't have to repeat the list twice.

Repeat a command while true or x times (equivalent of while/for loop)

I would like to repeat this command as many times as there is still sometextin the field note (several rows from the table itemNotes could have one or more sometext in the field note):
UPDATE itemNotes
SET
note = SUBSTR(note, 0, INSTR(LOWER(note), 'sometext')) || 'abc' || SUBSTR(note, INSTR(LOWER(note), 'sometext')+sometext_len)
WHERE
INSTR(LOWER(note), 'sometext') >= 0;
So a proto-code would be :
While (SELECT * FROM itemNotes WHERE note like "%sometext%") >1
UPDATE itemNotes
SET
note = SUBSTR(note, 0, INSTR(LOWER(note), 'sometext')) || 'abc' || SUBSTR(note, INSTR(LOWER(note), 'sometext')+sometext_len)
WHERE
INSTR(LOWER(note), 'sometext') >= 0;
END
But apparently Sqlite3 doesn't support While loop or for loop. They can be emulated with something like this but I have difficulties integrating what I want with this query:
WITH b(x,y) AS
(
SELECT 1,2
UNION ALL
SELECT x+ 1, y + 1
FROM b
WHERE x < 20
) SELECT * FROM b;
Any idea how to do this?
PS: I don't use replace because I want to replace all the case combinations of sometext (e.g. sometext, SOMEtext, SOmeText...) cf this question
Current input and desired output:
For a single row, a note field could look like (and many rows in the table itemNotescould look like this one):
There is SOmetext and also somETExt and more SOMETEXT and even more sometext
The query should output:
There is abc and also abc and more abc and even more abc
I am doing it on the zotero.sqlite, which is created by this file (line 85). The table is created by this query
CREATE TABLE itemNotes (
itemID INTEGER PRIMARY KEY,
parentItemID INT,
note TEXT,
title TEXT,
FOREIGN KEY (itemID) REFERENCES items(itemID) ON DELETE CASCADE,
FOREIGN KEY (parentItemID) REFERENCES items(itemID) ON DELETE CASCADE
);
You just have your answer in your query:
UPDATE itemNotes
SET
note = SUBSTR(note, 0, INSTR(LOWER(note), 'sometext')) || 'abc' || SUBSTR(note, INSTR(LOWER(note), 'sometext')+sometext_len)
WHERE
note LIKE "%sometext%";
It will update all rows that contain sometext in the note field
UPDATE
If you want to update the field which has multiple occurrences in different cases and maintain the rest of the text the simplest solution imo is to use regex and for that you need an extension
UPDATE itemNotes
SET
note = regex_replace('\bsometext\b',note,'abc')
WHERE
note LIKE "%sometext%";
As recommended by Stephan in his last comment, I used python to do this.
Here is my code :
import sqlite3
import re
keyword = "sometext"
replacement = "abc"
db = sqlite3.connect(path_to_sqlite)
cursor = db.cursor()
cursor.execute(f'SELECT * FROM itemNotes WHERE note like "%{keyword}%"')
for row in cursor.fetchall():
row_regex = re.compile(re.escape(keyword), re.IGNORECASE)
row_regex_replaced = row_regex.sub(replacement, row[2])
rowID = row[0]
sql = "REPLACE INTO itemNotes (itemID,note) VALUES (?,?)"
data = (rowID, row_regex_replaced)
cursor.execute(sql, data)
db.commit()

show the digits between underscore and dot

in oracle
I want to show the digits between underscore and dot
For example : pSE1001335806_17950.dat
So column will be = 17950
But for : pSE1001311462_4558.dat
The column will be = 4558
How can I do that????
You can use a regular expression for this; if you can generalise it so the second block of alphanumeric characters, then something this simple would work:
regexp_substr(<value>, '([[:alnum:]]+)', 1, 2)
With a CTE to generate your sample values:
with files as
(
select 'pSE1001335806_17950.dat' as filename from dual
union all select 'pSE1001311462_4558.dat' from dual
)
select regexp_substr(filename, '([[:alnum:]]+)', 1, 2)
from files;
REGEXP_SUBSTR(FILENAME,'([[:ALNUM:]]+)',1,2)
--------------------------------------------------------------------------------
17950
4558
One way:
select regexp_replace('pSE1001335806_17950.dat','.*_([0-9]+).dat','\1') from dual

Resources