How to repeat (recursive) query, again and again? - sqlite

[EDIT]
Ah.. How can I put "IF" condition when iterate ? sometimes there are no Tags at all.
In that case I do not want any kind of modification.
I wrote query like this;
UPDATE myTable SET myCOL =
substr(myCOL, 1, instr(myCOL, '<Tag>') - 1)
|| '■' ||
substr(myCOL, instr(myCOL, '</Tag>') + 6, length(myCOL));
Because I'd like to achive like this;
myVar := "abc<Tag>BuLah..BuLah..</Tag>def"
myGoal := "abc■def"
So far so good.
Now, things goes to the real world, I have myVar like this;
myVar := "abc<Tag>BuLah1..</Tag>def..ghi<Tag>BuLah2..</Tag>jkl"
How can I achieve my goal ?
Thanks..
[EDIT] Text files attached for the moment.
Link Deleted...
simple extracted text, all ROWs - 811 rows
simple extracted text, all ROWs Tags processed
simple extracted text, Tags ROWs only - 18 rows, 42 tags.
When I execute without this query, its runtime instantly. No time needed. But, when I exectue with this query, it took me 71 seconds to finish. I guess I have made some.. bad bug or something. The other XML parts have handled by the other query, this query deals above mentioned Tags only.

With a recursive CTE:
with recursive cte as (
select myCOL,
substr(myCOL, 1, instr(myCOL, '<Tag>') - 1)
|| '■' ||
substr(myCOL, instr(myCOL, '</Tag>') + 6, length(myCOL)) newCOL
from myTable
where myCOL like '%<Tag>%'
union all
select c.myCOL,
substr(c.newCOL, 1, instr(c.newCOL, '<Tag>') - 1)
|| '■' ||
substr(c.newCOL, instr(c.newCOL, '</Tag>') + 6, length(c.newCOL)) newCOL
from cte c
where c.newCOL like '%<Tag>%'
)
update myTable
set myCOL = (
select newCOL from cte
where myTable.myCOL = cte.myCOL and cte.newCOL not like '%<Tag>%'
)
where myCOL like '%<Tag>%';
See the demo.
Results:
| myCOL |
| ---------------- |
| abc■def |
| abc■def..ghi■jkl |

Related

Repeat a command while true or x times (equivalent of while/for loop)

I would like to repeat this command as many times as there is still sometextin the field note (several rows from the table itemNotes could have one or more sometext in the field note):
UPDATE itemNotes
SET
note = SUBSTR(note, 0, INSTR(LOWER(note), 'sometext')) || 'abc' || SUBSTR(note, INSTR(LOWER(note), 'sometext')+sometext_len)
WHERE
INSTR(LOWER(note), 'sometext') >= 0;
So a proto-code would be :
While (SELECT * FROM itemNotes WHERE note like "%sometext%") >1
UPDATE itemNotes
SET
note = SUBSTR(note, 0, INSTR(LOWER(note), 'sometext')) || 'abc' || SUBSTR(note, INSTR(LOWER(note), 'sometext')+sometext_len)
WHERE
INSTR(LOWER(note), 'sometext') >= 0;
END
But apparently Sqlite3 doesn't support While loop or for loop. They can be emulated with something like this but I have difficulties integrating what I want with this query:
WITH b(x,y) AS
(
SELECT 1,2
UNION ALL
SELECT x+ 1, y + 1
FROM b
WHERE x < 20
) SELECT * FROM b;
Any idea how to do this?
PS: I don't use replace because I want to replace all the case combinations of sometext (e.g. sometext, SOMEtext, SOmeText...) cf this question
Current input and desired output:
For a single row, a note field could look like (and many rows in the table itemNotescould look like this one):
There is SOmetext and also somETExt and more SOMETEXT and even more sometext
The query should output:
There is abc and also abc and more abc and even more abc
I am doing it on the zotero.sqlite, which is created by this file (line 85). The table is created by this query
CREATE TABLE itemNotes (
itemID INTEGER PRIMARY KEY,
parentItemID INT,
note TEXT,
title TEXT,
FOREIGN KEY (itemID) REFERENCES items(itemID) ON DELETE CASCADE,
FOREIGN KEY (parentItemID) REFERENCES items(itemID) ON DELETE CASCADE
);
You just have your answer in your query:
UPDATE itemNotes
SET
note = SUBSTR(note, 0, INSTR(LOWER(note), 'sometext')) || 'abc' || SUBSTR(note, INSTR(LOWER(note), 'sometext')+sometext_len)
WHERE
note LIKE "%sometext%";
It will update all rows that contain sometext in the note field
UPDATE
If you want to update the field which has multiple occurrences in different cases and maintain the rest of the text the simplest solution imo is to use regex and for that you need an extension
UPDATE itemNotes
SET
note = regex_replace('\bsometext\b',note,'abc')
WHERE
note LIKE "%sometext%";
As recommended by Stephan in his last comment, I used python to do this.
Here is my code :
import sqlite3
import re
keyword = "sometext"
replacement = "abc"
db = sqlite3.connect(path_to_sqlite)
cursor = db.cursor()
cursor.execute(f'SELECT * FROM itemNotes WHERE note like "%{keyword}%"')
for row in cursor.fetchall():
row_regex = re.compile(re.escape(keyword), re.IGNORECASE)
row_regex_replaced = row_regex.sub(replacement, row[2])
rowID = row[0]
sql = "REPLACE INTO itemNotes (itemID,note) VALUES (?,?)"
data = (rowID, row_regex_replaced)
cursor.execute(sql, data)
db.commit()

Optimizing Query - I'm doing something very wrong here. Alternatives?

So basically what this query does is SUM some value, and fetch some data from another table (with 1.8 million records - EDTF007 - the one in the WITH). EDTF011 (main table) has around 800k records.
So what's the problem?
Well... the field that I fetch from EDTF007 has some rules (seen in the CASE clause). I fetch the field CCONTA by making different "INNER JOINS" (as seen in the CASE).
The database cries for 1 hour to run this query. I've ran statistics, I've created indexes (rebuild them), tried access directly to the partitions. And it takes around 1 hour to run this query. Not acceptable at this moment.
The thing is... I'm out of ideas... and the problem is the structure of the query because... the WAIT events that I get... are all related to this bad query structure... :(
Any ideas how could I turn this around? An update on the table is also too slow I think...
WITH IMP AS (SELECT CCONTA, CREFERENCIA FROM EDTF007_IMPARIDADE PARTITION (P_IMPARIDADE_201703))
SELECT SUM(GC.MAVALIAA),
V.ID, NVL(GC.CKNUMCTA, '00000000000'), NVL(GC.CKBALCAO, '0000'), GC.CKPRODUT, GC.CKSUBPRO, GC.ZDEPOSIT, GC.MSLDACT, SUM(GC.MAVALIAA), GC.TIPGARBL, GC.CGARANT, NVL(G.TIPO_GAR, 'SEM GAR'), G.TIPO_GAR2, GC.CREFERENCIA_IMP,
CASE
WHEN (SUBSTR(GC.CKPRODUT, 1, 3) ) IN ('096' , '097' , '020' , '021' , '024') AND GC.CKPRODUT != 'MOR' THEN
(SELECT DISTINCT CCONTA FROM IMP WHERE (GC.CKBALCAO || GC.CKNUMCTA) = CCONTA)
WHEN (SUBSTR(GC.CKPRODUT, 1, 3) ) IN ('035') AND GC.CKPRODUT != 'MOR' THEN
(SELECT DISTINCT CCONTA FROM IMP WHERE GC.ZDEPOSIT = CREFERENCIA)
WHEN NVL(GC.CKBALCAO ||GC.CKNUMCTA, '000000000000000') IN (SELECT IMP.CCONTA FROM RE_ED.EDTF007_IMPARIDADE IMP WHERE (IMP.PRODUTO ) = ( '000' ) AND (IMP.SEGMENTO ) IN ('IE' , 'IP' , 'IA' , 'GI')) AND GC.CKPRODUT != 'MOR' THEN
(SELECT DISTINCT CCONTA FROM IMP WHERE (GC.CKBALCAO || GC.CKNUMCTA) = CCONTA)
WHEN SUBSTR(GC.CKPRODUT, 1, 3) NOT IN ('096' , '097' , '020' , '021' , '024' , '035' ) AND GC.CKPRODUT != 'MOR' THEN
(SELECT DISTINCT CCONTA FROM IMP WHERE (GC.CKBALCAO || GC.CKNUMCTA || GC.CREFERENCIA_IMP) = CCONTA || CREFERENCIA)
WHEN GC.CKPRODUT = 'MOR' THEN
(SELECT DISTINCT CCONTA FROM IMP WHERE (GC.CKBALCAO || GC.CKNUMCTA || GC.CREFERENCIA_IMP) = CCONTA || CREFERENCIA)
ELSE '000000000000000'
END CCONTA_IMP,
GC.CREATED_BY, GC.CREATED_DATE
FROM RE_ED.EDTD011_GARANTIAS_CONTRATO GC,
(SELECT MAX(IDVERSAO) AS ID FROM RE_CD.CDTD009_VERSOES WHERE (TABELA) = ('RE_CD.CDTD015_GARANTIAS_CONTRATO')) V,
(SELECT DISTINCT TIPO_GAR, TIPO_GAR2, CODIGO FROM RE_CD.CDTD011_COD_GARANTIAS WHERE FLAG_ATIVO = 1 AND DTBEGIN_VER <= TO_TIMESTAMP(TRUNC(TO_DATE('2017-03-01', 'YYYY-MM-DD'), 'MM')) AND DTEND_VER >= TO_TIMESTAMP(LAST_DAY(TO_DATE('2017-03-01', 'YYYY-MM-DD')))) G
WHERE (GC.ANO) = ('2017')
AND (GC.MES) = ('03')
AND (GC.CGARANT) = (G.CODIGO)
GROUP BY V.ID, GC.CKNUMCTA, GC.CKBALCAO, GC.CKPRODUT, GC.CKSUBPRO, GC.ZDEPOSIT, GC.MSLDACT, GC.TIPGARBL, GC.CGARANT, G.TIPO_GAR, G.TIPO_GAR2, GC.CREFERENCIA_IMP, GC.CREATED_BY, GC.CREATED_DATE;
EDIT: After some conversation with a collegue he sugested to separate the query universe and instead of 1 query, I should try with 5 insert queries for each case. I'm going forward with this approach. Lets see how it goes. The main goal is a Insert into Select by the way.

Select Case, when no data return

it is possible do a SELECT CASE, decode, nvl or another query function when I need verify if the return of a select query is empty or has a value?
For example, I have this:
Record | type | id_customer
-------+--------+-------------
1 | T | cus1
2 | A | cus2
3 | T | cus3
4 | | cus4
If I do this:
select decode(type,'T','Main','A','Adicional','none') from table where record=1;
I get Main.
If I fo this:
select decode(type,'T','Main','A','Adicional','none') from table where record=4;
I get none.
But if I do this:
select decode(type,'T','Main','A','Aditional','none') from table where record=5;
I get nothing, and is logic. So, I need get the decode value when the row exist and a text if the rows no exist.
So, I tried with SELECT CASE but is not posible get a value using COUNT. For example like this:
SELECT
CASE
WHEN count(1)>0 THEN decode(type,'T','Main','A','Aditional','none')
ELSE '-'
END
FROM TABLE WHERE record=5;
And get a ' - ', or the same if the record is 2, get 'Aditional'
Thanks a lot.
You can use aggregate functions min or max outside expression:
select max(decode(type,'T','Main','A','Aditional','none'))
from table
where record=5;
If query returns one row, you get value of that row. If query returns 0 rows, you get NULL.
Then you can replace NULL using nvl:
select nvl(max(decode(type,'T','Main','A','Aditional','none')), ' - ')
from table
where record=5;
EDIT
Also, if you need to choose one string from several:
select decode(max(decode(type,'T', 2, 'A', 1, 0)), 0, 'none', 1, 'Additional', 2, 'Main', null, ' - ')
from table
where record=5;
This is an option:
select decode(type,'T','Main','A','Aditional','none')
from table
where record = 5
union all
select '-'
from dual
where not exists (select 1 from table where record = 5);
It selects records with record = 5 and unifies them with '-', if no records exits with record = 5. Check out this Fiddle.

Retrieve a table to tallied numbers, best way

I have query that runs as part of a function which produces a one row table full of counts, and averages, and comma separated lists like this:
select
(select
count(*)
from vw_disp_details
where round = 2013
and rating = 1) applicants,
(select
count(*)
from vw_disp_details
where round = 2013
and rating = 1
and applied != 'yes') s_applicants,
(select
LISTAGG(discipline, ',')
WITHIN GROUP (ORDER BY discipline)
from (select discipline,
count(*) discipline_number
from vw_disp_details
where round = 2013
and rating = 1
group by discipline)) disciplines,
(select
LISTAGG(discipline_count, ',')
WITHIN GROUP (ORDER BY discipline)
from (select discipline,
count(*) discipline_count
from vw_disp_details
where round = 2013
and rating = 1
group by discipline)) disciplines_count,
(select
round(avg(util.getawardstocols(application_id,'1','AWARD_NAME')), 2)
from vw_disp_details
where round = 2013
and rating = 1) average_award_score,
(select
round(avg(age))
from vw_disp_details
where round = 2013
and rating = 1) average_age
from dual;
Except that instead of 6 main sub-queries there are 23.
This returns something like this (if it were a CSV):
applicants | s_applicants | disciplines | disciplines_count | average_award_score | average_age
107 | 67 | "speed,accuracy,strength" | 3 | 97 | 23
Now I am programmatically swapping out the "rating = 1" part of the where clauses for other expressions. They all work rather quickly except for the "rating = 1" one which takes about 90 seconds to run and that is because the rating column in the vw_disp_details view is itself compiled by a sub-query:
(SELECT score
FROM read r,
eval_criteria_lookup ecl
WHERE r.criteria_id = ecl.criteria_id
AND r.application_id = a.lgo_application_id
AND criteria_description = 'Overall Score'
AND type = 'ABC'
) reader_rank
So when the function runs this extra query seems to slow everything down dramatically.
My question is, is there a better (more efficient) way to run a query like this that is basically just a series of counts and averages, and how can I refactor to optimize the speed so that the rating = 1 query doesn't take 90 seconds to run.
You could choose to MATERIALIZE the vw_disp_details VIEW. That would pre-calculate the value of the rating column. There are various options for how up-to-date a materialized view is kept, you would probably want to use the ON COMMIT clause so that vw_disp_details is always correct.
Have a look at the official documentation and see if that would work for you.
http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_6002.htm
Do all most of your queries in only one. Instead of doing:
select
(select (count(*) from my_tab) as count_all,
(select avg(age) from my_tab) as avg_age,
(select avg(mypkg.get_award(application_id) from my_tab) as_avg-app_id
from dual;
Just do:
select count(*), avg(age),avg(mypkg.get_award(application_id)) from my_tab;
And then, maybe you can do some union all for the other results. But this step all by itself should help.
I was able to solve this issue by doing two things: creating a new view that displayed only the results I needed, which gave me marginal gains in speed, and in that view moving the where clause of the sub-query that caused the lag into the where clause of the view and tacking on the result of the sub-query as column in the view. This still returns the same results thanks to the fact that there are always going to be records in the table the sub-query accessed for each row of the view query.
SELECT
a.application_id,
util.getstatus (a.application_id) status,
(SELECT score
FROM applicant_read ar,
eval_criteria_lookup ecl
WHERE ar.criteria_id = ecl.criteria_id
AND ar.application_id = a.application_id
AND criteria_description = 'Overall Score' //THESE TWO FIELDS
AND type = 'ABC' //ARE CRITERIA_ID = 15
) score
as.test_total test_total
FROM application a,
applicant_scores as
WHERE a.application_id = as.application_id(+);
Became
SELECT
a.application_id,
util.getstatus (a.application_id) status,
ar.score,
as.test_total test_total
FROM application a,
applicant_scores as,
applicant_read ar
WHERE a.application_id = as.application_id(+)
AND ar.application_id = a.application_id(+)
AND ar.criteria_id = 15;

Duplicate record by using with CTEs SQL Server 2008

I need to manage hierarchy data storing in my database. But I have a problem now. Please see my example
I have a table called COMMON.TASK_REL
My second table is called Common. task
I suppose need to sort the task_seq and return a result like below:
Task Name || Task_Seq
Item1 1
..Item1.2 1
...Item1.2.1 1
..Item1.1 2
Here is my query
--Common task SQL modify --
WITH ctLevel
AS
(
SELECT
C_TASK_ID AS Child
,P_Task_ID AS Parent
,common_task.TASK_SEQ AS taskOrder
,1 AS [Level]
,CAST(C_TASK_ID AS VARCHAR(MAX)) AS [Order]
,CAST (Replicate('.', 1) + common_task.TASK_NAME AS VARCHAR(25)) AS [Task_Name]
FROM
[COMMON.TASK_REL] as common_task_rel,
[COMMON.TASK] as common_task
WHERE common_task_rel.C_TASK_ID = common_task.TASK_ID
and common_task.[TASK_TYPE] = 'B' AND common_task.[MODULE_CODE] = 'LWRPT'
AND common_task.[STATUS] <> 'D'
UNION ALL
SELECT
C_TASK_ID AS Child
,P_Task_ID AS Parent
,common_task.TASK_SEQ AS taskOrder
,[Level] + 1 AS [Level]
,[Order] + '.' + CAST(C_TASK_ID AS VARCHAR(MAX)) AS [Order]
,CAST (Replicate('.', [Level] + 1) + common_task.TASK_NAME AS VARCHAR(25)) AS [Task_Name]
FROM [COMMON.TASK_REL] as common_task_rel
INNER JOIN ctLevel
ON ( P_Task_ID = Child ) , [COMMON.TASK] as common_task
WHERE common_task_rel.C_TASK_ID = common_task.TASK_ID
and common_task.[TASK_TYPE] = 'B' AND common_task.[MODULE_CODE] = 'LWRPT'
AND common_task.[STATUS] <> 'D'
)
-- Viewing Data
SELECT Child ,Parent ,taskOrder,Level,[Order],Task_Name
FROM ctLevel
GROUP BY Child ,Parent ,taskOrder,Level,[Order],Task_Name
order by [Order];
GO
But my result returns duplicated rows:
Anyone can help me correct my query? Thanks
I believe that your duplicates are coming from your root/anchor query. You should add the following to that query:
AND Task_Seq = 0
Basically, you only want the root to be set up as the beginning of the tree. 301|300 should not be picked up until the recursion section (the part after union all)
If that does not make sense, then I can repaste your query with the modification, but that seemed unnecessary for a small change.

Resources