Split values in parts with sqlite

Split values in parts with sqlite - sqlite

I'm struggling to convert
a | a1,a2,a3
b | b1,b3
c | c2,c1
to:
a | a1
a | a2
a | a3
b | b1
b | b2
c | c2
c | c1
Here are data in sql format:
CREATE TABLE data(
"one" TEXT,
"many" TEXT
);
INSERT INTO "data" VALUES('a','a1,a2,a3');
INSERT INTO "data" VALUES('b','b1,b3');
INSERT INTO "data" VALUES('c','c2,c1');
The solution is probably recursive Common Table Expression.
Here's an example which does something similar to a single row:
WITH RECURSIVE list( element, remainder ) AS (
SELECT NULL AS element, '1,2,3,4,5' AS remainder
UNION ALL
SELECT
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, 0, INSTR( remainder, ',' ) )
ELSE
remainder
END AS element,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, INSTR( remainder, ',' )+1 )
ELSE
NULL
END AS remainder
FROM list
WHERE remainder IS NOT NULL
)
SELECT * FROM list;
(originally from this blog post: https://blog.expensify.com/2015/09/25/the-simplest-sqlite-common-table-expression-tutorial)
It produces:
element | remainder
-------------------
NULL | 1,2,3,4,5
1 | 2,3,4,5
2 | 3,4,5
3 | 4,5
4 | 5
5 | NULL
the problem is thus to apply this to each row in a table.

Yes, a recursive common table expression is the solution:
with x(one, firstone, rest) as
(select one, substr(many, 1, instr(many, ',')-1) as firstone, substr(many, instr(many, ',')+1) as rest from data where many like "%,%"
UNION ALL
select one, substr(rest, 1, instr(rest, ',')-1) as firstone, substr(rest, instr(rest, ',')+1) as rest from x where rest like "%,%" LIMIT 200
)
select one, firstone from x UNION ALL select one, rest from x where rest not like "%,%"
ORDER by one;
Output:
a|a1
a|a2
a|a3
b|b1
b|b3
c|c2
c|c1

Check my answer in How to split comma-separated value in SQLite?.
This will give you the transformation in a single query rather than having to apply to each row.
-- using your data table assuming that b3 is suppose to be b2
WITH split(one, many, str) AS (
SELECT one, '', many||',' FROM data
UNION ALL SELECT one,
substr(str, 0, instr(str, ',')),
substr(str, instr(str, ',')+1)
FROM split WHERE str !=''
) SELECT one, many FROM split WHERE many!='' ORDER BY one;
a|a1
a|a2
a|a3
b|b1
b|b2
c|c2
c|c1

Related

SQL to find next greater records for each element

I've got a table defined like this:
CREATE TABLE event (t REAL, event TEXT, value);
For each record in the table which have event='type' and value='G' there will be two corresponding records with event='Z' - one with value=1 and one with value=0. Here is an example:
t | event | value
1624838448.123 | type | G
1624838448.123 | Z | 1
1624839543.215 | Z | 0
Note that there could be other event='Z' records that don't have corresponding type='G' records. I'm trying to write a query to find all the event='G' records that do have a corresponding type='G' record to use as the bounds for an additional query (or join?).
Note: The t value for the "type" event and the Z event where value=1 will always be the same.
So for instance if the table looked like this:
t | event | value
1624838448.123 | type | G
1624838448.123 | Z | 1
1624839543.215 | Z | 0
1624839555.555 | type | H
1624838555.555 | Z | 1
1624839602.487 | Z | 0
1624839999.385 | type | G
1624839999.385 | Z | 1
1624840141.006 | Z | 0
Then I want the results of the query to return this:
t1 | t2
1624838448.123 | 1624839543.215
1624839999.385 | 1624840141.006

From your comment:
There are always three records (ignoring any other events in between)
in chronological order: the "type" event, the first "Z" record with
the same timestamp, and the second "Z" record with a later timestamp
So, there is no need to return t1 separately since it is equal to t in the row where event = 'type' and value = 'G'.
For t2 you can use conditional aggregation with MIN() window function:
SELECT t1, t2
FROM (
SELECT t AS t1, event, value
MIN(CASE WHEN event = 'Z' AND value = '0' THEN t END) OVER (ORDER BY t ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) t2
FROM Event
)
WHERE event = 'type' AND value = 'G'
See the demo.

I found a solution using the RANK() function. With this I get an intermediate table which has the same rank for both the "type" and first "Z" record, since they have the same timestamp, and a rank two greater for the second "Z" record. I use WITH so I can self join repeatedly without having to specify the same query over and over. I first join the "type" and first "Z" row by requiring that the type of two second record be greater than that of the first (so I only get the type:Z combination and not type:type, Z:type, or Z:Z). Then I self join again to get the rank-2 row which picks up the second Z record. Overall, the query looks like this:
WITH Seq(t,event,A,I)
AS
(
SELECT t, event, value,
RANK() OVER (ORDER BY t) I
FROM Event e1
WHERE (e1.event='type' OR e1.event='Z')
)
SELECT s2.t,s3.t
FROM Seq s1
INNER JOIN Seq s2 ON s1.I = s2.I AND s1.event < s2.event
INNER JOIN Seq s3 ON s1.I = s3.I-2
WHERE s1.value='G';

calculate percentages with postgresql join queries

I am trying to calculate percentages by joining 3 tables data to get the percentages of positive_count, negative_count, neutral_count of each user's tweets. I have succeeded in getting positive, negative and neutral counts, but failing to get the same as percentages instead of counts. Here is the query to get counts:
SELECT
t1.u_id,count() as total_tweets_count ,
(
SELECT count() from t1,t2,t3 c
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Positive'
) as pos_count ,
(
SELECT count() from t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Negative'
) as neg_count ,
(
SELECT count() from t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Neutral'
) as neu_count
FROM t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id
GROUP BY t1.u_id;
**OUTPUT:**
u_id | total_tweets_count | pos_count | neg_count | neu_count
-----------------+--------------------+-----------+-----------+-------
18839785| 88 | 38 | 25 | 25
(1 row)
Now I want the same in percentages instead of counts. I have written the query in the following way but failed.
SELECT
total_tweets_count,pos_count,
round((pos_count * 100.0) / total_tweets_count, 2) AS pos_per,neg_count,
round((neg_count * 100.0) / total_tweets_count, 2) AS neg_per,
neu_count, round((neu_count * 100.0) / total_tweets_count, 2) AS neu_per
FROM (
SELECT
count(*) as total_tweets_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Positive'
) AS pos_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Negative'
) AS neg_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Neutral') AS neu_count
FROM t1,t2, t3
WHERE
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id
GROUP BY a.u_id
) sub;
Can anyone help me out in achieving as percentages for each user data as below?
u_id | total_tweets_count | pos_count | neg_count | neu_count
------------------+--------------------+-----------+-----------+-----
18839785| 88 | 43.18 | 28.4 | 28.4
(1 row)

I am not entirely sure what you are looking for.
For starters, you can simplify your query by using conditional aggregation instead of three scalar subqueries (which btw. do not need to repeat the where condition on a.u_id)
You state you want to "count for all users", so you need to remove the WHERE clause in the main query. The simplification also gets rid of the repeated WHERE condition.
select u_id,
total_tweets_count,
pos_count,
round((pos_count * 100.0) / total_tweets_count, 2) AS pos_per,
neg_count,
round((neg_count * 100.0) / total_tweets_count, 2) AS neg_per,
neu_cont,
round((neu_count * 100.0) / total_tweets_count, 2) AS neu_per
from (
SELECT
t1.u_id,
count(*) as total_tweets_count,
count(case when t3.sentiment='Positive' then 1 end) as pos_count,
count(case when t3.sentiment='Negative' then 1 end) as neg_count,
count(case when t3.sentiment='Neutral' then 1 end) as neu_count
FROM t1
JOIN t2 ON t1.u_id=t2.u_id
JOIN t3 t2.ts_id=t3.ts_id
-- no WHERE condition on the u_id here
GROUP BY t1.u_id
) t
Note that I replaced the outdated, ancient and fragile implicit joins in the WHERE clause with "modern" explicit JOIN operators
With a more up-do-date Postgres version, the expression count(case when t3.sentiment='Positive' then 1 end) as pos_count can also be re-written to:
count(*) filter (where t3.sentiment='Positive') as pos_count
which is a bit more readable (and understandable I think).
In your query you can achieve the repetition of the global WHERE condition on the u_id by using a co-related subquery, e.g.:
(
SELECT count(*)
FROM t1 inner_t1 --<< use different aliases than in the outer query
JOIN t2 inner_t2 ON inner_t2.u_id = inner_t1.u_id
JOIN t3 inner_t3 ON inner_t3.ts_id = inner_t2.ts_id
-- referencing the outer t1 removes the need to repeat the hardcoded ID
WHERE innter_t1.u_id = t1.u_id
) as pos_count
The repetition of the table t1 isn't necessary either, so the above could be re-written to:
(
SELECT count(*)
FROM t2 inner_t2
JOIN t3 inner_t3 ON inner_t3.ts_id = inner_t2.ts_id
WHERE inner_t2.u_id = t1.u_id --<< this references the outer t1 table
) as pos_count
But the version with conditional aggregation will still be a lot faster than using three scalar sub-queries (even if you remove the unnecessary repetition of the t1 table).

Get the number of execution id wise in toad for oracle

Please help in extracting a data of number of executions against respective id
For e.g. I have temp1 table with data of id and DT where id is given in the form below :
SNGL~27321~SUBM~28867_17227~20170815.CSV.20170815113439
SNGL~27321~SUBM~28867_17227~20170815.CSV.20170815113439
SNGL~27321~SUBM~29329_17227~20170815.CSV.20170815113439
I need the result as below :
id number of exec
28867 2
29329 1
The query is below:
select count(A.DT)
from temp1 a
where A.id like '%28867%'
and A.DT >= to_date( '01-Aug-2017','dd-MON-yyyy')
and A.DT < to_date('01-Sep-2017','dd-MON-yyyy')
The problem i am facing is to extract ids from the column of id using like operator.
Please help me to retrieve the result in TOAD FOR ORACLE

You can use REGEXP_REPLACE function, or a combination of SUBSTR and INSTR functions to extract this number from the string.
The latter is faster than the pattern matching in REGEXP_REPLACE, so if there is a huge table of strings I would use the second option.
Assumming that SUBM~ substring is always before the number, this should work:
With my_data as (
select 'SNGL~27321~SUBM~28867_17227~20170815.CSV.20170815113439' as str from dual union all
select 'SNGL~27321~SUBM~28867_17227~20170815.CSV.20170815113439' from dual union all
select 'SNGL~27321~SUBM~29329_17227~20170815.CSV.20170815113439' from dual
)
SELECT
regexp_replace( str, '.*SUBM~(\d+).*', '\1' ) as x,
substr( str,
instr( str, 'SUBM~' ) + length('SUBM~'),
instr( str, '_', instr( str, 'SUBM~' ) )
- instr( str, 'SUBM~' )
- length('SUBM~')
) as y
FROM My_data;
| X | Y |
|-------|-------|
| 28867 | 28867 |
| 28867 | 28867 |
| 29329 | 29329 |

Selecting the n'th range/island of rows where columns have a common value?

I need to select all rows (for a range) which have a common value within a column.
For example (starting from the last row)
I try to select all of the rows where _user_id == 1 until _user_id != 1 ?
In this case resulting in selecting rows [4, 5, 6]
+------------------------+
| _id _user_id amount |
+------------------------+
| 1 1 777 |
| 2 2 1 |
| 3 2 11 |
| 4 1 10 |
| 5 1 100 |
| 6 1 101 |
+------------------------+
/*Create the table*/
CREATE TABLE IF NOT EXISTS t1 (
_id INTEGER PRIMARY KEY AUTOINCREMENT,
_user_id INTEGER,
amount INTEGER);
/*Add the datas*/
INSERT INTO t1 VALUES(1, 1, 777);
INSERT INTO t1 VALUES(2, 2, 1);
INSERT INTO t1 VALUES(3, 2, 11);
INSERT INTO t1 VALUES(4, 1, 10);
INSERT INTO t1 VALUES(5, 1, 100);
INSERT INTO t1 VALUES(6, 1, 101);
/*Check the datas*/
SELECT * FROM t1;
1|1|777
2|2|1
3|2|11
4|1|10
5|1|100
6|1|101
In my attempt I use Common Table Expressions to group the results of _user_id. This gives the index of the last row containing a unique value (eg. SELECT _id FROM t1 GROUP BY _user_id LIMIT 2; will produce: [6, 3])
I then use those two values to select a range where LIMIT 1 OFFSET 1 is the lower end (3) and LIMIT 1 is the upper end (6)
WITH test AS (
SELECT _id FROM t1 GROUP BY _user_id LIMIT 2
) SELECT * FROM t1 WHERE _id BETWEEN 1+ (
SELECT * FROM test LIMIT 1 OFFSET 1
) and (
SELECT * FROM test LIMIT 1
);
Output:
4|1|10
5|1|100
6|1|101
This appears to work ok at selecting the last "island" but what I really need is a way to select the n'th island.
Is there a way to generate a query capable of producing outputs like these when provided a parameter n?:
island (n=1):
4|1|10
5|1|100
6|1|101
island (n=2):
2|2|1
3|2|11
island (n=3):
1|1|777
Thanks!

SQL tables are unordered, so the only way to search for islands is to search for consecutive _id values:
WITH RECURSIVE t1_with_islands(_id, _user_id, amount, island_number) AS (
SELECT _id,
_user_id,
amount,
1
FROM t1
WHERE _id = (SELECT max(_id)
FROM t1)
UNION ALL
SELECT t1._id,
t1._user_id,
t1.amount,
CASE WHEN t1._user_id = t1_with_islands._user_id
THEN island_number
ELSE island_number + 1
END
FROM t1
JOIN t1_with_islands ON t1._id = (SELECT max(_id)
FROM t1
WHERE _id < t1_with_islands._id)
)
SELECT *
FROM t1_with_islands
ORDER BY _id;

Select multiple columns using decode

I have 6 columns with same value as either of 0,1,2,3. I want to display the result such as 0 represents SUCCESS, 1 or 2 represent failure and 3 represents NOT APPLICABLE. So if in DB the values are :
col A | col B | col C | col D | col E | col F
0 | 1 | 2 | 0 | 3 | 2
Output should be :
col A | col B | col C | col D | col E | col F
S | F | F | S | NA | F
Is it possible to do it through decode by selecting all the columns at once rather than selecting them individually?

If I understand your question correctly, it sounds like you just need a case expression (or decode, if you prefer, but that's less self-documenting than a case expression), along the lines of:
case when some_col = 0 then 'S'
when some_col in (1, 2) then 'F'
...
else some_col -- replace with whatever you want the output to be if none of the above conditions are met
end
or maybe:
case some_col
when 0 then 'S'
when 1 then 'F'
...
else some_col -- replace with whatever you want the output to be if none of the above conditions are met
end
So your query would look something like:
select case ...
end col_a,
...
case ...
end col_f
from your_table;

Is it possible to do it through decode by selecting all the columns at once rather than selecting them individually?
No
However, besides using pivot, the only solution I see would be using PL/SQL:
1.This is how I simulated your table
SELECT *
FROM (WITH tb1 (col_a, col_b, col_c, col_d, col_e, col_f) AS
(SELECT 0, 1, 2, 0, 3, 2 FROM DUAL)
SELECT *
FROM tb1)
2.I would append the columns together with a comma between them and save them into a table of strings
SELECT col_a || ',' || col_b || ',' || col_c || ',' || col_d || '.' || col_e || ',' || col_f
FROM (WITH tb1 (col_a, col_b, col_c, col_d, col_e, col_f) AS (SELECT 0, 1, 2, 0, 3, 2 FROM DUAL)
SELECT *
FROM tb1)
3.Then I would use REGEXP_REPLACE to replace your values one row at a time
SELECT REPLACE (REGEXP_REPLACE (REPLACE ('0,1,2,0,3,2', 0, 'S'), '[1-2]', 'F'), 3, 'NA') COL_STR
FROM DUAL
4. Using dynamic SQL I would update the table using rowid or whatever you intend to do. I made this SQL which will separate the string into columns
SELECT REGEXP_SUBSTR (COL_STR, '[^,]+', 1, 1) AS COL_A,
REGEXP_SUBSTR (COL_STR, '[^,]+', 1, 2) AS COL_B,
REGEXP_SUBSTR (COL_STR, '[^,]+', 1, 3) AS COL_C,
REGEXP_SUBSTR (COL_STR, '[^,]+', 1, 4) AS COL_D,
REGEXP_SUBSTR (COL_STR, '[^,]+', 1, 5) AS COL_E,
REGEXP_SUBSTR (COL_STR, '[^,]+', 1, 6) AS COL_F
FROM tst1)
All of this is very tedious and it could take some time. Using DECODE or CASE would be easier to look at and interpret and thus easier to maintain.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Split values in parts with sqlite - sqlite

Related

SQL to find next greater records for each element

calculate percentages with postgresql join queries

Get the number of execution id wise in toad for oracle

Selecting the n'th range/island of rows where columns have a common value?

Select multiple columns using decode

Categories

Resources