query with max and second factor [duplicate] - sqlite

I have:
TABLE MESSAGES
message_id | conversation_id | from_user | timestamp | message
I want:
1. SELECT * WHERE from_user <> id
2. GROUP BY conversation_id
3. SELECT in every group row with MAX(timestamp) **(if there are two same timestamps in a group use second factor as highest message_id)** !!!
4. then results SORT BY timestamp
to have result:
2|145|xxx|10000|message
6|1743|yyy|999|message
7|14|bbb|899|message
with eliminated
1|145|xxx|10000|message <- has same timestamp(10000) as message(2) belongs to the same conversation(145) but message id is lowest
5|1743|me|1200|message <- has message_from == me
example group with same timestamp
i want from this group row 3 but i get row 2 from query
SELECT max(message_timestamp), message_id, message_text, message_conversationId
FROM MESSAGES
WHERE message_from <> 'me'
GROUP BY message_conversationId
ORDER by message_Timestamp DESC
what is on my mind to do union from message_id & timestamp and then get max???

Your query is based on non-standard use of GROUP BY (I think SQLite allows that only for compatibility with MySQL) and I'm not at all sure that it will produce determinate results all the time.
Plus it uses MAX() on concatenated columns. Unless you somehow ensure that the two (concatenated) columns have fixed widths, the results will not be accurate for that reason as well.
I would write the query like this:
SELECT
m.message_timestamp,
m.message_id,
m.message_text,
m.message_conversationId
FROM
( SELECT message_conversationId -- for every conversation
FROM messages as m
WHERE message_from <> 'me'
GROUP BY message_conversationId
) AS mc
JOIN
messages AS m -- join to the messages
ON m.message_id =
( SELECT mi.message_id -- and find one message id
FROM messages AS mi
WHERE mi.message_conversationId -- for that conversation
= mc.message_conversationId
AND mi.message_from <> 'me'
ORDER BY mi.message_timestamp DESC, -- according to the
mi.message_id DESC -- specified order
LIMIT 1 -- (this is the one part)
) ;

Try below sql to achieve your purpose by group by twice.
select m.*
from
Messages m
-- 3. and then joining to get wanted output columns
inner join
(
--2. then selecting from this max timestamp - and removing duplicates
select conversation_id, max(timestamp), message_id
from
(
-- 1. first select max message_id in remainings after the removal of duplicates from mix of cv_id & timestamp
select conversation_id, timestamp, max(message_id) message_id
from Messages
where message <> 'me'
group by conversation_id, timestamp
) max_mid
group by conversation_id
) max_mid_ts on max_mid_ts.message_id = m.message_id
order by m.message_id;
http://goo.gl/MyZjyU

ok it was more simple than I thought:
basically to change select from:
max(message_timestamp)
to:
max(message_timestamp || message_id)
or max(message_timestamp + message_id)
so it will search for max on concatenation of timestamp and message_id
ps. after a digging - it's working only if message id is growing with timestamp ( order of insertion is preserved )
edit:
edit2 :
so why it works ?
SELECT max(message_timestamp+message_id), message_timestamp, message_id, message_conversationId, message_from,message_text
FROM MESSAGES
WHERE message_conversationId = 1521521
AND message_from <> 'me'
ORDER by message_Timestamp DESC

Related

Spool space error when inserting large result set to table

I have a SQL query in teradata that returns a results set of ~160m rows in (I guess) a reasonable time: dependent on how good a day the server is having it runs between 10-60 minutes.
I recently got access to space to save it as a table, however using my initial query and the "insert into " command I get error 2646-no more spool.
query structure is
insert into <test_DB.tablename>
with smaller_dataset as
(
select
*
from
(
select
items
,case items
from
<Database.table>
QUALIFY ROW_NUMBER() OVER (PARTITION BY A,B ORDER BY C desc , LAST_UPDATE_DTM DESC) = 1
where 1=1
and other things
) T --irrelevant alias for subquery
QUALIFY ROW_NUMBER() OVER (PARTITION BY A, B ORDER BY C desc) = 1)
, employee_table as
(
select
items
,max(J1.field1) J1_field1
,max(J2.field1) J2_field1
,max(J3.field1) J3_field1
,max(J4.field1) J4_field1
from smaller_dataset S
self joins J1,J2,J3,J4
group by
non-aggregate items
)
select
items
case items
from employee_table
;
How can I break up the return into smaller chunks to prevent this error?

PLSQL query union sum query

I need to output running balance in the following query, Please help me with the query-code:
select
WEAVING_YARN_TR.TR_ID,
WEAVING_YARN_TR.GP_NO,
WEAVING_YARN_TR.YR_ID,
WEAVING_YARN_GATEPASS.TO_FROM as PARTY,
WEAVING_YARN_GATEPASS.GP_DATE,
WEAVING_YARN_TR.BAGS_IN,
WEAVING_YARN_TR.BAGS_OUT,
SUM(NVL(WEAVING_YARN_TR.BAGS_OUT,0)-NVL(WEAVING_YARN_TR.BAGS_IN,0)) over (order by GP_DATE) as BALANCE
from WEAVING_YARN_TR,WEAVING_YARN_GATEPASS
where WEAVING_YARN_TR.GP_NO=WEAVING_YARN_GATEPASS."GP_NO." and WEAVING_YARN_TR.YR_ID=:P216_YR_ID and WEAVING_YARN_GATEPASS.TO_FROM=:P216_YR_DESC
UNION ALL
select
null as TR_ID,
WEAVING_WARP_SET.SET_ID,
WEAVING_WARP_SET.YARN_ITEM,
WEAVING_WARP_SET.PARTY,
WEAVING_WARP_SET.RECEIVED_DATE,
WEAVING_WARP_SET.TOTAL_BAGS_CONSUMED,
null as BAGS_OUT,
NVL(null,0)-NVL(WEAVING_WARP_SET.TOTAL_BAGS_CONSUMED,0) as BALANCE
from WEAVING_WARP_SET
where WEAVING_WARP_SET.YARN_ITEM=:P216_YR_ID and WEAVING_WARP_SET.PARTY= :P216_YR_DESC
order by GP_DATE
So here is the thing; if you're going to use UNION or UNION ALL, each query block has to be the equal column and type.
For example; We have 2 queries and we want to bind them.
The first query block contains 3 columns and their type NUMBER, VARCHAR, NUMBER (in order). The second block has to be 3 column and types must be NUMBER, VARCHAR, VARCHAR (in order)
So your query does not seem like that. You want an output like the screenshot that you sent. Here is the query that edited:
select
weaving_yarn_gatepass.gp_date
,weaving_yarn_tr.gp_no
,weaving_yarn_gatepass.to_from as party
,weaving_yarn_tr.yr_id
,weaving_yarn_tr.bags_in
,weaving_yarn_tr.bags_out
,sum(nvl(weaving_yarn_tr.bags_out,0) - nvl(weaving_yarn_tr.bags_in,0)) over(order by gp_date) as balance
from weaving_yarn_tr
,weaving_yarn_gatepass
where weaving_yarn_tr.gp_no = WEAVING_YARN_GATEPASS."GP_NO."
and weaving_yarn_tr.yr_id = :p216_yr_id
and weaving_yarn_gatepass.to_from = :p216_yr_desc
union all
select
weaving_warp_set.received_date --I don't know if it means gp_date
,null as gp_no
,weaving_warp_set.party
,null yr_id -- If there is a column named YR_ID in WEAVING_WARP_SET put here
,weaving_warp_set.total_bags_consumed
,null as bags_out
,nvl(null,0) - nvl(weaving_warp_set.total_bags_consumed,0) as balance
from weaving_warp_set
where weaving_warp_set.yarn_item = :p216_yr_id
and weaving_warp_set.party = :p216_yr_desc
order by gp_date;

Cannot replace a string with several random strings taken from another table in sqlite

I'm trying to replace a placeholder string inside a selection of 10 random records with a random string (a name) taken from another table, using only sqlite statements.
i've done a subquery in order to replace() of the placeholder with the results of a subquery. I thought that each subquery loaded a random name from the names table, but i've found that it's not the case and each placeholder is replaced with the same string.
select id, (replace (snippet, "%NAME%", (select
name from names
where gender = "male"
) )
) as snippet
from imagedata
where timestamp is not NULL
order by random()
limit 10
I was expecting for each row of the SELECT to have different random replacement every time the subquery is invoked.
hello i'm %NAME% and this is my house
This is the car of %NAME%, let me know what you think
instead each row has the same kind of replacement:
hello i'm david and this is my house
This is the car of david, let me know what you think
and so on...
I'm not sure it can be done inside sqlite or if i have to do it in php over two different database queries.
Thanks in advance!
Seems that random() in the subquery is only evaluated once.
Try this:
select
i.id,
replace(i.snippet, '%NAME%', n.name) snippet
from (
select
id,
snippet,
abs(random()) % (select count(*) from names where gender = 'male') + 1 num
from imagedata
where timestamp is not NULL
order by random() limit 10
) i inner join (
select
n.name,
(select count(*) from names where name < n.name and gender = 'male') + 1 num
from names n
where gender = 'male'
) n on n.num = i.num

SQLite Nested Query for maximum

I'm trying to use DB Browser for SQLite to construct a nested query to determine the SECOND highest priced item purchased by the top 10 spenders. The query I have to pick out the top 10 spenders is:
SELECT user_id, max(item_total), SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders AS o
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10
This gives the user_id, most expensive item they purchased (not counting shipping or discounts) as well as the total amount they spent on the site.
I was trying to use a nested query to generate a list of the second most expensive items they purchased, but keep getting errors. I've tried
SELECT user_id, MAX(item_total) AS second_highest
FROM orders
WHERE item_total < (SELECT user_id, SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10)
group by user_id
I keep getting a row value misused error. Does anyone have pointers on this nested query or know of another way to find the second highest item purchased from within the group found in the first query?
Thanks!
(Note: The following assumes you're using Sqlite 3.25 or newer since it uses window functions).
This will return the second-largest item_total for each user_id without duplicates:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders)
SELECT user_id, item_total FROM ranked WHERE ranking = 2;
You can combine it with your original query with something like:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders),
totals AS
(SELECT user_id
, sum (item_total + shipping_cost - discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = 0
GROUP BY user_id)
SELECT t.user_id, r.item_total, t.total_spent
FROM totals AS t
JOIN ranked AS r ON t.user_id = r.user_id
WHERE r.ranking = 2
ORDER BY t.total_spent DESC, t.user_id
LIMIT 10;
Okay, after fixing your table definition to better reflect the values being stored in it and the stated problem, and fixing the data and adding to it so you can actually get results, plus an optional but useful index like so:
CREATE TABLE orders (order_id INTEGER PRIMARY KEY
, user_id INTEGER
, item_total REAL
, shipping_cost NUMERIC
, discounts_applied NUMERIC
, payment_reject INTEGER);
INSERT INTO orders(user_id, item_total, shipping_cost, discounts_applied
, payment_reject) VALUES (9852,60.69,10,0,FALSE),
(2784,123.91,15,0,FALSE), (1619,119.75,15,0,FALSE), (9725,151.92,15,0,FALSE),
(8892,153.27,15,0,FALSE), (7105,156.86,25,0,FALSE), (4345,136.09,15,0,FALSE),
(7779,134.93,15,0,FALSE), (3874,157.27,15,0,FALSE), (5102,108.3,10,0,FALSE),
(3098,59.97,10,0,FALSE), (6584,124.92,15,0,FALSE), (5136,111.06,10,0,FALSE),
(1869,113.44,20,0,FALSE), (3830,129.63,15,0,FALSE), (9852,70.69,10,0,FALSE),
(2784,134.91,15,0,FALSE), (1619,129.75,15,0,FALSE), (9725,161.92,15,0,FALSE),
(8892,163.27,15,0,FALSE), (7105,166.86,25,0,FALSE), (4345,146.09,15,0,FALSE),
(7779,144.93,15,0,FALSE), (3874,167.27,15,0,FALSE), (5102,118.3,10,0,FALSE),
(3098,69.97,10,0,FALSE), (6584,134.92,15,0,FALSE), (5136,121.06,10,0,FALSE),
(1869,123.44,20,0,FALSE), (3830,139.63,15,0,FALSE);
CREATE INDEX orders_idx_1 ON orders(user_id, item_total DESC);
the above query will give:
user_id item_total total_spent
---------- ---------- -----------
7105 156.86 373.72
3874 157.27 354.54
8892 153.27 346.54
9725 151.92 343.84
4345 136.09 312.18
7779 134.93 309.86
3830 129.63 299.26
6584 124.92 289.84
2784 123.91 288.82
1619 119.75 279.5
(If you get a syntax error from the query now, it's because you're using an old version of sqlite that doesn't support window functions.)

sort semicolon separated values per row in a column

I want to sort semicolon separated values per row in a column. Eg.
Input:
abc;pqr;def;mno
xyz;pqr;abc
abc
xyz;jkl
Output:
abc;def;mno;pqr
abc;pqr;xyz
abc
jkl;xyz
Can anyone help?
Perhaps something like this. Breaking it down:
First we need to break up the strings into their component tokens, and then reassemble them, using LISTAGG(), while ordering them alphabetically.
There are many ways to break up a symbol-separated string. Here I demonstrate the use of a hierarchical query. It requires that the input strings be uniquely distinguished from each other. Since the exact same semicolon-separated string may appear more than once, and since there is no info from the OP about any other unique column in the table, I create a unique identifier (using ROW_NUMBER()) in the most deeply nested subquery. Then I run the hierarchical query to break up the inputs and then reassemble them in the outermost SELECT.
with
test_data as (
select 'abc;pqr;def;mno' as str from dual union all
select 'xyz;pqr;abc' from dual union all
select 'abc' from dual union all
select 'xyz;jkl' from dual
)
-- End of test data (not part of the solution!)
-- SQL query begins BELOW THIS LINE.
select str,
listagg(token, ';') within group (order by token) as sorted_str
from (
select rn, str,
regexp_substr(str, '([^;]*)(;|$)', 1, level, null, 1) as token
from (
select str, row_number() over (order by null) as rn
from test_data
)
connect by level <= length(str) - length(replace(str, ';')) + 1
and prior rn = rn
and prior sys_guid() is not null
)
group by rn, str
;
STR SORTED_STR
--------------- ---------------
abc;pqr;def;mno abc;def;mno;pqr
xyz;pqr;abc abc;pqr;xyz
abc abc
xyz;jkl jkl;xyz
4 rows selected.

Resources