How to associate the same code when doing SUM()? - window-functions

I am doing a SUM() OVER(PARTITION BY ORDER BY ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
Using SQL Server 2019
I have 2 tables: Sales and Inventory.
I am trying to associate inventory numbers (DWF_INVENTORY_NUMBER_DESC) which represent batches of item quantities in inventory (DWF_INVENTORY_QUANTITY) to items and their invoices. Each item (DWF_SALES_ITEM_CODE) that is invoiced in sales with a certain sales Qty (DWF_SALES_SALES_QTY) should represent a purchase (DWF_SALES_INVOICE_NUM) and should then be pulled from inventory (DWF_INVENTORY_QUANTITY reduced). I have multiple invoices for the same item. I also have multiple inventory batches with different inventory batch numbers with their own associated quantities and their items in inventory. I need to be able to use the first in-first out (FIFO) strategy when I am pulling from inventory. In other words, I need to be able to only pull from one inventory number (batch) at one time with the earliest inventory date (meaning 1st in for DWF_INVENTORY_DELIVERY_DATE). However, in my script, it looks like I am pulling from multiple inventory numbers (batches) for the same item, which I do not want to have happen.
My current & expected results are the following (note that my shrinking running sum DWF_INVENTORY_RUNNING_BALANCE_QTY is working well):
I feel like I am super close, but I just need that final piece of advice to get my results in order. The condition should be that the oldest delivery date (that is also older than the invoice date) with the corresponding batch number should be associated to the corresponding item and its associated invoice. Once the initial inventory batch of 5000 units of QTY have reached zero, THEN we could associate the next inventory batch number (example, DWF_INVENTORY_NUMBER_DESC is 13763002028961) when newer invoice transactions come in the future.
See below scripts to recreate the entire scenario.
Build both sales and inventory tables with data:
CREATE TABLE [dbo].[DWF_INVENTORY](
[DWF_INVENTORY_ITEM_CD] [varchar](50) NULL,
[DWF_INVENTORY_DELIVERY_DATE] [varchar](50) NULL,
[DWF_INVENTORY_NUMBER_DESC] [varchar](50) NULL,
[DWF_INVENTORY_QUANTITY] [int] NULL,
[DWF_INVENTORY_ACCUMULATED_QTY] [int] NULL
) ON [PRIMARY]
GO
/****** Object: Table [dbo].[DWF_SALES] Script Date: 8/25/2021 12:13:27 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DWF_SALES](
[DWF_SALES_INVOICE_NUM] [int] NOT NULL,
[DWF_SALES_INVOICE_DATE] [datetime2](7) NOT NULL,
[DWF_SALES_ITEM_CODE] [nvarchar](50) NOT NULL,
[DWF_SALES_SALES_QTY] [int] NOT NULL
) ON [PRIMARY]
GO
INSERT [dbo].[DWF_INVENTORY] ([DWF_INVENTORY_ITEM_CD], [DWF_INVENTORY_DELIVERY_DATE], [DWF_INVENTORY_NUMBER_DESC], [DWF_INVENTORY_QUANTITY], [DWF_INVENTORY_ACCUMULATED_QTY]) VALUES (N'TU-1055', N'2019-10-31', N'13763001924657', 5000, 5000)
GO
INSERT [dbo].[DWF_INVENTORY] ([DWF_INVENTORY_ITEM_CD], [DWF_INVENTORY_DELIVERY_DATE], [DWF_INVENTORY_NUMBER_DESC], [DWF_INVENTORY_QUANTITY], [DWF_INVENTORY_ACCUMULATED_QTY]) VALUES (N'TU-1055', N'2020-11-20', N'13763002028961', 5000, 10000)
GO
INSERT [dbo].[DWF_INVENTORY] ([DWF_INVENTORY_ITEM_CD], [DWF_INVENTORY_DELIVERY_DATE], [DWF_INVENTORY_NUMBER_DESC], [DWF_INVENTORY_QUANTITY], [DWF_INVENTORY_ACCUMULATED_QTY]) VALUES (N'TU-1055', N'2021-01-08', N'13763002038565', 5000, 15000)
GO
INSERT [dbo].[DWF_SALES] ([DWF_SALES_INVOICE_NUM], [DWF_SALES_INVOICE_DATE], [DWF_SALES_ITEM_CODE], [DWF_SALES_SALES_QTY]) VALUES (395395, CAST(N'2021-05-13T00:00:00.0000000' AS DateTime2), N'TU-1055', 4)
GO
INSERT [dbo].[DWF_SALES] ([DWF_SALES_INVOICE_NUM], [DWF_SALES_INVOICE_DATE], [DWF_SALES_ITEM_CODE], [DWF_SALES_SALES_QTY]) VALUES (411239, CAST(N'2021-07-26T00:00:00.0000000' AS DateTime2), N'TU-1055', 100)
GO
INSERT [dbo].[DWF_SALES] ([DWF_SALES_INVOICE_NUM], [DWF_SALES_INVOICE_DATE], [DWF_SALES_ITEM_CODE], [DWF_SALES_SALES_QTY]) VALUES (378789, CAST(N'2021-02-23T00:00:00.0000000' AS DateTime2), N'TU-1055', 100)
GO
INSERT [dbo].[DWF_SALES] ([DWF_SALES_INVOICE_NUM], [DWF_SALES_INVOICE_DATE], [DWF_SALES_ITEM_CODE], [DWF_SALES_SALES_QTY]) VALUES (313564, CAST(N'2020-02-05T00:00:00.0000000' AS DateTime2), N'TU-1055', 30)
GO
INSERT [dbo].[DWF_SALES] ([DWF_SALES_INVOICE_NUM], [DWF_SALES_INVOICE_DATE], [DWF_SALES_ITEM_CODE], [DWF_SALES_SALES_QTY]) VALUES (327469, CAST(N'2020-05-04T00:00:00.0000000' AS DateTime2), N'TU-1055', 350)
GO
Here is my script with the current results:
SELECT
DWF_SALES_INVOICE_NUM,DWF_SALES_INVOICE_DATE
,DWF_INVENTORY_NUMBER_DESC
,DWF_INVENTORY_DELIVERY_DATE
,INVENTORY_NUM_STATUS
,DWF_INVENTORY_ITEM_CD
,CASE
WHEN DWF_INVENTORY_RUNNING_BALANCE_QTY < 0 THEN 0
ELSE DWF_INVENTORY_RUNNING_BALANCE_QTY
END AS DWF_INVENTORY_RUNNING_BALANCE_QTY
,DWF_INVENTORY_QUANTITY
,CASE
WHEN DWF_SALES_RUNNING_BALANCE_QTY < 0 THEN 0
ELSE DWF_SALES_RUNNING_BALANCE_QTY
END AS DWF_SALES_RUNNING_BALANCE_QTY
,DWF_SALES_SALES_QTY
FROM (
SELECT
DWF_SALES_INVOICE_NUM,DWF_SALES_INVOICE_DATE
,DWF_INVENTORY_NUMBER_DESC
,DWF_INVENTORY_DELIVERY_DATE
,INVENTORY_NUM_STATUS
,DWF_INVENTORY_ITEM_CD
,CASE
WHEN INVENTORY_NUM_STATUS='NOT ALLOCATED' THEN 0
ELSE (DWF_INVENTORY_QUANTITY-
SUM(DWF_SALES_SALES_QTY_forInvRunnBlnCalculate) OVER(PARTITION BY DWF_INVENTORY_ITEM_CD ORDER BY DWF_INVENTORY_DELIVERY_DATE,DWF_SALES_INVOICE_DATE
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))
END AS DWF_INVENTORY_RUNNING_BALANCE_QTY
,DWF_INVENTORY_QUANTITY
,CASE
WHEN INVENTORY_NUM_STATUS='NOT ALLOCATED' THEN (DWF_INVENTORY_QUANTITY-DWF_SALES_SALES_QTY)
ELSE 0
END AS DWF_SALES_RUNNING_BALANCE_QTY
,DWF_SALES_SALES_QTY
FROM (
SELECT b.DWF_SALES_INVOICE_NUM,b.DWF_SALES_INVOICE_DATE
,a.DWF_INVENTORY_NUMBER_DESC
,a.DWF_INVENTORY_DELIVERY_DATE
,CASE
WHEN (a.DWF_INVENTORY_DELIVERY_DATE > b.DWF_SALES_INVOICE_DATE)
AND (a.DWF_INVENTORY_QUANTITY - SUM(b.DWF_SALES_SALES_QTY) OVER(PARTITION BY a.DWF_INVENTORY_ITEM_CD ORDER BY a.DWF_INVENTORY_DELIVERY_DATE,b.DWF_SALES_INVOICE_DATE
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)) < 1
THEN 'NOT ALLOCATED'
ELSE ''
END AS INVENTORY_NUM_STATUS
,CASE
WHEN (a.DWF_INVENTORY_DELIVERY_DATE > b.DWF_SALES_INVOICE_DATE)
AND (a.DWF_INVENTORY_QUANTITY - SUM(b.DWF_SALES_SALES_QTY) OVER(PARTITION BY a.DWF_INVENTORY_ITEM_CD ORDER BY a.DWF_INVENTORY_DELIVERY_DATE,b.DWF_SALES_INVOICE_DATE
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)) < 0
THEN 0
ELSE DWF_SALES_SALES_QTY
END AS DWF_SALES_SALES_QTY_forInvRunnBlnCalculate
,a.DWF_INVENTORY_ITEM_CD
,a.DWF_INVENTORY_QUANTITY - SUM(b.DWF_SALES_SALES_QTY) OVER(PARTITION BY a.DWF_INVENTORY_ITEM_CD ORDER BY a.DWF_INVENTORY_DELIVERY_DATE,b.DWF_SALES_INVOICE_DATE
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS DWF_INVENTORY_RUNNING_BALANCE_QTY
,a.DWF_INVENTORY_QUANTITY
,b.DWF_SALES_SALES_QTY - SUM(a.DWF_INVENTORY_QUANTITY) OVER(PARTITION BY a.DWF_INVENTORY_ITEM_CD ORDER BY a.DWF_INVENTORY_DELIVERY_DATE,b.DWF_SALES_INVOICE_DATE
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS DWF_SALES_RUNNING_BALANCE_QTY
,b.DWF_SALES_SALES_QTY
,ROW_NUMBER() OVER (PARTITION BY a.DWF_INVENTORY_ITEM_CD ORDER BY a.DWF_INVENTORY_DELIVERY_DATE,b.DWF_SALES_INVOICE_DATE) AS rowNum
FROM [dbo].[DWF_INVENTORY] AS a
INNER JOIN [dbo].[DWF_SALES] AS b ON a.DWF_INVENTORY_ITEM_CD=b.DWF_SALES_ITEM_CODE
WHERE b.DWF_SALES_SALES_QTY >= 0
) AS tblA) AS tblB;
I know that I need to create some kind of conditional logic for my DWF_INVENTORY_NUMBER_DESC column or perhaps create a subquery, a CTE, or a temporary table to store the MIN(DWF_INVENTORY_DELIVERY_DATE), but am not sure exactly how to proceed.
Summary of the conditions to associate the same inventory number in my resulting records, when ...
... the DWF_INVENTORY_NUMBER_DESC is associated to the good item (DWF_INVENTORY_ITEM_CODE = DWF_SALES_ITEM_CODE)
... the DWF_INVENTORY_DELIVERY_DATE is the oldest/earliest
... the DWF_INVENTORY_DELIVERY_DATE < DWF_SALES_INVOICE_DATE
... the DWF_INVENTORY_RUNNING_BALANCE_QTY > 0
move on to next DWF_INVENTORY_NUMBER_DESC, when ...
... DWF_INVENTORY_RUNNING_BALANCE_QTY reaches zero
then follow again the above conditions.
Hopefully, this makes more sense on now to be able to re-use same DWF_INVENTORY_NUMBER_DESC for the records in the resulting script?
Any help in completing my reduced running SUM script would be appreciated.
Hopefully, all of my explanations above were clear enough as it's not a super straightforward problem!

So, I have resolved my own issue by writing to a temp table and storing the Inventory number where MIN(Delivery Date), which gets updated accordingly.

Related

Window function lag() in trigger uses default instead of previous value

I'm trying to create an SQLite trigger to update balance for a particular account code.
accounts table :
CREATE TABLE accounts (
year INTEGER NOT NULL,
month INTEGER NOT NULL CHECK(month BETWEEN 1 AND 12),
amount REAL NOT NULL CHECK(amount >= 0),
balance REAL,
code INTEGER NOT NULL
);
When a new row is inserted I want the balance value of the new row to reflect OLD balance + NEW amount. But this trigger does not recognize the lagging balance value and I cannot figure out why:
CREATE TRIGGER trg_accounts_balance
AFTER INSERT ON accounts
BEGIN
UPDATE accounts
SET balance = (
SELECT
lag(balance, 1, 0) OVER (
PARTITION BY code
ORDER BY month
) + NEW.amount
FROM accounts
)
WHERE rowid = NEW.ROWID;
END;
If I insert one row per month, I expect my data to look like:
year
month
amount
balance
code
2022
1
100.0
100.0
100
2022
2
9.99
109.99
100
But I get:
year
month
amount
balance
code
2022
1
100.0
100.0
100
2022
2
9.99
9.99
100
What am I doing wrong?
The query:
SELECT
lag(balance, 1, 0) OVER (
PARTITION BY code
ORDER BY month
)
FROM accounts
returns as many rows as there are in the table and SQLite picks the first (whichever it is) to return it as the result so that it can use it to add NEW.amount.
There is nothing that links this value to the specific row that was inserted.
Instead, use this:
CREATE TRIGGER trg_accounts_balance
AFTER INSERT ON accounts
BEGIN
UPDATE accounts
SET balance = COALESCE(
(
SELECT balance
FROM accounts
WHERE code = NEW.code
ORDER BY year DESC, month DESC
LIMIT 1, 1
), 0) + NEW.amount
WHERE rowid = NEW.ROWID;
END;
The subquery returns the previous inserted row by ordering the rows of the specific code descending and skipping the top row (which is the new row).
See the demo.

Spool space error when inserting large result set to table

I have a SQL query in teradata that returns a results set of ~160m rows in (I guess) a reasonable time: dependent on how good a day the server is having it runs between 10-60 minutes.
I recently got access to space to save it as a table, however using my initial query and the "insert into " command I get error 2646-no more spool.
query structure is
insert into <test_DB.tablename>
with smaller_dataset as
(
select
*
from
(
select
items
,case items
from
<Database.table>
QUALIFY ROW_NUMBER() OVER (PARTITION BY A,B ORDER BY C desc , LAST_UPDATE_DTM DESC) = 1
where 1=1
and other things
) T --irrelevant alias for subquery
QUALIFY ROW_NUMBER() OVER (PARTITION BY A, B ORDER BY C desc) = 1)
, employee_table as
(
select
items
,max(J1.field1) J1_field1
,max(J2.field1) J2_field1
,max(J3.field1) J3_field1
,max(J4.field1) J4_field1
from smaller_dataset S
self joins J1,J2,J3,J4
group by
non-aggregate items
)
select
items
case items
from employee_table
;
How can I break up the return into smaller chunks to prevent this error?

SQLite: Running balance with an ending balance

I have an ending balance of $5000. I need to create a running balance, but adjust the first row to show the ending balance then sum the rest, so it will look like a bank statement. Here is what I have for the running balance but how can I adjust row 1 to not show a sum of the first row, but the ending balance instead.
with BalBefore as (
select *
from transactions
where ACCT_NAME = 'Real Solutions'
ORDER BY DATE DESC
)
select
DATE,
amount,
'$' || printf("%.2f", sum(AMOUNT) over (order by ROW_ID)) as Balance
from BalBefore;
This gives me"
DATE AMOUNT BALANCE
9/6/2019 -31.00 $-31.00 <- I need this balance to be replaced with $5000 and have the rest
9/4/2019 15.00 $-16.00 sum as normal.
9/4/2019 15.00 $-1.00
9/3/2019 -16.00 $-17.00
I have read many other questions, but I couldn't find one that I could understand so I thought I would post a simpler question.
The following is not short and sweet, but using the WITH statement and CTEs, I hope that the logic is apparent. Multiple CTEs are defined which refer to each other to make the overall query more readable. Altogether the goal was just to add a beginning balance record that could be :
/*
DROP TABLE IF EXISTS data;
CREATE temp TABLE data (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
date DATETIME NOT NULL,
amount NUMERIC NOT NULL
);
INSERT INTO data
(date, amount)
VALUES
('2019-09-03', -16.00),
('2019-09-04', 15.00),
('2019-09-04', 15.00),
('2019-09-06', -31.00)
;
*/
WITH
initial_filter AS (
SELECT id, date, amount
FROM data
--WHERE ACCT_NAME = 'Real Solutions'
),
prepared AS (
SELECT *
FROM initial_filter
UNION ALL
SELECT
9223372036854775807 as id, --largest signed integer
(SELECT MAX(date) FROM initial_filter) AS FinalDate,
-(5000.00) --ending balance (negated for summing algorithm)
),
running AS (
SELECT
id,
date,
amount,
SUM(-amount) OVER
(ORDER BY date DESC, id DESC
RANGE UNBOUNDED PRECEDING
EXCLUDE CURRENT ROW) AS balance
FROM prepared
ORDER BY date DESC, id DESC
)
SELECT *
FROM running
WHERE id != 9223372036854775807
ORDER BY date DESC, id DESC;
This produces the following
id date amount balance
4 2019-09-06 -31.00 5000
3 2019-09-04 15.00 5031
2 2019-09-04 15.00 5016
1 2019-09-03 -16.00 5001
UPDATE: The first query was not producing the correct balances. The beginning balance row and the windowing function (i.e. OVER clause) were updated to accurately sum over the correct amounts.
Note: The balance on each row is determined completely from the previous rows, not from the current row's amount, because this works backward from an ending balance, not forward from the previous row balance.

SQLite Nested Query for maximum

I'm trying to use DB Browser for SQLite to construct a nested query to determine the SECOND highest priced item purchased by the top 10 spenders. The query I have to pick out the top 10 spenders is:
SELECT user_id, max(item_total), SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders AS o
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10
This gives the user_id, most expensive item they purchased (not counting shipping or discounts) as well as the total amount they spent on the site.
I was trying to use a nested query to generate a list of the second most expensive items they purchased, but keep getting errors. I've tried
SELECT user_id, MAX(item_total) AS second_highest
FROM orders
WHERE item_total < (SELECT user_id, SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10)
group by user_id
I keep getting a row value misused error. Does anyone have pointers on this nested query or know of another way to find the second highest item purchased from within the group found in the first query?
Thanks!
(Note: The following assumes you're using Sqlite 3.25 or newer since it uses window functions).
This will return the second-largest item_total for each user_id without duplicates:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders)
SELECT user_id, item_total FROM ranked WHERE ranking = 2;
You can combine it with your original query with something like:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders),
totals AS
(SELECT user_id
, sum (item_total + shipping_cost - discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = 0
GROUP BY user_id)
SELECT t.user_id, r.item_total, t.total_spent
FROM totals AS t
JOIN ranked AS r ON t.user_id = r.user_id
WHERE r.ranking = 2
ORDER BY t.total_spent DESC, t.user_id
LIMIT 10;
Okay, after fixing your table definition to better reflect the values being stored in it and the stated problem, and fixing the data and adding to it so you can actually get results, plus an optional but useful index like so:
CREATE TABLE orders (order_id INTEGER PRIMARY KEY
, user_id INTEGER
, item_total REAL
, shipping_cost NUMERIC
, discounts_applied NUMERIC
, payment_reject INTEGER);
INSERT INTO orders(user_id, item_total, shipping_cost, discounts_applied
, payment_reject) VALUES (9852,60.69,10,0,FALSE),
(2784,123.91,15,0,FALSE), (1619,119.75,15,0,FALSE), (9725,151.92,15,0,FALSE),
(8892,153.27,15,0,FALSE), (7105,156.86,25,0,FALSE), (4345,136.09,15,0,FALSE),
(7779,134.93,15,0,FALSE), (3874,157.27,15,0,FALSE), (5102,108.3,10,0,FALSE),
(3098,59.97,10,0,FALSE), (6584,124.92,15,0,FALSE), (5136,111.06,10,0,FALSE),
(1869,113.44,20,0,FALSE), (3830,129.63,15,0,FALSE), (9852,70.69,10,0,FALSE),
(2784,134.91,15,0,FALSE), (1619,129.75,15,0,FALSE), (9725,161.92,15,0,FALSE),
(8892,163.27,15,0,FALSE), (7105,166.86,25,0,FALSE), (4345,146.09,15,0,FALSE),
(7779,144.93,15,0,FALSE), (3874,167.27,15,0,FALSE), (5102,118.3,10,0,FALSE),
(3098,69.97,10,0,FALSE), (6584,134.92,15,0,FALSE), (5136,121.06,10,0,FALSE),
(1869,123.44,20,0,FALSE), (3830,139.63,15,0,FALSE);
CREATE INDEX orders_idx_1 ON orders(user_id, item_total DESC);
the above query will give:
user_id item_total total_spent
---------- ---------- -----------
7105 156.86 373.72
3874 157.27 354.54
8892 153.27 346.54
9725 151.92 343.84
4345 136.09 312.18
7779 134.93 309.86
3830 129.63 299.26
6584 124.92 289.84
2784 123.91 288.82
1619 119.75 279.5
(If you get a syntax error from the query now, it's because you're using an old version of sqlite that doesn't support window functions.)

Time Difference between query result rows in SQLite: How To?

Consider the following reviews table contents:
CustomerName ReviewDT
Doe,John 2011-06-20 10:13:24
Doe,John 2011-06-20 10:54:45
Doe,John 2011-06-20 11:36:34
Doe,Janie 2011-06-20 05:15:12
The results are ordered by ReviewDT and grouped by CustomerName, such as:
SELECT
CustomerName,
ReviewDT
FROM
Reviews
WHERE
CustomerName NOT NULL
ORDER BY CustomerName ASC, ReviewDT ASC;
I'd like to create a column of the time difference between each row of this query for each Customer... rowid gives the original row, and there is no pattern to the inclusion from the rowid etc...
For the 1st entry for a CustomerName, the value would be 0. I am asking here incase this is something that can be calculated as part of the original query somehow. If not, I was planning to do this by a series of queries - initially creating a new TABLE selecting the results of the query above - then ALTERING to add the new column and using UPDATE/strftime to get the time differences by using rowid-1 (somehow)...
To compute the seconds elapsed from one ReviewDT row to the next:
SELECT q.CustomerName, q.ReviewDT,
strftime('%s',q.ReviewDT)
- strftime('%s',coalesce((select r.ReviewDT from Reviews as r
where r.CustomerName = q.CustomerName
and r.ReviewDT < q.ReviewDT
order by r.ReviewDT DESC limit 1),
q.ReviewDT))
FROM Reviews as q WHERE q.CustomerName NOT NULL
ORDER BY q.CustomerName ASC, q.ReviewDT ASC;
To get the DT of each ReviewDT and its preceding CustomerName row:
SELECT q.CustomerName, q.ReviewDT,
coalesce((select r.ReviewDT from Reviews as r
where r.CustomerName = q.CustomerName
and r.ReviewDT < q.ReviewDT
order by r.ReviewDT DESC limit 1),
q.ReviewDT)
FROM Reviews as q WHERE q.CustomerName NOT NULL
ORDER BY q.CustomerName ASC, q.ReviewDT ASC;

Resources