BigQuery join on closest date match - datetime

I'm trying to join two tables in BigQuery based on an id and the closest date match.
Transaction Table:
transactionId
dateTime
productId
4a50665e
2022-05-13T14:12:55
abc
7d5889cd
2022-05-22T16:10:21
abc
Product Log Table (log of when each productId is updated to a new version):
dateTime
productId
version
2022-05-19T06:37:24
abc
v2
2022-05-12T04:38:23
xyz
v1
2022-05-10T09:57:54
abc
v1
I want to add a version column to the transaction table by looking up from the product log table, based on productId match and the dateTime match from the product table. To get the active product version at the time of the transaction.
Desired Result:
transactionId
dateTime
productId
version
4a50665e
2022-05-13T14:12:55
abc
v1
7d5889cd
2022-05-22T16:10:21
abc
v2
Something like this
SELECT
t.*,
p.version
FROM
transaction_table t
LEFT JOIN
product_log_table p
ON
t.productId = p.productId AND
t.datetime < p.dateTime
But that doesn't work. I've tried searching a lot and tried a number of solutions but can't get anything to work. Should be simple? How do I do this?
Thanks for any help.

Consider below approach
select any_value(t).*,
string_agg(version, '' order by p.dateTime desc limit 1) as version
from Transaction_Table t
join Product_Log_Table p
on t.productId = p.productId
and t.dateTime >= p.dateTime
group by format('%t', t)
if applied to sample data in your question - output is

Related

Getting a min(date) AND max(date) AND their respective titles

I have three tables that I would like to select from
Table 1 has a bunch of static information about a user like their idnumber, name, registration date
Table 2 has the idnumber of the user, course number, and the date they registered for the course
Table 3 has the course number, and the title of the course
I am trying to use one query that will select the columns mentioned in table 1, with the most recent course they registered (name and date registered) as well as their first course registered (name and date registered)
Here is what I came up with
SELECT u.idst, u.userid, u.firstname, u.lastname, u.email, u.register_date,
MIN(l.date_inscr) as mindate, MAX(l.date_inscr) as maxdate, lc.coursename
FROM table1 u,table3 lc
LEFT JOIN table2 l
ON l.idCourse = lc.idCourse
WHERE u.idst = 12787
AND u.idst = l.idUser
And this gives me everything i need, and the dates are correct but I have no idea how to display BOTH of the names of courses. The most recent and the first.
And help would be great.
Thanks!!!
You can get your desired results by generating the min/max date_inscr for each user in a derived table and then joining that twice to table2 and table3, once to get each course name:
SELECT u.idst, u.userid, u.firstname, u.lastname, u.email, u.register_date,
l.mindate, lc1.coursename as first_course,
l.maxdate, lc2.coursename as latest_course
FROM table1 u
LEFT JOIN (SELECT idUser, MIN(date_inscr) AS mindate, MAX(date_inscr) AS maxdate
FROM table2
WHERE idUser = 12787
) l ON l.idUser = u.idst
LEFT JOIN table2 l1 ON l1.idUser = l.idUser AND l1.date_inscr = l.mindate
LEFT JOIN table3 lc1 ON lc1.idCourse = l1.idCourse
LEFT JOIN table2 l2 ON l2.idUser = l.idUser AND l2.date_inscr = l.maxdate
LEFT JOIN table3 lc2 ON lc2.idCourse = l2.idCourse
As #BillKarwin pointed out, this is more easily done using two separate queries.

SQLite Nested Query for maximum

I'm trying to use DB Browser for SQLite to construct a nested query to determine the SECOND highest priced item purchased by the top 10 spenders. The query I have to pick out the top 10 spenders is:
SELECT user_id, max(item_total), SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders AS o
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10
This gives the user_id, most expensive item they purchased (not counting shipping or discounts) as well as the total amount they spent on the site.
I was trying to use a nested query to generate a list of the second most expensive items they purchased, but keep getting errors. I've tried
SELECT user_id, MAX(item_total) AS second_highest
FROM orders
WHERE item_total < (SELECT user_id, SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10)
group by user_id
I keep getting a row value misused error. Does anyone have pointers on this nested query or know of another way to find the second highest item purchased from within the group found in the first query?
Thanks!
(Note: The following assumes you're using Sqlite 3.25 or newer since it uses window functions).
This will return the second-largest item_total for each user_id without duplicates:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders)
SELECT user_id, item_total FROM ranked WHERE ranking = 2;
You can combine it with your original query with something like:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders),
totals AS
(SELECT user_id
, sum (item_total + shipping_cost - discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = 0
GROUP BY user_id)
SELECT t.user_id, r.item_total, t.total_spent
FROM totals AS t
JOIN ranked AS r ON t.user_id = r.user_id
WHERE r.ranking = 2
ORDER BY t.total_spent DESC, t.user_id
LIMIT 10;
Okay, after fixing your table definition to better reflect the values being stored in it and the stated problem, and fixing the data and adding to it so you can actually get results, plus an optional but useful index like so:
CREATE TABLE orders (order_id INTEGER PRIMARY KEY
, user_id INTEGER
, item_total REAL
, shipping_cost NUMERIC
, discounts_applied NUMERIC
, payment_reject INTEGER);
INSERT INTO orders(user_id, item_total, shipping_cost, discounts_applied
, payment_reject) VALUES (9852,60.69,10,0,FALSE),
(2784,123.91,15,0,FALSE), (1619,119.75,15,0,FALSE), (9725,151.92,15,0,FALSE),
(8892,153.27,15,0,FALSE), (7105,156.86,25,0,FALSE), (4345,136.09,15,0,FALSE),
(7779,134.93,15,0,FALSE), (3874,157.27,15,0,FALSE), (5102,108.3,10,0,FALSE),
(3098,59.97,10,0,FALSE), (6584,124.92,15,0,FALSE), (5136,111.06,10,0,FALSE),
(1869,113.44,20,0,FALSE), (3830,129.63,15,0,FALSE), (9852,70.69,10,0,FALSE),
(2784,134.91,15,0,FALSE), (1619,129.75,15,0,FALSE), (9725,161.92,15,0,FALSE),
(8892,163.27,15,0,FALSE), (7105,166.86,25,0,FALSE), (4345,146.09,15,0,FALSE),
(7779,144.93,15,0,FALSE), (3874,167.27,15,0,FALSE), (5102,118.3,10,0,FALSE),
(3098,69.97,10,0,FALSE), (6584,134.92,15,0,FALSE), (5136,121.06,10,0,FALSE),
(1869,123.44,20,0,FALSE), (3830,139.63,15,0,FALSE);
CREATE INDEX orders_idx_1 ON orders(user_id, item_total DESC);
the above query will give:
user_id item_total total_spent
---------- ---------- -----------
7105 156.86 373.72
3874 157.27 354.54
8892 153.27 346.54
9725 151.92 343.84
4345 136.09 312.18
7779 134.93 309.86
3830 129.63 299.26
6584 124.92 289.84
2784 123.91 288.82
1619 119.75 279.5
(If you get a syntax error from the query now, it's because you're using an old version of sqlite that doesn't support window functions.)

Iteration for a non-sequential column

can some one help me...
I have to create,for each "Costumer", a iterator for a non-sequential ID to update the "version" column.
I need a cursor or something else?
Can i get some help?
Example:
ID COSTUMER VERSION
12 ANNA 1
24 ANNA 4
25 ANNA 5
60 ANNA 11
I want to correct the version to be sequential
You could use code something like this:
begin
for r in ( select id, row_number() over (partition by name order by version) as rn
from costumer
)
loop
update costumer
set version = r.rn
where id = r.id;
end loop;
end;
/
The partition by is there because I have assumed you want to have the sequence start from 1 for 'ANNA', then start from 1 again for customer 'JANE' etc. If not you can remove that part.
Here's the way to do it via a single MERGE statement:
MERGE INTO costumer tgt
USING (SELECT ID,
costumer,
VERSION,
ROWID row_id,
row_number() OVER (PARTITION BY costumer ORDER BY VERSION) new_version
FROM costumer) src
ON (tgt.rowid = src.rowid)
WHEN MATCHED THEN
UPDATE SET tgt.version = src.new_version;

Time Difference between query result rows in SQLite: How To?

Consider the following reviews table contents:
CustomerName ReviewDT
Doe,John 2011-06-20 10:13:24
Doe,John 2011-06-20 10:54:45
Doe,John 2011-06-20 11:36:34
Doe,Janie 2011-06-20 05:15:12
The results are ordered by ReviewDT and grouped by CustomerName, such as:
SELECT
CustomerName,
ReviewDT
FROM
Reviews
WHERE
CustomerName NOT NULL
ORDER BY CustomerName ASC, ReviewDT ASC;
I'd like to create a column of the time difference between each row of this query for each Customer... rowid gives the original row, and there is no pattern to the inclusion from the rowid etc...
For the 1st entry for a CustomerName, the value would be 0. I am asking here incase this is something that can be calculated as part of the original query somehow. If not, I was planning to do this by a series of queries - initially creating a new TABLE selecting the results of the query above - then ALTERING to add the new column and using UPDATE/strftime to get the time differences by using rowid-1 (somehow)...
To compute the seconds elapsed from one ReviewDT row to the next:
SELECT q.CustomerName, q.ReviewDT,
strftime('%s',q.ReviewDT)
- strftime('%s',coalesce((select r.ReviewDT from Reviews as r
where r.CustomerName = q.CustomerName
and r.ReviewDT < q.ReviewDT
order by r.ReviewDT DESC limit 1),
q.ReviewDT))
FROM Reviews as q WHERE q.CustomerName NOT NULL
ORDER BY q.CustomerName ASC, q.ReviewDT ASC;
To get the DT of each ReviewDT and its preceding CustomerName row:
SELECT q.CustomerName, q.ReviewDT,
coalesce((select r.ReviewDT from Reviews as r
where r.CustomerName = q.CustomerName
and r.ReviewDT < q.ReviewDT
order by r.ReviewDT DESC limit 1),
q.ReviewDT)
FROM Reviews as q WHERE q.CustomerName NOT NULL
ORDER BY q.CustomerName ASC, q.ReviewDT ASC;

Getting All the record of particular month - Building SQL Query

I need some help to build SQL Query. I have table having data like:
ID Date Name
1 1/1/2009 a
2 1/2/2009 b
3 1/3/2009 c
I need to get result something like...
1 1/1/2009 a
2 1/2/2009 b
3 1/3/2009 c
4 1/4/2009 Null
5 1/5/2009 Null
6 1/6/2009 Null
7 1/7/2009 Null
8 1/8/2009 Null
............................
............................
............................
30 1/30/2009 Null
31 1/31/2009 Null
I want query something like..
Select * from tbl **where month(Date)=1 AND year(Date)=2010**
Above is not completed query.
I need to get all the record of particular month, even if some date missing..
I guess there must be equi Join in the query, I am trying to build this query using Equi join
Thanks
BIG EDIT
Now understand the OPs question.
Use a common table expression and a left join to get this effect.
DECLARE #FirstDay DATETIME;
-- Set start time
SELECT #FirstDay = '2009-01-01';
WITH Days AS
(
SELECT #FirstDay as CalendarDay
UNION ALL
SELECT DATEADD(d, 1, CalendarDay) as CalendarDay
FROM Days
WHERE DATEADD(d, 1, CalendarDay) < DATEADD(m, 1, #FirstDay)
)
SELECT DATEPART(d,d.CalendarDay), **t.date should be (d.CalendarDay)**, t.Name FROM Days d
LEFT JOIN tbl t
ON
d.CalendarDay = t.Date
ORDER BY
d.CalendarDay;
Left this original answer at bottom
You need DATEPART, sir.
SELECT * FROM tbl WHERE DATEPART(m,Date) = 1
If you want to choose month and year, then you can use DATEPART twice or go for a range.
SELECT * FROM tbl WHERE DATEPART(m,Date) = 1 AND DATEPART(yyyy,Date) = 2009
Range :-
SELECT * FROM tbl WHERE Date >= '2009-01-01' AND Date < '2009-02-01'
See this link for more info on DATEPART.
http://msdn.microsoft.com/en-us/library/ms174420.aspx
You can use less or equal to.
Like so:
select * from tbl where date > '2009-01-01' and date < '2009-02-01'
However, it is unclear if you want month 1 from all years?
You can check more examples and functions on "Date and Time Functions" from MSDN
Create a temporary table containing all days of that certain month,
Do left outer join between that table and your data table on tempTable.month = #month.
now you have a big table with all days of the desired month and all the records matching the proper dates + empty records for those dates who have no data.
i hope that's what you want.

Resources