How to get nearest DateTime from 2 tables - sqlite

In SQLite, I want to build a query to get the nearest datetime for 'tag' entries against a 'tick' list:
CREATE TABLE Tick (
id integer primary key,
dt varchar(20)
);
INSERT INTO Tick (id, dt) VALUES
( 1, '2018-10-30 13:00:00'),
( 2, '2018-10-30 14:00:00'),
( 3, '2018-10-30 15:00:00'),
( 4, '2018-10-30 16:00:00'),
( 5, '2018-10-30 17:00:00'),
( 6, '2018-10-30 18:00:00'),
( 7, '2018-10-30 19:00:00'),
( 8, '2018-10-31 05:00:00'),
( 9, '2018-10-31 06:00:00'),
(10, '2018-10-31 07:00:00');
CREATE TABLE Tag (
id integer primary key,
dt varchar(20)
);
INSERT INTO Tag (id, dt) VALUES
(100, '2018-10-30 16:08:00'),
(101, '2018-10-30 17:30:00'),
(102, '2018-10-30 19:12:00'),
(103, '2018-10-31 04:00:00'),
(104, '2018-10-31 13:00:00');
The following query gives me the good match (based on diff) but I'm unable to get Tick columns:
SELECT Tag.dt,
(SELECT ABS(strftime('%s',Tick.dt) - strftime('%s',Tag.dt)) as diff
FROM Tick
ORDER BY diff ASC
LIMIT 1
) as diff from Tag
I tried the following but I receive an error on Tag.dt in ORDER BY:
SELECT
Tag.id, Tag.dt,
Tick.id, Tick.dt,
abs(strftime('%s',Tick.dt) - strftime('%s',Tag.dt)) as Diff FROM Tag JOIN Tick ON Tick.dt = (SELECT Tick.dt
FROM Tick
ORDER BY abs(strftime('%s',Tick.dt) - strftime('%s',Tag.dt)) ASC
limit 1)
The result I would like to have is something like:
TagID,DateTimeTag ,TickID,DateTimeTick
100,2018-10-30 16:08:00, 4,2018-10-30 16:00:00
101,2018-10-30 17:30:00, 6,2018-10-30 18:00:00
102,2018-10-30 19:12:00, 7,2018-10-30 19:00:00
103,2018-10-31 04:00:00, 8,2018-10-31 05:00:00
104,2018-10-31 13:00:00, 10,2018-10-31 07:00:00
Edited later...
Based on forpas's answer, I was able to derive something without using the ROW_COUNTER() keyword which I can't use in FME. I also set a maximum delta time difference (10000 sec) to find a match:
SELECT t.TagId, t.Tagdt, t.TickId, t.Tickdt, MIN(t.Diff)
FROM
(
SELECT
Tag.id as TagId, Tag.dt as Tagdt,
Tick.id as TickId, Tick.dt as Tickdt,
abs(strftime('%s',Tick.dt) - strftime('%s',Tag.dt)) as Diff
FROM Tag, Tick
WHERE Diff < 10000
) AS t
GROUP BY t.TagId
Thanks again!

Use ROW_NUMBER() window function:
SELECT t.tagID, t.tagDT, t.tickID, t.tickDT
FROM (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY t.tagID, t.tagDT ORDER BY t.Diff) AS rn
FROM (
SELECT Tag.id tagID, Tag.dt tagDT, Tick.id tickID, Tick.dt tickDT,
ABS(strftime('%s',Tick.dt) - strftime('%s',Tag.dt)) as Diff
FROM Tag CROSS JOIN Tick
) AS t
) AS t
WHERE t.rn = 1
See the demo.
Rsults:
| tagID | tagDT | tickID | tickDT |
| ----- | ------------------- | ------ | ------------------- |
| 100 | 2018-10-30 16:08:00 | 4 | 2018-10-30 16:00:00 |
| 101 | 2018-10-30 17:30:00 | 5 | 2018-10-30 17:00:00 |
| 102 | 2018-10-30 19:12:00 | 7 | 2018-10-30 19:00:00 |
| 103 | 2018-10-31 04:00:00 | 8 | 2018-10-31 05:00:00 |
| 104 | 2018-10-31 13:00:00 | 10 | 2018-10-31 07:00:00 |

Create a temp_table query to get the differences of time stamps of the cross product of Tick and Tag tables and select the min value for each of the Tick table id s.
The two temp_table queries are identical.
Note that this query may not be efficient as it takes full cross product across the two tables
SELECT temp_table.tid, temp_table.tdt, temp_table.tiid, temp_table.tidt, temp_table.diff
FROM
(SELECT Tag.id AS tid, Tag.dt AS tdt, Tick.id AS tiid, Tick.dt AS tidt, abs(strftime('%s',Tick.dt) - strftime('%s',Tag.dt)) as diff
FROM tag, tick) temp_table
WHERE temp_table.diff =
(SELECT MIN(temp_table2.diff) FROM
(SELECT Tag.id AS tid, Tag.dt AS tdt, Tick.id AS tiid, Tick.dt AS tidt, abs(strftime('%s',Tick.dt) - strftime('%s',Tag.dt)) as diff
FROM tag, tick) temp_table2
WHERE temp_table2.tid = temp_table.tid
)
group by temp_table.tid

Related

SQLite: Calculate how a counter has increased in current day and week

I have a SQLite database with a counter and timestamp in unixtime as showed below:
+---------+------------+
| counter | timestamp |
+---------+------------+
| | 1582933500 |
| 1 | |
+---------+------------+
| 2 | 1582933800 |
+---------+------------+
| ... | ... |
+---------+------------+
I would like to calculate how 'counter' has increased in current day and current week.
It is possible in a SQLite query?
Thanks!
Provided you have SQLite version >= 3.25.0 the SQLite window functions will help you achieve this.
Using the LAG function to retrieve the value from the previous record - if there is none (which will be the case for the first row) a default value is provided, that is same as current row.
For the purpose of demonstration this code:
SELECT counter, timestamp,
LAG (timestamp, 1, timestamp) OVER (ORDER BY counter) AS previous_timestamp,
(timestamp - LAG (timestamp, 1, timestamp) OVER (ORDER BY counter)) AS diff
FROM your_table
ORDER BY counter ASC
will give this result:
1 1582933500 1582933500 0
2 1582933800 1582933500 300
In a CTE get the min and max timestamp for each day and join it twice to the table:
with cte as (
select date(timestamp, 'unixepoch', 'localtime') day,
min(timestamp) mindate, max(timestamp) maxdate
from tablename
group by day
)
select c.day, t2.counter - t1.counter difference
from cte c
inner join tablename t1 on t1.timestamp = c.mindate
inner join tablename t2 on t2.timestamp = c.maxdate;
With similar code get the results for each week:
with cte as (
select strftime('%W', date(timestamp, 'unixepoch', 'localtime')) week,
min(timestamp) mindate, max(timestamp) maxdate
from tablename
group by week
)
select c.week, t2.counter - t1.counter difference
from cte c
inner join tablename t1 on t1.timestamp = c.mindate
inner join tablename t2 on t2.timestamp = c.maxdate;

SQLite - Merge 2 tables according to modified date, insert a new row if necessary

I have a table having an ID column, this column is a primary key and unique as well. In addition, the table has a modified date column.
I have the same table in 2 databases and I am looking to merge both into one database. The merging scenario in a table is as follows:
Insert the record if the ID is not present;
If the ID exists, only update if the modified date is greater than that of the existing row.
For example, having:
Table 1:
id | name | createdAt | modifiedAt
---|------|------------|-----------
1 | john | 2019-01-01 | 2019-05-01
2 | jane | 2019-01-01 | 2019-04-03
Table 2:
id | name | createdAt | modifiedAt
---|------|------------|-----------
1 | john | 2019-01-01 | 2019-04-30
2 | JANE | 2019-01-01 | 2019-04-04
3 | doe | 2019-01-01 | 2019-05-01
The resulting table would be:
id | name | createdAt | modifiedAt
---|------|------------|-----------
1 | john | 2019-01-01 | 2019-05-01
2 | JANE | 2019-01-01 | 2019-04-04
3 | doe | 2019-01-01 | 2019-05-01
I've read about INSERT OR REPLACE, but I couldn't figure out how the date condition can be applied. I know as well that I can loop through each pair of similar row and check the date manually but this would be very time and performance consuming. Therefore, is there an efficient way to accomplish this in SQLite?
I'm using sqlite3 on Node.js .
The UPSERT notation added in Sqlite 3.24 makes this easy:
INSERT INTO table1(id, name, createdAt, modifiedAt)
SELECT id, name, createdAt, modifiedAt FROM table2 WHERE true
ON CONFLICT(id) DO UPDATE
SET (name, createdAt, modifiedAt) = (excluded.name, excluded.createdAt, excluded.modifiedAt)
WHERE excluded.modifiedAt > modifiedAt;
First create the table Table3:
CREATE TABLE Table3 (
id INTEGER,
name TEXT,
createdat TEXT,
modifiedat TEXT,
PRIMARY KEY(id)
);
and then insert the rows like this:
insert into table3 (id, name, createdat, modifiedat)
select id, name, createdat, modifiedat from (
select * from table1 t1
where not exists (
select 1 from table2 t2
where t2.id = t1.id and t2.modifiedat >= t1.modifiedat
)
union all
select * from table2 t2
where not exists (
select 1 from table1 t1
where t1.id = t2.id and t1.modifiedat > t2.modifiedat
)
)
This uses a UNION ALL for the 2 tables and gets only the needed rows with EXISTS which is a very efficient way to check the condition you want.
I have >= instead of > in the WHERE clause for Table1 in case the 2 tables have a row with the same id and the same modifiedat values.
In this case the row from Table2 will be inserted.
If you want to merge the 2 tables in Table1 you can use REPLACE:
replace into table1 (id, name, createdat, modifiedat)
select id, name, createdat, modifiedat
from table2 t2
where
not exists (
select 1 from table1 t1
where (t1.id = t2.id and t1.modifiedat > t2.modifiedat)
)

Selecting the n'th range/island of rows where columns have a common value?

I need to select all rows (for a range) which have a common value within a column.
For example (starting from the last row)
I try to select all of the rows where _user_id == 1 until _user_id != 1 ?
In this case resulting in selecting rows [4, 5, 6]
+------------------------+
| _id _user_id amount |
+------------------------+
| 1 1 777 |
| 2 2 1 |
| 3 2 11 |
| 4 1 10 |
| 5 1 100 |
| 6 1 101 |
+------------------------+
/*Create the table*/
CREATE TABLE IF NOT EXISTS t1 (
_id INTEGER PRIMARY KEY AUTOINCREMENT,
_user_id INTEGER,
amount INTEGER);
/*Add the datas*/
INSERT INTO t1 VALUES(1, 1, 777);
INSERT INTO t1 VALUES(2, 2, 1);
INSERT INTO t1 VALUES(3, 2, 11);
INSERT INTO t1 VALUES(4, 1, 10);
INSERT INTO t1 VALUES(5, 1, 100);
INSERT INTO t1 VALUES(6, 1, 101);
/*Check the datas*/
SELECT * FROM t1;
1|1|777
2|2|1
3|2|11
4|1|10
5|1|100
6|1|101
In my attempt I use Common Table Expressions to group the results of _user_id. This gives the index of the last row containing a unique value (eg. SELECT _id FROM t1 GROUP BY _user_id LIMIT 2; will produce: [6, 3])
I then use those two values to select a range where LIMIT 1 OFFSET 1 is the lower end (3) and LIMIT 1 is the upper end (6)
WITH test AS (
SELECT _id FROM t1 GROUP BY _user_id LIMIT 2
) SELECT * FROM t1 WHERE _id BETWEEN 1+ (
SELECT * FROM test LIMIT 1 OFFSET 1
) and (
SELECT * FROM test LIMIT 1
);
Output:
4|1|10
5|1|100
6|1|101
This appears to work ok at selecting the last "island" but what I really need is a way to select the n'th island.
Is there a way to generate a query capable of producing outputs like these when provided a parameter n?:
island (n=1):
4|1|10
5|1|100
6|1|101
island (n=2):
2|2|1
3|2|11
island (n=3):
1|1|777
Thanks!
SQL tables are unordered, so the only way to search for islands is to search for consecutive _id values:
WITH RECURSIVE t1_with_islands(_id, _user_id, amount, island_number) AS (
SELECT _id,
_user_id,
amount,
1
FROM t1
WHERE _id = (SELECT max(_id)
FROM t1)
UNION ALL
SELECT t1._id,
t1._user_id,
t1.amount,
CASE WHEN t1._user_id = t1_with_islands._user_id
THEN island_number
ELSE island_number + 1
END
FROM t1
JOIN t1_with_islands ON t1._id = (SELECT max(_id)
FROM t1
WHERE _id < t1_with_islands._id)
)
SELECT *
FROM t1_with_islands
ORDER BY _id;

Is there a way to reuse subqueries in the same query?

See Update at end of question for solution thanks to marked answer!
I'd like to treat a subquery as if it were an actual table that can be reused in the same query. Here's the setup SQL:
create table mydb.mytable
(
id integer not null,
fieldvalue varchar(100),
ts timestamp(6) not null
)
unique primary index (id, ts)
insert into mydb.mytable(0,'hello',current_timestamp - interval '1' minute);
insert into mydb.mytable(0,'hello',current_timestamp - interval '2' minute);
insert into mydb.mytable(0,'hello there',current_timestamp - interval '3' minute);
insert into mydb.mytable(0,'hello there, sir',current_timestamp - interval '4' minute);
insert into mydb.mytable(0,'hello there, sir',current_timestamp - interval '5' minute);
insert into mydb.mytable(0,'hello there, sir. how are you?',current_timestamp - interval '6' minute);
insert into mydb.mytable(1,'what up',current_timestamp - interval '1' minute);
insert into mydb.mytable(1,'what up',current_timestamp - interval '2' minute);
insert into mydb.mytable(1,'what up, mr man?',current_timestamp - interval '3' minute);
insert into mydb.mytable(1,'what up, duder?',current_timestamp - interval '4' minute);
insert into mydb.mytable(1,'what up, duder?',current_timestamp - interval '5' minute);
insert into mydb.mytable(1,'what up, duder?',current_timestamp - interval '6' minute);
What I want to do is return only rows where FieldValue differs from the previous row. This SQL does just that:
locking row for access
select id, fieldvalue, ts from
(
--locking row for access
select
id, fieldvalue,
min(fieldvalue) over
(
partition by id
order by ts, fieldvalue rows
between 1 preceding and 1 preceding
) fieldvalue2,
ts
from mydb.mytable
) x
where
hashrow(fieldvalue) <> hashrow(fieldvalue2)
order by id, ts desc
It returns:
+----+---------------------------------+----------------------------+
| id | fieldvalue | ts |
+----+---------------------------------+----------------------------+
| 0 | hello | 2015-05-06 10:13:34.160000 |
| 0 | hello there | 2015-05-06 10:12:34.350000 |
| 0 | hello there, sir | 2015-05-06 10:10:34.750000 |
| 0 | hello there, sir. how are you? | 2015-05-06 10:09:34.970000 |
| 1 | what up | 2015-05-06 10:13:35.470000 |
| 1 | what up, mr man? | 2015-05-06 10:12:35.690000 |
| 1 | what up, duder? | 2015-05-06 10:09:36.240000 |
+----+---------------------------------+----------------------------+
The next step is to return only the last row per ID. If I were to use this SQL to write the previous SELECT to a table...
create table mydb.reusetest as (above sql) with data;
...I could then do this do get the last row per ID:
locking row for access
select t1.* from mydb.reusetest t1,
(
select id, max(ts) ts from mydb.reusetest
group by id
) t2
where
t2.id = t1.id and
t2.ts = t1.ts
order by t1.id
It would return this:
+----+------------+----------------------------+
| id | fieldvalue | ts |
+----+------------+----------------------------+
| 0 | hello | 2015-05-06 10:13:34.160000 |
| 1 | what up | 2015-05-06 10:13:35.470000 |
+----+------------+----------------------------+
If I could reuse the subquery in my initial SELECT, I could achieve the same results. I could copy/paste the entire query SQL into another subquery to create a derived table, but this would just mean I'd need to change the SQL in two places if I ever needed to modify it.
Update
Thanks to Kristján, I was able to implement the WITH clause into my SQL like this for perfect results:
locking row for access
with items (id, fieldvalue, ts) as
(
select id, fieldvalue, ts from
(
select
id, fieldvalue,
min(fieldvalue) over
(
partition by id
order by ts, fieldvalue
rows between 1 preceding and 1 preceding
) fieldvalue2,
ts
from mydb.mytable
) x
where
hashrow(fieldvalue) <> hashrow(fieldvalue2)
)
select t1.* from items t1,
(
select id, max(ts) ts from items
group by id
) t2
where
t2.id = t1.id and
t2.ts = t1.ts
order by t1.id
Does WITH help? That lets you define a result set you can use multiple times in the SELECT.
From their example:
WITH orderable_items (product_id, quantity) AS
( SELECT stocked.product_id, stocked.quantity
FROM stocked, product
WHERE stocked.product_id = product.product_id
AND product.on_hand > 5
)
SELECT product_id, quantity
FROM orderable_items
WHERE quantity < 10;

Generating attendance list for hours without a matching row

I have a project that calculates work hour from the attendance logs that I import from attendance machine. I use SQLite database and VB .NET.
First I'll show the table that I use:
CREATE TABLE [CheckLogs] (
[IDCheckLog] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
[IDEmployee] TEXT NOT NULL,
[Dates] TEXT NOT NULL,
[In] TEXT,
[Out] TEXT,
[OverTime] NUMERIC DEFAULT 0);
CREATE TABLE integers (i INTEGER NOT NULL PRIMARY KEY);
INSERT INTO integers (i) VALUES
(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
Table CheckLogs is the data that I import from the attendance machine. The OverTime column is calculated in my program. Table integer is used to create the date list, I got it from here.
I want to generate a view that shows employee attendance between 2 dates and display the CheckLogs data if the employee is present and null if absent. Because in the table CheckLogs, when the employee is absent then there is no data from that day from this employee.
This is the view that I desired (this is report for employee 10001 between 2014-10-01 and 2014-10-05):
Dates | IDEmployee | In | Out
---------------------------------------
2014-10-01 | 10001 | 07:00 | 16:00
2014-10-02 | 10001 | 07:01 | 15:58
2014-10-03 | 10001 | null | null
2014-10-04 | 10001 | 07:08 | 15:48
2014-10-05 | 10001 | null | null
And this is the query that I have now:
SELECT X.[Dates], C.[IDEmployee], C.[In], C.[Out]
FROM
(select date('2014-10-01', '+' || (H.i*100 + T.i*10 + U.i) || ' day') as Dates
from integers as H
cross
join integers as T
cross
join integers as U
where date('2005-01-25', '+' || (H.i*100 + T.i*10 + U.i) || ' day') <= '2014-10-05') AS X
, CheckLogs AS C USING (Dates)
WHERE C.[IDEmployee]='10001'
From this query I have this result:
Dates | IDEmployee | In | Out
---------------------------------------
2014-10-01 | 10001 | 07:00 | 16:00
2014-10-02 | 10001 | 07:01 | 15:58
2014-10-04 | 10001 | 07:08 | 15:48
To get NULL values for rows without a match, you need an outer join.
And you have to take care not to filter out those rows with a WHERE clause that would not match NULL values; to get dates that do not match a condition, you have to put that condition into the join's ON clause:
SELECT ...
FROM ( ... ) AS X
LEFT JOIN CheckLogs AS C ON C.Dates = X.Dates AND
C.IDEmployee = '10001'

Resources