I am designing a measurement instrument and the software is written in C++/Qt5.12 on a custom build Linux embedded system (Buildroot). The data are time series and fall into 2 categories :
actual physical data, 1..3 fields, sampling period 5 min
housekeeping parameters (temperatures, fow rates, etc.), 5..10 fields, sampling period 1..10 sec
I have been using CSV files so far, and they do the job. Although the data are not relational and the data acquisition rate is low, I am looking into SQLite because :
reduced risk to produce corrupted files in case of a crash thanks to transactions
more flexibility to alter the data format in the long run, e.g. add a column, with less impact on processing software
SQLite is supported by Buildroot
Questions :
Does SQLite look like a smart choice over CSV in this case ?
The instrument will be running 24/7 for years, so I guess I'll have to split the database into chunks (e.g. monthly) to keep the file reasonably small and for archiving. I wonder how easy that would be. Can it be automated with a cron job ?
Thanks.
Does SQLite look like a smart choice over CSV in this case ?
I'd suggest yes. Mainly because you would probably want to do something with the data other than spend the rest of your life looking through it.
Perhaps you want some sort of aggregated stats (a summary. averages, maximum value, minimum values perhaps to compare periods). SQLite can make that pretty easy and pretty efficient.
The instrument will be running 24/7 for years, so I guess I'll have to split the database into chunks (e.g. monthly) to keep the file reasonably small and for archiving. I wonder how easy that would be. Can it be automated with a cron job ?
Cron no need, utilise the power of SQLite, a TRIGGER could be handly.
Here's an example that shows a little of what you could do.
As you have 2 distinct sets of readings physical (table) and housekeeping (table) the example has a table per each.
the physical table has 1 column for the timestamp of the reading and 4 columns for the readings.
the housekeeping table has 1 column for the timestamp and 10 reading columns.
The example automatically generates data just_to_load some data to show results. The example has such a table that is used to control how much data is inserted, it has 1 row with 1 value (although it could have more rows) and this value is extracted to determine how much data is added.
with the value as 1000 1000 physical readings will be added for every 5 minutes (about 3.5 days worth of data).
with the value of 1000 then 300,000 rows will be added to the housekeeping table. i.e every 5 minutes 300 rows will be added.
The example demonstrates automated (TRIGGER) based tidying up (doesn't backup the data but will clear data from both the tables (just an example showing that you can do things automatically)). The TRIGGER is named auto_tidyup.
To know that the TRIGGER is being activated it additionally records the start and end of the TRIGGER's processing (what it does when activated and its WHEN clause condition is met (to reduce the times that it tries to do something)). This data is stored in another table namely tidyup_log.
The TRIGGER has been set so the WHEN clause is triggered (this would be changed after tested to a suitable schedule).
So in summary 4 tables (1 for testing purposes only) and 1 trigger.
When the data is loaded, the data is then used by 3 queries to extract useful data (well sort of).
The Example SQL (note that perhaps the most complicated SQL is for loading the testing data) :-
DROP TABLE IF EXISTS physical;
DROP TABLE IF EXISTS housekeeping;
DROP TRIGGER IF EXISTS auto_tidyup;
DROP TABLE IF EXISTS tidyup_log;
DROP TABLE IF EXISTS just_for_load;
CREATE TABLE IF NOT EXISTS physical(timestamp INTEGER PRIMARY KEY, fld1 REAL, fld2 REAL, fld3 REAL, fld4 REAL);
CREATE TABLE IF NOT EXISTS housekeeping(timestamp INTEGER PRIMARY KEY, prm1 REAL, prm2 REAL, prm3 REAL, prm4 REAL, prm5 REAL, prm6 REAL, prm7 REAL, prm8 REAL, prm9 REAL, prm10 REAL);
CREATE TABLE IF NOT EXISTS tidyup_log (timestamp INTEGER, action_performed TEXT);
CREATE TRIGGER IF NOT EXISTS auto_tidyup AFTER INSERT ON physical
WHEN CAST(strftime('%d','now') AS INTEGER) = 23 /* <<<<<<<<<< TESTING SO GET HITS >>>>>>>>>>*/
/*WHEN CAST(strftime('%d','now') AS INTEGER = 1 */ /* IF TODAY FIRST DAY OF MONTH */
BEGIN
INSERT INTO tidyup_log VALUES (strftime('%s','now'),'TIDY Started');
DELETE FROM physical WHERE timestamp < new.timestamp - (60 * 60 * 24 * 365 /*approx a year */);
DELETE FROM housekeeping WHERE timestamp < new.timestamp - (60 * 60 * 24 * 365);
INSERT INTO tidyup_log VALUES (strftime('%s','now'),'TIDY ENDED');
END
;
/* ONLY FOR LOADING Test Data controls number of rows added */
CREATE TABLE IF NOT EXISTS just_for_load (base_count INTEGER);
INSERT INTO just_for_load VALUES(1000); /* Number of physical rows to add 5 minutes e.g. 1000 is close to 3.5 days*/
WITH RECURSIVE counter(i) AS
(SELECT 1 UNION ALL SELECT i+1 FROM counter WHERE i < (SELECT sum(base_count) FROM just_for_load))
INSERT INTO physical SELECT strftime('%s','now','+'||(i * 5)||' minutes'), random(),random(),random(),random()FROM counter
;
WITH RECURSIVE counter(i) AS
(SELECT 1 UNION ALL SELECT i+1 FROM counter WHERE i < (SELECT (sum(base_count) * 300) FROM just_for_load))
INSERT INTO housekeeping SELECT strftime('%s','now','+'||(i)||' second'), random(),random(),random(),random(), random(),random(),random(),random(), random(),random()FROM counter
;
/* <<<<<<<<<< DATA LOADED SO EXTRACT IT >>>>>>>>> */
SELECT datetime(timestamp,'unixepoch'), fld1,fld2,fld3,fld4 FROM physical;
/* First query to basically show the 5 minute intervals (and lots of random values)*/
/* This query gets the sum and average of the 10 readings over a 5 minute window */
SELECT
'From '||datetime(min(timestamp),'unixepoch')||' To '||datetime(max(timestamp),'unixepoch') AS Range,
sum(prm1)AS avgP1, avg(prm1) AS sumP1,
sum(prm2)AS avgP2, avg(prm2) AS sumP2,
sum(prm3)AS avgP3, avg(prm3) AS sumP3,
sum(prm4)AS avgP4, avg(prm4) AS sumP4,
sum(prm5)AS avgP5, avg(prm5) AS sumP5,
sum(prm6)AS avgP6, avg(prm6) AS sumP6,
sum(prm7)AS avgP7, avg(prm7) AS sumP7,
sum(prm8)AS avgP8, avg(prm8) AS sumP8,
sum(prm9)AS avgP9, avg(prm9) AS sumP9,
sum(prm10)AS avgP10, avg(prm10) AS sumP10
FROM housekeeping GROUP BY timestamp / 300
;
/* This query shows that the TRIGGER is being activated (even though it does no deletions) */
SELECT * FROM tidyup_log;
/* Tidy up the Testing environment */
DROP TABLE IF EXISTS physical;
DROP TABLE IF EXISTS housekeeping;
DROP TRIGGER IF EXISTS auto_tidyup;
DROP TABLE IF EXISTS tidyup_log;
DROP TABLE IF EXISTS just_for_load;
The comments should explain quite a bit.
you may wish to look at:
SQLite CREATE TABLE
SQLite CREATE TRIGGER
SQLite Date and Time Functions
SQLite Aggregate Functions
SQLite SQL Language Expressions
Results
Extract from the physical table (showing 5 minute intervals of the data aka data you probably don't want to look at)
Extract more useful data averages and sums of each of the 10 readings every 5 minutes
1001 rows because rows don't end start on a 5 minute boundary
The tidyup log (to show the TRIGGER is being activated)
start and end for each physical row (noting that the WHEN criteria has been set to trigger on all) and hence 2000 rows
Lastly just to show 300000 rows part of the message log :-
WITH RECURSIVE counter(i) AS
(SELECT 1 UNION ALL SELECT i+1 FROM counter WHERE i < (SELECT (sum(base_count) * 300) FROM just_for_load))
INSERT INTO housekeeping SELECT strftime('%s','now','+'||(i)||' second'), random(),random(),random(),random(), random(),random(),random(),random(), random(),random()FROM counter
> Affected rows: 300000
> Time: 1.207s
I'm using SQLite3 and trying to query for recent rows. So I'm having SQLite3 insert a unix timestamp into each row with strftime('%s','now'). My Table looks like this:
CREATE TABLE test(id INTEGER PRIMARY KEY, time);
INSERT INTO test (time) VALUES (strftime('%s','now')); --Repeated
SELECT * FROM test;
1|1516816522
2|1516816634
3|1516816646 --etc lots of rows
Now I want to query for only recent entries, for example, I'm trying to get all rows with a time within the last hour. I'm trying the following SQL query:
SELECT * FROM test WHERE time > strftime('%s','now')-60*60;
However, that always returns all rows regardless of the value in the time column. I really don't know what's going on.
Also, if I put WHERE time > strftime('%s','now') it'll return nothing (which is expected) but if I put WHERE time > strftime('%s','now')-1 then it'll return everything. I don't know why.
Here's one more example:
sqlite> SELECT , strftime('%s','now')-1 AS window FROM test WHERE time > window;
1|1516816522|1516817482
2|1516816634|1516817482
3|1516816646|1516817482
It seems that SQLite3 thinks the values in the middle column are greater than the values in the right column!?
This isn't at all what I expect. Can someone please tell me what's going on? Thanks!
The purpose of strftime() is to format values, so it returns a string.
When you try to do computations with its return value, the database must convert it into a number. And numbers and strings cannot be compared directly with each other.
You must ensure that both values in a comparison have the same data type.
The best way to do this is to store numbers in the table:
INSERT INTO test (time)
VALUES (CAST(strftime('%s','now') AS MAKE_THIS_A_NUMBER_PLEASE));
(Or just declare the column type as something with numeric affinity.)
I'm synchronizing historical data between two systems and I've found a small clock problem between their logs.
I've loaded the data into an SqlLite and need one of the sets by a small amount (~40 milliseconds). However, I'm unable to do so as it seems to always round the time to the nearest second.
For example, attempting something like the following
UPDATE my_table SET my_datetime = DATETIME(my_datetime, '+0.04 seconds') rounds up to the nearest second and I can't find any fractional/millisecond modifier option.
Is there a way to do this that I'm overlooking?
Thanks.
SQLite hasn't a type for datetime. See http://sqlite.org/datatype3.html
Using datetime(...) you are storing your dates as strings. This is equivalent to strftime('%Y-%m-%d %H:%M:%S', ...).
One option is to use a strftime with fractions of seconds:
UPDATE my_table SET my_datetime = STRFTIME('%Y-%m-%d %H:%M:%f',my_datetime, '+0.04 seconds')
I'm trying to find the average figure for the last 10 rows in a database table:
select avg(Reading)
from Readings
Order By Rowid
desc limit 10;
This pulls the average of all entries in the table, not the last 10. I've tried all sorts of variations but can't get it to work.
Thanks for the super quick replies, I tried again and managed to type in the correct syntax this time in the From clause.
Here is the correct answer:
select avg(Reading)
from(select Reading
from Readings
Order By Rowid desc
limit 10);
I have a table that includes a 'LastUpdated' column that is generated when the row is inserted using Sqlite's datetime('now') function.
How do I write a Select statement that finds all rows with 'LastUpdated' more than 100 days old?
I think it's a variant of:
SELECT * FROM Table WHERE (DATETIME('Now')-100 Days) > LastUpdated
But I'm unsure of:
a) How to specify the 100 Days?
b) Whether I can actually compare datetimes like this or if I first have to convert DATETIME('Now') to a string?
c) DATETIME('Now') results in UTC time, correct? I think so from my reading of the documentation, but it was a little confusing...
Figured it out--I didn't see all the handy modifiers at the bottom of the SQLite Datetime Documentation.
A bunch of helpful examples there demonstrating addition/subtraction of any datetime unit (years, months, hours, seconds, etc)
SELECT * FROM Table WHERE (DATETIME('Now','-100 Days') > LastUpdated