R - how to react to database inserts/updates/deletes? - r

I'm reading in data from an SQLite database table into a data.frame with R's DBI. Often (as often as every 5 secs), new records get added into the database table externally, or existing ones updated/deleted, at which point I need to propagate these changes to my data.frame.
So the question is how can I hook onto and respond to these database events in R? I don't want to have to keep querying the database every 5 secs just to make sure nothing has changed. Is there some callback mechanism at my disposal?

If you have access to the C code that is writing your SQL data, then you can implement a callback:
http://www.sqlite.org/c3ref/update_hook.html
and then in your callback function you could update the timestamp of a file if the table being modified is one your R code cares about. Then your R code checks the timestamp of that file, and if its changed, only then does it need to query the SQLite database.
Now I don't know if you could add a callback to the SQLite connection held by R and expect to get a callback if another SQLite connection/process changes the database table. I doubt it, I suspect the callbacks are only triggered if the connection they are registered with does the update, because otherwise all sorts of asynchronous things happen, and there's no event handler.
Another idea is to use triggers to update a database table of modification times. Define triggers on all tables you care about so that they update a row in a "last modified" table. Then use the file modification time to check for any change to the database, and then your R only has to query the "last modified" table to see what specific table has changed since last check.

Related

Is there a way to efficiently test whether a newly inserted/updated row matches a SQLite query?

I have a SQLite table in my application that periodically has INSERT/UPDATE statements executed against it. I would like to display a view in my application that reflects some query that is run against that table, and keep it continually updated as the table contents change. Since the table could be large, I would like to avoid having to re-run the query each time the table is updated so that I can update the view.
One idea I had was to use SQLite's Data Change Notification Callbacks to be notified whenever an INSERT/UPDATE occurs against the table in question. In my callback, I have the rowid of the newly-updated row, and I would like to see whether it matches the query. Assuming I have the query available as a prepared sqlite3_stmt, what would be the most efficient way to test whether the row would be matched by the query?
Aside: I know that I can't do anything in the callback itself that would affect the state of the database connection, and that's fine. I can defer the actual work of checking the query until later to ensure safety; I'm just trying to determine what the best mechanism for checking the query against the new row contents would be.

Sqlite: table constraints and triggers

I know the order of triggers in SQLite is undefined (you cannot be sure what trigger will be executed first), but, how about the relationship between table constraints and triggers?
I mean, suppose I have, for example, a UNIQUE (or CHECK) constraint in a column, and a BEFORE and AFTER UPDATE triggers on that table. If the UNIQUE column is modified, when does sqlite check the UNIQUE constraint? before calling BEFORE triggers, after calling AFTER triggers, between them, or with undefined order?
I have found nothing in SQLite docs about it.
SQLite reccommends not to modify data in BEFORE UPDATE/DELETE triggers, since it will lead to undefined behaviour (see: Cautions on the use of before triggers in the documentations).
There is a hint in a SQLite source code comment (src/update.c) that helps to know what happens under the hood:
/* Fire any BEFORE UPDATE triggers. This happens before constraints are
** verified. One could argue that this is wrong.
*/
Looking at the source code, whenever SQLite updates a table it perform this actions:
Loads the table data used by the update.
Runs the UPDATE operation (you need this to populate old.field and new.field)
Then, it Executes the BEFORE UPDATE trigger(s).
If the BEFORE UPDATE trigger(s) didn't delete the row data:
Loads the table data not used by the trigger.
Then Checks constraints (Primary keys, foreign keys, uniqueness, on..cascade, etc)
And then SQLite executes the AFTER UPDATE trigger(s).
If any BEFORE UPDATE trigger deleted the row data:
There is no need to check constraints.
No AFTER UPDATE triggers are run.
When the documentation does not say anything about it, then the order is undefined.
As long as the triggers do not have side effects outside the database, this does not matter, because any changes made by a trigger would be rolled back if a constraint fails.
Please note that SQLite takes backwards compatibility very seriously, so it is unlikely that the actual order will ever change.

Is MLOAD executed in a single transaction?

I have an MLOAD job that inserts data from an Oracle database into a Teradata database. One of the things it does it drop the destination table and recreate it. Our production website populates a dropdown list based on what's in the destination table.
If the MLOAD script is not on a single transaction then it's possible that the dropdown list could fail to populate properly if the binding occurs during the MLOAD job. If it is transactional, however, it would be a seamless process because the changes would not show until the transaction is committed.
I checked the dbc.DBQLogTbl and dbc.DBQLQryLogsql views after running the MLOAD job and it appears there are several transactions occurring within the job, so it would seem that the entire job is not done in a single transaction. However, I wanted to verify that this is indeed the case before I make assumptions.
A transaction in Teradata cannot include multiple DDL statements, each DDL must be commited seperatly.
A MLoad is treated logically as a single transaction even if you see multiple transactions in DBQL, these are steps to prepare and cleanup.
When your application tries to select from the target table everything will be ok (unless it's doing a dirty read using LOCKING ROW FOR ACCESS).
Btw, there might be another error message "table doesn't exist" when the application tries to select. Why do you drop/recreate the table instead of a simple DELETE?
Another solution would be a loading a copy of the table and use view switching:
mload tab2;
replace view v as select * from tab2;
delete from tab1;
The next load will do:
mload tab1;
replace view v as select * from tab1;
delete from tab2;
And so on. Of course your load job needs to implement the switching logic.

System level trigger on DML Command in plsql

Suppose there are n number of tables in the database. Whatever insert,update,delete happen across any table in the database, have to be captured in a table called "Audit_Trail", where we have the below columns in the audit trail tables.
Server_Name
AT_date
AT_time
Table_name
Column_name
Action
Old_value
New_Value
The server on which table, on which column, on which date and time need to be captured. Also, the "Action" column tracks whether an action is an insert, update or delete and we have to capture the old value and new value as well.
So what is the best way to do this? Can we create a database level trigger which can fire trigger in case of any insert, update or delete?
The best way would be to use Oracle's own auditing functionality.
AUDIT ALL ON DEFAULT BY ACCESS;
http://docs.oracle.com/cd/E11882_01/network.112/e36292/auditing.htm#DBSEG392
In response to comment ...
There is nothing unusual in wanting to audit every change made to tables in the database -- hence there is already functionality provided in the system for doing exactly that. It is better then using triggers because it cannot be bypassed as easily. However, if you want to use this pre-supplied, robust, simple to use functionality you might have to compromise on your specific requirements a little, but the payoff will be a superior solution that will use code and configuration in common with thousands of other Oracle systems.

What methods are available to monitor SQL database records?

I would like to monitor 10 tables with 1000 records per table. I need to know when a record, and which record changed.
I have looked into SQL Dependencies, however it appears that SQL Dependencies would only be able to tell me that the table changed, and not which record changed. I would then have to compare all the records in the table to find the modified record. I suspect this would be a problem for me as the records constantly change.
I have also looked into SQL Trigger's, however I am not sure if triggers would work for monitoring which record changed.
Another thought I had, is to create a "Monitoring" table which would have records added to it via the application code whenever a record is modified.
Do you know of any other methods?
EDIT:
I am using SQL Server 2008
I have looked into Change Data Capture which is available in SQL 2008 and suggested by Martin Smith. Change Data Capture appears to be a robust, easy to implement and very attractive solution. I am going to roll CDC on my database.
You can add triggers and have them add rows to an audit table. They can audit the primary key of the rows that changed, and even additional information about the changes. For instance, in the case of an UPDATE, they can record the columns that changed.
Before you write/implement your own take a look at AutoAudit :
AutoAudit is a SQL Server (2005, 2008) Code-Gen utility that creates
Audit Trail Triggers with:
Created, CreatedBy, Modified, ModifiedBy, and RowVersion (incrementing INT) columns to table
Insert event logged to Audit table
Updates old and new values logged to Audit table
Delete logs all final values to the Audit table
view to reconstruct deleted rows
UDF to reconstruct Row History
Schema Audit Trigger to track schema changes
Re-code-gens triggers when Alter Table changes the table
What version and edition of SQL Server? Is Change Data Capture available? – Martin Smith
I am using SQL 2008 which supports Change Data Capture. Change Data Capture is a very robust method for tracking data changes as I would like to. Thanks for the answer.
Here's an idea.You can have a flag on each table that every time a record is created or updated is filled with current datetime. Then when you notice that a record has changed set its flag to null again.Thus unchanged records have null in their flag field and you can query not null values to see which record has changed/created and when (and set their flags to null again) .

Resources