I found this question on the old Dynamics AX forum and found the answer.
I was not able to post it there due to Microsoft currently is forcing all to start using their new 365 version
"I've noticed that Recversion on some records in the database aren't updating as I'd expect. What I'm trying to do is to rework how my company is moving data into our Data Warehouse for reporting purposes. Up until this point, the tables we use for reports are refreshed each night with data from our production environment. Data Warehouse tables are truncated, then reloaded fresh.
Naturally, as our production database grows, this is becoming less and less ideal. We're using SQL Server 2005 for our database, so we don't have access to the new SQL "Merge" statement, but we were able to achieve a similar effect. We match records between the databases by RecId, and are using Recversion to test whether a Data Warehouse record needs to be updated from production in the case of changes to the record (inserts are handled based on not finding a matching record in the data warehouse by RecId).
So the problem is, if Recversion isn't being updated all the time, this method is useless since we aren't accurately capturing all updates. The table we've noticed a problem with is InventSum. The only thing I can see different from most other tables is the concurrency setting; InventSum is set to use Pessimistic concurrency. Would this affect the behavior of when the Recversion value changes? What else might cause this value not to update regularly?
The test:
I tested this out by recording the Recversion value of an InventSum record on a particular item in a given warehouse. I then created a sales line for 100,000 of the item, which in turn updated the InventSum.ReservPhysical value. Despite the change in the reserved amount on InventSum, the Recversion remained unchanged from its original value. Picking the quantity also did not update the Recversion. Posting a packing slip DID cause the Recversion to change.
So why the difference?"
See the original question here
I've added "modifiedDateTime" to the InventSum table and altered the method: InventUpdateOnhand.sqlUpdateInventSumStrSQLServer() - the line:
str sqls_base = 'UPDATE %1 SET %2
FROM (SELECT %3 FROM %4 WHERE %5 GROUP BY %6) AS %7 WHERE %8';
changed to
str sqls_base = 'UPDATE %1 SET %2,
modifiedDateTime = GETUTCDATE() FROM (SELECT %3 FROM %4 WHERE %5 GROUP
BY %6) AS %7 WHERE %8';
Then added an index containing modifiedDateTime, DataAreaId in that order.
This has helped the data warehouse people a lot
Related
We have a MariaDB database running WordPress 4.8 and found a lot of transient named records in the wp_options table. The table was cleaned up with a Plugin and reduced from ~800K records down to ~20K records. Still getting slow query entries regarding the table:
# User#Host: wmnfdb[wmnfdb] # localhost []
# Thread_id: 950 Schema: wmnf_www QC_hit: No
# Query_time: 34.284704 Lock_time: 0.000068 Rows_sent: 1010 Rows_examined: 13711
SET timestamp=1510330639;
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes';
Found another post to create an index and did:
ALTER TABLE wp_options ADD INDEX (`autoload`);
That was taking too long and taking website offline. I found a lot of 'Waiting for table metadata lock' in the processlist. After canceling the ALTER TABLE, got all running again still with high loads and entries of course in the slow query log. I also tried creating the index with the web server offline and a clean processlist. Should it take so long if I try to create again tonight?
If you are deleting most of a table, it is better to create a new table, copy the desired rows over, then rename. The unfortunate aspect is that any added/modified rows during the steps would not get reflected in the copied table. (A plus: You could have had the new index already in place.)
In this, I give multiple ways to do big deletes.
What is probably hanging your system:
A big DELETE stashes away all the old values in case of a rollback -- which killing the DELETE invoked! It might have been faster to let it finish.
ALTER TABLE .. ADD INDEX -- If you are using MySQL 5.5 or older, that must copy the entire table over. Even if you are using a newer version (that can do ALGORITHM=INPLACE) there is still a metadata lock. How often is wp_options touched? (Sounds like too many times.)
Bottom line: If you recover from your attempts, but the delete is still to be done, pick the best approach in my link. After that, adding an index to only 20K rows should take some time, but not a deadly long time. And consider upgrading to 5.6 or newer.
If you need further discussion, please provide SHOW CREATE TABLE wp_options.
But wait! If autoload is a simple yes/no 'flag', the index might not be used. That is, it may be a waste to add the index! (For low cardinality, it is faster to do a table scan than to bounce back and forth between the index BTree and the data BTree.) Please provide a link to that post; I want to spit at them.
Where can I see at what time a view was dropped from database.
For other object types such as Indexes and tables , I use dba_recyclebin to check the droptime , but for a view I do not think that ORACLE captures details in this table.
Assuming that the drop was recent enough that the data is still present in your UNDO tablespace (likely a few hours but up to a few days depending on your workload and configuration) you should be able to use a flashback query on dba_views. You'd need to have the FLASHBACK ANY TABLE privilege.
SELECT text
FROM dba_views AS OF TIMESTAMP( systimestamp - interval '1' hour )
WHERE owner = <<owner of view>>
AND view_name = <<name of view>>
Beyond that, you could look at your archived logs using dbms_logmnr to look for drop view statements over a longer time fram. That is likely to be more effort, however.
So this is essentially a follow-up question on Finding duplicate records.
We perform data imports from text files everyday and we ended up importing 10163 records spread across 182 files twice. On running the query mentioned above to find duplicates, the total count of records we got is 10174, which is 11 records more than what are contained in the files. I assumed about the posibility of 2 records that are exactly the same and are valid ones being accounted for as well in the query. So I thought it would be best to use a timestamp field and simply find all the records that ran today (and hence ended up adding duplicate rows). I used ORA_ROWSCN using the following query:
select count(*) from my_table
where TRUNC(SCN_TO_TIMESTAMP(ORA_ROWSCN)) = '01-MAR-2012'
;
However, the count is still more i.e. 10168. Now, I am pretty sure that the total lines in the file is 10163 by running the following command in the folder that contains all the files. wc -l *.txt.
Is it possible to find out which rows are actually inserted twice?
By default, ORA_ROWSCN is stored at the block level, not at the row level. It is only stored at the row level if the table was originally built with ROWDEPENDENCIES enabled. Assuming that you can fit many rows of your table in a single block and that you're not using the APPEND hint to insert the new data above the existing high water mark of the table, you are likely inserting new data into blocks that already have some existing data in them. By default, that is going to change the ORA_ROWSCN of every row in the block causing your query to count more rows than were actually inserted.
Since ORA_ROWSCN is only guaranteed to be an upper-bound on the last time there was DML on a row, it would be much more common to determine how many rows were inserted today by adding a CREATE_DATE column to the table that defaults to SYSDATE or to rely on SQL%ROWCOUNT after your INSERT ran (assuming, of course, that you are using a single INSERT statement to insert all the rows).
Generally, using the ORA_ROWSCN and the SCN_TO_TIMESTAMP function is going to be a problematic way to identify when a row was inserted even if the table is built with ROWDEPENDENCIES. ORA_ROWSCN returns an Oracle SCN which is a System Change Number. This is a unique identifier for a particular change (i.e. a transaction). As such, there is no direct link between a SCN and a time-- my database might be generating SCN's a million times more quickly than yours and my SCN 1 may be years different from your SCN 1. The Oracle background process SMON maintains a table that maps SCN values to approximate timestamps but it only maintains that data for a limited period of time-- otherwise, your database would end up with a multi-billion row table that was just storing SCN to timestamp mappings. If the row was inserted more than, say, a week ago (and the exact limit depends on the database and database version), SCN_TO_TIMESTAMP won't be able to convert the SCN to a timestamp and will return an error.
I would like to monitor 10 tables with 1000 records per table. I need to know when a record, and which record changed.
I have looked into SQL Dependencies, however it appears that SQL Dependencies would only be able to tell me that the table changed, and not which record changed. I would then have to compare all the records in the table to find the modified record. I suspect this would be a problem for me as the records constantly change.
I have also looked into SQL Trigger's, however I am not sure if triggers would work for monitoring which record changed.
Another thought I had, is to create a "Monitoring" table which would have records added to it via the application code whenever a record is modified.
Do you know of any other methods?
EDIT:
I am using SQL Server 2008
I have looked into Change Data Capture which is available in SQL 2008 and suggested by Martin Smith. Change Data Capture appears to be a robust, easy to implement and very attractive solution. I am going to roll CDC on my database.
You can add triggers and have them add rows to an audit table. They can audit the primary key of the rows that changed, and even additional information about the changes. For instance, in the case of an UPDATE, they can record the columns that changed.
Before you write/implement your own take a look at AutoAudit :
AutoAudit is a SQL Server (2005, 2008) Code-Gen utility that creates
Audit Trail Triggers with:
Created, CreatedBy, Modified, ModifiedBy, and RowVersion (incrementing INT) columns to table
Insert event logged to Audit table
Updates old and new values logged to Audit table
Delete logs all final values to the Audit table
view to reconstruct deleted rows
UDF to reconstruct Row History
Schema Audit Trigger to track schema changes
Re-code-gens triggers when Alter Table changes the table
What version and edition of SQL Server? Is Change Data Capture available? – Martin Smith
I am using SQL 2008 which supports Change Data Capture. Change Data Capture is a very robust method for tracking data changes as I would like to. Thanks for the answer.
Here's an idea.You can have a flag on each table that every time a record is created or updated is filled with current datetime. Then when you notice that a record has changed set its flag to null again.Thus unchanged records have null in their flag field and you can query not null values to see which record has changed/created and when (and set their flags to null again) .
I have an Sqlite3 database with a table and a primary key consisting of two integers, and I'm trying to insert lots of data into it (ie. around 1GB or so)
The issue I'm having is that creating primary key also implicitly creates an index, which in my case bogs down inserts to a crawl after a few commits (and that would be because the database file is on NFS.. sigh).
So, I'd like to somehow temporary disable that index. My best plan so far involved dropping the primary key's automatic index, however it seems that SQLite doesn't like it and throws an error if I attempt to do it.
My second best plan would involve the application making transparent copies of the database on the network drive, making modifications and then merging it back. Note that as opposed to most SQlite/NFS questions, I don't need access concurrency.
What would be a correct way to do something like that?
UPDATE:
I forgot to specify the flags I'm already using:
PRAGMA synchronous = OFF
PRAGMA journal_mode = OFF
PRAGMA locking_mode = EXCLUSIVE
PRAGMA temp_store = MEMORY
UPDATE 2:
I'm in fact inserting items in batches, however every next batch is slower to commit than previous one (I'm assuming this has to do with the size of index). I tried doing batches of between 10k and 50k tuples, each one being two integers and a float.
You can't remove embedded index since it's the only address of row.
Merge your 2 integer keys in single long key = (key1<<32) + key2; and make this as a INTEGER PRIMARY KEY in youd schema (in that case you will have only 1 index)
Set page size for new DB at least 4096
Remove ANY additional index except primary
Fill in data in the SORTED order so that primary key is growing.
Reuse commands, don't create each time them from string
Set page cache size to as much memory as you have left (remember that cache size is in number of pages, but not number of bytes)
Commit every 50000 items.
If you have additional indexes - create them only AFTER ALL data is in table
If you'll be able to merge key (I think you're using 32bit, while sqlite using 64bit, so it's possible) and fill data in sorted order I bet you will fill in your first Gb with the same performance as second and both will be fast enough.
Are you doing the INSERT of each new as an individual Transaction?
If you use BEGIN TRANSACTION and INSERT rows in batches then I think the index will only get rebuilt at the end of each Transaction.
See faster-bulk-inserts-in-sqlite3.