My target is:
Take a csv file
compare to a DB access
update the necessary row
insert the necessary row
The logic of my Talend seems ok and working. But the Update process is super slow... Do you have any ideas how I can make it faster?
Talend Map
Thank you.
I put an index (Yes No duplicate) on the update key in my access database.
I put a key on my mapping in talend.
Do you think it is the size of the Database?:
60000 rows
20 columns
size : 20M
460s to updated 700row
My next try will be to try on a SQL Database.
Related
I often use RODBC to work with MS Access files in R. For removing existing tables sqlDrop works fine, e.g.:
db <- odbcConnectAccess(choose.files(caption="select database"))
sqlDrop(db, "existing_dummy_table")
What I need to do now is to delete an existing query that is stored in the Access database. sqlDrop does only seem to work with tables, not with querys.
sqlDrop(db, "existing_dummy_query")
brings up:
Error in odbcTableExists(channel, sqtable, abort = errors) :
‘existing_dummy_query’: table not found on channel
Is there any solution how to delete/remove existing queries?
Thank you!
After a lot of testing I found a soultion myself:
sqlQuery(db, "DROP TABLE existing_dummy_query";)
Maybe its helpful for others. DROP VIEW did not work. I don't know why.
I am writing python script to automate AWS DynamoDB table backup restore test. Once the table is restored from backup i can not check(test) table size or item count in restored table immediately. As per AWS "Storage size and item count are not updated in real-time. They are updated periodically, roughly every six hours."
I also tried using "scan" on the restored table to list sample items but that's also not seems to be working.
Does anybody know what could be the work around here? Suggestion would be appreciated.
Thanks !!
I was able to achieve it by using, table scan.
client = boto3.resource('dynamodb', 'us-east-1')
table = client.Table('ddb_test_table')
response = table.scan(Limit=XX)
We have a MariaDB database running WordPress 4.8 and found a lot of transient named records in the wp_options table. The table was cleaned up with a Plugin and reduced from ~800K records down to ~20K records. Still getting slow query entries regarding the table:
# User#Host: wmnfdb[wmnfdb] # localhost []
# Thread_id: 950 Schema: wmnf_www QC_hit: No
# Query_time: 34.284704 Lock_time: 0.000068 Rows_sent: 1010 Rows_examined: 13711
SET timestamp=1510330639;
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes';
Found another post to create an index and did:
ALTER TABLE wp_options ADD INDEX (`autoload`);
That was taking too long and taking website offline. I found a lot of 'Waiting for table metadata lock' in the processlist. After canceling the ALTER TABLE, got all running again still with high loads and entries of course in the slow query log. I also tried creating the index with the web server offline and a clean processlist. Should it take so long if I try to create again tonight?
If you are deleting most of a table, it is better to create a new table, copy the desired rows over, then rename. The unfortunate aspect is that any added/modified rows during the steps would not get reflected in the copied table. (A plus: You could have had the new index already in place.)
In this, I give multiple ways to do big deletes.
What is probably hanging your system:
A big DELETE stashes away all the old values in case of a rollback -- which killing the DELETE invoked! It might have been faster to let it finish.
ALTER TABLE .. ADD INDEX -- If you are using MySQL 5.5 or older, that must copy the entire table over. Even if you are using a newer version (that can do ALGORITHM=INPLACE) there is still a metadata lock. How often is wp_options touched? (Sounds like too many times.)
Bottom line: If you recover from your attempts, but the delete is still to be done, pick the best approach in my link. After that, adding an index to only 20K rows should take some time, but not a deadly long time. And consider upgrading to 5.6 or newer.
If you need further discussion, please provide SHOW CREATE TABLE wp_options.
But wait! If autoload is a simple yes/no 'flag', the index might not be used. That is, it may be a waste to add the index! (For low cardinality, it is faster to do a table scan than to bounce back and forth between the index BTree and the data BTree.) Please provide a link to that post; I want to spit at them.
How can you remove old records from the BAMPrimaryImport TDDS_FailedTrackingData table?
... not the TDDS_FailedTrackingData in the BizTalkDTADb database
Our production system has 2+ million records in the BAMPrimaryImport.dbo.TDDS_FailedTrackingData, and the various BizTalk SQL Agent jobs are running fine, but these records are still there.
UPDATE: We sorted the issue that was generating the fails (fingers crossed), so there are no new records.
This might be helpful for you as well:
http://www.codit.eu/blog/2014/07/03/maintaining-biztalk-bam-databases/
I'm not claiming this is an actual answer to your question, but it is about maintaining the BAM databases using NSVacuum.
Looks like it's a case of manually deleting the records (TRUNCATE TABLE or DELETE FROM) ...
I've used Red Gate's SQL Search and looked for TDDS_FailedTrackingData throughout the database ...
all objects and all databases
Found 8 references in the entire system ... see below
Records are removed from the [BizTalkDTADb].[dbo].[TDDS_FailedTrackingData] in two stored procedures ...
[dtasp_CleanHMData] does a TRUNCATE TABLE
[dtasp_PurgeTrackingDatabase_Internal] does a DELETE FROM for 100 records at a time
However the [BAMPrimaryImport] database only has one stored procedure that has any mention of the [BAMPrimaryImport].[dbo].[TDDS_FailedTrackingData] table ...
[BAMPrimaryImport].[dbo].[TDDS_InsertFailedTrackingData]
and it just inserts records, with the addition of current date & time from GETUTCDATE()
Found lots of posts about clearing down the [BizTalkDTADb] table, but very few on clearing down the [BAMPrimaryImport]
This on TechNet from a BizTalk MVP
And this on MSDN from another BizTalk expert.
You can manually perform a simple DELETE TSQL script :
DELETE FROM [BAMPrimaryImport].[dbo].[TDDS_FailedTrackingData]
I have a set table in teradata , when I load duplicate records throough informatica , session fails because it tries to push duplicate records in SET table.
I want that whenever duplicate records being loaded informatica rejects them using TPT or Relation connection
can anyone help me with properties I need to set
Do you really need to keep track of what records are rejected due to duplication in the TPT logs? It seems like you are open to suggestions about TPT or relational connections, so I assume you don't really care about TPT level logs.
If this assumption is correct then you can simply put an Aggregator Transformation in the mapping and mark every field as Group By. As expected, this will add a group by clause in the generated query and eliminate duplicates in the source data.
Please try following things:
1. If you'll use fload or TPT Fast load then the utility will implicitly remove the duplicates but this utility can only be used for loading into empty tables.
2. If you are trying to load data in non-empty table then place a sorter and de-dupe your data in Informatica
3. Also try changing the flag stop on error to 0 and flag Error limit in target to -1
Please share your results with us.