The database's auto-deleting task problem - asp.net

I'm trying to figure out how to develop the on my database. E.g. there are some records in the database:
alt text http://img109.imageshack.us/img109/2962/datax.png
So, if the actual DateTime on the server is 2010-01-09 12:12:12 the record no.1 must be deleted.
OK, but what, if in the datebase are e.g. 1.000.000 records? The server has to search in the database on each second to check what rows must be deleted ? It's not efficient at all.
I'm totally new to Microsoft Server so I'd be grateful of any kind of help

There isn't a time based trigger in sql server. So you are going to have to implement this as a job or through some other scheduled mechanism.
Most likely you will want an index on the StartDate (end date?) column so that your deletion query doesn't have to perform a full table scan to find the data it needs to delete.
Usually you don't actually perform deletes every second. Instead the app should be smart enough to query the table in a way to eliminate those records from it's result set. Then, you can perform lazy deletes at some other time interval to do cleanup. Such as once an hour or once a day etc.

Related

Inserting Result of Time-Consuming Stored Procedures to Tables

I have a website that runs a stored procedure when you open home page. That stored procedure process data from 4 relational table and gives a result. Since DB records increased, completion of the stored procedure can take more than 10 seconds and it is too much for a home page.
So I think, inserting result of the stored procedure into a new table regularly and using that table for home page can be a good idea to solve the problem but I am not sure if it is a good practice for SQL Server.
Is there any better solution for my case?
Edit: Those 4 tables are updated every 15 minutes with about 30 insert.
If you are willing to have a "designated victim" update the cache as needed (which may also cause other users to wait) you can do something like this in a stored procedure (SP):
Start a transaction to block access to the cache.
Check the date/time of the cache entries. (This requires either adding a CacheUpdated column to the cache table or storing the value elsewhere.)
If the cached data is sufficiently recent then return the data and end the transaction.
Delete the cached data and run a new query to refill it with an appropriate CacheUpdated date/time.
Return the cached data and end the transaction.
If the update time becomes too long for users to wait, or the cache rebuild blocks too many users, you can run a stored procedure at a scheduled interval by creating a job in SQL Server Agent. The SP would:
Save the current date/time, e.g. as #Now.
Run the query to update the cache marking each row with CacheUpdated = #Now.
Delete any cache rows where CacheUpdate != #Now.
The corresponding SP for users would simply return the oldest set of data, i.e. Min( CacheUpdated ) rows. If there is only one set, that's what they get. If an update is in progress then they'll get the older complete set, not the work in progress.
As far as you have explained your issue I see no problem in doing that, but you must explain more, since we don't know what type of data you collecting and how it is increases every time, so as to provide you a better solution

Clear DocumentDB Collection before inserting documents

I need to know how to clear documentdb collection before inserting new documents. I am using datafactory pipeline activity to fecth data from on-prem sql server and insert into documentdb collection. The frequency is set to every 2 hrs. So when the next cycle runs, I want to first clear the exisitng data in documentdb collection. How do I do that?
The easiest way is to programmatically delete the collection and recreate it with the same name. Our test scripts do this automatically. There is the potential for this to fail due to a subtle race condition, but we've found that adding a half second delay between the delete and recreate avoids this.
Alternatively, it would be possible to fetch every document id and then delete them one at a time. This would be most efficiently done from a stored procedure (sproc) so you didn't have to send it all over the wire, but it would still be consuming RUs and take time.

Updating records one by one

I have 5000 records I am calculating salary of one user and update his data in database. So it’s taking quit long to update 5000 records. I want to calculate all users’ salary first and then update to records in db.
Is there any other way we can update db in single click
It really depends on how your are managing your data access layer and what data you need for doing the calculation? Do you have all the data you need in just one table or for each record you need to fetch some other data from another tables?
One way is to retrieve each record and do the calculation in a transaction and then store it on the database. In this way, you can also take advantage of ajax UI to inform the user about the progress of calculation. In this way, you should use SqlDataReader to fetching the data as it is very optimized and has less overhead than using DataSet and DataTables and also you can prevent several type-castings. In addition, you can also make it optimized by taking advantage of TPL or make it configurable for fetching/updating N records each time. This approach works if you have the ID of the records. You also need to have a field for your records to track your calculations in case of any disconnection or crashes or iisreset execution so that you can resume the calculation instead of rerunning it again.

Attaching two memory databases

I am collecting data every second and storing it in a ":memory" database. Inserting data into this database is inside a transaction.
Everytime one request is sending to server and server will read data from the first memory, do some calculation, store it in the second database and send it back to the client. For this, I am creating another ":memory:" database to store the aggregated information of the first db. I cannot use the same db because I need to do some large calculation to get the aggregated result. This cannot be done inside the transaction( because if one collection takes 5 sec I will lose all the 4 seconds data). I cannot create table in the same database because I will not be able to write the aggregate data while it is collecting and inserting the original data(it is inside transaction and it is collecting every one second)
-- Sometimes I want to retrieve data from both the databses. How can I link both these memory databases? Using attach database stmt, I can attach the second db to the first one. But the problem is next time when a request comes how will I check the second db is exist or not?
-- Suppose, I am attaching the second memory db to first one. Will it lock the second database, when we write data to the first db?
-- Is there any other way to store this aggregated data??
As far as I got your idea, I don't think that you need two databases at all. I suppose you are misinterpreting the idea of transactions in sql.
If you are beginning a transaction other processes will be still allowed to read data. If you are reading data, you probably don't need a database lock.
A possible workflow could look as the following.
Insert some data to the database (use a transaction just for the
insertion process)
Perform heavy calculations on the database (but do not use a transaction, otherwise it will prevent other processes of inserting any data to your database). Even if this step includes really heavy computation, you can still insert and read data by using another process as SELECT statements will not lock your database.
Write results to the database (again, by using a transaction)
Just make sure that heavy calculations are not performed within a transaction.
If you want a more detailed description of this solution, look at the documentation about the file locking behaviour of sqlite3: http://www.sqlite.org/lockingv3.html

Teradata Change data capture

My team is thinking about developing a real time application (a bunch of charts, gauges etc) reading from the database. At the backend we have a high volume Teradata database. We expect some other applications to be constantly feeding in data into this database.
Now we are wondering about how to feed in the changes from the database to the application. Polling from the application would not be a viable option in our case.
Are there any tools that are available within Teradata that would help us achieve this?
Any directions on this would be greatly appreciated
We faced similar requirement. But in our case client asked us to provide daily changes to a purchase orders table. That means we had to run a batch of scripts every day to capture the changes occuring to the table.
So we started to collect data every day and store the data in a sparse history format in another table. So the process is simple here. We collect a purchase order details record in the against first day's date in the history table. And then the next day we compare the next day's feed record against the history record and identify any change in that record. If there is a change in the purchase order record columns we collect that record and keep it in a final reporting table which will be shown to the client.
If you run the batch scripts every day once and there will be more than one change in a day to a record then this method cannot give you the full changes. For that you may need to run the batch scripts more than once every day based on your requirement.
Please let us know if you find any other solution. Hope this helps.
There is a change data capture tool from wisdomforce.
http://www.wisdomforce.com/resources/docs/databasesync/DatabaseSyncBestPracticesforTeradata.pdf
It would it probably work in this case
Are triggers with stored procedures an option?
CREATE TRIGGER dbname.triggername
AFTER INSERT ON db_name.tbl_name
REFERENCING stored_procedure
Theoretically speaking, you can write external stored procedures which may call UDFs written in Java or C/C++ etc which can push the row data to your application in near real time.

Resources