Few day before it was hard time..,
I have developed an Application for Online Admission process for College students and was quite successful.
Let me come to the problem i faced,
2 tables were involved in the problem : Student_AdmissionDetails ( contains almost 30-35 fields and most of them were having datatype of nvarchar(70)) and the other one was StudentCategory
After few day from when admission process started, The Student_AdmissionDetails was having about 3,00,000 of records and the StudentCategory was having 4 records
I have developed a dashboard where I was suppose to show No of Students applied in each category. and to achieve this I had following Query.
Select count(*)
from Student_AdmissionDetails
inner join StudentCategory
on Student_AdmissionDetails.Id=StudentCategory.Id
where CategoryTypeName=#ParameterValue
The above query gets fired on single page 3 times. and There were 250-300 users who were accessing the same page simultaneously.Along with that on admission form there were 1300-2000 students were filling form at the same time.
The problem that i get was when i ran above query in the sql server it gets fired 1 out of 5 time. It throws error that An deadlock has occurred while accessing object from memory(forgive me for not remembering the exact error).
what i'm looking for from the following post is :
This time i was bit lucky that i haven't made someone unhappy from my coding but Can anyone let me know what can be done to overcome such scenario. What can be best way to handle large DBs
I tried to figure it out with SQL profiler but since there were 5 more application were running similar kind of mine i was not able to find out howmany users were trying to access the same resource.
I guess following points will be helpful for answering my question.
The application server and DB server different
DB server was running on Windows XP(I guess!) and it was having RAM of 128 GBs
When i executed query from the SQL Server it was taking average of 12-15 second to execute the query.
apologize for writing this big long but i really need help to learn this :)
Try to update your SELECT statement adding WITH (NOLOCK). This will make your results less precise but it seemed that it's enough for your dashboard.
Also it's better to use something like integer CategoryTypeId than CategoryTypeName in WHERE clause.
Related
Have drop-down menu which fills 4 datagridviews based on the branch selected or when the start button is pressed loops through 80 branches.
4 sql server procs, 1 per datagridview, unique sql table, read access, only.
Need to access multiple copies, single url.
Database retrieval time = # of copies run (single asp.net websites over single url called multiple times) * database runtime.
So if it takes 30 seconds for data retrieval, running 3 copies takes 90 seconds and seems to fragment the data or timeout..
I'm using nolocks so there isn't deadlock.
But I need to optimize this.
Should I create one web service and will this solve the problem of hitting the database only one time instead of 1x per single url hit.
Thank you.
David
Thank you, the timer was taking over and performing differently on the server than on my local. Also the UI, timer, and Database were out of synch. So adding a thread.sleep helped. Adding a longer interval on the timer, helped. Also putting all the database calls together, instead of 1 connection per database call helped. Now it runs all the same time.
The main takeaway I think is that the timer and the Thread.Sleep was the main thing.
I also had a UI button - which I added some code so that once it's pushed, if you keep pushing it, it doesn't do anything.
Thank you to everyone that posted answer..
Well, this will come down to not really the numbers of records pulled, but that if you are executing multiple SQL statements over and over.
I mean, to fill 4 gv's with 4 queries? That's going to be quite much instant assuming the record set size for each grid is say in the 100 row range. Such a button click and filling the grids should be very low time.
And even if you using a row databound event - once again, it will run fast. But ONLY if you not executing a whole bunch of additional SQL queries. So the number of "hits" or so called SQL statements sent to the database is what for the most part determine the speed of this setup.
So say you have one grid - pulls 100 rows. But then the next grid say needs data based on 100 rows of "new" SQL queries. In that case, what you can often do is fill a reocrdset with the child data - and filter against that recordset, and thus say not have to execute 100 SQL queries, but only 1 query.
So, this will really come down to how many separate SQL queries you execute in total.
To fill 4 grids with 4 queries? I don't see that as being a problem, and thus we are no doubt missing some big "whopper" of a detail you not shared with us.
Expand in your question how many SQL statements are generated in total - that's the bottle neck here. Reduce that, and your performance should be just fine.
And if the 4 simple stored procedures have "cursors" that loop and again generate many SQL commands - get rid of that.
4 basic SQL queries pulls is nothing - something else is at work that you not sharing. Why would each single stored procedure take so very long? That's the detail we are missing here.
we have a web application built using ASP.NET 4.0 (C#) and we are using SQL Server 2005 as the backend.
the application itself is a workflow engine where each record is attested by 4 role bearers over 18 days in a month.
we roughly have 200k records which come on 1st of each month.
during the 18 days - some people are looking and attesting records whereas system admin might be changing the ownership of these records.
my question or worry is that we often get deadlock issues in the database.
some user may have 10000 records in their kitty and they try to attest all records in one go whereas system admin may also change ownership in bulk for few thousand records and at that point we get deadlock and even when two or more users with laods of accounts try to attest - we get deadlocks.
We are extensively using stored procs with transactions. Is there a way to code for such situations?
or to simply avoid deadlocks.
Apologies for asking in such a haphazard manner but any hints or tips are welcome and if you need more info to under stand the issue then let me know.
thanks
Few suggestions:
1) Use the same order for reading/writing data from/into tables.
Example #1 (read-write deadlock): Avoid creating a stored procedure usp_ReadA_WriteB that reads from A and then writes into B and another stored procedure usp_ReadB_WriteA that reads from B and then writes into A. Read this blog post please.
Example #2 (write-write deadlock): Avoid creating a stored procedure usp_WriteA_WriteB that writes data into table A and then into table B and another stored procedure usp_WriteB_writeA that writes data into the same tables: table B and then into table A.
2) Minimize duration of transactions. Minimize the affected rows to reduce the number of locks. Be attention at 5000 locks threshold for lock escalation.
3) Optimize your queries. For example: look for [Clustered]{Index|Table}Scan, {Key|RID} Lookup and Sort operators in execution plans. Use indices but, also, try to minimize the number of indices and try to minimize the size of every index (first try to minimize the index key's size). Read this blog post, please.
Hai guys,
I ve developed a web application using asp.net and sql server 2005 for an attendance management system.. As you would know attendance activities will be carried out daily.. Inserting record one by one is a bad idea i know,my questions are
Is Sqlbulkcopy the only option for me when using sql server as i want to insert 100 records on a click event (ie) inserting attendance for a class which contains 100 students?
I want to insert attendance of classes one by one?
Unless you have a particularly huge number of attendance records you're adding each day, the best way to do it is with insert statements (I don't know why exactly you've got it into your head that this is a bad idea, our databases frequently handle tens of millions of rows being added throughout the day).
If your attendance records are more than that, you're on a winner, getting that many people to attend whatever functions or courses you're running :-)
Bulk copies and imports are generally meant for transferring sizable quantities of data and I mean sizeable as in the entire contents of a database to a disaster recovery site (and other things like that). I've never seen it used in the wild as a way to get small-size data into a database.
Update 1:
I'm guessing based on the comments that you're actually entering the attendance records one by one into your web app and 1,500 is taking too long.
If that's the case, it's not the database slowing you down, nor the web app. It's how fast you can type.
The solution to that problem (if indeed it is the problem) is to provide a bulk import functionality into your web application (or database directly if you wish but you're better off in my opinion having the application do all the work).
This is of course assuming that the data you're entering can be accessed electronically. If all you're getting is pieces of paper with attendance details, you're probably out of luck (OCR solutions notwithstanding), although if you could get muliple people doing it concurrently, you may have some chance of getting it done in a timely manner. Hiring 1,500 people do do one each should knock it over in about five minutes :-)
You can add functionality to your web application to accept the file containing attendance details and process each entry, inserting a row into your database for each. This will be much faster than manually entering the information.
Update 2:
Based on your latest information that it's taking to long to process the data after starting it from the web application, I'm not sure how much data you have but 100 records should basically take no time at all.
Where the bottleneck is I can't say, but you should be investigating that.
I know in the past we've had long-running operations from a web UI where we didn't want to hold up the user. There are numerous solutions for that, two of which we implemented:
take the operation off-line (i.e., run it in the background on the server), giving the user an ID to check on the status from another page.
same thing but notify user with email once it's finished.
This allowed them to continue their work asynchronously.
Ah, with your update I believe the problem is that you need to add a bunch of records after some click, but it takes too long.
I suggest one thing that won't help you immediately:
Reconsider your design slightly, as this doesn't seem particularly great (from a DB point of view). But that's just a general guess, I could be wrong
The more helpful suggestion is:
Do this offline (via a windows service, or similar)
If it's taking too long, you want to do it asynchronously, and then later inform the user that the operation is completed. Probably they don't even need to be around, you just don't let them do whatever functions that the data is needed, before it's completed. Hope that idea makes sense.
The fastest general way is to use ExecuteNonQuery.
internal static void FastInsertMany(DbConnection cnn)
{
using (DbTransaction dbTrans = cnn.BeginTransaction())
{
using (DbCommand cmd = cnn.CreateCommand())
{
cmd.CommandText = "INSERT INTO TestCase(MyValue) VALUES(?)";
DbParameter Field1 = cmd.CreateParameter();
cmd.Parameters.Add(Field1);
for (int n = 0; n < 100000; n++)
{
Field1.Value = n + 100000;
cmd.ExecuteNonQuery();
}
}
dbTrans.Commit();
}
}
Even on a slow computer this should take far less than a second for 1500 inserts.
[reference]
My web site has city,state and zip code autocomplete feature.
If user types in 3 characters of a city in the textbox, then top 20 cities starting with those characters are shown.
As of now, Autocomplete method in our application queries sql 2005 database which has got around 900,000 records related to city,state and zip.
But the response time to show the cities list seems to be very very slow.
Hence, for peformance optimization, is it a good idea to store the location data into Lucene index or may be in Active directory and then pull the data from there?
Which one will be faster...Lucene or Activedirectory?And what are the pros and cons of each?Any suggestions please?
Thanks a bunch!
Taking a nuclear option (like changing backing data stores) probably shouldn't be the first option. Rather, you need to look at why the query is performing so slowly. I'd start with looking at the query performance in SQL Profiler and the execution plan in Sql Management Studio and see if I am missing anything stupid like an index. After you cover that angle, then check the web layer and ensure that you are not sending inordinate amounts of data or otherwise tossing a spanner in the works. Once you have established that you aren't killing yourself in the db or on the wire, then it is time to think about re-engineering.
On a side note, my money would be on Sql Server handling the data end of this task better than either of those options. Lucene is better suited for full-text searches and AD is a poor database at best.
I would cache the data into a separate table. Depending on how fresh you need that data to be, you can rebuild it as often as necessary.
--Create the table
SELECT DISTINCT city, state, zip INTO myCacheTable FROM theRealTable
--Rebuild the table anytime
TRUNCATE TABLE myCacheTable
INSERT INTO myCacheTable (city, state, zip) SELECT DISTINCT city, state, zip FROM theRealTable
Your AJAX calls can access myCacheTable instead, which will have far fewer rows than 900k.
Adding to what Wyatt said, you first need to figure out which area is slow? Is the SQL query slow OR the network connection slow between the browser and the server? OR is there something else?
And I completely agree with Wyatt that SQL Server is much more suitable for this task then Lucene and Active Directory.
Context
My current project is a large-ish public site (2 million pageviews per day) site running a mixture of asp classic and asp.net with a SQL Server 2005 back-end. We're heavy on reads, with occasional writes and virtually no updates/deletes. Our pages typically concern a single 'master' object with a stack of dependent (detail) objects.
I like the idea of returning all the data required for a page in a single proc (and absolutely no unnecesary data). True, this requires a dedicated proc for such pages, but some pages receive double-digit percentages of our overall site traffic so it's worth the time/maintenance hit. We typically only consume multiple-recordsets from our .net code, using System.Data.SqlClient.SqlDataReader and it's NextResult method. Oh, yeah, I'm not doing any updates/inserts in these procs either (except to table variables).
The question
SQL Server (2005) procs which return multiple recordsets are working well (in prod) for us so far but I am a little worried that multi-recordset procs are my new favourite hammer that i'm hitting every problem (nail) with. Are there any multi-recordset sql server proc gotchas I should know about? Anything that's going to make me wish I hadn't used them? Specifically anything about it affecting connection pooling, memory utilization etc.
Here's a few gotchas for multiple-recordset stored procs:
They make it more difficult to reuse code. If you're doing several queries, odds are you'd be able to reuse one of those queries on another page.
They make it more difficult to unit test. Every time you make a change to one of the queries, you have to test all of the results. If something changed, you have to dig through to see which query failed the unit test.
They make it more difficult to tune performance later. If another DBA comes in behind you to help performance improve, they have to do more slicing and dicing to figure out where the problems are coming from. Then, combine this with the code reuse problem - if they optimize one query, that query might be used in several different stored procs, and then they have to go fix all of them - which makes for more unit testing again.
They make error handling much more difficult. Four of the queries in the stored proc might succeed, and the fifth fails. You have to plan for that.
They can increase locking problems and incur load in TempDB. If your stored procs are designed in a way that need repeatable reads, then the more queries you stuff into a stored proc, the longer it's going to take to run, and the longer it's going to take to return those results back to your app server. That increased time means higher contention for locks, and the more SQL Server has to store in TempDB for row versioning. You mentioned that you're heavy on reads, so this particular issue shouldn't be too bad for you, but you want to be aware of it before you reuse this hammer on a write-intensive app.
I think multi recordset stored procedures are great in some cases, and it sounds like yours maybe one of them.
The bigger (more traffic), you site gets, the more important that 'extra' bit of performance is going to matter. If you can combine 2-3-4 calls (and possibly a new connections), to the database in one, you could be cutting down your database hits by 4-6-8 million per day, which is substantial.
I use them sparingly, but when I have, I have never had a problem.
I would recommend having invoking in one stored procedure several inner invocations of stored procedures that return 1 resultset each.
create proc foo
as
execute foobar --returns one result
execute barfoo --returns one result
execute bar --returns one result
That way when requirments change and you only need the 3rd and 5th result set, you have a easy way to invoke them without adding new stored procedures and regenerating your data access layer. My current app returns all reference tables (e.g. US states table) if I want them or not. Worst is when you need to get a reference table and the only access is via a stored procedure that also runs an expensive query as one of its six resultsets.