I have a query from a web site that takes 15-30 seconds while the same query runs in .5 seconds from SQL Server Management studio. I cannot see any locking issues using SQL Profiler, nor can I reproduce the delay manually from SSMS. A week ago, I detached and reattached the database which seemed to miraculously fix the problem. Today when the problem reared its ugly head again, I tried merely rebuilding the indexes. This also fixed the problem. However, I don't think it's necessarily an index problem since the indexes wouldn't be automatically rebuilt on a simple detach/attach, to my knowledge.
Any idea what could be causing the delay? My first thought was that perhaps some parameter sniffing on the stored procedure being called (said stored proc runs a CTE, if that matters) was causing a bad query plan, which would explain the intermittent nature of the problem. Since both detaching / reattaching and an index rebuild should theoretically invalidate the cached query plan, this makes sense, but I'm unsure how to verify this. Additionally, why wouldn't the same query (copied directly from SQL Profiler with the exact same parameters) exhibit the same delay when run manually through SSMS?
Any thoughts?
I know I am weighing in on this topic very late, but I wanted to post a solution that I found when having a similar issue. In brief, adding the SET ARITHABORT ON command at the outset of my procedures brought website query performance in line with performance seen from SQL Server tools. This option is typically being set on the connection when you run a query from QA or SSMS (you can change that option, but it is the default).
In my case, I had about 15 different stored procs doing mathematical aggregates (SUMs, COUNTs, AVGs, STDEVs) across a fairly sizeable set of data (10s to 100s of thousands of rows) - adding the SET ARITHABORT ON option moved them all from running in 3-5 seconds each to 20-30ms.
So, hopefully that helps someone else out there.
If a bad plan is cached then the same bad plan should be used from SSMS too, if you run the very same query with identical arguments.
There cannot be better solution that finding the root cause. Trying to peek and poke various settings in the hope it fixes the problem will never give you the confidence it is actually fixed. Besides, next time the system may have a different problem and you'll believe this same problem re-surfaced and apply a bad solution.
The best thing to try is to capture the bad execution plan. Showplan XML Event Class Profiler event is your friend, you can get the plan of the ADO.Net call. This is a very heavy event, so you should attach profiler and capture it only when the problem manifests itself, in a short session.
Query IO statistics can also be of help. RPC:Completed and SQL: Batch Completed events both include Reads and Writes so you can compare the amount of logical IO performed by ADO.Net invocation vs. SSMS one. Large difference (for exactly the same query and params) indicate different plans.sys.dm_exec_query_stats is another avenue of investigation. You can find your query plan(s) in there and inspect the execution stats.
All these should help establish with certitude if the problem is a bad plan or something else, to start with.
I have been having the same problem.
The only way i can fix this is setting ARITHABORT ON.
but unfortunatley when it occurs again i Have to set ARITHABORT OFF.
I have no clue what ARITHABORT has to do with this but it works, I have been having this problem for over 2 years now with still no solution. The databses i am working with are over 300GB so maybe it is a size issue...
The closest i got to resolving this problem was from an earlier post
Google Groups post
Let me know if you have managed to completely solve this problem as it is very frustrating!
Is it possible that your ADO.NET query is running after the system has been busy doing other things, so that the data it needs is no longer in RAM? And when you test on SSMS, it is?
You can check for that by running the following two commands from SSMS before you run the query:
CHECKPOINT
DBCC DROPCLEANBUFFERS
If that causes the SSMS query to run slowly, then there are some tricks you can play on the ADO.NET side to help it run faster.
Simon Sabin has a great session on "when a query plan goes wrong" ( http://sqlbits.com/Sessions/Event5/When_a_query_plan_goes_wrong ) that discusses how to address this issue within procs by using various "optimize for" hints and such to help a proc generate a consistent plan and not use the default parameter sniffing.
However I've got an issue with and ad-hoc query (not in a proc) where the SSMS plan and the ASP plan are exactly the same - clustered index / table scan - and yet the ASP query takes 3+ minutes instead of 1 second. (In this case table-scan happens to be a decent answer for fetching the results.)
Anyone care to explain that one?
Related
Many time i've seen queries stuck at "Opening Tables" state on mariadb. However, i couldn't find any documentation for the same.
When does this state occur?
If this state persists for a long time(say 30 seconds), is it bad? If it is, what could be causing it?
According to the documentation for mariadb, opening tables should be fast process and should not get stuck unless it's being blocked by a lock. For example an alter query can lock the table for open. You should also consider checking if your table_open_cache value is large enough.
Refer to mariadb manual on thread states.
I'm using a simple sqlite DB as persistent msg-queueing mechanism between processes. To reduce the file size after exceeding a certain limit I wanted to use the "vacuum" command. Generally all this works nicely, only that every now and then I get a "database is locked" error when vacuuming.
After reading through various resources on the web I understand that there is nothing I can do on sqlite-level.
However, besides the side-questions "Why is that the case? What would be the problem with retrying to obtain the required lock using the regular busyHandler mechanism?" I came up with the idea to just implement the very same busyHandler mechanism just on application level.
Now the essential question: Anything wrong with this?
Many thx!!
There isn't really much of a difference between using SQLite's built-in retrying mechanism, and implementing it yourself.
However, writing code that you do not need to write is useless effort and increases the chance of introducing bugs.
After working on this for another while, I changed my application logic by removing the vacuuming altogether but rather use "pragma max_page_count" to limit the DB size. The only thing to watch out for is that this seems to be session-specific, i.e. the pragma must be set again after each DB connect.
Regarding the original problem: I found that #CL was mostly right - vacuuming indeed involves the standard busy handler. And on Linux this also works nicely, just on Windows it does not reliably (might be by accident though and rather caused by system speed, ...). Using a little printf() in my custom busyHandler I can see that it is invoked in most cases, however sometimes it is not and the "pragma vacuum" just returns "DB locked".
Possibly caused by concurrent processes trying to vacuum at the same time (?) ... anyway, the reworked design is much cleaner/easier anyway.
Here's the scenario...
We have an internal website that is running the latest version of the ODAC (Oracle Client). It opens database connections, runs a stored procedure or packaged method, then disconnects. Connection pooling is turned on, and we are currently under version 11g in both our development and test environments, but under 10gR2 in our production environment. This happens on Production.
A few days ago, a process began firing off a ORA-2020 error. The process is called from a webpage on our internal website. The user simply sets a date, hits a button, and a job is started on another system that is separate from the website. The call itself, however, uses a database link to run a function.
We've scoured the SQL to find that it only uses that one database link. And since these links are on a per session basis and the user isn't exceeding the default limit of 4, how is it possible that we are getting a ORA-2020 error.
We have ran a number of tests to try to push over the default limit of 4. ODAC, from what I recall, runs a commit after each connection, and I can't seem to run 4 DB links, then run a piece of SQL with 1 DB link directly after with any errors. The only way I can bring up this error is if I run a query with 4 DB links, then a function or piece of dynamic SQL with a database link within it. We don't have that problem as this issue is sporadic. It isn't ALWAYS happening.
Questions
Is it possible that connection pooling is allowing User B to use User A's connection after the initial process was run, thus adding to the open links number if User B runs a SQL statement with more database links?
Is this a scenario where we should up our limit past 4? What are the disadvantages of increasing the number?
Do I need to explicitly close open database links before disconnecting from the database? Oracle documentation seems to suggest it should automatically happen, but "on occasion"... doesn't.
Firstly, the simple solution: I'd double check that in the production database the number of default links is actually 4.
select *
from v$system_parameter
where name = 'OPEN_LINKS'
Assuming you're not going to get off that lightly:
Is it possible that connection pooling is allowing User B to use User
A's connection after the initial process was run, thus adding to the
open links number if User B runs a SQL statement with more database
links?
You say that you explicitly close the session, which, according to the documentation, should mean that all links associated with that session are closed. Other than that I confess complete ignorance on this point.
Is this a scenario where we should up our limit past 4? What are the
disadvantages of increasing the number?
There aren't any disadvantages that I can think of. Tom Kyte suggests, albeit a long time ago, that each open database link uses 500k of PGA memory. If you don't have any then this will obviously cause a problem but it should be more than fine for most situations.
There are, however, unintended consequences: Imagine that you up this number to 100. Somebody codes something that continually opens links and draws a lot of data through all them select * from my_massive_table or similar. Instead of 4 sessions doing this you have 100, which is attempting to transfer hundreds of gigabytes simultaneously. Your network dies under the strain...
There's probably more but you get the picture.
Do I need to explicitly close open database links before disconnecting
from the database? Oracle documentation seems to suggest it should
automatically happen, but "on occasion"... doesn't.
As you've noted the best answer is "probably not", which isn't much help. You don't mention exactly how you're terminating the session but if you're killing it rather than closing gracefully then definitely.
Using a database link spawns a child process on the remote server. Because your server is no longer in absolute charge of this process there's a myriad of things that could cause it to become orphaned or otherwise not close on termination of the parent process. By no means does this happen the whole time but it can and does.
I would do two things.
In your process, if an exception is encountered, e-mail the results of the following query to yourself.
select *
from v$dblink
As a minimum at least you will know what database links are open in the session and give you some way of tracing them.
Follow the documentations advice; specifically the following:
"You may have occasion to close the link manually. For example, close
links when:
The network connection established by a link is used infrequently in an application.
The user session must be terminated."
The first seems to exactly fit your situation. Unless your process is time-sensitive, which doesn't seem to be the case, then what have you got to lose? The syntax is:
alter session close database link <linkname>
We ended up increasing the link amount, but we never did find the root cause.
My client is asking for a "suggestion" based lookup to be added to a particular form field.
In other words, as you start typing into a field there should be a "Google style" popup which suggests possible results to select from. There will be in the order of "tens of thousands" of possible suggestions - this is the best estimate I currently have on the quantity.
Using AJAX to send/retrieve the result, my question is whether it is better to store ALL the suggestions within .NET cache and process there, or whether it's better to run a stored-procedure based query on SQL Server for each request?
This would be using .NET 2.0 and SQL Server 2005
There is one trick I use every time when faced with such task. Do not do it on every keystroke. Put the launching of the search on a sliding timeout.
The intent here is to launch the search only when the user paused in his/her typing. Usually I set the timeout at .1 to .2 sec. Psychologically it is still instantaneous, but it considerably reduces the load on whatever you will use to search
Your bottle neck will be transporting the data from the server to the browser. You can easily produce a large result from the database in almost no time at all, but it takes forever to return to the browser.
You will have to find a way to limit the result that the server returns, so that you only fetch what the user has to see right now.
When you have the data traffic down to reasonable level, you can start looking at optimising the server part. With some caching and logic that should be easy, considering that when the user types the result is often a subset of the previous result, e.g. the match for "hat" is a subset of the match for "ha".
When I've seen the "suggest as you type" type searches done in SQL Server environments, I've seen the best performance using some sort of a caching mechanism, and typically a distributed approach - like a memcached, typically. Even if your code is optimized well, your database is tuned well and you have your query taking only a <= 10ms with the call to it, process and return, that is still 10ms as they type.
It depends on the number of items. If you can cache the items in a .NET process without running out of memory this will defenitely be faster.
But if that can't be done you are stuck with accessing the database on each request. A stored procedure would be a nice way to do that.
But there are some design patterns which can guide you. I've used the following two while developing something similar to what you are doing.
Submission throttling
Browser side cache
I run a site with decent traffic (~100,000 page views per day) and sporadically the site has been brought to its knees due to SQL Server timeout errors.
When I run SQL Profiler, I see a command getting called hundreds of times a second like this:
...
exec dbo.TempGetStateItemExclusive3 #id=N'ilooyuja4bnzodienj3idpni4ed2081b',...
...
We use SQL Server to store ASP.NET session state. The above is the stored procedure called to grab the session state for a given session. It seems to be looping, asking for the same 2 or 3 sessions over and over.
I found a promising looking hot fix that seems to address this exact situation, but it doesn't seem to have solved the problem for us. (I'm assuming this hotfix is included in the most recent .NET service pack, because it doesn't look like you can install it directly anymore). I added that registry key manually, but we still see the looping stored procedure calls like above (requesting the same session much more often than every 500ms)
I haven't been able to recreate this on a development machine. When two requests are made for the same session ID, it seems to block correctly, and even try to hit SQL until the first page releases the session.
Any ideas? Thank you in advance!!!
This may be one of those cases where I needed an answer to a different question. The question should have been "Why am I using SQL to store session state information?" SQL is much slower, and much more disconnected from the web server, both of which may have contributed to this problem. I looked up the size of our ASPStateTempSessions table and realized it was only about 1MB. We moved back to <sessionState mode="InProc" ... /> and the problem is fixed (And the site runs faster)
The next step, when traffic dictates, would be to add another servers and use the "StateServer" mode so we can spread out the memory usage.
I think I originally made this move to deal with a memory bottle neck which is no longer an issue. (This is not a good solution to dealing with a memory bottle neck, FYI!)
IMPORTANT EDIT: Ok, so it turns out that the whole "TempGetStateItemExclusive" thing was not the problem, it was just a symptom of another problem. We had some queries that were causing blocking issues, so every SQL request would just get kicked out. The actual fix was to identify and fix the blocking issues. (I still believe that "InProc" is the way to go, though) This link helped a lot to identify our issues:
http://www.simple-talk.com/sql/sql-tools/how-to-identify-blocking-problems-with-sql-profiler/
It's been some time, but its there not a cleanup job that runs to remove stale sessions? Is it enabled.
This old KB mentions it. Like I said, it's been a while.
Just out of curiosity. Have you opened up that proc to see what it does?
If it's just making a select statement, you might look to see if it is using NOLOCK or not. If not, add NOLOCK to it and see what happens.