IIS bottleneck? - asp.net

I have 3 machines - one which is IIS, one with a database and one from which I test the efficiency of my application - which means:
Using The Grinder I run 1000 instances of my application (hosted on the IIS and operating with the database on the machine with SQL Server). And using perfmon I observe that there really are 1000 requests.
BUT the problem is that connecting to this application (IIS) from another computer is very slow. I suppose there is some bottleneck on the IIS side but I cannot find it - CPU usage is less than 10%.
I think I changed every option in the IIS Manager and machine.config and web.config files - nothing seems to have any effect.

First thing is you need to confirm if you have a slowness issue while browsing the site
Check the IIS logs and look for the time-taken field. If the time taken is more than 10 seconds then it is considered as a Slowness.
The slowness might be because of several reasons. It might be because of the Network or might be because something in your code might be causing it.
I suggest you to capture a Network trace using Netmon or WireShark in case if its a Network related.
If its not Network you can collect a Process dump using Debug diag 2 update 2 tool.
You can check the below link to collect the dumps and check them and try to find out if there is any slowness:
https://msdn.microsoft.com/en-us/library/ff420662.aspx

Related

Extremely high TTFB when connecting to Oracle DB 11 via ODP.net

After mutiple days of searching Google, I still have the following issue:
I have an internal website, which is hosted on Win2012 server, and uses ODP.NET to connect to an oracle db.
The issue is that no matter what i do, no matter the IIS configuration or server load, the TTFB for loading ceratain pages is huge - more than 20s.
When the same site is hosted on a different host (test enviroment), the TTFB is <1s.
System setup - the productive server is win2012 server, the test sever is identical. Identical - same OS, same patch level ,same services, same setup.
Both machines are virtual, with the productive having 24GB RAM and 2xXeon 2.2Ghz
So far i have conducted the following test - changed the oracle host in tnsnames to the host IP - no effect. Stopped all services running on the server - no effect. System load does not at any time exceed 50% CPU time , 45% memory, 4Mbps / out of a 2x100Mbps. Disk I/O operations are acceptable - no more than 5MB/s at a maximum. ODP.NET trace shows gaps of
Slight improvement was noticed when the number of worker processes was increated from 1 do 3 , but he web site requires a repeat of the log on after the server changes the WP which currently handles the user.
In short- from server A the site loads a page for 25-30s, from server B (identical) the page loads in 1-2s.
My request is : would you please recommend where should i am my efforts, or is there a windows based tool which would allow me to break down the TTFB to a more detailed view than the one offered by chrome/IE.
Thank you kindly for your responses.
Start by enabling logging and see if there is anything that stands out:
<configuration>
<oracle.dataaccess.client>
<add key="TraceFileName" value="c:\temp\odpnet1.trc"/>
<add key="TraceLevel" value="63"/>
<add key="TraceOption" value="0"/>
</oracle.dataaccess.client>
The "certain pages" part throws me off a bit. Your test server is connected to the same database, yes? The test server consistently has better performance? In other words, you are sure this isn't related to query performance? Does the test server go thru the same networks to get to the server (tracert oracleserver)?

Troubleshooting an IIS .NET website outage

Last night one of the websites (.NET 4.0 forms) hosted on my Win 2008 R2 (IIS 7.5) Server started to time out throwing the following error for all connected users.
TYPE System.Web.HttpException
MESSAGE Request timed out.
DETAIL System.Web.HttpException (0x80004005): Request timed out.
The outage was confined to just one website within IIS, the others continued to work fine.
Unfortunately I was unable to identify why the website was timing out. Here are the steps I took:
First thing I did was look at the task manager which revealed normal CPU and memory usage. Network activity was also moderate.
I then opened IIS to look at the live connections under 'Worker Processes'. There were about 60 live connections, so it didn't look like anything DDoS related.
Checked database connectivity (hosted on a separate server), all fine!
I then reset the website on IIS. That didn't work
I tried to then do a complete iisreset...still no luck :(
In the end (and under some duress) the only thing I could think to do to resolve this was to restart the server.
Restarting the server worked but I am nervous not knowing why this happened in the first place. Can anyone recommend any checks that I failed to carryout? Is there an official checklist for working through these sorts of IIS problems? I have reviewed the IIS logs but don't see anything unusual on the run up to the outage.
Any pointers or links to useful resources to help me understand and mitigate against this in future will be much appreciated.
EDIT
The only time I logged into the server that day was to add an additional web handler component (for remote deploy) to IIS Web Deploy. I'm doubtful this caused the outage as the server worked for for 6 hours after.
Because iisreset didn't helped and you had to restart whole machine, I would suspect it was a global resources shortage and mostly used website (or most resource consuming) was impacted. It could be because of not available RAM, network connections congestion due to some malfunctioning calls (for example a lot of CLOSE_WAIT sockets exhausting connections pool, we've seen that in production because of malfunction of external service). It could be also one specific client problem, which was disconnected after machine restart so eventually the problem disappeared.
I would start from:
Historical analysis
review Event Viewer to see any errors/warnings from that period of time,
although you have already looked into IIS logs, I would do it once again with help of Log Parser Lizard to make some statistics like number of request per client, network bandwith per client, average response time per client and so on.
Monitoring
continuously monitor Performance Counters:
\Processor(_Total_)\% Processor Time,
\.NET CLR Exceptions(_Global_)\# of Exceps Thrown / sec,
\Memory\Available MBytes,
\Web Service(Default Web Site)\Current Connections (per each your site name),
\ASP.NET v4.0.30319\Request Wait Time,
\ASP.NET v4.0.30319\Requests Current,
\ASP.NET v4.0.30319\Request Queued,
\Process(XXX)\Working Set,
\Process(XXX)\% Processor Time (XXX per each w3wp process),
\Network Interface(XXX)\Bytes total / sec
run Performance Analysis of Logs (PAL) Tool during time of failure to make a very detailed analysis of performance counters data,
run netstat -ano to analyze network traffic (or TCPView tool even better)
If all this will not lead you to any conclusion, create a Debug Diagnostic rule to create a memory dump of the process for long running requests and analyze it with WinDbg and PSSCor extension for .NET debugging.

asp.net runs fast then slow on restart

Why is it that every time the server goes down, and asp.net restarts, the response time is SUPER FAST when it comes back up for a few minutes. I assume because everyone is off the server and I am one of the few (or only) people back on the server so quick?
I have discussed this with our developers and they say the response time is due to everyone on the server normally (200+ desktops) and when you are the only person on there, it flys. Really? Then does that mean we need newer, faster web servers?
I am not a programmer, but I think there may be two answers, one is what the devs say above is true, and two is the system is accumulating temp files of some sort and they get cleared out when the server crashes and then restarts.
How do we prove who might be right? Where does one start to look for asp.net bottlenecks?
windows server 2003
asp.net 3.0
iis6
12GB ram
sql server 2005 (db admin says there is no load issue on sql..)
Some very basic steps that you can follow and check if your server work on limits are:
First you download the Process Explorer from sysinternals and you run it to see two things.
Is your server work on their memory limit ?
If yes then what program eats the memory, usually SQL Server 2005 use a lot of memory for database cache, and this is done after many time of work.
Did the server use all of his computing power, if yes, check what program is the one that need all that computing power.
Now next step, download the TCPView from sysinternals, run it and see how many connections are done, how fast, etc... There you can see anomalies, or if the computer is also on their limit.
Final step is to defrag your disks.
Also have in mine that the asp.net session is lock the entire view on all users. So if you have one application on web, with too many users, and each user, or some users, make long time processing on their calls, then this can cause delay to all the users.

ASP.NET Requests Queued causes website to crumble. SQL backend, IIS6

I have inherited a somewhat complex system (and problem) that I need help with.
I have a webserver w/ the following specs:
Hardware:
Server 2003 32bit
IIS 6
8 cores (16 w/ hyperthreading)
12gb RAM
ASP.NET site
3 app pools, so 3 instances of w3wp.exe running.
This system serves a large number of people and bandwidth is fairly constant during business hours reaching ~ 68,000kbit/s
There are moments when the system "comes down" - site gets very slow which generates a lot of phone calls. Things usually slow down for 60 seconds, but has varied greatly in length. Sometimes only a few seconds and sometimes 3 minutes or more.
I have my app pools set to recycle somewhere about 600mb of consumed memory. That's not exact but they recycle on their own with much success. At times I recycle the "main" pool manually to clear the problem I'm describing.
This is what I know is going on when things are running slow.
Network bandwidth takes a considerable dip.
Requests Queued in the ASP.NET performance counters goes up.
In tandem w/ the Requests Queued rising page latency increases. (I employ a simple ASP page that makes a SQL call and just says "The system is live" - this page is monitored for latency)
Overall CPU usage rises.
Overall memory consumption of w3wp.exe rises.
In my mind here is what I imagine is happening.
Someone asks the system to generate a report or glob of data. This spins up a process that consumes a large number of threads (ie, all available threads) This causes all other requests to the system to wait in the ASP.NET que pool which essentially kills the site. The lack of activity causes the network traffic to dip.
I have read many articles about thread queues, thread pools, etc. This is a good example: http://williablog.net/williablog/post/2008/12/02/Increase-ASPNET-Scalability-Instantly.aspx and does what I believe is a clue to help me solve my problem... but I'm not sure. My "Machine.config" file for the version of asp.net that I am using does not specify any of the thread values listed in the article so we are default for everything which I believe is incorrect given our situation.
If you were me; What would you do next? Where do you think the problem is?
edit: Here is a screenshot. It should be obvious when the problem is happening.
http://i.imgur.com/5BJlq.png
edit:
I want to change these values for our setup. A few questions first:
1) After making the changes, what needs to be restarted for them to take effect?
2) How do these settings look for a system with 8 physical cores?
maxconnection = 96
maxIoThreads = 100
maxWorkerThreads = 100
minFreeThreads = 704
minLocalRequestFreeThreads = 608
Not fun.
Many root causes share common symptoms which makes it difficult to diagnose without getting dirty with the application. :) Pardon if some of these steps were implied.
Some next steps might be:
Review the IIS logs of each site looking for things like:
HTTP response codes (5xx,4xx,3xx)
Request response times
Review Windows Event Logs
How often are application pools cycling?
Application errors, etc.
Verify processModel settings as suggested by #vinayc to make sure predecessor didn't get 'tricky'
Install DebugDiag, its a surprisingly good tool for some basic analysis of memory and crash related problems.
This can also help you capture memory snaps to diagnose later.
Tess Ferrandez blog can help make heads/tails of memory snap analysis.
Understand how many web applications are running in each AppPool.
Investigate using a 'web garden' to possibly help minimize number of users impacted by 'slow down'
Is a virus scanner enabled? Is it running? If so, verify exclusions.
Are application teams available to help troubleshoot? Identify if they have any custom application instrumentation that might help diagnose problem.
Is the behavior 'new'? Or has it always been there? If 'new', can you track down which deployment might have caused the new behavior?
Could the the description given of the 'slow down' behavior be attributed to an apppool recycle and resulting jitting of the application again? ala - the first request syndrome.
Reviewing the logs helps understand how the sites/applications are being used, which can be especially important if you don't own the codebase. Logparser is an excellent tool for doing some IIS log analysis (as well as other formats).
Good luck!
Z
The settings that your are talking are part of processModel element under system.web element from machine.config. For IIS6, following are applicable:
autoConfig
maxIoThreads
maxWorkerThreads
minIoThreads
minWorkerThreads
requestQueueLimit
responseDeadlockInterval
Typically, you will only find autoConfig="true" and not other elements. Auto-config sets the values as per your machine configuration - the tuning is done as per recommended values (see Threading Explained section from this article) which are same as sighted by the link that you have provided.
The article although dated, i excellent resource if you want to tune up these settings manually.
On the other hand, at the load that you are serving, I would recommend two things (if you haven't tried already)
Use output caching aggressively - even if the data is dynamic, caching for say 30-60 seconds can give a definite boost at your load
If you suspect certain requests are hogging too many threads then attempt to move those resources under different app-pool (you can use different web-site with different sub-domain or you can use different virtual directory/application and choose different app-pool)

Logging across multiple web servers

I would like to know how people dealing with logging across multiple web servers. E.g. Assume there are 2 webservers and some events during the users session are serviced from one, some from the other. How would you go about logging events from the session coherently in one place (without e.g.creating single points of failure)? Assuming we are using: ASP.Net MVC, log4net.
Or am I looking at this the wrong way - should I log seperately and then merge later?
Thanks,
S
UPDATE
Please also assume that the load balancers will not guarantee that a session is stuck to one server.
You definitely want your web servers to log locally rather than over a network. You don't want potential network outages to prevent logging operations and you don't want the overhead of a network operation for logging. You should have log rotation set up and all your web servers clock's synced. When log rotation rolls your log files over to a new file, have the completed log files from each web server shipped to a common destination where they can be merged. I'm not a .net guy but you should be able to find software out there to merge IIS logs (or whatever web server you're using). Then you analyze the merged logs. This strategy is optimal except in the case that you need real-time log analysis. Do you? Probably not. It's fairly resilient to failures (assuming you have redundant disks) because if a server goes down, you can just reboot it and reprocess any log ship, log merge or log analysis operations that were interrupted.
An interesting solution alternative:
Have 2 log files appenders
First one in the local machine
In case of network failure you'll keep this log.
Second log to a unix syslog service remotely (of course
a very consistent network connection)
I used a similar approach long time ago, and it work really well, there are
a lot of nice tools for analyzing unix logs.
Normally your load balancing would lock the user to one server after the session is started. Then you wouldn't have to deal with logs for a specific user being spread across multiple servers.
One thing you could try is to have the log file in a location that is accessible by all web servers and have log4net configured to write to it. This may be problematic, however, with multiple processes trying to write to the same file. I have read about NLog which may work better in this scenario.
Also, the log4net FAQ has a question and possible solution to this exact problem

Resources