Working with one of our partners, we have developed now two separate sets of web services for their use. The first one was a simple "post to an https URL" style web service, which we facilitated by building a web page in ASP.NET that inspected the arguments in the URL, and then acted accordingly. This "web service" (if you can call it that) has been very stable.
At some point, the partner asked us to begin using SOAP based web services. At their request, we built them a new set of web services largely based on the previous objects, reimplemented as an actual "Web Service". This web service has not been very stable: around once a week, Nagios will alert us that our web service is not responding - and a quick iisreset does the trick.
Analyzing the log output and working in a debugger has not led us to anything concrete. The volume on this new web service is actually much lower than the HTTP web service. I think this could be a code problem or a platform problem, or of course something in between.
We've tried, with little improvement:
To duplicate the behavior in the lab
Debugging in the Visual Studio debugger
Tinkering with IIS options to give it its own application pool
My question, what are the next steps for troubleshooting?
Environment:
Windows Server 2003 Standard Edition R2 Service Pack 2 32 bit, Visual Studio 2005, MS SQL 2005, .NET Framework 2.0.50727
You may get some answers by profiling your webservices and understanding how they are using their resources. perfmon and procmon are both very useful tools in this regard.
EDIT: Since you say errors happen after about a week, the only thing I can think of is resource usage. Ensure your DB connections are being cleaned up, and any opened files (system call to the exe) are being closed.
Also, if your webservices can tolerate it, IIS has a setting that triggers a periodic recycle of an App Pool to handle cases where performance degrades over time. Its dirty, but it may work well for your case.
Since there isn't much to go on - here's another odd issue we came up against regarding our web services.
When the web service stops responding how is memory utilization? We have experienced issues with memory and memory fragmentation relating to busy web services on a system (there was also other things running causing additional fragmentation). When we re-factored the web services to load from smaller dll's and depend on other libraries (instead of one large library) we were able to resolve the memory fragmentation.
To identify what was occurring we would take a dump from the offending iis worker process where the app pool resided and then reviewed that using WinDbg.
http://www.microsoft.com/whdc/devtools/debugging/default.mspx
Additionally we used DebugDiag to take the postmortem dumps.
http://www.iis.net/downloads/default.aspx?tabid=34&g=6&i=1286
Hope this provides another direction to look at.
Related
We have 4 servers load balanced:
4 cores # 2.6Ghz (E5-2650 v2)
14GB RAM
Windows 2012 R2
High Performance power setting
IIS 8.5
ASP 5.3
EF 6.1
They each have a single application pool with one worker process and a single website. Each server has its own copy of the site (DLLs & views), running on a local disk. We are using IIS virtual directories to point to shares on a clustered file server for log files and common images etc (content only). The application pools are set to not shut down when idle (interval of 0) and we have also disabled the every-1740 minute recycle interval too.
We have New Relic's .NET agent installed on all servers, and looking through our slow transaction log, I can see that many requests are taking 15 seconds or so to complete. Looking into the trace, I can see a common call to System.Web.Compilation.AssemblyBuilder.Compile() and System.Web.Compilation.BuildManager.CompileWebFile().
As far as I know or understand, ASP would compile these views upon first request to them, and cache it (to the temporary ASP files in C:\Windows\Microsoft.Net) and then load from there for subsequent requests.
I'm confused how this is happening so often - when I visit these URLs, the TTFB is about 400ms, and due to constant load I can't see the websites "losing" their cache and needing to compile the views again. These pages are frequently hit - it's an e-commerce store and I can see that it happens often, and on our most popular pages: catalogue (category/brand/gender etc) listings and product details.
I've set the settings against each application pool to log an event when recycling, and there have been no events logged when I'm checking the WAS service in the event viewer. We also have New Relic server installed, and looking over the past 6 hours' worth of data, I can't see any dip in RAM usage on any of the servers - which would indicate the application pool recycling. This has really baffled me!
I'm thinking of moving towards pre-compiling our views as part of our release process - it makes sense really. But it feels like that is working around, or masking an issue which as far as I can see should not be happening. We build our site in Release mode and have <compilation debug="false" /> on all web.config files.
Can anyone think of any causes for this?
It is because of how JIT (Just-In-Time) compilation works.
When you build your application, it is converted into .NET Microsoft Intermediate Language (MSIL) or Intermediate Language (IL).
As your applications is accessed, Common Language Runtime (CLR) converts only executed IL parts of your code into native instructions.
Just-In-Time compilation process converts IL to native machine instructions and it is a part of CLR.
In a simplified terms when you run a .NET application and your program calls a method. JIT Compiler reads IL from metadata and compiles it into native instructions and run it. Next when your program calls the same method, CLR executes native CPU instructions directly. This process adds some overhead for the first method call. You can go with the other option of pre-compiling your application using NGEN, which is usually not recommended because you will loose some optimizations that only JIT can perform due to its awareness of underlying hardware platform. These two articles has more details
http://geekswithblogs.net/ilich/archive/2013/07/09/.net-compilation-part-1.-just-in-time-compiler.aspx and https://msdn.microsoft.com/en-us/library/ms366723.aspx
There are also other things you can try that might help you speed up your application. You can use IIS Application warm up module How to warm up an ASP.NET MVC application on IIS 7.5?, implement distributed caching etc to alleviate some of your application bottlenecks.
Backstory
Last month our development team created a new asp.net 3.5 application to place out on our production website. Once we had the work completed, we requested from the group that manages are server to copy the app out to our production site, and configure the virtual directory as a new application.
On 12/27/2010, two public 'Gineau Pigs' were selected to use the app, and it worked great.
On 12/30/2010, We received notification by internal staff, that when that staff member tried to access the application (this was the Business Process Owner) they recieved the 'Server Application Unavailable' message.
When I called the group that does our server support, I was told that it probably failed, because I didn't close the connections in my code. However, the same group went in and then created a separate app pool for this Extension Request application. It has had no issues since.
I did a little googling, since I do not like being blamed for things. I found that the 'Server Application Unavailable' message will also appear when you have multiple applications using different frameworks and you do not put them in different application pools.
Technical Details - Tree of our website structure
Main Website <-- ASP Classic
+-Virtual Directory(ExtensionRequest) <-- ASP 3.5
From our server support group:
'Reviewed server logs and website setup in IIS. Had to reset the application pool as it was not working properly. This corrected the website and it is now back online. We went ahead and created a application pool for the extension web so it is isolated from the main site pool. In the past we have seen other application do this when there is a connection being left open and the pool fills up. Would recommend reviewing site code to make sure no connections are being left open.'
The Real Question:
What really caused the failure? Isn't the connection being left open issue an ASP Classic issue? Wouldn't the ExtensionRequest application have to be used (more than twice) in the first place to have the connections left open? Is it more likely the failure is caused by them not bothering to setup the new Application in it's own App Pool in the first place?
Sorry for the long windedness
You'd really need to obtain and review the server's Application & System event and HTTPERR logs for the period the server was reporting these errors.
Without these it'd be hard speculate what was the root cause of the problem.
Update:
OP incorrectly tagged his question so this next section no longer applies. However I'll leave in place because I think the information is useful for those encountering these issues and perhaps thinking about migrating to IIS7.x.
You are correct that running two different .NET Framework's in the same application pool can cause these errors but that's something you'd tend to see on Windows 2003/IIS6, not Windows 2008/IIS7.
IIS7 uses a slightly different approach to specifying which .NET Framework version is loaded and it's determined by the Application pool's managedRunTimeVersion property. When requests are processed by IIS/ASP.NET the site's Handler Mapping's use a preCondition attribute to determine when to load the requisite handler (which is kind of like a script mapping in previous versions of IIS).
This mechanism prevents the incorrect runtime version being loaded into the application pool's worker process.
So if an application pool is configured to run .NET Framework version v4.0 only that version will load, even if your application is built against v2.0.
There's a great article on how this works here:
Achtung! IIS7 Preconditions
The section on Handlers about half way through explains why the dangers of accidentally loading the wrong .NET version into a pool are mitigated by the preCondition feature.
A Server Application Unavailable error usually means something catastrophic has happened (like loading the wrong ASP.NET version's ISAPI filter into an already running worker process).
Not closing SQL connections is unlikely to cause this type of serious error. You'd more than likely be seeing a yellow screen of death runtime errors if that were the case. Running out of SQL connections usually doesn't bend ASP.NET so out of shape that the whole service tops itself.
My prime suspect would be a permissions problem where the application pool identity was unable to correctly access the application folders. But it's just a hunch.
Again, what you need to do is get the Application & System event logs and the HTTPERR logs (they reside in %systemroot%\System32\LogFiles\HTTPERR. That will contain clues and facts about what went wrong.
Update 2:
On Windows 2003/IIS6, if you have two applications running different ASP.NET versions that reside in the same pool you will get this error. In my experience (I work for a web hoster) it is the primary cause of this infamous error page:
There's also a tell-tale event logged to the Application Event log:
Event Type: Error
Event Source: ASP.NET 2.0.50727.0
Event Category: None
Event ID: 1062
Date: 12/01/2011
Time: 12:31:43
User: N/A
Computer: KK-DEBUG
Description:
It is not possible to run two different versions of ASP.NET in the same
IIS process. Please use the IIS Administration Tool to reconfigure your
server to run the application in a separate process.
Whilst your root application may not be written in ASP.NET it's likely that something has triggered loading of a different version of the framework into your site's application pool.
there's a rogue web.config in the root...this will trigger ASP.NET to load
there's a wildcard mapping to ASP.NET 1.1 in the site script maps (less likely, but possible)
I'm inclined to think that your new application most certainly ended up in a pool where other sites or applications were running a different framework version. The only way to really find out is to obtain the Application event logs and look for the event shown above.
It's hard to tell; there could be many causes (too many resources used, calling outside of .NET caused something to crash, etc). I would look in the Event log and see if you can find something there.
If you're running different versions of .NET you definitely want separate pools. If you have the option, I would recommend separate pools for each application (even if in the same .NET version).
As far as "closing the connection" (I assume you mean the connection to the database). If you're creating "low level" connections (i.e. SqlConnection, SqlCommand) then make sure you're wrapping them in a "using" statement, otherwise your connection pool can fill up. In my experience though, you should receive regular .NET errors in this case. If you're using an ORM this shouldn't be an issue.
Edit:
If you can't find anything useful in the Event Log, you could try this: http://learn.iis.net/page.aspx/266/troubleshooting-failed-requests-using-tracing-in-iis-7/
Usually I would look at writing a Windows Service to manage tasks that aren't suited to being hosted in a web application. These types of tasks are usually long running processes or scheduled tasks. Although this is normally the primary approach for these types of tasks, people have looked at ways of running these kinds of background processes in a web application by kicking off a number of threads in the Application_Start event exposed by Global.asax. The problem with this approach has always been that if your IIS worker process dies, then your background thread is killed too (effectively your 'Windows Service' is stopped until the next request is received).
ASP .NET 4.0 offers a solution to this problem. You can now set the StartMode to 'AlwaysRunning' as described in this blog post by Scott Gu. Somewhere in the comments on this post, someone asks a question about the viability of hosting Windows Service type tasks in IIS since the new feature ensures the worker process is always running. Scott mentioned that it would definitely support the scenario. Further to this, the recent introduction of AppFabric means that Microsoft themselves are providing simple hooks for hosting and monitoring WCF and WF services in a web application.
What does this mean for those of us that used to write Windows Services to support our web apps? Should we adopt this model? What are the pitfalls? As far as I can tell, there are a number of benefits to hosting 'Windows Service' processes in a web application, the most useful being the ease of deployment. Furthermore, we can actually start developing simple user interfaces to our services which provide information about what is happening at runtime.
If I had to go this route, I don't think that I would host my 'Windows Service' type functionality in the customer facing web application. I would probably develop a new web application project (much like I would in the Windows Service context) that would host my long running/scheduled task processes. I guess there are few reasons for this.
Security. There may be a different security model for the UI displaying information about the running background processes. I would not want to expose this UI to anyone else but the ops team. Also, the web application may run as a different user which has an elevated set of permissions.
Maintenance. It would be great to be able to deploy changes to the application hosting the background processes without impacting on user's using the front end website.
Performance. Having the application separated from the main site processing user requests means that background threads will not diminish IIS's capability to handle the incoming request queue. Furthermore, the application processing the background tasks could be deployed to a separate server if required.
I would be really interested to hear your thoughts on this approach and whether I should be sticking with Windows Services. I am very tempted to try this new approach.
What does this mean for those of us that used to write Windows Services to support our web apps?
I think this a key scenario where you could be move away from a Windows service to using the continous running web site.
Should we adopt this model?
Standard development answer: Depends ;)
What are the pitfalls?
One issue I can see is the IIS dependency. If you need a service to run on a users machine I would not feel comfortable about asking them to install IIS just to run my service. Here I think the traditional model works better.
Monitoring and tracking are major issues, but as you also point out this is solved by AppFabric. It is even better than what you get from the Window Service. However you have added another dependency which also will require .NET 4.0 and a relatively new version of Windows. I could also be wrong here, but my understanding is that AppFabric is not supported in production on client OS's. Which could bring in additional headaches.
You will lose pause functionality in the continuous web site model too.
Finally IIS killing inactive app-pools isn't the only way an app pool can recycle. Editing a web.config file causes it for instance, which may not be an ideal situation.
the most useful being the ease of deployment.
I also think development is much easier - in the past I have had a console app and a windows service so I can dev/test on my machine using the console app and then change it to a windows service when it goes out. Now dev/test is MUCH easier.
A must read for this is Death to Windows Services...Long Live AppFabric!
What are the pitfalls?
One I found, no shutdown event. You have AppStart when the web site starts (not global.asax because that is HTTP only) but you have no way to handle shutdown which could mean disposing becomes an issue.
I would suggest sticking with a windows service. The issue is with your number 2.
You won't be able to update service part of web site without restarting whole web site.
After deploying a new build (mostly changed DLL's) of an ASP.NET web app the CPU on the server is now jumping to 100% every few seconds and the culprit is lsass.exe. Do you think that the deployment of the asp.net web app to the server and this problem are related? (or a coincidence that it happened at the same time?)
More info:
This is the first time that I've done the build on a Server 2008 x64 machine. Previous the builds were done on a Server 2003 x86 machine. Target is "Any CPU" so should work on either. Deployed to server is Server 2003 x86.
I've searched the web for more info on this and have confirmed that the process is lsass.exe (first character a lower case L and not an upper case i) so ruled out the virus version. Found some docs relating to a Server 2000 bug but doesn't apply here.
I eventually isolated the problem to an ASP forum running "under" that ASP.NET web app. Using the admin page on the forum I took the forum down and then brought it back up again and the problem disappeared. I find this very frustrating because the problem has now gone but I don't know what caused it and as such it could easily return.
I also installed this Microsoft Hotfix and rebooted this server but that didn't work.
Have you checked the System and Application event logs for anything unusual?
Have you updated to use Active Directory role provider? I've seen issues where enumerating groups to do role checking pegs the CPU and really slows down the app. I actually implemented a customized provider that allowed me to specify a particular OU and set of groups that I actually care about to get around this issue.
The xperf tools distributed in the Windows Performance Toolkit will tell you exactly what is usin CPU time or disk bandwith. These tools are free and work on any retail build of WS2008 or Vista. Here is series of posts on the xperf tools from myself.
Occasionally, I find that while debugging an ASP.Net application (written in visual studio 2008, running on Vista 64-bit) the local ASP.Net development server (i.e. 'Cassini') stops responding.
A message often comes up telling me that "Data Execution Prevention (DEP)" has killed WebDev.WebServer.exe
The event logs simply tell me that "WebDev.WebServer.exe has stopped working"
I've heard that this 'problem' presents itself more often on Vista 64-bit because DEP is on by default. Hence, turning DEP off may 'solve' the problem.
But i'm wondering:
Is there a known bug/situation with Cassini that causes DEP to kill the process?
Alternatively, what is the practical danger of disabling Data Execution Prevention?
The only way to know for sure would be to dig through the Cassini source and see if there are any areas where it generates code on the heap and then executes it without clearing the NX flag.
However, instead of doing that, why not use IIS?
EDIT:
The danger of disabling DEP is that you open up security holes. DEP works by not allowing arbitrary generated code on the heap to be executed. This helps prevent malware programs from inserting code into the data segments of legit programs.
You are on vista, iis got better (7), cassini stayed crappy.
So just start this app on iis with a host header and a hosts file entry.
You can grant certain programs exclusion from DEP if you need.
As Jonathan
mentions this does open up any vulnerabilities that application may have.
Using IIS in Visual Studio isn't the pain in the ass that it used to be in 1.1/VS02/03 days. There are lots of good reasons to prefer IIS over the Cassini server (articles by Dominick Baier):
Cassini considered harmful
Another Reason why I would not recommend Cassini
Dominick is 'the man' when it comes to IIS and security stuff.
When using IIS for a web app, I always create the app in IIS first, point it at my preferred folder, then get VS to create the project. This means you don't end up cluttering c:\inetpub\wwwroot with your web apps.
Of course, now we have IISExpress which if you're targeting IIS7.x it's the obvious choice for developing ASP.NET applications in Visual Studio.
Thanks for the answers. I guess I developed such an aversion to IIS in the .net 1.x era that I've refused to consider re-using it -- until now.
aside: when choosing between two equally acceptable answers from ChanChan and Jonathan, I arbitrarily marked Jonathan's as 'accepted' because a) he got in first and b) his rep is currently lower.