ASP.Net Core Performance on Azure vs Local - asp.net

I have a website with a function that generates a report in Excel that is pretty much just a datadump, approx 16000 rows, using EPPlus. This report keeps timing out for the users on Azure. The timeout (524) is a cloudflare limit if the request takes longer than 100 seconds.
I have optimised the hell out of it using Hashsets and Dictionaries, and it now runs in under 2 seconds on my laptop in Debug. I've also tried publishing with the target runtime as win-x64, in case it's a memory allocation issue.
Initially I thought the bottleneck would be memory. After setting up Application Insights, I can see that the CPU is at 100% while the memory is fairly low, about 300MB. I've bumped the Service Plan up to the P3V2 (14GB RAM & 840 ACU) to test if it's just a resource allocation issue. Even at that level, it takes about 50-60 seconds to produce. I can't run the app at that level, so I need to get it down much lower.
I'm not sure how else to optimise this, or identify the bottleneck. Any ideas?

Related

Server random downtime windows server 2003 sp2 .net4

Server Specks
Microsoft Windows Server 2003 Enterprise Edition SP2
IIS 6
.net4
Intel(R) Xeon(R) CPU
X5680 # 3.33GHz, 2.00GB of RAM
Physical Address Extension
I am having trouble finding the cause of our server's random downtime. Our clients inform us that their website goes down for hours at a time. Sometimes users are able to log in however the site is extremely slow/unstable and unusable. Sometimes users are not able to log in at all. When users are able to log in not all images are displayed (they get the image not found image).
We upgraded their website from .net1 to .net4 because we thought the cause of their downtime and random user log out was due to them running their website on .net1. The website was running fine with no issues for a few months.
The first time the server started to go down after that was due to the drive with which the website resided on running out of disk space. There was 40GB partitioned to this drive and 20GB was added. This didn't resolve the issue for very long.
The second time the server would randomly go down, I noticed in the Event viewer, that the web worker associated with the app pool used by the website would periodically require to be recylcled. That is, in the Security tab of the Event Viewer I would periodically see an event with ID 1074 reading 'A worker process with process id of '1540' serving application pool 'Net4' has requested a recycle because the worker process reached its allowed processing time limit.'. I then went into this app pool's properties and saw that the app pool would be recycled every 29 hours, which is the default. I modified this to have the app pool recycle every day at 3:00am. Since that we have not seen this event in the Event Viewer. We were able to catch the website during one of its downtimes before this was changed and recycled the app pool manually. This resolved the issue in this one instance.
This did not permanently fix the issue however, as we are still receiving emails from our client informing us that the website is down for hours at a time.
I then set up a performance monitor counter log. We have managed to monitor the server's performance during many of these downtimes. It does not appear to be a problem with memory as there is plenty of space on the drive. It does not appear to be a memory leak or related to excessive paging as there are no running processes which take up an excessive amount of % Processor Time and the Pages/Second Memory counter does not peak at an excessive amount during most of the downtime (I'll explain why excessive paging occurs later). The total IO Data Bytes/sec and IO Other Data Bytes/sec Process counter does not appear to be usually high or low during downtime. The total Thread Count and Handle Count Process counter do not exhibit any abnormal spikes or drops during this time. The total thread count, at a given time, seems to be between 600 and 900, give or take. The total handle count, at a given time, seems to be between 15,000 and 23,00, give or take. The % Time in Jit .NET CLR jit counter for instance w3wp is 0 for about half of the time and will randomly peak at almost 100 the other half, most of the time peaking for just a moment but rarely peaking for about 10 minutes, unrelated to downtime.
There are random times throughout the day where the process dsmcsvc takes up most, if not all, of the % Processor Time. This is a process run by the Symantec Antivirus software. When this process takes up the % Processor Time there is a corresponding event in the Event Viewer signifying that a new virus definition file has been uploaded that is, an Application event with ID 7 'New virus definition file loaded. Version: #version number#'. When this event occurs, the Pages/Sec counter spikes. Sometimes it spikes to only 200-300 but will at times peak over 10,000. This event seems to be completely unrelated to website downtime. I have researched the Symantec Antivirus software and found that there is a known memory leak in old versions of this software. I have found that this software is known to cause high memory usage when the link to a process called NavLogon.exe is broken/does not exist. This process does not appear to exist on the server so I currently have no way of restoring the link to it. I also found that this software uses Crypt32.dll and that old versions of Crypt32.dll have a known memory leak. The Crypt32.dll which exists on the server was last updated in 2007.
The Performance Monitor log monitors the total Sessions Active ASP.Net Applications counter. During downtime, the total number of sessions does not exhibit any abnormal behavior, there are a normal amount of active sessions during this time. Active sessions at a given time can be between 0 and 200. I was informed that the time when the most users are active is during 1st shift, however during about 10pm and 2am every day, this number peaks.
The site runs JavaScript client side, and Visual Basic.net server side. All users have about 10-15 session variables almost all of the time.
When the site goes down there are no events which seem to correspond to its downtime in the Event Viewer.
I also have set up a W3C Extended Log File Format log for this site. During downtime there seems be an excessive amount of GET requests for a Telerik.RadUploadProgressHandler.ashx.
I have seriously run out of ideas at this point and have extensively searched the web for solutions and come up empty. Any feedback as to why this may be occurring would be great.
It does not appear to be a problem with memory as there is plenty of space on the drive.
Really? Memory and hard drive space are two completely different things. 2GB of RAM was okay a decade ago, when that server was new, but today it's laughably small.
But don't bother upgrading or adding RAM. This server is old enough, the problem is probably just that the hardware is reaching the end of it's useful life. Additionally, the operating system is also nearing it's end of life. Server 2003 is scheduled for end of life on July 14, 2015. After that date, there will be no new patches of any kind produced for Server 2003... not even critical security patches. That will make Server 2003 completely unsuitable as a web server.
This seems like a good time to execute a transition to a completely new server.

asp.net high number of Request Queued and Context switching

We have a fairly popular site that has around 4 mil users a month. It is hosted on a Dedicated Box with 16 gb of Ram, 2 procc with 24 cores.
At any given time the CPU is always under 40% and the memory is under 12 GB but at the highest traffic we see a very poor performance. The site is very very slow. We have 2 app pools one for our main site and one for our forum. Only the site is being slow. We don't have any restrictions on cpu or memory per app pool.
I have looked at he Performance counters and I saw something very interesting. At our peek time for some reason Request are being queued. Overall context switching numbers are very high around 30 - 110 000 k.
As i understand high context switching is caused by locks. Can anyone give me an example code that would cause a high number of context switches.
I am not too concerned with the context switching, and i don't think the numbers are huge. You have a lot of threads running in IIS (since its a 24 core machine), and higher context switching numbers re expected. However, I am definitely concerned with the request queuing.
I would do several things and see how it affects your performance counters:
Your server CPU is evidently under-utilized, since you run below 40% all the time. You can try to set a higher value of "Threads per processor limit" in IIS until you get to a 50-60% utilization. An optimal value of threads per core by the books is 20, but it depends on the scenario, and you can experiment with higher or lower values. I would recommend trying setting a value >=30. Low CPU utilization can also be a sign of blocking IO operations.
Adjust the "Queue Length" settings in IIS properties. If you have configured the "Threads per processor limit" to be 20, then you should configure the Queue Length to be 20 x 24 cores = 480. Again, if the requests are getting Queued, that can be a sign that all your threads are blocked serving other requests or blocked waiting for an IO response.
Don't serve your static files from IIS. Move them to a CDN, amazon S3 or whatever else. This will significantly improve your server performance, because 1,000s of Server requests will go somewhere else! If you MUST serve the files from IIS, than configure IIS file compression. In addition use expire headers for your static content, so they get cached on the client, which will save a lot of bandwidth.
Use Async IO wherever possible (reading/writing from disk, db, network etc.) in your ASP.NET controllers, handlers etc. to make sure you are using your threads optimally. Blocking the available threads using blocking IO (which is done in 95% of the ASP.NET apps i have seen in my life) could easily cause the thread pool to be fully utilized under heavy load, and Queuing would occur.
Do a general optimization to prevent the number of requests that hit your server, and the processing time of single requests. This can include Minification and Bundling of your CSS/JS files, refactoring your Javascript to do less roundtrips to the server, refactoring your controller/handler methods to be faster etc. I have added links below to Google and Yahoo recommendations.
Disable ASP.NET debugging in IIS.
Google and Yahoo recommendations:
https://developers.google.com/speed/docs/insights/rules
https://developer.yahoo.com/performance/rules.html
If you follow all these advices, i am sure you will get some improvements!

Out of Memory Exception - ASP.NET - IIS 7

The problem is with Memory management because I keep receiving “Out of Memory exception”.
Here are the scenarios where we face the problem:
Please note:
1. The site/application is developed in ASP.Net and uploaded on a server with the following specs:
- Windows Server 2008 (R2) Standard
- Intel Xeon L5520#2.27GHz 2.27GHz
- RAM = 8GB
- System Type = 64bit
The application is event management based web application where the requirements include saving huge amount of data in Sessions etc (mentioning this in case it is relevant)
The applications/site works fine until we:
Edit a file directly on the server
Update a file from repository
Copy/Paste a file (we don’t usually edit code using this technique)
Please note, all of the above hold true ONLY when the traffic to the site is high that is,
The issue/error “Out of Memory” is not produced when the traffic/visits is low
Details of:
System Properties > Advanced > Performance Settings > Advanced tab
Total paging file size for all drives: 16362 MB
In web.config
Is there any way we can debug this problem to the core and find out a solution. Can you please provide links/help where we can further investigate this problem?
Best regards,
Farrukh
Out of Memory Exceptions are common with applications that see periodic transaction surges while keeping larger volumes of data in memory. This problem does, however, depend on your application and architecture. Below are a few pointers:
Hardware - you have Xeon 5500 (Intel Nehalem chips). These are very good at handling memory. You should be good here.
OS - Windows Server 2008 R2 - As an OS this system will handle more than enough memory for you (you are good here, see link for capabilities: Memory Limits for Windows)
Physical Memory - Did you say you have 8 GB on the server? Note you app is allowing 16 GB. There is one issue. If your app requests more memory than physically available you will see your error. But this is not your only concern ...
CLR / GC limitations - Your application has a "paging file size" of 16+ GB. This is probably your issue.
GC is the heart of your problem for you. In terms of why, it is the same reason Java and the JVM have issues whenever an application exceeds 2-4 GB. That requires a look at the actual process of GC.
You have "old generation" and "young generation" Garbage Collection processes. As you app runs the CLR tries to keep your memory space organized. These processes force all threads to pause (phase changes) when GC mark and swap processes occur. The problem here is, depending on how your code is written and the amount of memory you keep around for long periods, you can run into memory issues.
Any time you press a runtime environment to exceed the 4 GB threshold you will see exponential increases in collection times. When you hit the "stop the world" pause (the old gen GC where everything gets cleaned up) the CLR has to go through the entire heap and de-allocate memory. Based on your app, 16 GB may give you issues even with more physical memory (Windows Server 2008 R2 - Enterprise or DataCenter can support 2 TB). Even if you feed it more physical memory you may see LONG collection times when your full GC hits.
Ideally I would do the following:
Get more physical memory (you never want to come withing 600MB of your total physical memory allocated to your application to avoid out of memory errors, but your buffer does depend on your load and the application's ability to handle it ... you may want a larger safety net to be safe).
Once you have the physical memory you need run GC logs while stressing the app. This will give you an idea where you see exponential degradation in performance and what level your app can support when considering Heap size (Memory). You may want to find a way to get your 16GB page down to a smaller size. I do know with .Net 4.0 Microsoft has made some solid improvements to the GC process, including allowing a background thread to maintain GC. This should give you the ability to support larger heaps (in theory) ... but nothing beats real tests on the app. Check out this link for more info:
Garbage Collection Performance (Asp.net 4.0) - Also, as I am limited on links. Navigate to the Fundamentals page for some great explanations on new GC features of ASP.Net 4.0
(http://msdn.microsoft.com/en-us/library/ee787088.aspx#concurrent_garbage_collection)
Hope this helps!
PS - Anyone out there on lesser hardware will need to be aware of the ASP.NET use of the GC thread. If you are running something in development like a Core Duo you have to consider that 50% of your compute power will go to GC optimization. This means that Hardware (number of cores) is important to consider. If you have more than you need this process should theoretically help performance. If you are constrained on cores either get better hardware or use an older version of ASP.Net or consider turning the feature off (if possible). Second, if latency is a concern, using "hyper-threading" does have an impact on performance as well. You always get better performance on "physical" cores ... but that will not be a concern for 99.9% of the applications out there.
2 GB by default. If the application is large address space aware (linked with /LARGEADDRESSAWARE), it gets 4 GB (see http://msdn.microsoft.com/en-us/library/aa366778.aspx)
They're still limited to 2 GB since many application depends on the top bit of pointers to be zero.

w3wp.exe has high cpu usage on every request

I'm running a Windows 2008 server (a VPS with 1GB of RAM), with SQL Server Express and IIS 7 installed. On it I'm hosting a NopCommerce 1.7 website, with a database of around 26 000 products.
Right now I'm the only user of the website (it's in development) and I'm getting rather bad performance from it. To be more specific every time I make a request, the worker process goes to 90-100% CPU usage for a few seconds. Is it me or this is a lot for a 1 user NopCommerce website? Any ideas why this happens and what I can do to rectify it or further investigate?
PS: the worker process uses between 100MB-400MB of memory (private working set), and SQL Server with this database, around 160MB. Do you have any suggestions other then the obvious one to get more RAM? I intend to get one more GB but I fear this will not solve the cpu usage problem.
You've already stated you're going to get more RAM, but don't be surprised how much a lack of RAM can impact the CPU. If your RAM is not able to hold large objects efficiently because of lack of space (and I'd say using 40% of available RAM qualifies), then the CPU has to work harder to page things in and out of virtual memory. 90% is a little overkill for this, but with the server specs you give it's not impossible.
The most likely problem is that there is a hole in your code somewhere. My guess is that you have either an infinite loop or a direct memory leak (resources open during requests that aren't closed perhaps?). Your best bet would be to get the IIS Debug Diagnostics tool, install it and set up reports to find out what is going on directly on the server.

aspnet_wp keeps recycling because of high memory consumption. How can I fix it?

I have a small WCF service which is executed on an XP box with 256 megs of RAM running in VM.
When I make a request (with a request size of approximately 5mbs) to that service I always get the following message in the event log:
aspnet_wp.exe was recycled because memory consumption exceeded the 153 MB (60 percent of available RAM).
and the call fails with error 500.
I've tried to increase memory limit to 95% but it still takes up all the available memory and fails in the same manner.
It looks like something is wrong with my app (I do not reuse byte[] buffers and maybe something else) but I cannot find root cause of such memory overuse.
Profiling showed that all CLR objects that I have in memory together do not take up that much space.
Doing a dump analysis with windbg showed same situation - nothing that big in object heap.
How can I find out what is contributing to such memory overuse?
Is there any way to make a dump right before process is recycled (during peak mem usage)?
Tess Ferrandez's blog "If broken it is, fix it you should" has lots of hints, tips and recommendations for sorting out exactly this sort of problem.
Of particular use to you would be Lab 3: Memory, where she walks you through working out what has caused all the memory on your machine to disappear.
Could be a lot of things, hard to diagnose this one. Have you watched perfmon to see if the memory usage does peak on aspnet process or on the server itself? 256MB is pretty low, but it should still be able to handle it. Do you have a SWAP file on this machine? AT what point do you take the memory dump? Have you stepped though the code, and does it work on other machines? Perhaps it is getting stuck in a loop and leaking memory until it crashes?

Resources