Dealing with huge traffic - online ticket website - asp.net

Backgournd:
Asp.net 4.0 based ecommerce website. Hosted on cloud setup with a dedicated SQL server 2008 standard (16 core) with 32gb RAM.
User interaction:
Users visit the website.
browse through different categories (no static content yet)
put product in cart
A timer ticks for up-to 15 minutes.
Checkout
Login/Create account
Payment is processed using Authorize.Net gateway (the user stays on our website)
Emails are shot upon signup/forgot password/order completion using a third party SMTP provider.
Note: Availability of tickets is checked when the product page is loaded and when they are putting in cart. Once they have a product in cart, surely a timer is ticking for 15 minutes. Querying database every 25 seconds to update the timer.
Incident:
Okay guys, we had a huge sale last week probably put about 10000 tickets for sale for the fans across USA. We saw the traffic when beyond control and for 2-4 hours we saw there were about 1000 concurrent users on our website.
The Problem:
The problem was we had about 6 2gb cloud servers which quickly filled and then crashed due to enormous traffic. Then we had to spin up 4gb, 8gb and 16gb servers (2 each) to deal with the traffic. during the crash period which was about 15-20 minutes the website became unresponsive and also we saw database (dedicated one) was touching 100% CPU usage.
gb is the RAM capacity of the servers.
The framework:
The .net code is written very efficiently, it only executes two SQL statements to fetch and build all necessary data that needs to be rendered on the browser. All business logic that deals with the database is written in stored procedures. No cursors, no dynamic sql in the stored procedure.
Required:
I am unable to understand why the website is crashing... I have lots of code analysis tools implemented that keep telling us which code part is taking too long or which query is taking so much time. When we have bigger servers (8gb or more) then the website is working smoothly.
Should I eliminate the need of hammering database every page load? Like what about having static pages? (though it will need us to export the products info into html which is fine).
What about if I store the pages in Lucene.Net index? and then render from it? Will the I/O cost a lot in this scenario?
I really want some expert opinion about how to tackle this? We initially had plan to deal with 25k concurrent users, but we see we would require lots of big cloud servers to handle that.
Thanks

Should I eliminate the need of hammering database every page load?
Like what about having static pages? (though it will need us to export
the products info into html which is fine).
You don't need to convert products to html, or any third part code to do this. Asp.NET have built-in support for output cache.
Webforms:
<%# OutputCache Duration="60" VaryByParam="none" %>
MVC:
[OutputCache(Duration=60, VaryByParam="none")]
public ActionResult Index()
{
return View();
}
Where Duration is the duration for which the page will be cached in seconds, and VaryByParam is the url parameter that acts as key for that page . It will be cache the page for each different parameter provided, so you'd normally leave none for the index and ProductId for specific product page)
But you'll have to investigate further, as this may not be the only reason of your site's slowdown.

What do your queries look like? You say business logic is in stored procs but are you using dynamic sql, cursors or full-text indexing in those procs? All are possible causes of high CPU.
Lucene.NET could only help if you're using sql full-text indexes in which case it has been shown to be more efficient. But only in the case where searching is your bottleneck.
Caching can help popular pages and reduce the load on your database as mentioned by #Andre but watch your cache hit/miss ratio and which pages you're caching. For example, you'll get a lot of bang for your buck on the Categories and Products pages, but you'll end up using more memory for less benefit (if any) on your user-specific shopping cart pages.
If you're displaying real-time ticket availability on those popular pages that could be really hurting you a lot if you're hitting the database to get that number. Try increasing the latency on those updates and do your validation later in the process if that's the case.

Related

ASP Website runs slow when number of users Increases

I need some information from you.I have used session.TimeOut=540 in application.Is that effects on my Application performance after some time.When number of users increases its getting very slow. response time nearly more that 2 minutes for a button click also.This is hosted in server in Application pool .I don't know about Application pool much.If Session Timeout is the problem i will remove it.Please suggest me the way to for more users.
Job Numbers,CustomerID,Tasks will come from one database.when the user click start Button then the data saved in another Database.I need this need to be faster for more Users
I think that you have some page(s) that make some work that takes time, or for some reason or a bug is keep open for more time than the usual.
This page is keep lock the session and hold the rest page from response because the session holds all the pages.
Now, together with the increase of the timeout this page is lock everything and here is you response time near to 2 minutes.
The solution is to locate the page that have the long running problem and fix it or make it faster by optimize the process, or if this page must keep the long time running, then disable the session for that one.
relative:
Web app blocked while processing another web app on sharing same session
What perfmon counters are useful for identifying ASP.NET bottlenecks?
Replacing ASP.Net's session entirely
Trying to make Web Method Asynchronous
Does ASP.NET Web Forms prevent a double click submission?
About server
Now from the other hand, if your server suffer from hardware, or bad setup then here is one other answer with points that you need to check to make it faster.
Find out where the time is spent
add the StopWatch in the method which you said "more that 2 minutes for a button click". you can find which statment spent the most time.
If it is a query on DB that cost time. Check your sql statement.
are you using "SELECT Count(*)" instead of "SELECT Count(Id)"? the * is always slower. also, don't try "SELECT * FROM...."
Use cache.
there are many ways to do cache. both in ASPX pages and your biz layer.
the OutputCache is the most easy way.
and also, cache the page (for example a blog post) on the first time when a user visit it.
Did you use memory paging?
be careful when doing paging on gridview or other list. If you just call DataSource=xxx and DataBind(), even with PagedDataSource, this is likely a memory paging. It cost a lot of performance. Please use stored procedures to do paging.
Check your server environment
where did you deploy the website? many ISP will limit brandwide and IIS connection count and also CPU time to your account.
if you have RD access to your server. you can watch CPU and memory usage to see if they are high when many user comes to your site. If the site is slow and neither CPU nor memory useage is high, it may be a network brandwide problem.
Here are some simple steps to narrow down the issue -
1) Get HTTPWatch (theres a free Basic version) available and check whats really taking time from an end user perspective. Look at number of requests, number of resources downloaded, and the payload. If there is nothing to worry move on to next
2) If its not client, then its usually the processing time on the server. Jump on to DB first - since this is quite easier to eliminate quickly. Look at how many DB calls are made (run profiler in staging or dev) and see if there are any long running queries, missing indexes or statistics, and note the IO. If all is well, move on
3) Check your app code. You could get on with VS.NET in build profiler or professional tools such as Ants. If code is fine then its your network or external calls that you make, check your network bandwidth. If you still cannot narrow down, check your environment/hardware
The best way to get to it is to apply load - You could use simple tools such as ab.exe (that comes as part of Apache Web server) to have concurrent hits on your server and run the App, DB profilers in the background to get to the issue.
Hope this helps!

When to use load balancing?

I am just getting in to the more intricate parts of web development. This may not be in the best place. However, when is it best to get load balancing for a web project? I understand that it depends on good design/bad design as to how many users you can get to visit a site without it REALLY effecting the performance. However, I am planning to code a new project that could potentially have a lot of users and I wondered if I should be thinking off the bat about load balancing. Opinions welcome; thanks in advance!
I should not also that the project most likely will be asp.net (webforms or mvc not yet decided) with backend of mongodb or pgsql(again still deciding).
Load balancing can also be a form of high availability. What if your web server goes down? It can take a long time to replace it.
Generally, when you need to think about throughput you are already rich because you have an enormous amount of users.
Stackoverflow is serving 10m unique users a month with a few servers (6 or so). Think about how many requests per day you had if you were constantly generating 10 HTTP responses per second for 8 hot hours: 10*3600*8=288000 page impressions per day. You won't have that many users soon.
And if you do, you optimize your app to 20 requests per second and CPU core which means you get 80 requests per second on a commodity server. That is a lot.
Adding a load balancer later is usually easy. LBs can tag each user with a cookie so they get pinned to one particular target. You app will not notice the difference. Usually.
Is this for an e-commerce site? If so, then the real question to ask is "for every hour that the site is down, how much money are you losing?" If that number is substantial, then I would make load balancing a priority.
One of the more-important architecture decisions that I have seen affect this, is the use of session variables. You need to be able to provide a seamless experience if your user ends-up on different servers during their visit. Session variables won't transfer from server to server, so I would avoid using them.
I support a solution like this at work. We run four (used to be eight) .NET e-commerce websites on three Windows 2k8 servers (backed by two primary/secondary SQL Server 2008 databases), taking somewhere around 1300 (combined) orders per day. Each site is load-balanced, and kept "in the farm" by a keep-alive. The nice thing about this, is that we can take one server down for maintenance without the users really noticing anything. When we bring it back, we re-enable our replication service and our changes get pushed out to the other two servers fairly quickly.
So yes, I would recommend giving a solution like that some thought.
The parameters here that may affect the one the other and slow down the performance are.
Bandwidth
Processing
Synchronize
Have to do with how many user you have, together with the media you won to serve.
So if you have to serve a lot of video/files to deliver, you need many servers to deliver it. Let say that you do not have, what is the next think that need to check, the users and the processing.
From my experience what is slow down the processing is the locking of the session. So one big step to speed up the processing is to make a total custom session handling and your page will no lock the one the other and you can handle with out issue too many users.
Now for next step let say that you have a database that keep all the data, to gain from a load balance and many computers the trick is to make local cache of what you going to show.
So the idea is to actually avoid too much locking that make the users wait the one the other, and the second idea is to have a local cache on each different computer that is made dynamic from the main database data.
ref:
Web app blocked while processing another web app on sharing same session
Replacing ASP.Net's session entirely
call aspx page to return an image randomly slow
Always online
One more parameter is that you can make a solution that can handle the case of one server for all, and all for one :) style, where you can actually use more servers for backup reason. So if one server go off for any reason (eg for update and restart), the the rest can still work and serve.
As you said, it depends if/when load balancing should be introduced. It depends on performance and how many users you want to serve. LB also improves reliability of your app - it will not stop when one system goes crashing down. If you can see your project growing to be really big and serve lots of users I would sugest to design your application to be able to be upgraded to LB, so do not do anything non-standard. Try to steer away of home-made solutions and always follow good practice. If later on you really need LB it should not be required to change your app.
UPDATE
You may need to think ahead but not at a cost of complicating your application too much. Do not go paranoid and prepare everything to work lightning fast 'just in case'. For example, do not worry about sessions - session management can be easily moved to SQL Server at any time and this is the way to go with LB. Caching will also help if you hit some bottlenecks in the future but you do not need to implement it straight away - good design (stable interfaces), separation and decoupling will allow for the cache to be added later on. So again - stick to good practices, do not close doors but also do not open all of them straight away.
You may find this article interesting.

Dealing with larger traffic on ASP.net web site

I have a asp.net web site for our company and handles about 1000 - 2000 users every day. Now the site will have about 4-5000 users every day. We are putting it to two servers and put them in the hardware load balanced environment.
I am wondering if there is anything else I should do from the ASP.net web site perspective to handle the larger users.
Thanks.
Some things I'd take into consideration..
Session state management - are you going to do it out-of-process? If so, make sure everything being stored in Session is serializable.
Do you have a large number (or any? some may argue) update panels being used or many standard server-side postbacks? If so, try to convert what you can to simple AJAX requests and marshal raw/JSON data back and forth. This will minimize on the number of full page life cycles and data traffic on the server.
On the front-end/UI side, try to leverage CSS sprites, so that you go to the server for the images once and never again.
For database connectivity, make sure you leverage connection pooling.
You may also want to consider js and css minification.
Additionally, these pages has some good tips:
http://msdn.microsoft.com/en-us/magazine/cc163854.aspx (a bit outdated, but still somewhat relevant)
http://developer.yahoo.com/performance/rules.html
First of all you should profile your application against bottleneck - if there is any place in your code which makes your application slow then adding new servers won't help. There are many profilers - I recommend JetBrains Dot Trace (there is a free trial for couple of days).
Second thing is OutputCache - the shortest explanation is "store html that is sent to the users, not recreate it every time. There is a huge number of articles about OutputCache so I don't think you need any link here.
If the traffic is even bigger you can think about using some solution for caching your responses around the world (read e.g. about Akamai) but I don't suppose you will need it with couple thousands of visitors daily.

Running a query in Page Load a bad idea?

I'm running an ASP.NET app in which I have added an insert/update query to the [global] Page_Load. So, each time the user hits any page on the site, it updates the database with their activity (session ID, time, page they hit). I haven't implemented it yet, but this was the only suggestion given to me as to how to keep track of how many people are currently on my site.
Is this going to kill my database and/or IIS in the long run? We figure that the site averages between 30,000 and 50,000 users at one time. I can't have my site constantly locking up over a database hit with every single page hit for every single user. I'm concerned that's what will happen, however this is the first time I have attempted a solution like this so I may just be overly paranoid.
Do it Async.
Create a dll that handles the update, and in the page load do a fire and forget with parameters.
Insert-Based designs have less locking than Update-Based designs.
So if a user logged-in and then logged-out, in an Insert-Based design you would have multiple rows with a SessionID in each, one for each activity whereas in an Update-Based design, you would have a SessionId, LoginTime and a LogoutTime column and you would update the LogoutTime based on the SessionId.
I have seen many more locking and contention problems caused by Update activity more than Insert activity.
Activities such as counting and linking logins to logouts etc take more complex queries and a little more resources.
It goes without saying that your queries, especially the ones that run on every page, should be as fast as possible so that the site doesn't appear slow to users.
To keep track of how many users are currently on your site you could use performance counters. What you describe though sounds more like a full fledged logging of every page hit.
Lets say you realy have 50k users connected at any one time.
As long as you don't have contention between the updates (trying to lock the same record) a database can track a very high number of inserts and updates. You need to do some capacity planning to assure the load can be carried. 50k users visiting a page every minute will give you 50k inserts and 50k updates per minute, roughly 850 inserts and 850 updates per second, which have to commit (flush the log). Does your DB I/O subsytem support such a write pressure load, in addition to responding to all the requests (reads)?
Also 50k users doing 1 page hit per minute adds up to 72 mil hits per day, 72 mil. logging inserts, at such a rate you need to carefully plan the size capacity of the database and consider what kind of analysis you'll do on the collected data since querying ad-hoc 2 billion rows (one month data) will get you nowhere fast (actually... quite slow).
Doing it async can give you some relief over very short spikes, but not on the long run. If your DB system cannot handle the load then doing async calls will just create a backlog queue in the application process (in the ASP app pool) and this will grow until out of memory, at which moment the all vigilant IIS will 'recycle' the app pool, thus loosing all pending async updates.
I think updating the database in the begin session and end session will do the job. that will reduce the count of statements dramatically.
I think it makes no difference if you track hits or begin/end session. with hits you'll also need additional logic to subtract inactive users
EDIT: session end is not fired always. I would suggest to call an update statement/stored procedure in another session begin event (in addition to the other insert statement) that will fix invalid sessions.
I don't think that calling this "fix routine" is necessary in every page load event because I think you cant exactly count "current no. of visitors".
I would keep this in Application state instead - if possible. On ApplicationStart create some data structure saved to App state that you can update from anywhere in your application - session start, page load, wherever. Keep it out of the database. You are just using it to track "currently online" info anyway it sounds like.
If you have multiple instances of your app, or if there is a requirement to maintain historical info beyond the IIS logs, this won't work obviously. Go with chris' fire-and-forget solution in that case.
What's wrong with IIS Logs?
2009-05-01 12:30:31 207.219.27.35 GET /assocadmin/ibb-reg.asp - usernameremoved 544.566.570.575 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+Media+Center+PC+5.0;+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30618) 200 0 0 40058
EDIT: I'd like to close this answer, but I want the comments to stay. Consider this answer withdrawn.
How about adding a small object to the session?
Something like LoggedInUserFlag:IDisposable
In the constructor, increment your counter however you decide to implement it.
Then in the Dispose method, decrement the counter.
This way, regardless of how the session is ended, the counter will always be (eventually) decremented.
see:
http://weblogs.asp.net/cnagel/archive/2005/01/23/359037.aspx
for info on using IDisposable.
I am not an ASP guy at all, but what about rather than logging all that other info, and insert their IP address?
If they have an IP address already in there, have a last_seen timestamp, and on each refresh just delete any row that isn't 10 minutes ago?
This is how I would take a shot at it. It is much more space efficient, but I am not sure about the checking and deleting so much on such a high profile site.
As a direct answer to your question, yes, running a database query in-line with every request is a bad idea:
Synchronous requests will tie up a thread, which will reduce your scalability (fewer simultaneous activities)
DB inserts (or updates) are writes to the DB, which will put a load on your log volume
DB accesses shouldn't be required in a single server / single AppPool scenario
I answered your question about how to count users in the other thread:
Best way to keep track of current online users
If you are operating in a multi-server / load-balanced environment, then DB accesses may in fact be required. In that case:
Queue them to a background thread so the foreground request thread doesn't have to wait
Use Resource Governor in SQL 2008 to reduce contention with other DB accesses
Collect several updates / inserts together into a single batch, in a single transaction, to minimize log disk I/O pressure
Return the current count with each DB access, to minimize round-trips
In case it's of any interest, I cover sync/async threading issues and the techniques above in detail in my book, along with code examples: Ultra-Fast ASP.NET.

Running a Asp.net website with MS SQL server - When should i worry about scalability?

I run a medium sized website on an ASP.net platform and using MS SQL server to store the data.
My current site stats are:
~ 6000 Page Views a day
~ 10 tables in the SQL server with around 1000 rows per table
~ 4 queries per page served
The hosting machine has 1GB RAM
I expect by the end of 2009 to hit around:
~ 20,000 page views
~ 10 tables and around 4000 rows per table
~ 5 queries per page served
My question is should I plan for scalability right now itself? Will the machine hold up till the end of the year with the expects stats.
I know my description is very top level and does not provide insight into the kind of queries etc. But just wanted to know what your gut instinct tells you?
Thanks!
You should always plan for scalability. When to put resources into doing the actual scaling is usually the tough guess.
Will the machine hold up until the end
of the year
Way too little information to answer this. If a page request takes 30 CPU seconds to process due to massive interaction with a legacy enterprise application through the four queries per page - then there's no way. If it's taking miniscule fractions of a second to serve some static content stored in the cache and your queries are only executed every half hour to refresh the content - then you're good until 2020 at the rate of traffic growth you describe.
My guess is that you're somewhere closer to the latter scenario. 20,000 page hits a day is not really a ton of traffic, but you'll need to benchmark your page and server performance at some point so that you can make the calculations you need.
Things to look at for scaling your site when it is time:
Output Caching
Optimizing Viewstate
Using Ajax where appropriate
Session optimization
Request, script, css and html minification
Two years ago I saw a relatively new (for two years ago) laptop running IIS and serving up 1100 to 1200 simple dynamic page requests per second. It had been set up by a consulting firm whose business was optimizing ASP.Net websites, but it goes to show you how much you can do.
Essentially, by the end of 2009, you expect to do 100,000 SQL queries per day. This is about 1.157 queries per second.
I am making the assumption that your configuration is "normal" (i.e. you're not doing something funky and these are pretty straightforward SELECT, UPDATE, INSERT, etc), and that your server is running RAID disks.
At 4,000 rows per table this is nothing to SQL server. You should be just fine. If you wanted to be proactive about it, put another stick of RAM in the server and bring it up to at least 2GB, that way IIS and SQL have plenty of memory (SQL will certainly take advantage of it).
The hosting machine? Does this mean that you have IIS and SQL installed on the same box or IIS on your host machine with a dedicated SQL Server provided by your hosting company? Either way I would suggest starting to take a look at how you might implement a caching layer to minimize the hits (where possible) to the database. Once this is PLANNED (not necessarily implemented) I would then start to look at how you might build a caching layer around your output (things built in ASP.NET). If you see a clear an easy path to building caching layers...then this is a quick and easy way to start to minimize request to the database and work on your web server. I suggest that this cache layer be flexible...read not use anything provided by .NET! Currently I still suggest using MemCached Win32. You can install it on your one hosted local box easily and configure your cache layer to use local resources (add memory...1gb is not enough). Then if you find that you really need to squeeze every little bit of performance out of your system...splurge for a second box. Split your cache between your current box...and the new box (allowing you to keep more in cache). This will give you some room (and time) to grow. Offloading to more cache should help address any future spikes...and with the second box you can now also focus on making your site work in farmed environment. If you are using local session..push that into your cache layer so that a request from one box or another won't matter (standard session is local to the box that it is managed on).
This is a huge subject...so without real details this is all speculation of course! You might be just right for adding better and more hardware to the existing installation.
Have you tried setting up a quick performance test using sample data? 20,000 page views is less than one/sec (assuming even distribution over 8 hours), which is pretty minimal given your small tables. Assuming you're not sending a ton of data with each page view (i.e. a data table with all 1000 rows from one of your tables), you are likely OK.
You may need to increase RAM, but other than running a performance test I wouldn't worry too much about performance right now.
I don't think the load you are describing would be too much of a problem for most machines. Of course it doesn't just depend on the few metrics you outlined but also on query complexity, page size, and a heap of other things.
If you worry about scalability do some load testing and see how your site handles, say 10000 page views per hour (about 3 views per second). It's mostly always good to plan ahead as long as you plan for probable scenarios.
Guts say: Given 10 tables with 4,000 rows each and assuming about 2KB of data per row is only 80MB for the entire database. Easily cached within memory available. Assuming everything else about the application is equally simple, you should be able to easily serve hundreds of pages per second.
Engineers say: If you want to know, stress test your application.

Resources