I would like to ask you what is the best setup for a following application:
ASP.NET 3.5 Web site - used as a presentation layer, a lot of AJAX and JS. Will not hit the server a lot.
ASP.NET WCF - sevice providing all data to the application. It's responsible for validation, data modeling / preparing and communication with the DB Server.
Database - SQL Server 2005 Std, some logic is coded on the server side as stored procedures. Some of the logic can be a bit time consuming. In my opinion it's the most resource consuming part of the app.
The website can have up to 1000 users per minute. We can have up to 4 servers in the following configuration: Intel Bi Xeon Quad 8x 2.00+ GHz, 16 GB RAM, SSD or RAID drives.
What is the best way to place parts of the application on the physical servers? Will they handle this kind of load?
The less scalable place in any application is database server, you can add more web and application servers but you can't replicate DB with the same ease so you will benefit in a long run if DB will not contain any logic especially any long running logic. In a lot of the applications limiting factor is not cpu but memory think about user sessions if you store 1mb of data per user you applications will be able to support 64,000 silmantanius user sessions with you machines it may be sufficient or not. Both problems can be mitigated by using application level caching but this can cause it own set of problems because now you faced with stale data. To scale session based sites you will need to use smart load balancer solution that supports sticky sessions, for your loads most likely you will need hardware load balancer.
In the application you describe, I suspect that thread management is going to be a big issue. Throwing hardware at the problem may not be the best approach.
In terms of partitioning, it depends on whether you can leverage things like caching and cache notifications. If every call to the app has to hit the DB and run a lengthy stored procedure, then you may want to have more DB machines and fewer front-end web servers.
This is a big subject. In an attempt to provide a reasonably comprehensive answer to exactly this kind of question, I ended up writing a book about it: Ultra-Fast ASP.NET: Build Ultra-Fast and Ultra-Scalable web sites using ASP.NET and SQL Server.
Related
Our client requirement is to develop a WCF which can withstand with 1-2k concurrent website users and response should be around 25 milliseconds.
This service reads couple of columns from database and will be consumed by different vendors.
Can you suggest any architecture or any extra efforts that I need to take while developing. And how do we calculate server hardware configuration to cope up with.
Thanks in advance.
Hardly possible. You need network connection to service, service activation, business logic processing, database connection (another network connection), database query. Because of 2000 concurrent users you need several application servers = network connection is affected by load balancer. I can't imagine network and HW infrastructure which should be able to complete such operation within 25ms for 2000 concurrent users. Such requirement is not realistic.
I guess if you simply try to run the database query from your computer to remote DB you will see that even such simple task will not be completed in 25ms.
A few principles:
Test early, test often.
Successful systems get more traffic
Reliability is usually important
Caching is often a key to performance
To elaborate. Build a simple system right now. Even if the business logic is very simplified, if it's a web service and database access you can performance test it. Test with one user. What do you see? Where does the time go? As you develop the system adding in real code keep doing that test. Reasons: a). right now you know if 25ms is even achievable. b). You spot any code changes that hurt performance immediately. Now test with lots of user, what degradation patterns do you hit? This starts to give you and indication of your paltforms capabilities.
I suspect that the outcome will be that a single machine won't cut it for you. And even if it will, if you're successful you get more traffic. So plan to use more than one server.
And anyway for reliability reasons you need more than one server. And all sorts of interesting implementation details fall out when you can't assume a single server - eg. you don't have Singletons any more ;-)
Most times we get good performance using a cache. Will many users ask for the same data? Can you cache it? Are there updates to consider? in which case do you need a distributed cache system with clustered invalidation? That multi-server case emerging again.
Why do you need WCF?
Could you shift as much of that service as possible into static serving and cache lookups?
If I understand your question 1000s of users will be hitting your website and executing queries on your DB. You should definitely be looking into connection pools on your WCF connections, but your best bet will be to avoid doing DB lookups altogether and have your website returning data from cache hits.
I'd also look into why you couldn't just connect directly to the database for your lookups, do you actually need a WCF service in the way first?
Look into Memcached.
I am working on asp.net (newbie) and I am trying to understand what it means to do "load balancing" for the web site. The website will be used by multiple users and resources (database, web service,..).
If anyone could help me understanding the concept of the load balance for asp.net web site, I would really appreciate it.
Thanks.
One load-balancing-related issue you may want to be aware of at development time: where you store your session state. This MSDN article gives a good overview of your options.
If you implement your asp.net system using "out-of-process" or "sql-server-mode" session state management, that will give you some additional flexibliity later, if you decide to introduce a load-balancer to your deployed system:
Your load balancer needn't handle session affinity. As one poster mentioned above, all modern load-balancers handle it anyway, so this is a minor consideration in any case.
Web-gardens (a sort of IIS/server-implemented load-balancer) REQUIRES use of "out-of-process" or "sql-server-mode" session state management. So if your system is already configured that way, you'll be one step closer to being able to use web-gardens.
What is it?
Load balancing simply refers to distributing a workload between two or more computers. As a concept, it's not unique to asp.net. Although having separate machines for your database and web server could be called "load balancing" it more commonly refers to using multiple machines to serve a single role, such as having multiple web servers.
Should you worry about it? Probably not. Do you already have a performance problem? Are your database and web server on their own machines? If you do find that your server resources are strained, it would probably be easier to scale up (a more powerful single machine) than out (load balancing). These days, a dedicated box can handle a LOT of traffic if your code is decent.
Load Balancing, in the programming sense, does not apply to ASP.NET; it applies to a technique to try to distribute server load across two or more machines, rather than it all being used on one machine. Unless you will have many thousands (millions?) of users, you probably do not need to worry about it.
Check the Wikipedia article for more information.
Load balancing is not specific for any on technology stack be it asp.net, jsp etc. To load balance is to spread the incoming requests to a web site over more than one server. This is typically done with a software or hardware load balancer. The load balancer sits in front of two or more web servers and delegates the incoming traffic. Although this technique is not limited to web servers. Load Balancing
Enjoy!
I've never used it, but an option is IIS Application Request Routing.
IIS Application Request Routing (ARR)
2.0 enables Web server administrators, hosting providers, and Content
Delivery Networks (CDNs) to increase
Web application scalability and
reliability through rule-based
routing, client and host name
affinity, load balancing of HTTP
server requests, and distributed disk
caching
In a typical web server/database scenario, the db is almost always guaranteed to load up the machine first. This is because dealing with storing data requires more resources. Before you even start looking at load balancing your web server, you need to think about how to load balance the database.
Spreading one database across multiple servers is a lot harder than load balancing a web server. One of the techniques that can be used is sharding (or horizontal partitioning). This is where some records are stored on one server, and other records - on another server. For example records with ID 1-900000 are on server 1 and records 900001- are on server 2.
In comparison to DB load balancing, spreading the load across multiple ASP.NET servers is not overly complicated. Most of the session issues can be easily mitigated by using out of process session and/or never talking to Application.Cache directly. Data load balancing on the other hand is hard and requires a lot of planning and trial and error. In most cases, talking to a load balanced DB requires using an ORM which supports it (e.g. NHibernate) or your own Data Access Layer. The reason being is that you need to take out establishing a connection from the code that uses the database, so that the decision which DB to talk to is handled in one place.
the exact solution is to save session into the SQL Server with Stored Procedure. To read session call 'SessionCheck' stored Procedure.
I'd add that it really isn't something to worry about. By the time you need a load balancer, you can probably afford one of the neato newfangled ones with sticky sessions so you don't even have to deal with the session boogeyman.
I am not talking about application profilers or debuggers but more specific to managing the applications in production environment. So essentially monitor, identify bottlenecks, deploy fixes.
For monitoring the application is up and running we use Nagios.
We also use good old performance monitor for monitoring database connections, memory consumption and CPU usage.
We use IPMonitor to verify uptime, and it has a lot of options for pinging the site for keyword validation, HTTP response validation, and response time. You can also use SNMP to figure out responsiveness of the processor and RAM, and remaining size on hard disks, among many other options. It supports multiple servers and types of servers, not just website or database.
Additionally, we test basic uptime and response speed with AlertSite.
A 3rd party, Keynote, tests our sites to verify that they are navigable like a human would browse. They have scripts to mimic clicks and interactions.
We use Spotlight for SQL server management, and also good old perfmon for the granular problem fixing.
We recently purchased WildMetrix to monitor and troubleshoot performance issues for our ASP.NET applications. It's nice because you can easily aggregate IIS, ASP.NET, and SQL Server information into a single graph or dashboard that allows you to pinpoint possible trouble spots. We currently use it for as our primary performance reporting and track tool, along with ELMAH for Exception Tracking.
We have 22 HTTP servers each running their own individual ASP.NET Caches. They read from a read only DB that is only updated off peak hours.
We use a file dependency to invalidate the cache, prompting the servers to "new up" their caches...If this is accidentally done during peak hours, it risks bringing down our DB cluster due to the sudden deluge of open connections.
Has anyone used memcached with ASP.NET in this distributed form? It seems to me that it would offer a huge advantage of having to only build up one cache (and hit the DB 21 times less), while memcached would handle distributing it on each box.
If you have, do you place it on the same box as the HTTP boxes, or do you run a separate cache tier? How well does it scale, can we expect it to need powerful servers? Our working dataset is not huge (We fit it into 4 gigs of memory on each HTTP box just fine).
How do you handle invalidation?
Looking for experiences and war stories.
EDIT: Win2k3, IIS6, 64-bit servers...4 gigs per box (I believe, we may have upped it to 16 gigs when we changed to 64-bit servers).
"memcached would handle distributing it on each box"
memcached does not distribute or replicate a cache to each box in a memcached farm. The memcached client basically hashes the key and chooses a cache server based on that hash. When one of the memcached servers fail you will lose whatever cached items existed on that server, however, the client will recognize the failure and begin writing values to a different server. This being the case, your code needs to account for missing items in the cache and reset them if necessary.
This article discusses the memcached architecture in more detail: How memcached works.
Best practice (according to the memcached site) is to run memcached on the same box as your web server app or else you're making http calls (which isn't all that bad, but it's not optimal). If you're running a 64-bit app server (which you probably should if you're going to be running memcached), then you can load up each of the servers with loads of memory and it will be available to memcached. There's not much in the way of CPU resources used by memcached, so if your current app server isn't very taxed, it will remain that way.
Haven't used them together, but I've used them both on separate projects.
Last I saw the documentation explicitly said that sharing with the web server was ok.
Memcache really only needs RAM and if you take your asp.net cache out of the equation how much RAM is you web server actually using? Probably not much. It won't compete much with your web server for CPU and it doesn't need disk at all. You might consider segmenting off the network traffic (if you don't already) from the incoming web requests.
It worked well and was fast I didn't have any problems with it.
Oh, invalidation was explicit on the project I used it on. Not sure what other modes there are for that.
If you want to get replication accross your memcached servers then it maybe worth a look at repcached. It's a patch for memcached that handles the replication part.
Worth checking out Velocity, which is a distributed cache provided by Microsoft. I cannot give you a point-by-point comparison to memcached, but Velocity is integrated with ASP.NET and will continue to get more development and integration.
I'm working on a .net portal which would be having lots of concurrent users.
so scalability,performance need to be addressed in the design and architecture.
We plan to use load balancing in the application.
Keeping this in mind,what would be the best way of communicating between IIS web server(hosting aspx,aspx.cs files) and application server (hosting .net assemblies like business logic and data access layer)?
Should it be .net remoting or soap web service?or is there any other approach?
Thanks.
Is there another approach? Yes - don't distribute your objects.
The most scalable approach is to NOT to distribute your objects away from each other. Ask yourself, why do you want to deploy one flavor of code to an "app server" while another flavor of code goes to a "web server"? The communication that goes on between those two layers, if they are distributed, will be much much much much (etc etc) more expensive than a local call.
With today's 64-bit servers, with all of that memory, and the hot CPUs, and with ASP.NET's superior memory management, why not put your business logic and DAL on the same physical machine as the ASPX files? Why not?
If you need to scale, add more servers. Simple.
There are good reasons, of course, to distribute. The most common good reasons have to do with domains of ownership - along several axes: security management, or even budget and control. In other words, to take the latter case, if team is responsible for running the business logic and a separate team is responsible for building and running the web layer -then it may make sense to distribute those two things to allow independence of management. Most of the good reasons for distributing computer code, have their origins in the structures of the human organizations using or developing the code.
There is no good technical reason why a web page should not run on the same CPU, sharing the same CLR VM and memory heap, as the database access layer.
Regardless what you do with distribution, it would be unwise to architect your system with less-than-formal interfaces defining the connections between the layers. If you keep formal interfaces, then it should be no problem for you to measure the perf and efficiency of a distributed approach versus a co-located approach.
Do you really need an app server? Just how big are you talking exactly? For example Stackoverflow.com has ~50k uniques a day and doesn't have an app server so I assume you are talking much bigger than that? Most performance bottle necks come down to database issues so I would concentrate on that.
I suggest you take a look at the Patterns and Practices groups guidelines for performance, more specifically Chapter 6 - Improving ASP.NET Performance of the guideline. I agree with Cheeso that you should seriously consider NOT physically splitting your application layer and UI layer if you can. The P&P guideline has the following notes:
Avoid Unnecessary Process Hops
Although process hops are not as expensive as machine hops, you should avoid process hops where possible. Process hops cause added overhead because they require interprocess communication (IPC) and marshaling. For example, if your solution uses Enterprise Services, use library applications where possible, unless you need to put your Enterprise Services application on a remote middle tier.
Understand the Performance Implications of a Remote Middle Tier
If possible, avoid the overhead of interprocess and intercomputer communication. Unless your business requirements dictate the use of a remote middle tier, keep your presentation, business, and data access logic on the Web server. Deploy your business and data access assemblies to the Bin directory of your application. However, you might require a remote middle tier for any of the following reasons:
You want to share your business logic between your Internet-facing Web applications and other internal enterprise applications.
Your scale-out and fault tolerance requirements dictate the use of a middle tier cluster or of load-balanced servers.
Your corporate security policy mandates that you cannot put business logic on your Web servers.
If you absolutely have to split the application logic up anyways, you could use WCF as a transport mechanism. I'm not sure how it stacks up against remoting when it comes to performance. However, I seem to remember that this is the guideline Microsoft is pushing.
Clemens Vasters (Technical Lead for the Microsoft .NET Service Bus) talks about WCF vs. Remoting in this answer on MSDN forums.
Learn to write asynchronously.
Explore the CCR runtime for example.
Each thread that is blocked waiting for IO responses is one less available to your system.
Turn off 'idealised logging' leave the ability to switch it back on via admin console. But logging is often a hidden bottle neck.
CACHE CACHE CACHE!
If it was expensive to get the data the first time, don't pay for it the second!
Avoid ASP.net's Session State - This can seriously bloat and lead to a large slow down in page responsiveness.
Modify the http headers to specify short browser caching (5sec - 20sec) (Depends on the nature of the content)
Utilise GZIP while you are at it!
AND USE LOTS OF RAM
Here are my tips
1)Move all your static files - images , css, js to a load balancer like nginx. This will greatly reduce the load on IIS server and it will have enough free resources to serve the main request.
2)Think about caching and avoiding database access altogether.
3)Try to implement REST principles are far as possible.
4)Keep session state to a bare minimum - if possible avoid it altogether.
There are some good performance and scalability points in these articles from Omar Al Zabir.
10 ASP.NET Performance and Scalability Secrets
and
99.99% available ASP.NET and SQL Server SaaS Production Architecture
(also check out his book Building a Web 2.0 Portal with ASP.NET 3.5)