Since this question is from a user's (developer's) perspective I figured it might fit better here than on Server Fault.
I'd like an ASP.NET hosting that meets the following criteria:
The application seemingly runs on a single server (so no need to worry about e.g. session state or even static variables)
There is an option to scale storage, memory, DB size and CPU-power up and down on demand, in an "unlimited" way
I researched but there seems not to be such a platform, that completely abstracts the underlying architecture away and thus has the ease of use of a simple shared hosting but "unlimited" scalability.
"Single server" and "scalability" are mutually exclusive, I'm afraid. But a good load-balancer will apply affinity to requests so you don't need to needlessly double-cache data on multiple servers.
However, well-designed web applications are easy to port to a multiple-server scenario.
I think your best option is something like Windows Azure Websites (separate from Azure Web Workers) which run on a VM you don't have access to. The VM itself provides enough power as-is necessary to run your website, so you don't need to worry about allocating extra CPU power or RAM.
Things like SQL Server are handled separately, but is very cheap to run, and you can drag a slider to give yourself more storage space.
This can be still accomplished by using a cloud host like www.gearhost.com. Apps live in the cloud and by default get 1 node worker so session stickiness is maintained. You can then scale that application larger workers to accomplish what you need, all while maintaining HA and LB. Even further you can add multiple web workers. Each visitor is tied to a particular node to maintain session state even though you might have 10 workers for example. It's an easy and cheap way to scale a site with 100 visitors to many million in just a few clicks.
Related
I am developing an ASP.NET application that will be hosted as an Azure web app. Part of the app will continuously record multiple web-based cameras by retrieving a snapshot every N seconds. I would like to design the app so that the processes that record the cameras can be run on multiple instances. I would like it to load balance between all instances, but not duplicate effort for any one camera.
For example, if I have 100 cameras, and am running on 2 instances, I want each instance to get 50 cameras to process. If I have 5 instances, each instance should get 20 cameras to process. As I add cameras or scale instances up/down I would like for the system to load balance the work evenly.
If it's feasible, I would rather not spin up dedicated VMs just for processing cameras, due to increased cost.
I'm somewhat familiar with Akka.NET, Hangfire, and WebJobs, but am unclear if these will help in this scenario. I have used Hangfire and WebJobs to do background processing, but not with this sort of load-balancing requirement. Will these or some other framework or tool help me load balance these background tasks evenly across Azure Web App Instances? How should I go about setting up these or another framework to do this?
I honestly don't think you want to try to "balance" the servers. I think you just want to make sure the work is well distributed. If I were you, I would use a queue system like SQS to queue up all of the cameras that need a snapshot and let each instance worker dequeue one at a time and process it.
A good approach could be to have a master server responsible for queueing up the snapshots, and then have all of your workers servers simply work out of this shared queue. Even if one server happens to process more than the others, that is fine since the others were working out of the same queue. It just means that this server was able to process its jobs more quickly than the others.
To be honest, there are a lot of ways to approach this. You could do something as simple as just having a shared list of your cameras, with a timestamp for the last snapshot, and use this to work off of. Each server would request a camera, they would look at the list and find one that was stale, and then update the timestamp and perform the snapshot for the camera. The downside to something like this is you are going to struggle with non-atomic operations and the possibility of multiple workers making the request at the same time and both working on the same server. These are the type of things that a queue system will help you with, because as soon as one of those queue items are in flight, they will no longer be available. And also, because each server is responsible for invalidating their items once they are finished, if a server were to crash mid-snapshot, this work would simple go back into the queue.
No matter which solution you choose, it is going to boil down to having a central system/list for serving up stale cameras.
The Azure WebJob SDK uses the Storage Account you set up to balance the work between the various instances that are running your Jobs. You can gain finer control by using a Queue to divide up the work that needs doing and then scale your App Service Plan based on the Queue length.
Here's a rough picture of that architecture:
We have a web-application product that we sell to companies that is hosted at our servers.
The product contains couple of web applications, windows services and SQL server db.
Right now we have only one client that uses our product. We have two servers - one for the web apps and services and other for the db.
In order to add the product to another client, we have to 'duplicate' all the apps and db and run in separately.
As we started expanding and some companies will require more server power then others, I need to plan the servers infrastructure.
Having two servers for each client sounds ridiculous. Hosting costs will be huge. What will happen when I'll have 10 clients? And probably some servers will take more power than others, leaving servers using 30% from their capacity while others use 70%.
One thing I really care about is separating the DB from each product so in case of server compromise, only one db will be at risk.
So... I thought about Virtual Machines...
Does it sounds right?
Do I need two super servers to hold virtual machine instances? (one for web and other for db?)
What about Load balancing / etc..?
Will it require more maintenance time only because I use virtual machines?
Are there any hardware recommendations?
Any help will be appreciated
Many thanks
Virtual Machines is definitely the safest way to separate clients and will allow you the flexibility to allocate a specific percentage of resources to specific clients.
However, using separate processes on the same physical machine will perform better (but not always significantly) and will allow more dynamic use of resources (i.e., if one spikes, it will use the resources it needs). This setup will not allow you to control the resource allocation nearly as easily though. You'll also have to build your own monitoring tools to see and analyze what processes (clients) are using what resources (piggyback on perfmon).
Using separate processes also is dangerous if your application wasn't designed for this. Anywhere the application caches data on the file system or accesses anything besides memory and the database needs to be thoroughly scrubbed to make sure data from clients is not co-mingled or shared.
Separate virtual machines is more work to manage--each one is pretty much like it's own computer. So you have to manage all the VM's plus the physical machine.
You may also want to consider hosting in a more dynamic environment like Amazon AWS or Microsoft's Azure which will allow you to more easily scale up/down as necessary than a VM at a traditional host.
Our client requirement is to develop a WCF which can withstand with 1-2k concurrent website users and response should be around 25 milliseconds.
This service reads couple of columns from database and will be consumed by different vendors.
Can you suggest any architecture or any extra efforts that I need to take while developing. And how do we calculate server hardware configuration to cope up with.
Thanks in advance.
Hardly possible. You need network connection to service, service activation, business logic processing, database connection (another network connection), database query. Because of 2000 concurrent users you need several application servers = network connection is affected by load balancer. I can't imagine network and HW infrastructure which should be able to complete such operation within 25ms for 2000 concurrent users. Such requirement is not realistic.
I guess if you simply try to run the database query from your computer to remote DB you will see that even such simple task will not be completed in 25ms.
A few principles:
Test early, test often.
Successful systems get more traffic
Reliability is usually important
Caching is often a key to performance
To elaborate. Build a simple system right now. Even if the business logic is very simplified, if it's a web service and database access you can performance test it. Test with one user. What do you see? Where does the time go? As you develop the system adding in real code keep doing that test. Reasons: a). right now you know if 25ms is even achievable. b). You spot any code changes that hurt performance immediately. Now test with lots of user, what degradation patterns do you hit? This starts to give you and indication of your paltforms capabilities.
I suspect that the outcome will be that a single machine won't cut it for you. And even if it will, if you're successful you get more traffic. So plan to use more than one server.
And anyway for reliability reasons you need more than one server. And all sorts of interesting implementation details fall out when you can't assume a single server - eg. you don't have Singletons any more ;-)
Most times we get good performance using a cache. Will many users ask for the same data? Can you cache it? Are there updates to consider? in which case do you need a distributed cache system with clustered invalidation? That multi-server case emerging again.
Why do you need WCF?
Could you shift as much of that service as possible into static serving and cache lookups?
If I understand your question 1000s of users will be hitting your website and executing queries on your DB. You should definitely be looking into connection pools on your WCF connections, but your best bet will be to avoid doing DB lookups altogether and have your website returning data from cache hits.
I'd also look into why you couldn't just connect directly to the database for your lookups, do you actually need a WCF service in the way first?
Look into Memcached.
My Company is running several international websites for selling insurance products.
Our current setup is a Webfarm with multiple Loadbalanced Webservers hosting our ASP.NET applications. The backend is a single - yet powerful - SQL Server. (all in one data center)
Our network admins want to move to virtual servers running on VMWare.
Scenarios could be
Webfarm: Multiple standard webservers, Loadbalanced (current setup), Session state on SQL Server
Virtual Webfarm: Multiple virtual servers, loadbalanced on one physical VMWare Host, Session state on SQL Server
2.a same as above but with multiple physical hosts
Single Virtual Webserver: One big powerful virtual webserver, no loadbalancing required, session state can be kept in process
There is a big hype around virtualization and I can see the benefits, but have no experience with this. I cannot tell what issues we will face and to what we should pay special attention.
Does anyone have experience with such a virtual setup?
What are general recommendations?
I tend towards 2a. I am afraid of having all webservers on one single physical machine.
Many thanks in advance to share your thoughts.
There are three reasons to use more than one webserver for an application:
Scaling - More grunt is required than one machine can provide
Reliability - Website should keep running in case of failure (a. hardware b. software)
Prioritization - One of the webservers takes on heavy work (perhaps scheduled tasks) leaving the other to respond to client requests quickly.
Marrying that up to you scenarios:
Scenario 1 provides 1, 2, 3
Scenario 2 provides 2b (perhaps 2a if it is fully hardware redundant (doubt it))
Scenario 2a provides 1, 2
Scenario 3 provides none of the above
Advantages of Virtual Hosting:
Lower Total Cost of Ownership (TCO) on big cluster serving multiple purposes is cost effective
New servers can be created quickly if needed
Redundant hardware is easier to justify if the cost is shared among many applications
Disadvantages:
Other virtual machines may suck away your CPU/Disk IO capacity
IMHO there is little point to load balancing multiple virtual machines on the same virtual server.
Robert's pretty much covered it all, I'm mostly just adding a note to say that at least one of our clients is currently running with option 2a.
So we have multiple loadbalanced web servers running on a couple of VM hosts, talking to a non-virtualised SQL cluster - this works quite well for them.
One other advantage of virtualisation is that it allows you to more fully utilise your hardware - however, you need to be aware that if you're running your virtual host at around 90% capacity with multiple VMs, you've not got a lot of spare capacity for any traffic spikes - if you're not expecting any, then great, but if you are, you'll need to have something in place to cope.
I agree with all of the above answers, and I actually work at a webhost. :-) If you're using multiple load-balanced webservers now then I can only assume the reason for it is either
Hardware Redundancy: If a single app server fails then those sessions are lost, but the app keeps running on the other servers and users can immediately re-connect.
or
Application Load Distribution (it's late so I can't think of a better name): Your traffic dictates that you have multiple app servers since all of your users would crash a single app server.
If #1 is the reason, then going to VMWare defeats the purpose since you only have one server supporting everything, and in case of hard drive crash, etc, you are down while it is repaired. If #2 is the reason then a VMWare based solution MAY work, however keep in mind that the hardware you'd use would almost necessarily be of a higher caliber than what you're currently using. So you maybe get more bang for your buck, but you STLL lose the redundancy that multiple physical machines gave you.
Now, you could always combine the two by having multiple physical machines all running VMWare, but that adds a level of complexity to things that you may not necessarily want either.
It doesn't sound like there would be any tangible benefit from running multiple virtual servers on the same physical host, you're just adding overhead. Unless I'm missing something with the way you've described the setup, there wouldn't be any benefit at all from moving to VMware - unless you're looking at taking advantage of features such as VMotion
VMware is most useful for consolidating underutilized hardware. If your hardware is running at near-capacity during peak periods then you don't want to run multiple VMs on the one machine.
There are benefits to Virtualization but your network admins need to prove that there is a benefit for your company before you even consider switching. I would say if you have multiple apps running on dedicated servers with low traffic (i.e. each app has it's own physical server) then sure, Virtualize. If you have one app over many servers, then don't.
You should be able to use virtual machine hosts with multiple vm per host and load balance across all of them.
Microsoft is doing this with msdn and technet http://virtualization.info/en/news/2008/05/microsoft-migrates-msdn-and-technet-on.html.
I have a large ASP.NET website on a hosted platform. It shares the machine with a lot of other applications. We do not have access to the machine itself (only an FTP account).
Our client is complaining that it is starting to perform rather badly, particularly around peak hours. I've run some remote measurements (using a JMeter-like tool) that tells me that, yes, it does indeed perform rather badly during peak hours. It doesn't tell me why though. The client is resisting a move to a dedicated server without some hard facts.
As I see it, what I need are hard data about the machine itself. Setting up a local performance test environment would be extremely time-consuming, and I have no way to estimate the server performance.
My question: is there a good way to collect (a lot) of performance measurements when I have limited access to the machine, and certainly no access to the performance monitor? Any code would have to run in the asp.net application itself, without screwing it up too much.
We had a similar problem with our asp.net application hosted on a shared server, which also started to perform badly during peak hours.
Although I don't know of an elegant solution to your question, this is what we did:
Talk to your host providers to see what additional information they can give you - it's in their best interest to keep their clients happy. Our host providers were able to give us some time with one of their network engineers who provided us with some decent CPU and memory utilization stats.
Take your own performance measurements by dumping information to either a log file (using log4net) and/or the database - for example, user sessions, search times, page hits, timing measurements around key functionality. From this information we were able to ascertain what our systems normal behavior was for a set number of automation tests.
Setup a local server (not necessarily same stats as hosted/production server) with your application loaded and give it a full load/performance/capacity testing (we used Red Gate's ANTS Profiler). The stats that you gather from that will give you and your client a good indication of how the system should behave under certain loads with a known environment. Yes, this can be time consuming but it will give you a great performance measuring tool so that you can catch/fix bottlenecks locally rather than on production.
Good luck.