I am currently in the design phase where we have several servers that make up our system and I need to work out what is the best way to store session information so that each of the servers can get access to it. Session information is basically a guid for the session id and a list of user group names.
The main client applications are web and mobile apps.
I have the following configuration:
A master server where all users log in and the session object is filled with the user group information that corresponds to the user. There could be up to 10,000 users at peak log in.
Slave servers which contain archiving content and most users would then via the UI be talking directly to the slave servers. The slave servers need the session information which is initially determined on the master server.
One option was to move the session data for each log in to the slave servers and cache it on the slave, then each slave could work independently and it will not need to reference the master server.
The other option is to have central database which contains the session information but since our databases are on each server (we don't have a separate machine to act as a database server), each slave will have a remote connection string back to the master server database. No doubt this will slow things down if I have to query the database remotely from the slave server.
I then have the situation where I need to clean up sessions but in general I don't expect more that 25 MB of data at the peak log in.
We could have up to 10 slave servers.
What is the best solution?
For IIS 7.0 here is a technet article that outlines two approaches (a session server or using SQL server):
http://technet.microsoft.com/en-us/library/cc754032(v=ws.10).aspx
I question the need to have 10 web servers running with 10 seperate databases running on them. Granted, I don't know anything about the application you're writing and there may very well be a good reason for it.
Here's how I'm seeing it (with my admitted limited knowledge of your application).
10,000 possible concurrent users hitting 1 authentication server that will then redirect them to one (or more?) of 10 servers could potentially cause bottlenecks. What if a majority go to one of the servers? What if a large number all try to log in at the same time?
Here's a stab at a different architecture:
[LoadBalancer]
-------------------------------------------------------------------------
[WebServer] [WebServer] [WebServer] -------------> [SessionServer]
[LoadBalancer]
-------------------------------------------------------------------------
[AppServer] [AppServer] [AppServer] [AppServer] -------^
-------------------------------------------------------------------------
[DBServer]
[DBServer](backup)
I write that not knowing what class of machines these are; they may not be suitable to be a DB server.
Well, it's early here and I'm only on my second cup of coffee. That may or may not be helpful, I hope it is.
Related
I need to build a Windows Service in VB.net under Visual Studio 2003. This Windows service should read the flat file (Huge file of about a million records) from the local folder and upload it to the corresponding database table. This should be done in Rollback mode (Database transaction). While transferring data to table, the service should also be listening to additional client requests. So, if in between client requests for a cancel operation, then the service should rollback the transactions and give feedback to the client. This windows service also keeps writing continuously to two log files about the status and error records.
My client is ASPX page (A website).
Can somebody help me explain how to organize and achieve this functionality in a windows service(Processing and listening for additional client requests simultaneously. Ex. Cancellation request).
Also could you suggest me the ideal way of achieving this (like if it is best to implement it as web service or windows service or just a remote object or some other way).
Thank you all for your help in advance!
You can architect your service to spawn "worker threads" that do the heavy lifting, while it simply listens for additional requests. Because future calls are likely to have to deal with the current worker, this may work better than, say, architecting it as a web service using IIS.
The way I would set it up is: service main thread is listening on a port or pipe for a communication. When it gets a call to process data, it spawns a worker thread, giving it some "status token" (could be as simple as a reference to a boolean variable) which it will check at regular intervals to make sure it should still be running. Thread kicks off, service goes back to listening (network classes maintain a buffer of received data so calls will only fail if they "time out").
If the service receives a call to abort, it will set the token to a "cancel" value. The worker thread will read this value on its next poll and get the message, rollback the transaction and die.
This can be set up to have multiple workers processing multiple files at once, belonging to callers keyed by their IP or some unique "session" identifier you pass back and forth.
You can design your work like what FTP do. FTP use two ports, one for commands and another for data transfer.
You can consider two classes, one for command parsing and another for data transfer, each one on separate threads.
Use a communication channel (like a privileged queue) between threads. You can use Syste.Collections.Concurrent if you move to .NET 4.0 and more threading features like CancellationTokens...
WCF has advantages over web service, but comparing it to windows service needs more details of your project. In general WCF is easier to implement in compare to windows service.
Our client requirement is to develop a WCF which can withstand with 1-2k concurrent website users and response should be around 25 milliseconds.
This service reads couple of columns from database and will be consumed by different vendors.
Can you suggest any architecture or any extra efforts that I need to take while developing. And how do we calculate server hardware configuration to cope up with.
Thanks in advance.
Hardly possible. You need network connection to service, service activation, business logic processing, database connection (another network connection), database query. Because of 2000 concurrent users you need several application servers = network connection is affected by load balancer. I can't imagine network and HW infrastructure which should be able to complete such operation within 25ms for 2000 concurrent users. Such requirement is not realistic.
I guess if you simply try to run the database query from your computer to remote DB you will see that even such simple task will not be completed in 25ms.
A few principles:
Test early, test often.
Successful systems get more traffic
Reliability is usually important
Caching is often a key to performance
To elaborate. Build a simple system right now. Even if the business logic is very simplified, if it's a web service and database access you can performance test it. Test with one user. What do you see? Where does the time go? As you develop the system adding in real code keep doing that test. Reasons: a). right now you know if 25ms is even achievable. b). You spot any code changes that hurt performance immediately. Now test with lots of user, what degradation patterns do you hit? This starts to give you and indication of your paltforms capabilities.
I suspect that the outcome will be that a single machine won't cut it for you. And even if it will, if you're successful you get more traffic. So plan to use more than one server.
And anyway for reliability reasons you need more than one server. And all sorts of interesting implementation details fall out when you can't assume a single server - eg. you don't have Singletons any more ;-)
Most times we get good performance using a cache. Will many users ask for the same data? Can you cache it? Are there updates to consider? in which case do you need a distributed cache system with clustered invalidation? That multi-server case emerging again.
Why do you need WCF?
Could you shift as much of that service as possible into static serving and cache lookups?
If I understand your question 1000s of users will be hitting your website and executing queries on your DB. You should definitely be looking into connection pools on your WCF connections, but your best bet will be to avoid doing DB lookups altogether and have your website returning data from cache hits.
I'd also look into why you couldn't just connect directly to the database for your lookups, do you actually need a WCF service in the way first?
Look into Memcached.
I am working on asp.net (newbie) and I am trying to understand what it means to do "load balancing" for the web site. The website will be used by multiple users and resources (database, web service,..).
If anyone could help me understanding the concept of the load balance for asp.net web site, I would really appreciate it.
Thanks.
One load-balancing-related issue you may want to be aware of at development time: where you store your session state. This MSDN article gives a good overview of your options.
If you implement your asp.net system using "out-of-process" or "sql-server-mode" session state management, that will give you some additional flexibliity later, if you decide to introduce a load-balancer to your deployed system:
Your load balancer needn't handle session affinity. As one poster mentioned above, all modern load-balancers handle it anyway, so this is a minor consideration in any case.
Web-gardens (a sort of IIS/server-implemented load-balancer) REQUIRES use of "out-of-process" or "sql-server-mode" session state management. So if your system is already configured that way, you'll be one step closer to being able to use web-gardens.
What is it?
Load balancing simply refers to distributing a workload between two or more computers. As a concept, it's not unique to asp.net. Although having separate machines for your database and web server could be called "load balancing" it more commonly refers to using multiple machines to serve a single role, such as having multiple web servers.
Should you worry about it? Probably not. Do you already have a performance problem? Are your database and web server on their own machines? If you do find that your server resources are strained, it would probably be easier to scale up (a more powerful single machine) than out (load balancing). These days, a dedicated box can handle a LOT of traffic if your code is decent.
Load Balancing, in the programming sense, does not apply to ASP.NET; it applies to a technique to try to distribute server load across two or more machines, rather than it all being used on one machine. Unless you will have many thousands (millions?) of users, you probably do not need to worry about it.
Check the Wikipedia article for more information.
Load balancing is not specific for any on technology stack be it asp.net, jsp etc. To load balance is to spread the incoming requests to a web site over more than one server. This is typically done with a software or hardware load balancer. The load balancer sits in front of two or more web servers and delegates the incoming traffic. Although this technique is not limited to web servers. Load Balancing
Enjoy!
I've never used it, but an option is IIS Application Request Routing.
IIS Application Request Routing (ARR)
2.0 enables Web server administrators, hosting providers, and Content
Delivery Networks (CDNs) to increase
Web application scalability and
reliability through rule-based
routing, client and host name
affinity, load balancing of HTTP
server requests, and distributed disk
caching
In a typical web server/database scenario, the db is almost always guaranteed to load up the machine first. This is because dealing with storing data requires more resources. Before you even start looking at load balancing your web server, you need to think about how to load balance the database.
Spreading one database across multiple servers is a lot harder than load balancing a web server. One of the techniques that can be used is sharding (or horizontal partitioning). This is where some records are stored on one server, and other records - on another server. For example records with ID 1-900000 are on server 1 and records 900001- are on server 2.
In comparison to DB load balancing, spreading the load across multiple ASP.NET servers is not overly complicated. Most of the session issues can be easily mitigated by using out of process session and/or never talking to Application.Cache directly. Data load balancing on the other hand is hard and requires a lot of planning and trial and error. In most cases, talking to a load balanced DB requires using an ORM which supports it (e.g. NHibernate) or your own Data Access Layer. The reason being is that you need to take out establishing a connection from the code that uses the database, so that the decision which DB to talk to is handled in one place.
the exact solution is to save session into the SQL Server with Stored Procedure. To read session call 'SessionCheck' stored Procedure.
I'd add that it really isn't something to worry about. By the time you need a load balancer, you can probably afford one of the neato newfangled ones with sticky sessions so you don't even have to deal with the session boogeyman.
I would like to know how people dealing with logging across multiple web servers. E.g. Assume there are 2 webservers and some events during the users session are serviced from one, some from the other. How would you go about logging events from the session coherently in one place (without e.g.creating single points of failure)? Assuming we are using: ASP.Net MVC, log4net.
Or am I looking at this the wrong way - should I log seperately and then merge later?
Thanks,
S
UPDATE
Please also assume that the load balancers will not guarantee that a session is stuck to one server.
You definitely want your web servers to log locally rather than over a network. You don't want potential network outages to prevent logging operations and you don't want the overhead of a network operation for logging. You should have log rotation set up and all your web servers clock's synced. When log rotation rolls your log files over to a new file, have the completed log files from each web server shipped to a common destination where they can be merged. I'm not a .net guy but you should be able to find software out there to merge IIS logs (or whatever web server you're using). Then you analyze the merged logs. This strategy is optimal except in the case that you need real-time log analysis. Do you? Probably not. It's fairly resilient to failures (assuming you have redundant disks) because if a server goes down, you can just reboot it and reprocess any log ship, log merge or log analysis operations that were interrupted.
An interesting solution alternative:
Have 2 log files appenders
First one in the local machine
In case of network failure you'll keep this log.
Second log to a unix syslog service remotely (of course
a very consistent network connection)
I used a similar approach long time ago, and it work really well, there are
a lot of nice tools for analyzing unix logs.
Normally your load balancing would lock the user to one server after the session is started. Then you wouldn't have to deal with logs for a specific user being spread across multiple servers.
One thing you could try is to have the log file in a location that is accessible by all web servers and have log4net configured to write to it. This may be problematic, however, with multiple processes trying to write to the same file. I have read about NLog which may work better in this scenario.
Also, the log4net FAQ has a question and possible solution to this exact problem
StateServer or SQLServer?
What is the best solution for storing ASP.NET session variables?
What are the pros and cons of each?
Are one better then other in any particular situation?
Here's some thoughts about pro's/con's.
I've also added Microsoft Velocity Distributed Caching solution.
Pros for InProc
Fastest optional available (it's all in memory/ram)
Easy to setup (nothing new required in the .config file .. i think this is the default behavior).
Most people I believe use this.
Cons for InProc
If the web site (application pool) dies, then all session info is lost.
Doesn't work in a WebFarm scenario -> session information is per app pool only.
Cannot contain non-session information.
Pro's for a StateServer
In memory/ram, so it's fast (but has some net latency .. read below), so it might not be as fast as Inproc.
Default configuration for a web farm scenario. Multiple iis sites use a stateserver to control the state session info.
Con's for StateServer
Requires the ASP.NET StateServer service to be set to run.
StateServer requires some config tweaking to accept 'remote iis machine' requests.
There's some tiny tiny net latency if the iis request needs to grab/set the session info on another networked machine.
Cannot contain non-session information.
Pro's for SqlServer (as a state server)
State is always retained, even after the iis site restarts.
Con's for SqlServer (as a state server)
Slowest solution -> net latency AND hard-drive latency (as the sql server stores the state on the harddisk / reads from the harddisk).
Hardest to setup/configure.
Cannot contain non-session information
Pro's for Velocity (or other distributed caching systems)
Can handle more than just session information -> objects, application settings, cache, etc. (This is a very GOOD thing IMO!!)
Can be memory only or persist to a database.
If one 'node' fails, the system still works. (assuming there's 2+ caching nodes)
Con's for Velocity (or other distributed caching systems)
Generally cost $$$
Hardest to setup (have to install stuff, tweak configs, add extra specal code).
Has network latency (which is generally nothing) but could have hard disk latency IF the service is persisting the data (eg. to a Sql Server).
I think the assumption would be that you are using a web farm of some sort.
One use of state service is in a Web Garden (multiple worker-processes on the same machine). In this case, you can use load-balancing to keep a user's connection going to a particular server, and have the n worker processes all sharing the same state service.
EDIT: In the web garden + state service or sql server scenario, you also have the benefit of being able to recycle the worker processes on that machine w/o the connected clients losing their session.
I'm not as familiar with using SQL Server as a session state store, but I would think you would gain robustness by using an SQL Server in a cluster. In this case, you could still have multiple worker processes and multiple servers, but you would not have to use a sticky session (server affinity).
And one more note, you can use state service on a second machine, and have all server in the farm hit that machine, but you would then have a single point of failure.
And finally, there are 3rd party (and some home-grown) distributed state-service-like applications. Some of these have performance benefits over the other options, plus Session_End event will actually fire. (In both State Service and SQL Server session backing, there the Session_End in Global.asax will not fire (there may be a way of hooking into SQL Server)).
In an n-tier environment, with SQL Server hosting session state you'll create additional network traffic to your back-end, as well as losing some SQL Server resources that will need to now take care of that additional traffic (session-related requests). SQL Server state management is also slower than state server.
However, if your servers go down in some unforeseen incident, SQL Server will most likely maintain the session information, as opposed to a state server.
In my personal experience I had a few problems storing in session variables. I kept loosing the session and I believe it was the anti virus, which, as it was scanning every file in the server, IIS would recompile the site killing the sessions. (I must say I had no power over that server, I was told to host the app there)
So I decided to store the session in the SQL Server and everybody is happy now... it is incredibly fast
Take a look at this article for a quick start up
Using a single machine to store state in a web garden means a single point of failure. We use SQL state, but it does add a bit of overhead.
In Proc is very Fast.
But having limitation. we can use single system only.
When the time of reboot the System, information will be lost.
worker processes in same machine
StateServer stored the session information in other machine.
Web Farm can use the session. for ex: multiple worker-processes can access the session information from server.
When the time of rebooting server, information will be lost.
SQLServer is used to store the info in Table. Default it will store in TempDB.
This tempdb will dynamically call after sqlservice is called.
So this also not persist the data. In this Scenario we can store in our own DB using Script, that is called Custom Option.