zero downtime deployments and session state problems

At work we have a fairly good process for zero-downtime deployments of a busy ASP.NET MVC web site. There's two server groups so we can move traffic over, we write db SP's to be backwards compatible and roll them out in advance. Generally, it works well.
However, one issue has just tripped me up. The last deployment included a lot of code namespace changes.
When the build was deployed it broke existing user sessions because the objects in their session state no longer existed on the new codebase e.g. their session contained App_Code.UserDetails and the site now has a class of X.Y.Z.UserDetails
I understand the problem and why it happened, but what I can't work out is if there's anything that can be done apart from a full site shutdown to ensure there are no active users on the site. Or maybe restarting the state service so everyone gets booted off and has to log in again. Any ideas folks?

After thinking this over I've arrived at two possible solutions:
a) Always deploy web site updates with downtime. Keep the site down long enough for all sessions to expire. This is a safe option technically but unacceptable commercially
b) Simulate the problem and look at what can be done address the issue in the exception handler. Although not an ideal end-user experience it may be better to catch SerializationExceptions on session state, and flush the user session. They will have to authenticate again but it's not as bad as a broken session.
A decision I made upfront is that it's going to be difficult to pick up on live session dependencies during development, there are just too many other things to consider.
I will be investigating (b) as it sounds quite feasible.


ASP.NET: Session state has created a session id, but cannot save it because the response was already flushed by the application

in an old ASP.NET Web Forms application, developed in Visual Studio 2010,
suddenly does not run anymore, and in the log file appears this message:
Session state has created a session id,
but cannot save it because the response was already flushed by the application.
No new deployment has been made, and no code modifications take place.
Until now I didn't find any solution to this.
What I have to check?
I state that the source code is no longer available, and therefore it would be very difficult to change the code and proceed with a new deployment.
Thanks in advance.
This would suggest that someone might be hitting the site and jumping directly to some URL (and thus code) that say does some response redirect to another page or some such.
Remember, when code behind runs, and say re-directs to another page, in most cases the running code for the current page is terminated, and that is normal behaviors.
However, the idea that you going to debug code and debug a web site when you don't have the code to debug? Gee, I don't see how that's going to work at all. As noted, if this just started, then it sounds like incoming requests are to pages that don't expect to be hit "first", but some pages that expect to be ONLY called from other pages in the site when some session() and imporant values are setup BEFORE such pages are to be hit.
It also not clear if the site is using sql based sessions, or just in-memory sessions. In memory can (and is) faster, but it also not particually relaible. Now, if you deployed to a new web server or new hosting, then often session errrors can now start to appear, and this is due to the MASSIVE HUGE LARGE DIFFERENT of using cloud based hosting vs that of older hosting soluions that run on a single server.
Clould computing is real utility computing, and thus when you host a web site on such systems, then in-memory session() cannot be used anymore, since multiple servers can and will be used to "dish out" web pages. Since more then one server might be used, then obvisouly in-memory sesson() can't work, since a few web pages might be served out by one server, and then a few more pages might be served out by another server. And using shared memory for a session is limited to ONE server, since multiplel servers don't and can't transfer their memory to other servers.
So, this suggests that you want to be sure that sql server based sessions are being used here - and for any kind of server farm, or any kind of system that does load balances between more then one server, then of course you HAVE to use sql server based sessions, since in memory can't work in that kind of environment.
The error could also be due to excessive server loads - often the session database is "locked" for a short period of time, and can thus often be a bottleneck. So, for say years you might not see a issue, but then as load and use of the web site increases, then this can become noticed where as in the past it was not. I suppose the database used for storing sessions could be checked, or looked at, since as you note, you don't have the ability to test + debug the code. I would REALLY but REALLY work towards solving and fixing this lack of source code for the web site, since without that, you have really no means to manage, maintain, and fix issues for that web site.
But, abrupt terminations of web pages? As noted, this could be a error triggered in code, and thus the page never finished what it was supposed to do. And as noted, perhaps a page that expects some session() values but does not have them as explained above could be the problem. It not clear if your errors also shows what URL this was occurring for.
While nothing seems to have changed - something obviously did.
Ultimate, you need to get that source code, or deal with the people + vendor that supplies the code for that site. If you don't have a vendor, and you don't have source code, you quite much attempting to work on a car that you cant even open the hood to check what's going on under that hood.
so, one suggestion here? Someone is hitting a page that expected some value(s) in session to exist. Often the simple solution is to shove ANY simple or dummy value into session so session REALLY does get created, and then when the page attempts to save the session(), there is one to save!!!
In other words, this error often occurs when session is attempted to be saved, but no sesison exists. For such pages, as noted, a simple tiny small code change of doing this session("zoozoo") = "my useless text" will fix this error. So, it sounds like session is being lost.
As noted, a error on a web page can also trigger a app-pool re-start. If app-pool re-starts, then session is lost (in memory session). Now, with session being lost, then any page that decides to terminate early AND ALSO having used session() will trigger this error.
So, this sounds like app-pool is being re-started and session is being lost. (you can google why app-pool restarts and for the many reasons). However, critical to this issue would be are you using sql based sessions, or in-memory (server) sessions? So, this sounds like some code is triggering a error, and with a error triggered, app-pool re-starts. And with app-pool being restarted, then in-memory session is blown away. And now, without ANY session at all, then attempts to save the session trigger the exact error message you see. (and this is why shoving a dummy value into the session allows and can fix some pages - since you can't save a "nothing" session, and if you do, then you get that exact error message.
but, as noted, you can't make these simple changes to code anyway, right?
But, first on this issue - are you using memory based sessions or not? And that feature can be setup and configured in IIS, and without changes to the code base. So, one quick fix might be to turn on sql server based sessions. It will cost web site performance (10%), but the increased reliability is more then worth the performance hit.
Another area to look at? Are AJAX calls being made to a page, and again without any previous session having been created? So, once again, we down to a change in end user behaviors, and possible those hitting a page first before having logged in, or done other things - and again one would see this error crop up. Session or Distributed Cache - which is viable solution

We have MVC 4 application which communicates with backend D365 entities.
The application makes a lot of CRM calls to get the data hence it was really slow and the user experience was very poor
To improve its performance, cache layer has been added and whenever application gets a data from CRM, it puts it into Session variable.
Surely, that helped with the performance as within the user's session , it prevents the trip to server and everything is served from session data. However now the application is having a lot of data syncing issues. (data saved by one user is not reflected to others until they logout and re-login)
My questions : was it really a good way of handling the performance issue the application was having? In my opinion, rather then fixing the performance issue , a workaround was added which becomes the cause of other issue.
secondly question: is there a better architecture/design that can be put in place which will improve the performance as well as resolve the data syncing issues the application is having? I am thinking to add a distributed cache layer (Azure Redis likely) to replace in-place Session layer, and optionally (if that makes sense) to implement write-through strategy in Redis so that the front-end application only talks to Cache and let cache keep the data store up to date.
Any guidance or pointer is very much appreciated!
I think you're on the right path. As you've already experienced adding a cache to your application introduces new challenges: handling stale data. In your case it data is cached at the user-level, which means each user has its own cache. This works well if each user works on their own piece of the data. For instance: a banking app where each user sees only their own bank statements (and never those of others). However, this is not the case in your application. Multiple users operate on the same data and now you're running into synchronization issues. A quick fix could be to replace the Session cache with the Application cache which is shared with all users.
Externalizing your cache (e.g. Redis or Memcached) is another solution and offers many advantages (e.g. distribution; scaling; synchronization; etc.), but also increases the complexity of your application. Now your application is dependent on another piece of infrastructure with its own behavior.

Why we need multiple working process?

Yesterday, my customer played with the IIS settings and he changed the number of Working Process to 2 which made my Web Application ran very weird, the Session State sometimes was lost, sometimes was recovered and took me a day to find out what happened. So, the question is Multiple Working Process is useful in what situation?
It can be useful for scaling a web app vertically. Especially poorly designed ones that do too much work in the web processes or one where processes crash frequently so you always have a hot one. It isn't an option that should be exercised lightly as you have found out, but it is good to have when you need it.
The reason your user sessions started to fail was because you are using the default in-process session state module. This is fairly easy to fix as well -- just run the session state out of process using either the session state service or a database. Note that some behavior of the session state changes when you do this as well, so you will need to test carefully that you don't break something else.

When to use load balancing?

I am just getting in to the more intricate parts of web development. This may not be in the best place. However, when is it best to get load balancing for a web project? I understand that it depends on good design/bad design as to how many users you can get to visit a site without it REALLY effecting the performance. However, I am planning to code a new project that could potentially have a lot of users and I wondered if I should be thinking off the bat about load balancing. Opinions welcome; thanks in advance!
I should not also that the project most likely will be (webforms or mvc not yet decided) with backend of mongodb or pgsql(again still deciding).
Load balancing can also be a form of high availability. What if your web server goes down? It can take a long time to replace it.
Generally, when you need to think about throughput you are already rich because you have an enormous amount of users.
Stackoverflow is serving 10m unique users a month with a few servers (6 or so). Think about how many requests per day you had if you were constantly generating 10 HTTP responses per second for 8 hot hours: 10*3600*8=288000 page impressions per day. You won't have that many users soon.
And if you do, you optimize your app to 20 requests per second and CPU core which means you get 80 requests per second on a commodity server. That is a lot.
Adding a load balancer later is usually easy. LBs can tag each user with a cookie so they get pinned to one particular target. You app will not notice the difference. Usually.
Is this for an e-commerce site? If so, then the real question to ask is "for every hour that the site is down, how much money are you losing?" If that number is substantial, then I would make load balancing a priority.
One of the more-important architecture decisions that I have seen affect this, is the use of session variables. You need to be able to provide a seamless experience if your user ends-up on different servers during their visit. Session variables won't transfer from server to server, so I would avoid using them.
I support a solution like this at work. We run four (used to be eight) .NET e-commerce websites on three Windows 2k8 servers (backed by two primary/secondary SQL Server 2008 databases), taking somewhere around 1300 (combined) orders per day. Each site is load-balanced, and kept "in the farm" by a keep-alive. The nice thing about this, is that we can take one server down for maintenance without the users really noticing anything. When we bring it back, we re-enable our replication service and our changes get pushed out to the other two servers fairly quickly.
So yes, I would recommend giving a solution like that some thought.
The parameters here that may affect the one the other and slow down the performance are.
Have to do with how many user you have, together with the media you won to serve.
So if you have to serve a lot of video/files to deliver, you need many servers to deliver it. Let say that you do not have, what is the next think that need to check, the users and the processing.
From my experience what is slow down the processing is the locking of the session. So one big step to speed up the processing is to make a total custom session handling and your page will no lock the one the other and you can handle with out issue too many users.
Now for next step let say that you have a database that keep all the data, to gain from a load balance and many computers the trick is to make local cache of what you going to show.
So the idea is to actually avoid too much locking that make the users wait the one the other, and the second idea is to have a local cache on each different computer that is made dynamic from the main database data.
One more parameter is that you can make a solution that can handle the case of one server for all, and all for one :) style, where you can actually use more servers for backup reason. So if one server go off for any reason (eg for update and restart), the the rest can still work and serve.
As you said, it depends if/when load balancing should be introduced. It depends on performance and how many users you want to serve. LB also improves reliability of your app - it will not stop when one system goes crashing down. If you can see your project growing to be really big and serve lots of users I would sugest to design your application to be able to be upgraded to LB, so do not do anything non-standard. Try to steer away of home-made solutions and always follow good practice. If later on you really need LB it should not be required to change your app.
You may need to think ahead but not at a cost of complicating your application too much. Do not go paranoid and prepare everything to work lightning fast 'just in case'. For example, do not worry about sessions - session management can be easily moved to SQL Server at any time and this is the way to go with LB. Caching will also help if you hit some bottlenecks in the future but you do not need to implement it straight away - good design (stable interfaces), separation and decoupling will allow for the cache to be added later on. So again - stick to good practices, do not close doors but also do not open all of them straight away.
You may find this article interesting.

ASP.NET deployment: How to avoid losing session state when updating code?

How do you work-around the fact that sessions are dropped every time you deploy certain code files to an ASP.NET website? Sometimes we need to deploy a crucial fix in the middle of the day but don't want to boot off all our users for it.
By default Sessions are stored InProc. You should choose an out-of-process option.
Maybe just a StateServer is enough in your scenario
One way would be to have a load-balanced server set-up. You could direct all traffic to server A, patch Server B and then repeat the other way around.
Alternatively, as #Curtisk states, better to get to the stage where you don't need to do "hot patches" through rigourous testing and then proceed to planned outages advertised in advance.
Hope this helps.
The reason why this happens is that deploying the new code causes the application pool to get recycled. You then lose everything you have in memory.
The way to get around this is then not to save anything in memory.
How difficult this is depends on your architecture.
One solution could be to save your session information in SQL Server, using the out of process state. Note do not use an in memory state server, as if the app pool is recycled you will lose this.
