I'm going to run WordPress site in HA(High Availability) environment at AWS.
I already use HA MySQL - Amazon Avrora.
Right now I have a few question:
Should I prefer Session Replication or Sticky sessions or at my Load Balancer or both of them ?
User content must be uploaded to CDN and not to WP single node in cluster?
How AWS can help with WordPress HA setup ? For example should I use AWS Beanstalk for this purpose ?
What else should I pay my attention to in order to create HA for WordPress ?
Your questions are perhaps a bit broad for StackOverflow, but I am in your situation so I can sympathize.
Sticky sessions are not the preferred option because the need to use them would suggest that your application is not stateless.
In other words, you are requiring sticky sessions, that means your application relies on server memory for session management so, once a session is initialized, that user must stay on THAT server for the entire duration of the session. This is OK, but less desirable (compared to if your request didn't care at all which server instance it was running on) because if your traffic slowed down and Elastic Beanstalk decided to kill off the instance you were on, then on the next request when the load balancer routed you to another instance, your session would be RESET and your user would have to login again.
On the other hand, if your app was written to be completely stateless (by storing the state in a db instance for example), then you would not care which server each request hit because state would not be stored on the server instance. This would allow Beanstalk to freely spin up and down instances without affecting your users in any way.
The benefit to sticky sessions is, if your app is already written with a dependence on server memory, or MUST have it for some reason, it allow your app to run without code changes.
Yes, it seems to me like your user-content should not be uploaded to any single node (for mainly the same reasons I mentioned above). If your user-content is stored on the node and that node gets spun down due to low traffic, you will have lost that data.
This is where something like S3 comes in handy. Your application interacts directly with S3 as its storage solution and each instance saves content to your S3 bucket(s). Then, regardless of which node is running, it can just talk to the same S3 bucket and get the data it needs.
Aside from that, all I can recommend is that you experiment, look into load testing, and adjust as needed.
Related
Need some advice before starting develop some things.. I've 15 WordPress websites on different installs, and I've remote server which gets data 24/7 from those websites.
I've reached a point that I want the server to modify the websites based on his calculated data.
The things are this:
Should I allow the server the access the WP DB remotely and modify things without using WP on the circle?
Or, use WP REST API and supply some secured routes which provide data and accept data and make those changes?
My instinct is to use the WP API, but. After all its a PHP (nginx+apache) which have some limits (timeout for example) and I find it hard to run hard and long process on the WP itself.
I can divide the tasks to different levels, for example:
fetching data (simple get)
make some process on the remote server
loop and modify in small batches to another route
My concerns are that this circle require perfect match between remote server and WP API, and any change or fix on WP side brings plugins update on the websites which is not much fun.
Hope for any ideas and suggests to make it forward.
"use WP REST API and supply some secured routes which provide data and accept data and make those changes", indeed.
i don't know why timeout or another limits may cause a problem - but using API is the best way for such kind of cases. You can avoid timeout problems with some adjustments on web servers side.
Or you can increase memory, timeout limit exclusively for requested server.
f.e.
if ($_SERVER["remote_attr"]=='YOUR_MAIN_SERVER_IP') {
ini_set('max_execution_time',1000);
ini_set('memory_limit','1024M');
}
Since we have moved to azure, we have numerous session lost issues only on production.
We have InProc, cookie based, sticky session, large timeout, no high traffic and no high memory/process usage.
We use HAProxy as loadbalancer.
I have done basic research and none of the following seems to be the cause:
session timeout
application pool settings/recycling
memory size and usage thresholds
no eaten exceptions
there is no changes to file system to cause a restart
I'm particularly more suspicious about how loadbalancer/ssl and application work together and if http headers are fine, but I don't know any tools to really monitor that.
I'm assigned to find a solution at the same time I have no privilege to access the machines.
Logs(Log4Net) are all stored in database but doesn't help to give a clear understanding of what is going on the system and cannot follow a user session using them.
I'm allowed to find the problem by adding required logs to code or to develop some kind of monitoring module or to use profiling/debugging tools.
Only once a month there will be a production deployment so I'm trying to use the opportunity as best as possible.
Question:
Is there any useful monitoring/profiling tool that can give me a clear view of what is happening in the system by aggregating information I may need? for example following a user/session between requests from time of login until session drop plus information about headers and other system application parameters.
if there is not such a tool out there, please give me your ideas to write one?
This is a common issue in load balanced environment. As mentioned in this answer for a similar question,
InProc mode, which stores session state in memory on the Web server. Which means that session data is maintained inside your web server on a given VM and is not shared outside of the VM. So when you have multiple server for load balancing, the session state isn't shared with each other. To solve this, you must store your session state external to the web server.
Use Redis, or SQL Database, or something else.
If I have the following line in multiple websites that run on the same server will there be any problems with the app's writing to the same cache record?
HttpContext.Current.Cache["SearchResults"] = myDataTable;
I know the docs say there is one cache per app domain but I don't quite understand what that means.
https://msdn.microsoft.com/en-us/library/system.web.httpcontext.cache
There is one instance of the Cache class per application domain. As a
result, the Cache object that is returned by the Cache property is the
Cache object for all requests in the application domain.
Application domain is a mechanism (similar to a process in an operating system) used within the Common Language Infrastructure (CLI) to isolate executed software applications from one another so that they do not affect each other.
So when you write to cache, it will be stored in memory for that app domain and if you have web garden ( which has multiple worker process) or web farm, then request might be served by diff worker proc/node, then those node/work proc don't have access to cache as those are entirely diff app domain.
So it states cache is specific to single app domain, to solve the above issue, you need to look for centralised cache store.
Apologies if there is an answer already out here but I've looked at over 2 dozen threads and can't find the specific answer.
So, for our ASP.NET (2.0) application, our infrastructure team set up a load balancer machine that has two IIS 7.5 servers.
We have a network file server where the single copy of the application files reside. I know very little about the inner workings of load-balancing and even IIS in general.
My question is regarding sessions. I guess I'm wondering if the 'balancing' part is based on sessions or on individual page requests.
For example, when a user first logs in to the site, he's authenticated (forms), but then while he navigates around from page to page--does IIS 7.5 automatically "lock him in" to the particular server that first logged him in and authenticated him, or could his page requests alternate from one server to the next?
If the requests do indeed alternate, what problems might I face? I've read a bit about duplicating the MachineKey, but we have done nothing in web.config regarding MachineKey--it does not exist there at all.
I will add that we are not experiencing any issues (that we know of anyway) regarding authentication, session objects, etc. - the site is working very well, the question is more academic, and I just want to make sure I'm not missing something that may bite me down the road.
Thanks,
Jim
while he navigates around from page to page--does IIS 7.5 automatically "lock him in" to the particular server that first logged him in and authenticated him
That depends on the configuration of the load balancer and is beyond the scope of a single IIS. Since you haven't provided any information on what actual balancer you use, I can only provide a general information - regardless of the balancer type (hardware, software), it can be configured for so called "sticky sessions". In such mode, you are guaranteed that once a browser establishes connection to your cluster, it will always hit the same server. There are two example techniques - in first, the balancer just creates a virtual mapping from source IP addresses to cluster node numbers (which means that multiple requests from the same IP hit the same server), in second - the balancer attaches an additional HTTP cookie/header that allows it to recognize the same client and direct it to the same node.
Note that the term "session" has nothing to do with the server side "session" where you have a per-user container. Session here means "client side session", a single browser on a single operating system and a series of request-replies from it to your server.
If the requests do indeed alternate, what problems might I face
Multiple issues. First, encryption, if relies on machine key, will not work. This means that even forms cookies would be rejected by cluster nodes other than the one that issued the cookie. A solution is to have the same machine key on all nodes.
Another common issue would be the inproc session provider - any data stored in the memory of one application server will not "magically" appear on other cluster nodes, thus, making the session data unavailable. A solution is to configure the session to be stored in a separate process, for example in a sql server database.
I will add that we are not experiencing any issues (that we know of anyway) regarding authentication, session objects
Sounds like a positive coincidence or the infrastructure team has already configured sticky sessions. The latter sounds possible, the configuration is usually obvious and easy.
I am moving a site from a single server to a server farm consisting of three web servers behind a load balancer. It seems easy enough to handle session management - just make sessions "sticky" at the Load Balancer (we evaluated SQL-based session management but have decided to continue using InProc session management for efficiency).
However, we also use a sizable configuration object that we keep in the Application space (e.g. Application[ObjName]). Since the config object is loaded from memory, we have no problem until someone makes a change to the configuration. At that point, the application on the hosting server will have the change and the database will have the change. However, the other two servers won't have the change. We've debated having a "once a minute" polling rule (e.g. on new sessions), keeping information in the session instead (not very efficient), etc. All have serious drawbacks. I am wondering what other people do. Is it possible to keep the Application space on SQL Server but the Session space inproc? Any help or insight about how to handle this would be appreciated!
Application[] is always going to be local memory based, so no matter what you've going to have some code changes to make. So put it somewhere else, like a distributed cache, AppFabric, NCache, memcached.net etc. When someone makes a change to the configuration update the cache, when you need to read the settings read from the cache. Propogation/Sync is taken care of by the cache itself.
We currently decided to use NCache as we have got 4 web servers for our web farms. This third party caching tool can work perfectly with load balancer and is easy to configure(just the Express version of it is free. For Professional and Enterprise version, only Developer Machine is out of charge). It is also really fast and stable. You must setup the NChache on each server and set the load balancer to work with all of them. Hope it helps.