nginx to use a second node as a shadow - nginx

Is it possible to configure nginx to route all traffic to the primary node, but duplicate the requests (and ignore the response) to a second node? This is for testing a performance update to a web server in production with minimal risk.
I found Shadow Proxy but was concerned about its impact on performance and stability of a production environment.

Seems there is not a stable and high performance and easy way to do this for production environment.
here are some methods instead:
I found some blogs said using nginx-lua can do this.
In our environment, we split the traffic and force some certain requests to our sandbox server, and the ratio is under control. If have problems, only a few users will be affected, and, these group of users could be internal users, such as colleagues of your department or entire company.
Recover the requests from the access log for those "GET" requests. for POST requets, we usually using auto test cases.

Related

High Availability WordPress setup

I'm going to run WordPress site in HA(High Availability) environment at AWS.
I already use HA MySQL - Amazon Avrora.
Right now I have a few question:
Should I prefer Session Replication or Sticky sessions or at my Load Balancer or both of them ?
User content must be uploaded to CDN and not to WP single node in cluster?
How AWS can help with WordPress HA setup ? For example should I use AWS Beanstalk for this purpose ?
What else should I pay my attention to in order to create HA for WordPress ?
Your questions are perhaps a bit broad for StackOverflow, but I am in your situation so I can sympathize.
Sticky sessions are not the preferred option because the need to use them would suggest that your application is not stateless.
In other words, you are requiring sticky sessions, that means your application relies on server memory for session management so, once a session is initialized, that user must stay on THAT server for the entire duration of the session. This is OK, but less desirable (compared to if your request didn't care at all which server instance it was running on) because if your traffic slowed down and Elastic Beanstalk decided to kill off the instance you were on, then on the next request when the load balancer routed you to another instance, your session would be RESET and your user would have to login again.
On the other hand, if your app was written to be completely stateless (by storing the state in a db instance for example), then you would not care which server each request hit because state would not be stored on the server instance. This would allow Beanstalk to freely spin up and down instances without affecting your users in any way.
The benefit to sticky sessions is, if your app is already written with a dependence on server memory, or MUST have it for some reason, it allow your app to run without code changes.
Yes, it seems to me like your user-content should not be uploaded to any single node (for mainly the same reasons I mentioned above). If your user-content is stored on the node and that node gets spun down due to low traffic, you will have lost that data.
This is where something like S3 comes in handy. Your application interacts directly with S3 as its storage solution and each instance saves content to your S3 bucket(s). Then, regardless of which node is running, it can just talk to the same S3 bucket and get the data it needs.
Aside from that, all I can recommend is that you experiment, look into load testing, and adjust as needed.

IIS 7.5 Load Balancing--do Sessions stick to the originating server?

Apologies if there is an answer already out here but I've looked at over 2 dozen threads and can't find the specific answer.
So, for our ASP.NET (2.0) application, our infrastructure team set up a load balancer machine that has two IIS 7.5 servers.
We have a network file server where the single copy of the application files reside. I know very little about the inner workings of load-balancing and even IIS in general.
My question is regarding sessions. I guess I'm wondering if the 'balancing' part is based on sessions or on individual page requests.
For example, when a user first logs in to the site, he's authenticated (forms), but then while he navigates around from page to page--does IIS 7.5 automatically "lock him in" to the particular server that first logged him in and authenticated him, or could his page requests alternate from one server to the next?
If the requests do indeed alternate, what problems might I face? I've read a bit about duplicating the MachineKey, but we have done nothing in web.config regarding MachineKey--it does not exist there at all.
I will add that we are not experiencing any issues (that we know of anyway) regarding authentication, session objects, etc. - the site is working very well, the question is more academic, and I just want to make sure I'm not missing something that may bite me down the road.
Thanks,
Jim
while he navigates around from page to page--does IIS 7.5 automatically "lock him in" to the particular server that first logged him in and authenticated him
That depends on the configuration of the load balancer and is beyond the scope of a single IIS. Since you haven't provided any information on what actual balancer you use, I can only provide a general information - regardless of the balancer type (hardware, software), it can be configured for so called "sticky sessions". In such mode, you are guaranteed that once a browser establishes connection to your cluster, it will always hit the same server. There are two example techniques - in first, the balancer just creates a virtual mapping from source IP addresses to cluster node numbers (which means that multiple requests from the same IP hit the same server), in second - the balancer attaches an additional HTTP cookie/header that allows it to recognize the same client and direct it to the same node.
Note that the term "session" has nothing to do with the server side "session" where you have a per-user container. Session here means "client side session", a single browser on a single operating system and a series of request-replies from it to your server.
If the requests do indeed alternate, what problems might I face
Multiple issues. First, encryption, if relies on machine key, will not work. This means that even forms cookies would be rejected by cluster nodes other than the one that issued the cookie. A solution is to have the same machine key on all nodes.
Another common issue would be the inproc session provider - any data stored in the memory of one application server will not "magically" appear on other cluster nodes, thus, making the session data unavailable. A solution is to configure the session to be stored in a separate process, for example in a sql server database.
I will add that we are not experiencing any issues (that we know of anyway) regarding authentication, session objects
Sounds like a positive coincidence or the infrastructure team has already configured sticky sessions. The latter sounds possible, the configuration is usually obvious and easy.

Using Cloudfront to expose ElasticSearch REST API in read only (GET/HEAD)

I want to let my clients speak directly with ElasticSearch REST API, obviously preventing them from performing any data or configuration change.
I had a look at ElasticSearch REST interface and I noticed the pattern: HTTP GET requests are pretty safe (harmless queries and status of cluster).
So I thought I can use Cloudfront as a CDN/Proxy that only allows GET/HEAD methods (you can impose such restrict it in the main configuration).
So far so good, all is set up. But things don't work because I would need to open my EC2 security group to the world in order to be reachable from Cloudfront! I don't want this, really!
When I use EC2 with RDS, I can simply allow access to my EC2 security group in RDS security groups. Why can't I do this with CloudFront? Or can I?
Ideas?
edit: It's not documented, but ES accepts facets query, which involve a (JSON) body, not only with POST, but also with GET. This simply breaks HTTP recommendation (as for RFC3616) by not ignoring the body for GET request (source).
This relates because, as pointed out, exposing ES REST interface directly can lead to easy DOS attacks using complex queries. I'm still convinced though, having one less proxy is still worth it.
edit: Other option for me would be to skip CloudFront and adding a security layer as an ElasticSearch plugin as shown here
I ended coding with my own plugin. Surprisingly there was nothing quite like this around.
No proxies, no Jetty, no Tomcat.
Just a the original ES rest module and my RestFilter. Using a minimum of reflection to obtain the remote address of the requests.
enjoy:
https://github.com/sscarduzio/elasticsearch-readonlyrest-plugin
Note that even a GET request can be harmful in Elasticsearch. A query which simply takes up too much resources to compute will bring down your cluster. Facets are a good way to do this.
I'd recommend writing a simple REST API you place in front of ES so you get much more control over what hits your search cluster. If that's not an option you could consider running Nginx on your ES boxes to act as a local reverse proxy, which will give you the same control (and a whole lot more) as CloudFront does. Then you'd only have to open up Nginx to the world, instead of ES.
A way to do this in AWS would be:
Set up an Application Load Balancer in front of your ES cluster. Create a TLS cert for the ALB and serve https. Open the ES security group to the ALB.
Set up CloudFront and use the ALB as origin. Pass a custom header with a secret value (for WAF, see next point).
Set up WAF on your ALB to only allow requests that contain the custom header with the secret value. Now all requests have to go through CloudFront.
Set up a Lambda#Edge function on your CloudFront distribution to either remove the body from GET requests, or DENY such requests.
It’s quite some work, but there’s advantages over the plugin, e.g.:
CloudFront comes with free network DDOS protection
CloudFront gives your users lower latency to ES because of the fast CloudFront network and global PoP’s.
Opens many options to use CloudFront, WAF and Lamba#Edge to further protect your ES cluster.
I’m working on sample code in CDK to set all of this up. Will report back when that’s ready.

Avoiding invalid viewstate when deploying on a load balanced website without downtime

Here is the scenario:
We have 3 web servers A, B, C.
We want to release a new version of the application without taking the application down
(e.g. not using the "Down for maintenance page").
Server A goes live with latest code.
Server B gets taken off-line. Users on Server B get routed to A and C.
Page1.aspx was updated with new control. Anyone that came from Server B to Server A while
on this page will get a viewstate error when they perform an action on this page. This is what we want to prevent.
How do some of you resolve this issue?
Here are some thoughts we had (whether it's possible or not using our load balancer, I don't know... I am not familiar with load balancer configuration [it's an F5]):
The more naive approach:
Take down servers A and B and update. C retains the old code. All traffic will be directed to C, and that's ok since it's the old code. When A and B go live with the update, if possible tell the load balancer to only keep people with active sessions on C and all new sessions get initiated on A and B. The problem with this approach is that in theory sessions can stick around for a long time if the user keeps using the application.
The less naive approach:
Similar to the naive approach, except (if possible) we tell the load balancer about "safe" pages, which are pages that were not changed. When the user eventually ends up on a "safe" page, he or she gets routed to server A or B. In theory the user may never land on one of these pages, but this approach is a little less risky (but requires more work).
I assume that your load balancer is directing individual users back to the same server in the web farm during normal operations, which is why you do not normally experience this issue, but only when you start redirecting users between servers.
If that assumption is correct then it is likely the issue is a inconsistent machinekey across the server farm.
ViewState is hashed against the machine key of the server to prevent tampering by the user on the client side. The machine key is generated automatically by IIS, and will change every time the server restarts or is reset, as well as being unique to each server.
In order to ensure that you don't hit viewstate validation issues when users move between servers there are two possible courses of action.
Disable the anti-tampering protection on the individual page or globally in the pages element of the web.config file using the enableViewStateMac attribute with a false value. I mention this purely for the sake of completeness - you should never do this on a production website.
Manually generate a machine key and share that same value across each application (you could use the same key for all your applications, but it is sensible to use one key per application to maximise security), on each of your servers. To do this you need to generate keys (do not use any you may see in demos on the internet, this defeats the purpose of the unique machine key), this can be done programatically or in IIS manager (see http://www.codeproject.com/Articles/221889/How-to-Generate-Machine-Key-in-IIS7). Use the same machine key when deploying the website to all of your servers.
I can't answer on the best practice for upgrading applications that require 100% uptime.

Bandwidth Monitoring in asp.net

Hi, We are developing a multi-tenant application in Asp.Net with separate Database for each tenant, in which one of the requirement is to monitor the bandwidth usage for each tenant,
i have tried to search but not found much help on the topic,we want to monitor exactly how much bandwidth is being used for each tenant while each tenant can have its own top level domain or a sub domain or a combination of both.
so what are the available options, the ones which i can think of can be
IIS Log Monitoring means a separate application which will calculate the bandwidth for each tenant.
Log Each Request and Response for a tenant from within the application and then calculate the total bandwidth usage based on that.
Use some third part components if available
So what do you think will be the best approach, also if there is any other way to do this.
Ok, here is an idea (that I have not test, leave that to you)
On global.asax
use one of this function (find the one that have a valid final size)
Application_PostRequestHandlerExecute
Application_ReleaseRequestState
and get the size that you have send with
Response.Filter.Length
No need to metion, that you get the filename of the call using the
HttpContext.Current.Request.Path
This functions called with every single request, so you can get your size and you do the rest.
Here must note, that you need first to test this idea to see if its work, and maybe improve it, and have in mine that if you have compress the pages on server the length is not the correct and maybe you need to compress it on Global.asax to have the actually lenght.
Hope this help.
Well, since the IIS logs already contain the request size and response size, it doesn't seem like too much trouble to develop a small tool to parse them and calculate the total per day/week/month/whatever.
Trying to segment traffic based on host is difficult in my experience. Instead, if you give each tenant their own IP(s) for the applications you should be able to find programs that will monitor bandwidth based on IP.
ADDITION Is the structure of IIS that you have one website to rule them all for all tenants and on login the system forks to the proper database? If so, this may create problems with respect to versioning in that all tenant's sites will all have to have exactly the same schema and would all need to be updated simultaneously when you update the application such that a schema change is required.
Another structure, which sounds like what you may have, is that each tenant has their own website like so:
tenant1_site/appvirtualdir
tenant2_site/appvirtualdir
...
Where the appvirtualdir points to the same physical path for all tenant's sites. When all clients have the same application version, they are all using literally the same code. If you have this scenario and some sort of authentication, then you will need one IP per tenant anyway because of SSL. SSL will only bind to IP and port unlike non-SSL which will bind to IP, port and host. If that were the case, then monitoring traffic based on IP will still be simpler and more accurate as it could be done at the router or via a network monitor.

Resources