vcrpy route different responses to different cassettes depending on host

vcrpy route different responses to different cassettes depending on host - python-requests

I'm wanting to record various requests on a particular unit test, with the requests for each specific host going to their own specific cassette file. Is there any way to do this in vcrpy?

Related

NGINX routing based on server 200 response failures

My goal is to configure nginx's stream object(s) in the config to route requests to a backup upstream in the event that one fails on certain health checks (2/3)
The health checks while sort of specific I believe shouldn't be an issue:
-TCP 1212 availability
-TCP 1912 availability
-HTTP GET on 7078 /?
-Response should be 200 and if I can get the body somehow to check that it's as expected, even better!
If these checks fail on one upstream "cluster" so to speak, I would like to route requests to another identical cluster, much like a back up.
The issue I'm solving lies in the fact that the servers are quite literally half a world apart and so load balancing through one server would cause the same latency as if you waited for it to fail. So while a load balancer would have "routing" behavior in the end, the response time would be unacceptable.
Is there a way to do this in NGINX configs or am I spreading it too thin?

The NGINX upstream module will do passive health checks for you, meaning it will react to connection failures, and optionally switch to backup servers as necessary. To some extent, that might be enough for you.
What you're describing here though are active health checks that let you check different ports from the traffic port, assert HTTP status, header values and even body content. Unfortunately, having dangled that in front of you, these are only available as part of the NGINX Commercial Subscription, which I'm guessing isn't what you're looking for.
If you do need that kind of pro-active health checks, you can still do it from outside of NGINX. One approach might be:
put your upstreams in separate confs, and include one of them where you need it
use ncat and/or curl in a every-minute cron job to do the tests that matter to you
if ever those tests fail, switch out the upstream confs, and tell NGINX to do a zero-downtime reload
You can switch confs by fast mv to rename the right one to match the include, you shouldn't have to rewrite anything.

Can Nginx "load balance" requests to multiple backends?

I realize this is kind of a strange situation, but bear with me. Suppose I have a request being sent from some client-side Javascript which I want to forward to multiple upstream servers. There's no need for any sort of response, just an immediate 200 OK.
Is there any way to achieve this with Nginx's load balancer? There are multiple disciplines for load balancing based on round-robin, least-connections, weights, etc, but all of them enforce a one-to-one mapping between requests and upstream servers. I'm guessing this is due to the fact that it's not generally possible to "combine" multiple responses to send back to the client (and it's very rare that you'd even want to), but that's not a concern in this case.
Any creative solutions for this in Nginx? Whether it's through the load-balancing module or the HttpProxyModule.

The reason for a mandatory 'Host' clause in HTTP 1.1 GET

Last week I started quite a fuss in my Computer Networks class over the need for a mandatory Host clause in the header of HTTP 1.1 GET messages.
The reason I'm provided with, be it written on the Web or shouted at me by my classmates, is always the same: the need to support virtual hosting. However, and I'll try to be as clear as possible, this does not appear to make sense.
I understand that in order to allow two domains to be hosted in a single machine (and by consequence, share the same IP address), there has to exist a way of differentiating both domain names.
What I don't understand is why it isn't possible to achieve this without a Host clause (HTTP 1.0 style) by using an absolute URL (e.g. GET http://www.example.org/index.html) instead of a relative one (e.g. GET /index.html).
When the HTTP message got to the server, it (the server) would redirect the message to the appropriate host, not by looking at the Host clause but, instead, by looking at the hostname in the URL present in the message's request line.
I would be very grateful if any of you hardcore hackers could help me understand what exactly am I missing here.

This was discussed in this thread:
modest suggestions for HTTP/2.0 with their rationale.
Add a header to the client request that indicates the hostname and
port of the URL which the client is accessing.
Rationale: One of the most requested features from commercial server
maintainers is the ability to run a single server on a single port
and have it respond with different top level pages depending on the
hostname in the URL.
Making an absolute request URI required (because there's no way for the client to know on beforehand whether the server homes one or more sites) was suggested:
Re the first proposal, to incorporate the hostname somewhere. This
would be cleanest put into the URL itself :-
GET http://hostname/fred http/2.0
This is the syntax for proxy redirects.
To which this argument was made:
Since there will be a mix of clients, some supporting host name reporting
and some not, it just doesn't matter how this info gets to the server.
Since it doesn't matter, the easier to implement solution is a new HTTP
request header field. It allows all clients and servers to operate as they
do now with NO code changes. Clients and servers that actually need host
name information can have tiny mods made to send the extra header field
containing the URL and process it.
[...]
All I'm suggesting is that there is a better way to
implement the delivery of host name info to the server that doesn't involve
hacking the request syntax and can be backwards compatible with ALL clients
and servers.
Feel free to read on to discover the final decision yourself. But be warned, it's easy to get lost in there.

The reason for adding support for specifying a host in an HTTP request was the limited supply of IP addresses (which was not an issue yet when HTTP 1.0 came out).
If your question is "why specify the host in a Host header as opposed to on the Request-Line", the answer is the need for interopability between HTTP/1.0 and 1.1.
If the question is "why is the Host header mandatory", this has to do with the desire to speed up the transition away from assigned IP addresses.
Here's some background on the Internet address conservation with respect to HTTP/1.1.

The reason for the 'Host' header is to make explicit which host this request refers to. Without 'Host', the server must know ahead of time that it is supposed to route 'http://joesdogs.com/' to Joe's Dogs while it is supposed to route 'http://joscats.com/' to Jo's Cats even though they are on the same webserver. (What if a server has 2 names, like 'joscats.com' and 'joescats.com' that should refer to the same website?)
Having an explicit 'Host' header make these kinds of decisions much easier to program.

How to duplicate (not distribute/load-balance !) incoming http traffic to multiple servers?

I would like to create a setup where each incoming http request that matches a given rule (say url/headers-based regex) will be duplicated and dispatched to N upstream HTTP servers with the response used being from one of them (say the first).
Commmon rewriting tasks (url, headers) could be specified for each of the N upstream requests amd ideally this would work with all HTTP verbs but just GET and POST would be ok too.
What should I be looking at ? (bonus for windows based solution, two bonuses (bonusi?) for IIS-based one).
I know it is rather simple to write a rudimentary version of the above in node/python/etc but I'm looking for something mature that can be deployed in production.

Network Load Balancing with IIS may fit your request.
http://www.iis.net/learn/web-hosting/configuring-servers-in-the-windows-web-platform/network-load-balancing

How to tell if a Request is coming from a Proxy?

Is it possible to detect if an incoming request is being made through a proxy server? If a web application "bans" users via IP address, they could bypass this by using a proxy server. That is just one reason to block these requests. How can this be achieved?

IMHO there's no 100% reliable way to achieve this but the presence of any of the following headers is a strong indication that the request was routed from a proxy server:
via:
forwarded:
x-forwarded-for:
client-ip:
You could also look for the proxy or pxy in the client domain name.

If a proxy server is setup properly to avoid the detection of proxy servers, you won't be able to tell.
Most proxy servers supply headers as others mention, but those are not present on proxies meant to completely hide the user.
You will need to employ several detection methods, such as cookies, proxy header detection, and perhaps IP heuristics to detect such situations. Check out http://www.osix.net/modules/article/?id=765 for some information on this situation. Also consider using a proxy blacklist - they are published by many organizations.
However, nothing is 100% certain. You can employ the above tactics to avoid most simple situations, but at the end of the day it's merely a series of packets forming a TCP/IP transaction, and the TCP/IP protocol was not developed with today's ideas on security, authentication, etc.
Keep in mind that many corporations deploy company wide proxies for various reasons, and if you simply block proxies as a general rule you necessarily limit your audience, and that may not always be desirable. However, these proxies usually announce themselves with the appropriate headers - you may end up blocking legitimate users, rather than users who are good at hiding themselves.
-Adam

Did a bit of digging on this after my domain got hosted up on Google's AppSpot.com with nice hardcore porn ads injected into it (thanks Google).
Taking a leaf from this htaccess idea I'm doing the following, which seems to be working. I added a specific rule for AppSpot which injects a HTTP_X_APPENGINE_COUNTRY ServerVariable.
Dim varys As New List(Of String)
varys.Add("VIA")
varys.Add("FORWARDED")
varys.Add("USERAGENT_VIA")
varys.Add("X_FORWARDED_FOR")
varys.Add("PROXY_CONNECTION")
varys.Add("XPROXY_CONNECTION")
varys.Add("HTTP_PC_REMOTE_ADDR")
varys.Add("HTTP_CLIENT_IP")
varys.Add("HTTP_X_APPENGINE_COUNTRY")
For Each vary As String In varys
If Not String.IsNullOrEmpty(HttpContext.Current.Request.Headers(vary)) Then HttpContext.Current.Response.Redirect("http://www.your-real-domain.com")
Next

You can look for these headers in the Request Object and accordingly decide whether request is via a proxy/not
1) Via
2) X-Forwarded-For
note that this is not a 100% sure shot trick, depends upon whether these proxy servers choose to add above headers.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex