We want to reduce the https connect latency to facebook graph servers. We have configured our servers to do a http keep alive. However, it looks like the connection gets closed after every call (from traffic server logs...).
Is there a way to deterministically see that keep_alive connections are honored or not by graph.facebook.com? ..or for that matter any server in general?
Based on Facebook's attitude towards efficient queries, I would not expect them to honor any type of persistent connection. Especially since your request is going to end up in a performance hit on their servers.
You should look at fetching as much data as you can in one go by combining your queries into a single batch request.
Related
There are about a thousand clients (mobile application).
There is a server (we are going to use SpringBoot), which saves information about changes on the client side.
Changes from each client to the server should arrive every 5 minutes.
Please tell me the best way to realise it:
use simple http requests to the server
use websocket
use long polling
I can’t understand what is more critical, to keep a large number of connections or to respond to a large number of requests.
I had the following issue on a system that I supported ~7 years ago. We never got to the bottom of it, and focus shifted onto other issues. I was recently reminded of it, and wondered if anyone would know what was going on. But alas I'll be a little short on details. Sorry.
The Setup
I had a farm of web servers sitting behind a load balancer. The servers were hosting a system that would receive HTTP requests (XML &/or SOAP) from clients, then for each one kick-off a bunch of further HTTP requests to 3rd-party-suppliers, wait for the suppliers' responses, process and combine the results and respond to the client's request.
Think insurance comparison, but as Business-To-Business XML service.
The whole processing would take 5s of seconds, from receiving the initial client request to them sending back a response to that original HTTP request, and the server would be processing 10s or 100s of requests in parallel (i.e. at any given point, a given webserver would have many client Requests that had come in, and been logged, but not yet been responded to.)
We had detailed logging which record the reciept of the requests, including origin IP and which server was processing the request, and record when a response was sent.
All client requests were sent to a single IP address (well, URL), which was the address of the loadbalancer, which would then forward requests to the webservers, which weren't individually accessible to the internet (they didn't have public IP addresses).
Our load balancer would allow us to take individual web-servers out of rotation, for maintenance.
When we did that we could watch the DB logs, and see new requests stop coming in, and the existing request gradually get completed, until there were now outstanding requests and the server was idle.
The problem
We found that sometimes, when we took a server out of rotation ... it wouldn't entirely stop receiving requests. You could see the large bulk of request suddenly stop coming in, but it would still receive a trickle of fresh requests (I don't know ... maybe 0.1% of normal load, maybe less?). I think the longest we left it going was maybe ... 10 minutes?
Notably we realised that all of those requests were coming from a single client/IP address (I don't remember which).
I forget whether other (still-in-rotation) webservers were still receiving requests from this client, but I think they were?
If we rebooted the webserver, no further requests would come in after restarting.
Web stack was Windows, IIS, ASP.NET; pretty old school even at the time. All servers individually owned and configured.
What was happening?
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server. But that was BS-waffle, and since we never needed to actually understand what was going on, we ignored it and moved on with our lives :)
But I'd still like to know what we were seeing, if anyone can diagnose it from that description.
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server.
That sounds like a good explanation.
Normally, a LB will refuse new connections to a removed server, but will allow open connections to live on until they naturally close. This is known as "connection draining" or "graceful shutdown".
If one of your clients had HTTP keepalive on, and was holding a TCP connection open and sending HTTP requests through it for a long time, it would give the symptoms you describe.
Most LBs will have a configuration knob for how long to wait for connections to close before force-closing them during this "connection draining" time. You can set a timeout here to avoid this scenario if it is a problem for you.
The HTTP connection handling behaviour of clients will vary at the client's discretion, to a large extent. Perhaps most of your clients were of one type (say, web browsers) and weren't holding open a single connection for 10 mins, but perhaps one client was different (say, a programmatic HTTP API client)?
Further reading about "connection draining" on AWS Load Balancers here (the exact details will vary by LB vendor): https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-conn-drain.html
Further reading about HTTP keep alive here: https://en.wikipedia.org/wiki/HTTP_persistent_connection
I know in HTTP 1.1, keep-alive is the default behavior, unless the client explicitly asks the server to close the connection by including a Connection: close header in its request, or the server decides to includes a Connection: close header in its response. I am wondering if this isn't kind of an obstacle in scalability when growning servers horizontaly.
My scenario: we are developing all new services following microservices patterns either in Java or Phyton. It is desarible we can design and implement such way we can increase horizontally. For isntance, I can use docker in order to easily scale up or use Spring Boot Cloud Config. Whatever the phisical host implementation the basic idea is favour scalability.
My understanding: I must keep server and client as musch agnostic as possible and when I set up HTTP Keep Alive I understand there will be advantage while taking use of same http connection (and save some cpu process) but I guess I am forcing the client (eg. another service) to keep using same connection which may downgrade the advantage of several docker instances of same service since I will promote the client to keep consuming the same initial connection.
Assuming my understanding is correct, I assume it is not a good idea since we develop the service providing response that can be reuseable from different consumers with different approaches: some consumers can consume assyncronously or following reactive design paradigms which make me wondering if keeping alive same connection. Let's say in practical terms: the connection used should be free soon as possible in order to really balance the demand over all providers.
***edited after first comment
Let´s assume I have multiple diferent consumer services (CS1, CS2 ... CSn) connecting to a single Load Balance instance (LB) which will forward the request to multiple Dockers with same provider service (D1, D2 ... Dn). Since keep alive is the default behaviour in http 1+, we have keep "alive = true" in all connection (either between Cx and LB or LB and Dx). As far as I know the only advantage to keep alive is save cpu process while opening/closing a connection. If I send Connection:close after each request there is no advantage at all to use keep alive. If I have some logic to send "connection: close" it means I promote LB to keep connected to a specific Dx using exactly the same connection for while, right? (I choose here the word promote because I iguess force might not be the appropriate one since there is time out in keep alive and then LB migh route to another Dx anyway). So I have in some moment C1 -> LB -> D1 alive persisted for while, right? Comming back to my original question, isn't that against the idea of assyncronous/paralelal/reactive paradigm? For instance, I have some scenario where a single consumer service will call another service few times before returning a single answer to a page. Today we are doing it sequentially but if we decide to call in paralalel and depending on first answer therer will be already a answer to a page or we decide to compouse an answer to the page but I don't care the order. The caller service will wait every answers before returning to a ccontroller and the order doesn't matter. Ins't strange I have keep alive = true?
I am forcing the client (eg. another service) to keep using same connection
You are not forcing. The client can easily avoid persistent connections by sending HTTP/1.0 and/or Connection: close. There are many practical HTTP applications that work just like that.
keep using same connection which may downgrade the advantage of several docker instances of same service since I will promote the client to keep consuming the same initial connection
Assuming your load balancer works well, it will usually distribute connections evenly across your instances. Of course, this may not work when you only have a few connections altogether, but a few connections can hardly pose a scalability or performance problem.
I have an ASP.NET Web API application running behind a load balancer. Some clients keep an HTTP busy connection alive for too much time, creating unnecessary affinity and causing high load on some server instances. In order to fix that, I wish to gracefully close a connection that is doing too much requests in a short period of time (thus forcing the client to reconnect and pick a different server instance) while at same time keeping low traffic connections alive indefinitely. Hence I cannot use a static configuration.
Is there some API that I can call to flag a request to "answer this then close the connection" ? Or can I simply add the Connection: close HTTP header that ASP.NET will see and close the connection for me?
It looks like the good solution for your situation will be the built-in IIS functionality called Dynamic IP restriction. "To provide this protection, the module temporarily blocks IP addresses of HTTP clients that make an unusually high number of concurrent requests or that make a large number of requests over small period of time."
It is supported by Azure Web Apps:
https://azure.microsoft.com/en-us/blog/confirming-dynamic-ip-address-restrictions-in-windows-azure-web-sites/
If that is the helpful answer, please mark it as a helpful or mark it as the answer. Thanks!
I am not 100% sure this would work in your situation, but in the past I have had to block people coming from specific IP addresses geographically and people coming from common proxies. I created an Authorized Attribute class following:
http://www.asp.net/web-api/overview/security/authentication-filters
In would dump the person out based on their IP address by returning a HttpStatusCode.BadRequest. On every request you would have to check a list of bad ips in the database and go from there. Maybe you can handle the rest client side, because they are going to get a ton of errors.
Write an action filter that returns a 302 Found response for the 'blocked' IP address. I would hope, the client would close the current connection and try again on the new location (which could just be the same URL as the original request).
I have several clients that constantly post data to a REST service. REST service is put behind a network load balancer. Each client sends 100 - 500 MB a day and I need to support 500+ clients.
I can POST either very large packets, this will reduce overhead for TCP/IP session set up and HTTP headers. This will, however, firmly tie one client to a particular server and limit my scalability options. Alternatively, I can send small HTTP packets, which I can load balance well, but I will get more overhead for TCP/IP session set up and HTTP headers.
What is the recommended packet size for HTTP POST? Or how can I calculate one for my environment?
There is no recommended size.
While HTTP POST size is not constrained by the RFCs, since HTTP is a commodity protocol implementing request / response type messaging, most of the infrastructure is configured around the idea that TCP connections are not particularly long lasting / does not carry significant amounts of data. i.e. there will be factors outside your control which may impact the service - although HTTP supports range requests for responses, there is no corollary for requests.
You can get around a lot of these (although not all) by using HTTPS. However you still need to think about how you detect/manage outages - are you happy to wait for a TCP timeout?
With 500+ clients presumably using the system quite heavily, the congestion avoidance limits shouldn't be a problem - whether TCP window scaling is likely to be an issue depends on how the system is used. HTTP handshakes should not be an issue unless you restrict the request size to something silly.
If the service is highly dependant on clients pushing lots of data on to your server, then I'd encourage you to look at parsing the data on the client (given the volume, presumably it's coming from files - implying a signed java applet or javascript with UniversalBrowserRead privilege) then sending it over a bi-directional communication channel (e.g. websocket).
Leaving that aside for now, the only way you can find out what the route between your clients and your server will support is to measure it - and monitor it. I would expect that a 2Mb upload size would work pretty much anywhere, while a 10Mb size would work most of the time within the US or Europe - and that you could probably increase this to 50Mb as long as there's no mobile clients.
But if you want to maintain the effectiveness of the service you'll need to monitor bandwidth, packet loss and lost connections.