I have some very interesting behaviour from seemingly random clients on a website. What I see is that random POST-requests to the server result in a bad request to the backend. I've tracked down why the request is bad - but I still don't know WHY this occurs.
Client connects to webserver with HTTP.
Client sends the headers of an ordinary POST-request (not the body)
Five seconds pass. The cache server passes the request to its backend, because it took too long to complete the request.
The cache server replies to the client, with an error message - indicating that the request was bad.
The client sends the POST-body - a few seconds after the reply has been received.
I have no problem accepting that the cache server can be reconfigured to wait longer. My problem is; What can be the reason for clients to wait several seconds between the headers sent, and sending the POST-body? I don't know of any cases where this behavior makes sense.
This is a fairly ordinary Magento eCommerce website, with a setup of Haproxy -> Varnish -> Nginx -> php5-fpm. Varnish is the component that ships the request to Nginx when five seconds of idling has passed.
I have verified with tcpdump/wireshark that the server does not receive the POST-body from the client within the time (before Haproxy as well).
I have verified that this occurs across user agents, and across the kind of requests (Being ordinary login forms, to ajax callbacks)
Does anyone have any clever ideas?
NOTE: I wasn't sure if this was a question for Stack Overflow or Serverfault, but I consider this an HTTP question that requires developer-knowledge.
The server is buggy-- you shouldn't send partial requests from the front-end to the backend. It's possible that the client is waiting for a HTTP/100 Continue response for the server before transmitting the POST body. It's also possible that the client is generating the POST data and that's taken some time for some reason.
Related
I'm trying to diagnose a web service that sits behind some load balancers and proxies. Under load, one of the servers along the way starts to return HTTP 504 errors, which indicates a gateway timeout. With that background out of the way, here is my question:
When a proxy makes a request to the destination server, and the destination server receives the request but doesn't respond in time (thus exceeding the timeout), resulting in a 504, what happens when the destination server does eventually respond? Does it know somehow that the requestor is no longer interested in a response? Does it happily send a response with no idea that the gateway already sent HTTP error response back to the client? Any insight would be much appreciated.
It's implementation-dependent, but any proxy that conforms to RFC 2616 section 8.1.2.1 should include Connection: close on the 504 and close the connection back to the client so it can no longer be associated with anything coming back from the defunct server connection, which should also be closed. Under load there is the potential for race conditions in this scenario so you could be looking at a bug in your proxy.
If the client then wants to make further requests it'll create a new connection to the proxy which will result in a new connection to the backend.
We've recently noticed a problem where some user agents would repeat the same POST request without the user actually physically triggering it twice.
After further study, we noticed this only happens when the request goes through our load balancer and when the server took a long time to process the request. A packet capture session eventually revealed that the load balancer drops the connection after a 5 minute timeout by sending a TCP Reset to the client; however, the client automatically resubmitted the request without user intervention.
We observed this behavior in Apache HTTP client for Java, Firefox and IE 8. (I cannot install other browsers to test.) This makes me think this behavior is part of the HTTP standard, but this is not very easy to google.
Also, it seems this only happen if the first request is submitted via a kept-alive TCP connection.
This is part of the HTTP 1.1 protocol to handle connections that are closed prematurely by servers.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.4.
I experienced a similar situation whereby we had the same form posted a couple of times within milliseconds apart.
A packet capture via wireshark confirmed the retransmission by the browser and logs from the server indicated the arrival of the requests.
Also further investigation also revealed that the load balancer such as F5 have reported incidence of retransmission behavior. So it is worth checking with your load balancer vendor as well.
Let's say I am going to deploy a server application that's likely to be placed behind a NAT/firewall and I don't want to ask users to tweak their NAT port mapping. In other words, connections to the server are impossible, but my app is a server application by nature, i.e. it sends back objects per URI.
Now, I'm thinking about initiating connections from the server periodically to see what requests are there to be responded to. I'm going to use HTTP via port 80 as something that would likely be working through NAT/firewall from virtually anywhere.
The question is, are there any standard considerations and common practices of implementing a client that can act as a server at the application level, specifically using HTTP? Any special HTTP headers? Design patterns?
E.g. I am thinking about the following scheme:
The client (which is my logical server) sends a dummy HTTP request to the server
The server responds back with non-standard headers X-Request-URI:, X-Host:, X-If-Modified-Since: etc, in other words, request headers wrapped into X-xxx as they are not standard in this situation; also requests to keep the connection alive
The client responds with a POST request that sends the requested object; again, uses wrapped headers (e.g. X-Status:, etc)
Unless there is a more "standard" way of doing something like this, do you think my approach is plausible?
Edit: an interesting discussion took place on reddit here
I've done something similar. This is very common. Client initiate the connection to the Server and keep the connection ALIVE. If the session is shut-down, client would re-initiate. When the session is up, Server can push anything to the client since it's client initiated.
How HTTP request works:(if i have mistake, please write)
user type in browser http://www.website.com
the server send him html page with links to images+css+js files
browser read html and where included images/css/js file send http request to get the file
where browser send request to get file, does he waiting for response or he send request for next file?
Thanks
Most browsers will have an internal queue of requests which are handled as follows:
Request the first item. If a fresh copy is in the cache, this will mean a request to the cache. If a stale copy with validation information (last-mod and/or e-tag) this will be a conditional request (the server or proxy may return a 304 indicating the stale copy is actually still fresh). Otherwise an unconditional request.
As rendering of the entity returned requires other entities, these will be put into a queue of needed requests.
Requests in the queue that have already been in that same queue (e.g. if a page uses the same image more than once) will have the same entity immediately used (hence if a URI returns a random image, but you use it more than once in the same page, you will get the same image used).
Requests will be processed immediately, so in the case of a webserver, images, css, etc. will begin downloading before the HTML has finished rendering or indeed, finished downloading.
Requests to the same domain with the same protocol (HTTP or HTTPS) will be pipelined, using a connection that has already been used, rather than opening a new one.
Requests are throttled in two ways: A maximum number of simultaneous requests to the same domain, and a total maximum number of simultaneous requests.
The browser usually initiate more than one socket to the target server, and thus getting content on more than one socket at the same time. This can be combined with HTTP Pipelining (what you are asking about), where the browser sends multiple requests on the same socket without waiting for each of their responses.
From Wikipedia page:
HTTP pipelining is a technique in
which multiple HTTP requests are
written out to a single socket without
waiting for the corresponding
responses. Pipelining is only
supported in HTTP/1.1, not in 1.0.
Suppose I click on a link to website A on a page and just before the current page gets replaced, I click on a different link to a different website say B.
What happens to the request that was sent to website A? Does the webserver of site A reply back and the browser just rejects the HTTP reply?
There is no specific HTTP provision for canceling a request. I would expect this to happen at the socket level.
I would expect the associated TCP socket to be closed immediately upon canceling the request. Since http uses only 1 socket, the server will get the close after the request. If the close was processed before the data is generated, generated data down won't be sent to the client. Otherwise the data is sent to the client and ignored since the socket is closed. There may be wasted work, but a special http message to "cancel" would have the same effect.