Can max-stale be set in the response header for it to be used by the request on the client side? The documentation here - Cache-Control: Syntax, specifically the "Cache response directives" section makes it appear as if max-stale is not part of the response header. Is this only used by client to make decision how much longer it will use the stale resource and server/application has no say in it? If so, what can be set on the response to simulate the functionality of max-stale?
No max-stale cannot be used in the response. It’s meant to be used by the client to override the cache defaults. “Gimme this resource even if it’s technically a bit past it’s expiry date”. It would be used if there is a caching server between the client and the origin server.
To be honest, in my experience, request cache-control headers are rarely used, except to force refresh all the way back to origin server (max-age=0) for example when doing a “hard reload” with dev tools open. I’ve never seen a real world instance of max-stale as far as I can recall.
There is not equivalent on a response header. If a server is happy for a resource to be used for longer then it should just increase the max-age amount.
There is the stale-while-revalidate response option which allows a stale resource to be used for a limited period, to allow a quick reload of the page, while the browser checks and downloads a new version in the background for the next time. However support of it is limited at this time as shown at the bottom of that page you linked.
Simulating the behaviour of max-stale on the response does not really make much sense. Server owns the resources and has an understanding of how those resources change over time. Server has to decide how critical a resource is and if it is ok for it be served stale. Also decide on some reasonable time limits for the resource to be re-validated such that the clients get fresh data most of the time. It is a balancing act on the server side. Too strict the setting, you are overloading your server with requests and client with network traffic. Too loose, your clients see an old representation.
A client can use max-stale to avoid any re-validation and get what is in the cache. you don't want to generate network requests unless it is really necessary. for example must-revalidate overrides max-stale so if the response has that header, even with max-stale you will hit the origin server no matter. Similarly with no-cache and no-store. So in that sense, server has a say in it. It can identify resources that CANNOT be used stale, even with max-stale and marks with the appropriate header.
Related
I have a client side GUI app for human usage that consumes some SOAP web services and uses cURL as the underlying HTTP communication lib. Depending on the input, processing a request can take some large amount of time, even one hour. Neither the client nor server time out for that reason on their own and that's tested and works. Most of the requests get processed in some minutes anyway, so this is an edge case.
One of my users is forced to use a proxy between my client app and my server and for various reasons has no control over it. That proxy has a time out configured and closes the connection to my client after 4 minutes of no data transfer. So the user can (and did) upload data for e.g. 30 minutes, afterwards the server starts to process the data and after 4 minutes the proxy closes the connection, the server will silently continue to process the request, but the user is left with some error message AND won't get the processing result. My app already uses TCP Keep Alive, so that shouldn't be the problem, but instead the time out seems to be defined for higher level data. It works the same like the option read_timeout for squid, which I used to reproduce the behaviour in our internal setup.
What I would like to do now is start a background thread in my web service which simply outputs some garbage data to my client over all the time the request is processed, which is ignored by the client and tells the proxy that the connection is still active. I can recognize my client using the user agent and can configure if to ouput that data or not server side and such, so other clients consuming the web service wouldn't get a problem.
What I'm asking for is, if there's any HTTP compliant method to output such garbage data before the actual HTTP response? So e.g. would it be enough to simply output \r\n without any additional content over and over again to be HTTP compliant with all requesting libs? Or maybe even binary 0? Or some full fledged HTTP headers stating something like "real answer about to come, please be patient"? From my investigation this pretty much sounds like chunked HTTP encoding, but I'm not sure yet if this is applicable.
I would like to have the following, where all those "Wait" stuff is simply ignored in the end and the real HTTP response at the end contains Content-Length and such.
Wait...\r\n
Wait...\r\n
Wait...\r\n
[...]
HTTP/1.1 200 OK\r\n
Server: Apache/2.4.23 (Win64) mod_jk/1.2.41\r\n
[...]
<?xml version="1.0" encoding="UTF-8"?><soap:Envelope[...]
Is that possible in some standard HTTP way and if so, what's the approach I need to take? Thanks!
HTTP Status 102
Isn't HTTP Status 102 exactly what I need? As I understand the spec, I can simply print that response line over and over again until the final response is available?
HTTP Status 102 was a dead-end, two things might work, depending on the proxy used: A NPH script can be used to regularly print headers directly to the client. The important thing is that NPH scripts normally bypass header buffers from the web server and can therefore be transferred over the wire as needed. They "only" need be correct HTTP headers and depending on the web server and proxy and such it might be a good idea to create incrementing, unique headers. Simply by adding some counter in the header name.
The second thing is chunked transfer-encoding, in which case small chunks of dummy data can be printed to the client in the response body. The good thing is that such small amount of data can be transferred over the wire as needed using server side flush and such, the bad thing is that the client receives this data and by default behaves as if it was part of the expected response body. That might break the application of course, but most HTTP libs provide callbacks for processing received data and if you print some unique one, the client should be able to filter the garbage out.
In my case the web service is spawning some background thread and depending on the entry point of the service requested it either prints headers using NPH or chunks of data. In both cases the data can be the same, so a NPH-header can be used for chunked transfer-encoding as well.
My NPH solution doesn't work with Squid, but the chunked one does. The problem with Squid is that its read_timeout setting is not low level for the connection to receive data at all, but instead some logical HTTP thing. This means that Squid does receive my headers, but it expects a complete HTTP header within the period of time defined using read_timeout. With my NPH approach this isn't the case, simply because by design I only want to send some garbage headers to ignore until the real headers arrive.
Additionally, one has to be careful about NPH in Apache httpd, but in my use case it works. I can see the individual headers in Squid's log and without any garbage after the response body or such. Avoid the Action directive.
Apache2 sends two HTTP headers with a mapped "nph-" CGI
I understand that one can set Accept-Ranges: none on the server to advise the client not to attempt a range request.
I am wondering if there is a way to tell a browser not to attempt a range request without having to make any changes on the server.
For instance, is there a setting in Chrome or Firefox that I can toggle to deter my browser from making range requests?
You answered the question in the first sentence.
The relevant RFC is 7233, Hypertext Transfer Protocol (HTTP/1.1): Range Requests:
2.3. Accept-Ranges
A client MAY generate
range requests without having received this header field for the
resource involved.
A server that does not support any kind of range request for the
target resource MAY send
Accept-Ranges: none
to advise the client not to attempt a range request.
If you mean you want to know how to disable range requests in a browser altogether, consult the specific browser's documentation. A quick web search yielded no options for me to do this for common browsers.
I was reading reading an article on launching of HTTP/2. It was said that HTTP/2 is based on SPDY (speedy) protocol and it can provide faster browsing speed compared to HTTP/1.1 by using "header field compression" and "multiplexing". How does these terms exactly work?
Am I supposed to believe that in HTTP/1.1 requests are processed in a 'one after the other' manner?
Multiplexing
With HTTP 1.1 a lot of time is spent just waiting. A browser sends requests and waits for the response to come back and then sends another GET etc. An inefficient use of the bandwidth. At times it would use Pipelining but that too suffers that sometimes requests need to wait for the requests done prior. The head of line blocking problem.
With multiplexing, there's virtually no waiting but the browsers can just ask for hundreds of things at once and they will be delivered in whatever order they can be delivered and without individual streams or objects having to wait for each other. (With prioritization and flow control to help control them properly.)
This will be most notable on high-latency connections. For a visible and clear demo what it can do, see the golang's gophertiles demo at https://http2.golang.org/gophertiles?latency=1000 (requires a HTTP/2 enabled browser)
Header compression
Additionally, HTTP/2 offers header compression that makes a client able to squeeze in more requests earlier in a TCP connection lifetime. In the early slow-start period of a new TCP connection it can be valuable to cram in more requests so that the responses come back earlier. HTTP headers are extremely repetitive in their nature.
Server push
A HTTP/2 server can send data to the client as if the client asked for it, before the client asks for it! If the server thinks the client is likely to want/need that too, and thus a half RTT can be saved.
Starting with HTTP/2, both headers and HTTP response content could be compressed. With HTTP/1.1, headers are never compressed contrary to the content (specified with Content-Encoding header).
Multiplexing is related to server-push. Actually, when the server sends an HTML page, it can use the same connection to push additional resources like css and javascript files. If the HTML page needs to load those additional scripts, no more request will be sent to the server as they were already sent previously.
HTTP 1.1 supports keep alive connections, connections are not closed until "Connection: close" is sent.
So, if the browser, in this case firefox has network.http.pipelining enabled and network.http.pipelining.maxrequests increased isn't the same effect in the end?
I know that these settings are disabled because for some websites this could increase load but I think a simple http header flag could tell the browser that is ok tu use multiplexing and this problem can be solved easier.
Wouldn't be easier to change default settings in browsers than invent a new protocol that increases complexity especially in the http servers?
SPDY has a number of advantages that go beyond what HTTP pipelining can offer, which are described in the SPDY whitepaper:
With pipelining, the server still has to return the responses one at a time in the order they were requested. This can be a problem if the client requests a resource that's dynamically generated before one that is static: the server cannot send any of the "easy" static responses until the dynamically generated one has been generated and sent. With SPDY, responses can be returned out of order or in parallel as they are generated, lowering the total time to receive all resources.
As you noted in your question, not all servers are able to deal with pipelining: it's not just load, some servers actually behave incorrectly when the client requests pipelining. Using a header to indicate that it's okay to do pipelining is too late to get the maximum benefit: you are already receiving the first response at that point, so while you can use it on future connections it's already too late for this one.
SPDY compresses headers using an algorithm which is specific to that task (stateful and with knowledge of what is normally in HTTP headers); while yes, SSL already includes compression, just compressing them with deflate is not as efficient. Most HTTP requests have no bodies and only a short GET line, so the headers make up virtually the entire request: any compression you can get is an improvement. Many responses are also small compared to their headers.
SPDY allows servers to send back additional responses without the client asking for them. For example, a server might start sending back the CSS for a page along with the original HTML, before the client has had a chance to receive and parse the HTML to determine the stylesheet URL. This can speed up page loads even further by eliminating the need for the client to actually parse the HTML before requesting other resources needed to render the page. It also supports a less bandwidth-heavy version of this feature where it can "hint" about which resources might be needed, and allow the client to decide: this allows, for example, clients that don't care about images to not bother to request them, but clients that want to display images can still request the images using the given URLs without needing to wait for the HTML.
Other things too: see William Chan's answer for even more.
HTTP pipelining is susceptible to head of line blocking (http://en.wikipedia.org/wiki/Head-of-line_blocking) at the HTTP transaction level whereas SPDY only has head of line blocking at the transport level, due to its use of multiplexing.
HTTP pipelining has deployability issues. See https://datatracker.ietf.org/doc/html/draft-nottingham-http-pipeline-01 which describes a number of different workarounds and heuristics to mitigate this. SPDY as deployed in the wild does not have this problem since it is generally deployed over SSL (port 443) using NPN (http://technotes.googlecode.com/git/nextprotoneg.html) to negotiate SPDY support. SSL is key, since it prevents intermediaries from interfering.
SPDY has header compression. See http://dev.chromium.org/spdy/spdy-whitepaper which discusses some benchmark results of the benefits of header compression. Now, it's useful to note that bandwidth is less and less of an issue (see http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/), but it's also useful to remember that bandwidth is still key for mobile. Check out https://developers.google.com/speed/articles/spdy-for-mobile which shows how beneficial SPDY is for mobile.
SPDY supports features like server push. See http://dev.chromium.org/spdy/spdy-best-practices for ways to use server push to improve cacheability of content and still reduce roundtrips.
HTTP pipelining has ill-defined failure semantics. When the server closes the connection, how do you know which requests have been successfully processed? This is a major reason why POST and other non-idempotent requests are not allowed over pipelined connections. SPDY provides semantics to cancel individual streams on the same connection, and also has a GOAWAY frame which indicates the last stream to be successfully processed.
HTTP pipelining has difficulty, often due to intermediaries, in allowing deep pipelines. This (in addition to many other reasons like HoL blocking) means that you still need to utilize multiple TCP connections to achieve maximal parallelization. Using multiple TCP connections means that congestion control information cannot be shared, that compression contexts cannot be shared (like SPDY does with headers), is worse for the internet (more costly for intermediaries and servers).
I could go on and on about HTTP pipelining vs SPDY. But I'd recommend just reading up on SPDY. Check out http://dev.chromium.org/spdy and our tech talk on SPDY at http://www.youtube.com/watch?v=TNBkxA313kk&list=PLE0E03DF19D90B5F4&index=2&feature=plpp_video.
See Difference between HTTP pipeling and HTTP multiplexing with SPDY
If not, how accurate is it?
I want to know the size of the image before I download it.
Can the HTTP Content-length header be malformed? Yes.
Should you trust it to be a fair representation of the size of the message body? Yes.
It should be, and usually is, accurate. However it is entirely possible for a web server to report a incorrect content length although this obviously doesn't happen often (I recall old versions of apache retuning nonsensical content lengths on files > 2GB).
It is also not mandatory to provide a Content-Length header
It had better be - otherwise why have it at all?
If it can't be reliably determined in advanced, it shouldn't be served by the server at all. (When dealing with dynamically generated text, for example, something like chunked transfer encoding may be used - which doesn't require the final length to be known when the HTTP header is written at the beginning of the stream.)
Content-Length can be sent by the server code (or) by the apache layer itself.
When the code is not sending apache will send it.
There are known client-crashes when the client connects and closes the socket when the
content-length is sent smaller.
Since the images are usually not generated by code in run-time, you can rely on it.
Browsers can be unforgiving if the content-length is incorrect.
I was having a problem here, where the server was sometimes returning a content-length that was too low. The browsers just wouldn't handle it.
So yes, you can assume that the server is setting the content-length correctly, based on the knowledge that browser clients work on the same assumption.
A sharing for what I've discovered recently.
My case was using NodeJS to post http request to another server, which I'm able to set the content-length value by my self into the header (the value is lower than the actual size).
When the other server received my http request, it only process the request param/body size up to the point that the content-length told the server.
E.g. my request actual length is 100, but content-length mentioned 80, the server who received the request will only process 1-80 and the last 20 wasn't being processed and caused some error like "Invalid parameter".
Nowadays sever side will automatically insert the content-length for you, unless there are needs and you know what you are doing, else you don't want to change it as it may cause you trouble.