I'm a bit rusty on nuances of the HTTP protocol and I'm wondering if it can support publish/subscribe directly?
HTTP is a request reponse protocol. So client sends a request and the server sends back a response.
In HTTP 1.0 a new connection was made for each request.
Now HTTP 1.1 improved on HTTP 1.0 by allowing the client to keep the connection open and make multiple requests.
I realise you can upgrade an HTTP connection to a websocket for fast 2 way communications. What I'm curious about is whether this is strictly necessary?
For example if I request a resource "http://somewhere.com/fetch/me/slowly"
Is the server free to reply directly twice?
Such as first with a 202 accepted
and then shortly later with the content when it is ready,
but without the client sending an additional request first?
i.e.
Client: GET http://somewhere.com/fetch/me/slowly
Server: 202 "please wait..."
Server: 200 "here's your document"
Would it be correct to implement a publish/subscribe service this way?
For example:
Client: http://somewhere.com/subscribe
Server: item 1
...
Server: item 2
I get the impression that this 'might' work because clients will typically have an event loop watching the connection but is technically wrong (because a client following the protocol need not be implemented that way).
However, if you use chunked transfer encoding this would work.
HTTP/2 seems to allow this as well but I'm not clear whether something changed to make it possible.
I haven't seen much discussion of this in relation to pub/sub so what if anything is wrong with using plain HTTP/1.1 with or without chunked encoding?
If this works why do you need things like RSS or ATOM?
A HTTP request can have multiple 'responses', but the responses all have statuscodes in the 1xx range, such as 102 Processing.
However, these responses are only headers, never bodies.
HTTP/1.1 (like 1.0 before it) is a request/response protocol. Sending a response unsolicited is not allowed. HTTP/2 is a frames protocol which adds server push which allows the server to offer extra data and handle multiple requests in parallel but doesn't change its request/response nature.
It is possible to keep a HTTP connection open and keep sending more data though. Many (audio, video) streaming services will use this.
However, this just looks like a continuous body that keeps on streaming, rather than many multiple HTTP responses.
If this works why do you need things like RSS or ATOM
Because keeping a TCP connection open is not free.
Related
Pipelining is a technique in HTTP/1.1 where multiple requests are sent at once without waiting for a response, on a keepalive connection. The responses are then returned in order by the server, without waiting for a round-trip-time between a response being sent and the next request being received.
HTTP/2 adds a feature called multiplexing, which similarly allows the client to send off multiple requests at once. In this case however, the server can send responses all at once.
Without control of the server, Can I achieve something similar to pipelining (i.e. receiving responses in order one-at-a-time without latency between responses) when using HTTP/2?
This would be useful when downloading many large files, without much available memory to buffer several partially-completed responses.
Without control of the server, Can I achieve something similar to pipelining (i.e. receiving responses in order one-at-a-time without latency between responses) when using HTTP/2?
No you cannot, unless the server cooperates (for example the server can be configured to handle requests sequentially or something similar).
As a side note, while request pipelining was allowed in HTTP/1.1, it has always been considered a bad idea and as such made irrelevant by all major implementations (i.e. browsers don't do it, servers don't really support it, etc.).
The main problem is error handling and buggy proxy servers.
HTTP/2 allows a client to set priorities on requests so that requests are processed in priority order.
However, this feature is optional and servers may not implement it, so again you need to carefully choose/configure the server in order to get the behavior you want.
If you can control a little the server side, for both HTTP/1.1 and HTTP/2, a better solution would be to ask the server for all the files in a single request, and have the server reply with a multipart response.
I have a client side GUI app for human usage that consumes some SOAP web services and uses cURL as the underlying HTTP communication lib. Depending on the input, processing a request can take some large amount of time, even one hour. Neither the client nor server time out for that reason on their own and that's tested and works. Most of the requests get processed in some minutes anyway, so this is an edge case.
One of my users is forced to use a proxy between my client app and my server and for various reasons has no control over it. That proxy has a time out configured and closes the connection to my client after 4 minutes of no data transfer. So the user can (and did) upload data for e.g. 30 minutes, afterwards the server starts to process the data and after 4 minutes the proxy closes the connection, the server will silently continue to process the request, but the user is left with some error message AND won't get the processing result. My app already uses TCP Keep Alive, so that shouldn't be the problem, but instead the time out seems to be defined for higher level data. It works the same like the option read_timeout for squid, which I used to reproduce the behaviour in our internal setup.
What I would like to do now is start a background thread in my web service which simply outputs some garbage data to my client over all the time the request is processed, which is ignored by the client and tells the proxy that the connection is still active. I can recognize my client using the user agent and can configure if to ouput that data or not server side and such, so other clients consuming the web service wouldn't get a problem.
What I'm asking for is, if there's any HTTP compliant method to output such garbage data before the actual HTTP response? So e.g. would it be enough to simply output \r\n without any additional content over and over again to be HTTP compliant with all requesting libs? Or maybe even binary 0? Or some full fledged HTTP headers stating something like "real answer about to come, please be patient"? From my investigation this pretty much sounds like chunked HTTP encoding, but I'm not sure yet if this is applicable.
I would like to have the following, where all those "Wait" stuff is simply ignored in the end and the real HTTP response at the end contains Content-Length and such.
Wait...\r\n
Wait...\r\n
Wait...\r\n
[...]
HTTP/1.1 200 OK\r\n
Server: Apache/2.4.23 (Win64) mod_jk/1.2.41\r\n
[...]
<?xml version="1.0" encoding="UTF-8"?><soap:Envelope[...]
Is that possible in some standard HTTP way and if so, what's the approach I need to take? Thanks!
HTTP Status 102
Isn't HTTP Status 102 exactly what I need? As I understand the spec, I can simply print that response line over and over again until the final response is available?
HTTP Status 102 was a dead-end, two things might work, depending on the proxy used: A NPH script can be used to regularly print headers directly to the client. The important thing is that NPH scripts normally bypass header buffers from the web server and can therefore be transferred over the wire as needed. They "only" need be correct HTTP headers and depending on the web server and proxy and such it might be a good idea to create incrementing, unique headers. Simply by adding some counter in the header name.
The second thing is chunked transfer-encoding, in which case small chunks of dummy data can be printed to the client in the response body. The good thing is that such small amount of data can be transferred over the wire as needed using server side flush and such, the bad thing is that the client receives this data and by default behaves as if it was part of the expected response body. That might break the application of course, but most HTTP libs provide callbacks for processing received data and if you print some unique one, the client should be able to filter the garbage out.
In my case the web service is spawning some background thread and depending on the entry point of the service requested it either prints headers using NPH or chunks of data. In both cases the data can be the same, so a NPH-header can be used for chunked transfer-encoding as well.
My NPH solution doesn't work with Squid, but the chunked one does. The problem with Squid is that its read_timeout setting is not low level for the connection to receive data at all, but instead some logical HTTP thing. This means that Squid does receive my headers, but it expects a complete HTTP header within the period of time defined using read_timeout. With my NPH approach this isn't the case, simply because by design I only want to send some garbage headers to ignore until the real headers arrive.
Additionally, one has to be careful about NPH in Apache httpd, but in my use case it works. I can see the individual headers in Squid's log and without any garbage after the response body or such. Avoid the Action directive.
Apache2 sends two HTTP headers with a mapped "nph-" CGI
I've heared that you (in some cases) can prevent timeouts by sending the HTTP-header back to the client before the whole HTTP-body is prepared.
I know that this is impossible using gzip ... but is this possible using HTTPS?
I read in some posts that the secure part of HTTPS is done in the transport-layer (TLS/SSL) - therefore it should be possible, right?
Sorry for mixing gzip in here - it's a completely different level - I know ... and it may is more confusing than giving an example ;)
In HTTP 1.1 it's possible to send the response header before preparing of the body of the response is completed . To do this one normally uses chunked encoding.
Some servers also stream the data as is by not specifying the content length and indicating the end of stream by closing connection, but this is quite a brutal way to do things (chunked encoding was designed exactly for sending the data before it's completely available).
As HTTP(S) is HTTP running over SSL/TLS channel, TLS doesn't affect the above behavior in any way.
Yes, you can do this. HTTPS is just HTTP over an TLS/SSL transport, the HTTP protocol is exactly the same.
If I make multiple HTTP Get Requests to the same server and get HTTP 200 OK responses to each one how do I tell which request maps to which response using Wireshark?
Currently it looks like an http request is made, and the next HTTP 200 OK response is quickly received so everything is in a the proper sequence. I have seen things to the contrary however. For example using the Google Maps API v2 I've made several requests for location information and then the information is received in an arbitrary order (closely resembling the order in which I requested it, but not necessarily perfect.)
So my intuition is I cannot assume that my responses will be received in a specific order, even though they may be in order most of the time. So I'm wondering how I can determine this order from the response.
Update: Clarification as to what I need. I just need to know that the server has received the request. It seems like I need to do this by looking at sequence numbers and perhaps even ACKS. The reasoning behind this approach is I'm basically observing a web app and checking it is sending the information and the information is being received.
Update: This has nothing to do with wireshark specifically. I believe it is confusing people so I removing it from the title. It has to do with the HTTP protocol on top of the TCP/IP protocol and how we map responses to requests.
Thanks.
After you have stopped capturing packets follow this steps:
position the cursor on a GET request
Open the Analyze menu
click "Follow TCP Stream"
You get a new window with requests and responses in sequence.
While I was googling for a complete different question, I saw this one and I think I can provide a more complete answer :
HTTP dictates that responses must arrive in the order they were requested, Therefore, if you are looking at a single TCP connection at a given time you should be seeing :
Request ; Response ; Request ; Response ...
Also in HTTP/1.1, there is support for "Pipeline" where the client doesn't have to wait for responses to arrive in order to issue the next request. What could be observed in such cases is :
Request ; Response ; Request ; Request ; Response ; Response ; Request ; Response
In the HTTP response itself, there is no reference to the specific request that triggered it.
Filipo's suggestion is classic when debugging / observing a single TCP connection, but, when observing multiple TCP connections, you can't click the follow TCP Stream because you'd have to do it for each connection.
If you have many TCP connections, and many requests/responses you will have to look at TCP Source port in the request packet, and the TCP dest port in the response packet to know which response is related to each tcp connection, and then apply the HTTP request/response order rules.
Also, Wireshark CAN decompress the response body, and it will do it automatically if all the response body has arrived, but it will do so NOT in the Follow TCP Stream.
I always use Wireshark to debug HTTP.
Seems like this ability is not provided by the HTTP protocol at the application layer so I must go down to the transportation layer to determine this. In my case the TCP/IP layer using sequence numbers.
HTTP only presumes a reliable
transport; any protocol that provides
such guarantees can be used; the
mapping of the HTTP/1.1 request and
response structures onto the
transport data units of the protocol
in question is outside the scope of
this specification.
Read more:
http://www.faqs.org/rfcs/rfc2616.html#ixzz0e20kxKcz
Don't use Wireshark to debug HTTP, use an HTTP debugger such as Fiddler2
I need a way to detect a missing response to a long running HTTP POST request. This problem arises when the network infrastructure (firewalls, proxies, unplugged cables, etc.) drops the response packets. The server may detect this failure, but the client cannot send additional bytes after the POST to probe the state of the TCP connection. The failure may be limited to a single TCP connection. For example I may be able to subsequently open a new TCP connection to the server.
I'm looking for a solution that still uses HTTP POST and does not change the duration of the server side processing.
Some solutions that I can think of are:
Provide a side channel interface to retrieve request & response history. If the history lists the response as having been send (presumably resulting in a TCP error) but I have not yet received it within a reasonable time I can generate a local error.
Use an X header to request that the server deliver "spurious" 100 Continue provisional responses on a regular interval. If I fail to see an expected 100 Continue or a non-provisional response I can generate a local error.
Is there a state of the art solution for this problem?
It sounds to me like you are using Soap for something that would be much better done using a stateful connection, or a server side push technology.