How do server-sent events actually work? - server-sent-events

So I understand the concept of server-sent events (EventSource):
A client connects to an endpoint via EventSource
Client just listens to messages sent from the endpoint
The thing I'm confused about is how it works on the server. I've had a look at different examples, but the one that comes to mind is Mozilla's: http://hacks.mozilla.org/2011/06/a-wall-powered-by-eventsource-and-server-sent-events/
Now this may be just a bad example, but it kinda makes sense how the server side would work, as I understand it:
Something changes in a datastore, such as a database
A server-side script polls the datastore every Nth second
If the polling script notices a change, a server-sent event is fired to the clients
Does that make sense? Is that really how it works from a barebones perspective?

The HTML5 doctor site has a great write-up on server-sent events, but I'll try to provide a (reasonably) short summary here as well.
Server-sent events are, at its core, a long running http connection, a special mime type (text/event-stream) and a user agent that provides the EventSource API. Together, these make the foundation of a unidirectional connection between a server and a client, where messages can be sent from server to client.
On the server side, it's rather simple. All you really need to do is set the following http headers:
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Be sure to respond with the code 200 and not 204 or any other code, as this will cause compliant user agents to disconnect. Also, make sure to not end the connection on the server side. You are now free to start pushing messages down that connection. In nodejs (using express), this might look something like the following:
app.get("/my-stream", function(req, res) {
res.status(200)
.set({ "content-type" : "text/event-stream"
, "cache-control" : "no-cache"
, "connection" : "keep-alive"
})
res.write("data: Hello, world!\n\n")
})
On the client, you just use the EventSource API, as you noted:
var source = new EventSource("/my-stream")
source.addEventListener("message", function(message) {
console.log(message.data)
})
And that's it, basically.
Now, in practice, what actually happens here is that the connection is kept alive by the server and the client by means of a mutual contract. The server will keep the connection alive for as long as it sees fit. Should it want to, it may terminate the connection and respond with a 204 No Content next time the client tries to connect. This will cause the client to stop trying to reconnect. I'm not sure if there's a way to end the connection in a way that the client is told not to reconnect at all, thereby skipping the client trying to reconnect once.
As mentioned client will keep the connection alive as well, and try to reconnect if it is dropped. The algorithm to reconnect is specified in the spec, and is fairly straight forward.
One super important bit that I've so far barely touched on however is the mime type. The mime type defines the format of the message coming down the connecting. Note however that it doesn't dictate the format of the contents of the messages, just the structure of the messages themselves. The mime type is extremely straight forward. Messages are essentially key/value pairs of information. The key must be one of a predefined set:
id - the id of the message
data - the actual data
event - the event type
retry - milleseconds the user agent should wait before retrying a failed connection
Any other keys should be ignored. Messages are then delimited by the use of two newline characters: \n\n
The following is a valid message: (last new line characters added for verbosity)
data: Hello, world!
\n
The client will see this as: Hello, world!.
As is this:
data: Hello,
data: world!
\n
The client will see this as: Hello,\nworld!.
That pretty much sums up what server-sent events are: a long running non-cached http connection, a mime type and a simple javascript API.
For more information, I strongly suggest reading the specification. It's small and describes things very well (although the requirements of the server side could possibly be summarized a bit better.) I highly suggest reading it for the expected behavior with certain http status codes, for instance.

You also need to make sure to call res.flushHeaders(), otherwise Node.js won't send the HTTP headers until you call res.end(). See this tutorial for a complete example.

Related

Does 200 mean the request successfully started or successfully finished?

I realize this might sound like an odd question, but I'm seeing some odd results in my network calls so I'm trying to figure out if perhaps I'm misunderstanding something.
I see situations where in isolated incidents, when I'm uploading data, even though the response is 200, the data doesn't appear on the server. My guess is that the 200 response arrives during the initial HTTP handshake and then something goes wrong after the fact. Is that a correct interpretation of the order of events? Or is the 200 delivered once the server has collected whatever data the sending Header tells it is in the request? (cause if so then I'm back to the drawing board as to how I'm seeing what I'm seeing).
It means it has been successfully finished. From the HTTP /1.1 Spec
10.2.1 200 OK
The request has succeeded. The information returned with the response is dependent on the method used in the request, for example:
GET an entity corresponding to the requested resource is sent in the response;
HEAD the entity-header fields corresponding to the requested resource are sent in the response without any message-body;
POST an entity describing or containing the result of the action;
TRACE an entity containing the request message as received by the end server.
Finished. It doesn't make sense otherwise. Something might go wrong and the code could throw an error. How could the server send that error code when it already sent 200 OK.
What you experience might be lag, caching, detached thread running after the server sent the 200 etc.
A success reply, like 200, is not sent until after server has received and processed the full request. Error replies, on the other hand, may be sent before the request is fully received. If that happens, stop sending your request.

What to do with errors when streaming the body of an Http request

How do I handle a server error in the middle of an Http message?
Assuming I already sent the header of the message and I am streaming the
the body of the message, what do I do when I encounter an unexpected error.
I am also assuming this error was caused while generating the content and not a connection error.
(Greatly) Simplified Code:
// I can define any transfer encoding or header fields i need to.
send(header); // Sends the header to the Http client.
// Using an iterable instead of stream for code simplicity's sake.
Iterable<String> stream = getBodyStream();
Iterator<String> iterator = stream.iterator();
while (iterator.hasNext()) {
String string;
try {
string = iterator.next();
catch (Throwable error) { // Oops! an error generating the content.
// What do i do here? (In regards to the Http protocol)
}
send(string);
}
Is there a way to tell the client the server failed and should either retry or abandon the connection or am I sool?
The code is greatly simplified but I am only asking in regards to the protocol and not the exact code.
Thank You
One of the following should do it:
Close the connection (reset or normal close)
Write a malformed chunk (and close the connection) which will trigger client error
Add a http trailer telling your client that something went wrong.
Change your higher level protocol. Last piece of data you send is a hash or a length and the client knows to deal with it.
If you can generate a hash or a length (in a custom header if using http chunks) of your content before you start sending you can send it in a header so your client knows what to expect.
It depends on what you want your client to do with the data (keep it or throw it away). You may not be able to make changes on the client side so the last option will not work for example.
Here is some explanation about the different ways to close. TCP option SO_LINGER (zero) - when it's required.
I think the server should return a response code start with 5xx as per RFC 2616.
Server Error 5xx
Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.

Network protocol for consistent mobile clients requests

Is there a protocol (or framework) that ensures that when a request fails, it fails on both the client side (iOS, Android, etc) and server side, and when it succeeds, successes on both sides?
The request might be completed on the server but because of dropped network connection, the client does not receive the response and thinks that the request failed.
The Post-Redirect-Get pattern can be adapted to this. The post part is used to submit the request, and the redirected get will be to a "results" page where the client can acquire the status (in progress, failure, success. etc).
Obviously, a client should not conclude from a network problem that the request failed. It should simply be prepared to wait and/or retry to obtain the status.
The interesting case is where the initial request submission is incomplete, i.e. nothing, not even the redirect comes back. This is where the adaptation comes in. The initial data submission should be after the server has generated a transaction identifier that the client can use as an alternative for status requests. (E.g., a form with a static field "Please save and use this tracking ID for status inquiries".)
If your question was whether this fallback can be automated at the protocol level, the answer unfortunately is no.

Node.js http request pipelining

So, I want to use node.js and http request pipelining, but I want to use HTTP only as a transport, nothing else. I am interested in exploiting the request pipelining feature. However, one problem that I am running into is that till a send a response to a previous request, the next requests's callback isn't fired by node. I want a way to be able to do this. I shall be handling the ordering of results in the application. Is there a way to do this?
The HTTP RFC mentions that the responses should be in order, but I don't see any reason for node.js to not call the next callback till the 1st one is responded to. The application can in theory send the response to the 2nd query as a response to the 1st one (as long as there is some way for the recipient to know that it is a response to the 2nd one).
The HTTP client in NodeJS does not support pipelining. (Slightly old post from Ryan, but I'm fairly sure it still holds.)

Does sending POST data to a server that doesn't accept post data recieve the data?

I am setting up a back end API in a script of mine that contacts one of my sites by sending XML to my web server in the form of POST data. This script will be used by many and I want to limit the bandwidth waste for people that accidentally turn the feature on without a proper access key.
I will be denying requests that do not have the correct access key by maybe generating a 403 access code.
Lets say the POST data is ~500kb of data. Does the server receive all 500kb of data when this attempt is made regardless of the status code?
How about if I made the url contain the key mydomain/api/123456789 and generate 403 status on all bad access keys.
Does the POST data still get sent/received regardless or is it negotiated before the data is finally sent.
Thanks in advance!
Generally speaking, the entire request will be sent, including post data. There is often no way for the application layer to return a response like a 403 until it has received the entire request.
In reality, it will depend on the language/framework used and how closely it is linked to the HTTP server. Section 8.2.2 of RFC2616 HTTP/1.1 specification has this to say
An HTTP/1.1 (or later) client sending
a message-body SHOULD monitor the
network connection for an error status
while it is transmitting the request.
If the client sees an error status, it
SHOULD immediately cease transmitting
the body. If the body is being sent
using a "chunked" encoding (section
3.6), a zero length chunk and empty trailer MAY be used to prematurely
mark the end of the message. If the
body was preceded by a Content-Length
header, the client MUST close the
connection.
So, if you can find a language environemnt closely linked with the HTTP server (for example, mod_perl), you could do this in a way which does comply with standards.
An alternative approach you could take is to make an initial, smaller request to obtain a URL to use for the larger POST. The application can then deny providing the URL to clients without an appropriate key.
Here is great book about RESTful Web Services, where it's explained how HTTP works: http://oreilly.com/catalog/9780596529260
You can consider any request as envelope, where on top of it it's written address (URL), some properties (HTTP Headers) and inside it there's some data (if request is initiated by post method). So as you might guess you can't receive envelope partially.
Oh I forgot, it's when you are using HTTP Post with standard HTTP header "application/x-www-form-urlencoded" but if you are uploading files (correspondingly using ""multipart/form-data") Django gives you control over streamed chunks of files using Middleware classes: http://docs.djangoproject.com/en/dev/topics/http/middleware/

Resources