What's the point of streaming over HTTP (MPEG-DASH)? - http

I was reading about streaming-over-HTTP technologies such as MPEG-DASH but don't really get the point. As I understand it, such protocols divide up the binary data in the media file into chunks, wraps each chunk in some kind of metadata, then stuffs these into HTTP messages and sends them to the client.
But what's the point of implementing this on top of HTTP instead of just implementing/creating a separate application layer protocol? Doesn't this just introduce more overhead and unnecessarily complicate the encoding/decoding process?

Transporting stuff over HTTP isn't done for the sake of efficiency since it's obviously inefficient. HTTP itself (at least until HTTP 2.0) is horribly inefficient.
The main reasons for using HTTP are simplicity, interoperability and re-usability. It's simple to understand and implement, it already exists in both servers and clients; plus it's well known by networks so it can easily pass through firewalls.

Related

Difference between ZeroMQ asynchronous http requests and Messages?

How is using asynchronous HTTP Requests different from using Messages when it comes to sending data in ZeroMQ?
A http request is simply the use of the hypertext transport protocol used over IP between two machines, client and server. It can be used for moving data in either direction. There's no particular restrictions as to what that data can be. An asynchronous request is simply one where the requester isn't bothering to wait for the reply having made the request; it'll use some mechanism to later rendezvous with the request, whenever that happens to come in.
Sending a message through ZeroMQ can be somewhat similar, specifically the REQ/REP pattern (request, reply). Similar to a http request, the requester will send some sort of message and the replier will reply in some way, and strictly in this pattern.
ZeroMQ uses its own protocol, zmtp, to move messages around. Again, there's nothing really limiting what data is in a message. ZeroMQ is inherently asynchronous - it's implementing the Actor programming model (though I notice that the way some implementations in some languages have eroded ZeroMQ's simplicity w.r.t. that, fitting into the language's own way of being asynchronous rather than use a poll funcion provided by ZeroMQ).
However, ZeroMQ builds many more data distribution patterns than req/rep on top of zmtp, like pub/sub, dealer/router, that http simply has no equivalent of. Further differences are that ZeroMQ can use IP, interprocess comms, or in-memory transports; this makes it highly suited for both in-application use, and for inter-machine distributed applications. I guess that a webserver could be contacted over ipc too, but I've never heard of anyone bothering to do that. Http is expected to be used over specific ports (e.g. port 80), whereas ZMQ gets used on whatever ports the developer wants (obeying the normal port allocation rules if they want a quiet life).

Why use Server-Sent Events instead of simple HTTP chunked streaming?

I just read RFC-6202 and couldn't figure out benefits of using SSEs instead of simply requesting a chunked stream. As an example use case imagine you want to implement client and server, where the client wants to "subscribe" to events at the server using pure HTTP technology. What would be a drawback of the server keeping the initial HTTP request open and then occasionally sending new chunks as new events come up?
I found some argument against this kind of streaming, which include the following:
Since Transer-Encoding is hop-to-hop instead of end-to-end, a proxy in between might try to consolidate the chunks before forwarding the response to the client.
A TCP connection needs to be kept open between client and server the whole time.
However, in my understanding, both arguments also apply to SSEs. Another potential argument I could imagine is that a JavaScript browser client might have no chance to actually get the respective chunks, since re-combining them is handled on a lower level, transparent to the client. But I don't know if that's actually the case, since video streams must work in some kind of similar way, or not?
EDIT: What I've found in the meantime is that SSE basically is exactly just a chunked stream, encapsulated by a easier-to-use API, is that right?
And one more thing. This page first tells that SSE doesn't support streaming binary data (for which technical reason?) and then (at the bottom), they say that it is possible but inefficient. Could somebody please clarify that?
Yes, SSE is an API that works on top on HTTP for providing you some nice features such as automatic reconnection at client/server side or handling different types of events.
If you want to use it for streaming binary data, for sure it is not the right API. The main fact is that SSE is a text-based protocol (it's delimited by '\n's and every line starts with a text tag. If you still want to experiment with binary over SSE, a quick and dirty hack would be maybe submit the binary data in Base 64.
If you want to know more about SSE, maybe you can have a look to this simple library: https://github.com/mariomac/jeasse
You are correct SSE is a nice API on top of chunked HTTP. The API is good, and it also has support for reconnection.
With regards to your question about binary over SSE, I've got no experience of doing that. However, you can send binary over HTTP. So I see no reason why you can't do this. Although, you may end up having to convert it in JavaScript.

How do XMPP/HTML/etc. *really* work?

This might be a dumb question, however, I have been continually frustrated by what seems to be a big gap in every explanation I've seen of protocols like XMPP or HTML. So basically, when I've read documentation on either, in general, it will describe the structure of the data sent back and forth through the protocol, but it does not explain exactly how this data is transferred. It's one thing to provide an example of, say, a generic HTTP request, but it is something else to explain how this text is actually sent to the server.
I guess posed another way, what resources are there out there for learning best practices for implementing text-based protocols? At their core, are all text-based protocols basically the exact same thing? How, for example, would it differ at the binary level, were I to say send the text content of an HTTP request over IRC vs however it is done natively by HTTP?
If I wanted to develop my own, simple textual protocol, what would be the best way to send the text to a client? Does the content itself even really matter? What I mean is that, obviously, HTTP and XMPP are rather different protocols, but do they differ in terms of how the text is transferred between computer to computer?
HTTP, IRC and XMPP are all sent on top of TCP, which is a protocol that provides a bidirectional stream between two endpoints (IP address + port). Under the hood, the data you send is split into separate packets, sent across the network, and reassembled on the other end, so that the recipient just sees a stream of incoming data - except when something goes wrong; there is a somewhat accessible description here.
What that means is that while the application protocol (HTTP, XMPP etc) is different, the underlying transport mechanism is exactly the same. It would be possible (perhaps even interesting) to implement HTTP on top of IRC: an HTTP/IRC client enters a channel, sends the HTTP request as messages to the channel, line by line, a server is present in the channel, reads the request and sends the response the same way - but transporting HTTP over IRC is fundamentally different from transporting HTTP over TCP. The former means layering an application protocol over another application protocol (and the IRC connection needs to go over TCP anyway), while the latter is an application protocol over a transport protocol, which is the way things usually are done (except for various kinds of proxies).
Hope that makes some sense...

Discussion: Chat server via node.js: HTTP or TCP?

I was considering doing a chat server using node.js/socket.io. Should I make it a tcp server or a http server? I'd imagine tcp server would be more efficient, but can you send other stuff to it like file attachments etc? If tcp is more efficient, how much more so? Also, just wondering how many concurrent connections can one node.js server handle? Is it more work to do TCP or HTTP?
You are talking about 2 totally different approaches here - TCP is a transport layer protocol and HTTP is an application layer protocol. HTTP (usually) operates over TCP, so whichever option you choose, it will still be operating over TCP.
The efficiency question is sort of a moot point, because you are talking about different OSI layers. If you went for raw TCP sockets, your solution would probably be more efficient - in bandwidth at least - since HTTP contains a whole bunch of extra data (the headers) that would likely be irrelevant to your purposes (depending on the scale of the chat program). What you are talking about developing there is your own application layer protocol.
You can send anything you like over TCP - after all HTTP can send attachments, and that operates over TCP. FTP also operates over TCP, and that is designed purely for transferring "attachments". In order to do this, you would need to write your protocol so that it was able to tell the remote party that the following data was a file, then send the file data, then tell the remote party that the transfer is complete. Implementations of this are many and varied (the HTTP approach is completely different from the FTP approach) and your options are pretty much infinite.
I don't know for sure about the node.js connection limit, but I can say with a fair amount of confidence that it is limited by the operating system. This might help you get to grips with the answer to that question.
It is debatable whether it is more work to do it with TCP or HTTP - it's a lot of work to do it in both. I would probably lean more toward the TCP option being your best bet. While TCP would require you to design a protocol rather than/as well as an application, HTTP is not particularly suited to live, 2-way applications like chat servers. There are many implementations of chat over HTTP that use AJAX, but I can tell you from painful experience that they are a complete pain in the rear-end.
I would say that you should only be looking at HTTP if you are intending the endpoint (i.e. the client) to be a browser. If you are going to write a desktop app for the endpoint, a direct TCP link would definitely be the way to go. The main reason for this is that HTTP works in a request-response manner, where the client sends a request to the server, and the server responds. Over TCP you can open a single TCP stream, that can be used for bi-directional communication. This means that the server can push an event to the client instantly, while over HTTP you have to wait for the client to send a request, so you can respond with an event. If you were intending to use a browser as the client, it will make the whole file transfer thing much more tricky (the sending at least).
There are ways to implement this over HTTP using long-polling and server push (read this) but it can be a real pain to implement.
If you are going to implement this on a LAN (or possibly even over the internet) it is worth considering UDP over TCP - in a chat application it is not usually absolutely mission critical that messages arrive in the right order, and even if it was, users would probably not be able to type faster than the variations in network latency (probably <100ms). Then for file transfers you could either negotiate a seperate TCP socket for the data exchange (like FTP), or implement some kind of UDP ACK system (like TFTP).
I feel there is a lot more to say on this subject but right now I can't put it into words - I may extend this answer at some point.
Chat servers are the Hello World program in node. Use http.
As far as the question of how many concurrent connections can it handle, that all depends on your system. Set up a simple chat server and then try benchmarking it.
Also, check out http://search.npmjs.org/ and search for chat for a few pointers.

Remoteobject (AMFPHP)or HttpService? Which is the best to choose from?

Please explain me which of two is secure, powerful one, fast enough on sending to and receiving requested data from the server in Flex!
I prefer working with remoting object with AMFPHP rather than HTTPService
Check out James Ward Census Application for information on performance and data transfer size.
For performance, use RemoteObject.
However, since you ask for "Fast enough" i really depends on your application and the amount of data.
Either channel is secure as the other. HTTPS would make it more secure. I don't think anything can prevent packet sniffers from getting at the data in transit.
AMF(Remote Objects) – Why its Better
It’s a binary protocol .
But its still encapsulated in HTTP so there is no concern with
firewalls or client issues and we can use our normal web debugging
methods.
HTTP headers with a binary body.
HTTP/1.1 200 OK
Date: Tue, 28 Jun 2011 12:55:26 GMT
Content-Type: application/x-amf
Server: stackoverflow.com
(binaryamf body here)
Because it is binary, it can use pointers.
-Circular References
-Objects only transmitted one time.
Common Strings, for example, are only sent once, then all other
references to that string only contains a pointer, instead of being
re-transmitted.
Same behavior for all Objects.
Its Transmitted binary format (spec) is the same format as how the
Flash player stores its objects into memory.
-No de-marshaling (expensive)
-No de-serializing
-Bits from HTTP stream flow ~> into Flash Player Memory
James Ward Census Data
-A flex application that is built to use several different transport
mechanisms while transferring the same data. Show comparative timings
of each stage of data transfer.
-James Ward Census
AMF is supposedly always going to be faster, but HttpService using XML or JSON is used probably more often. If its only a small project, or if its going to be using web based api's that may be used by other technologies, then maybe httpService is quicker to get implemented.
If you want to quickly try out AMF PHP using ZendAMF, I put up a tutorial and demo here:
http://bbishop.org/blog/?p=441
Includes all the php and config file details, as well as server setup.
Security has little to do with it because. AMF will save you bandwidth costs by using a binary protocol instead of a string one. It's an additional layer of obfuscation but there are some packet readers that will read AMF anyways. If you plan to have alternates to desktop client, say mobile, going AMF may lock you out because those other clients may not be Flash Player based. The advantage of going non-AMF is that you can open the possibility of other clients but the trade off is if the app is bandwidth intensive, HTTP requests with string bodies will weigh heavier than the AMF Binaries.

Resources