Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
We are developing a browser extension which would send all the URLs visited by a logged in user to backend APIs to be persisted.
Now as number of requests send to backend API would be huge and hence we are confused between if we create a persistent connection via websocket OR do it via TCP connection i.e. using HTTP rest API calls.
The data post to backend API doesn't need to be real time as we anyway would be using that data in our models which doesn't demand them to be real time.
We are inclined towards HTTP rest API calls as due to below reasons
Easy to implement
Easy to scale(using auto-scaling techniques)
Everyone in the team is already comfortable with the rest APIs
But at the same time cons would be
On the scale where we would have a lot of post requests going to server not sure it would be optimised
Feels like websockets can give us an optimised infrastructure :(
I would love if I can hear from community if we can have any pitfalls going with rest API calls option.
So first of all TCP is the transport layer. It is not possible to use raw TCP, you have to create some protocol on top of it. You have to give meaning to the stream of data.
REST or HTTP or even WebSockets will never be as efficient as customly designed protocol on top of raw TCP (or even UDP). However the gain may not be as spectacular as one may think. I've actually done such transition once and we've experienced only few percent of performance gain. And it was neither easy to do correctly nor easy to maintain. Of course YMMV.
Why is that? Well, the reason is that HTTP is already quite highly optimized. First of all you have "keep alive" header that keeps the connection open if it is used. And so the default HTTP mechanisms already persists connection when used. Secondly HTTP handles body compression by default, and with HTTP/2 it also handles headers compression. With HTTP/3 you even have more efficient TLS usage and better support in case of unstable network (e.g. mobile).
Another thing is that since you do not require real time data then you can do buffering. So you don't send data each time it is available, but you gather it for say few seconds, or minutes or maybe even hours, and send it all in one go. With such approach the difference between HTTP and custom protocol will be even less noticable.
All in all: I advice you start with the simplest solution there is, in your case it seems to be REST. Design your code so that transition to other protocol is as simple as possible. Optimize later if needed. Always measure.
Btw, there are lots of valid privacy and security concerns around your extension. For example I'm quite surprised that you didn't mention TLS at all. Which matters, not only because of security, but also because of performance: establishing TLS connections is not free (although once established, encryption does not affect performance much).
Putting my discomfort aside (privacy, anyone?)...
Assuming your extension collates the Information, you might consider "pushing" to the server every time the browser starts / quits and then once again every hour or so (users hardly ever quite their browsers these days)... this would make REST much more logical.
If you aren't collating the information on the client side, you might prefer a WebSocket implementation that pushes data in real time.
However, whatever you decide, you would also want to decouple the API from the transmission layer.
This means that (ignoring authentication paradigms) the WebSockets and REST implementations would look largely the same and be routed to the same function that contains the actual business logic... a function you could also call from a script or from the terminal. The network layer details should be irrelevant as far as the API implementation is concerned.
As a last note: I would never knowingly install an extension that collects so much data on me. Especially since URLs often contain private information (used for REST API routing). Please reconsider if you want to take part in creating such a product... they cannot violate our privacy if we don't build the tools that make it possible.
Related
I have a scenario where I need to deliver realtime firehoses of events (<30-50/sec max) for dashboard and config-screen type contexts.
While I'll be using WebSockets for some of the scenarios that require bidirectional I/O, I'm having a bit of a bikeshed/architecture-astronaut/analysis paralysis lockup about whether to use Server-Sent Events or Fetch with readable streams for the read-only endpoints I want to develop.
I have no particular vested interest in picking one approach over the other, and the backends aren't using any frameworks or libraries that express opinionation about using one or the other approach, so I figure I might as well put my hesitancy to use and ask:
Are there any intrinsic benefits to picking SSE over streaming Fetch?
The only fairly minor caveat I'm aware of with Fetch is that if I'm manually building the HTTP response (say from a C daemon exposing some status info) then I have to manage response chunking myself. That's quite straightforward.
So far I've discovered one unintuitive gotcha of exactly the sort I was trying to dig out of the woodwork:
When using HTTP/1.1 (aka plain HTTP), Chrome specially allows up to 255 WebSocket connections per domain, completely independently of its maximum of 15 normal (Fetch)/SSE connections.
Read all about it: https://chromium.googlesource.com/chromium/src/+/e5a38eddbdf45d7563a00d019debd11b803af1bb/net/socket/client_socket_pool_manager.cc#52
This is of course irrelevant when using HTTP2, where you typically get 100 parallel streams to work with (that (IIUC) are shared by all types of connections - Fetch, SSE, WebSockets, etc).
I find it remarkable that almost every SO question about connection limits doesn't talk about the 255-connection WebSockets limit! It's been around for 5-6 years!! Use the source, people! :D
I do have to say that this situation is very annoying though. It's reasonably straightforward to "bit-bang" (for want of a better term) HTTP and SSE using printf from C (accept connection, ignore all input, immediately dprintf HTTP header, proceed to send updates), while WebSockets requires handshake processing, and SHA1, and MD5, and input XORing. For tinkering, prototyping and throwaway code (that'll probably stick around for a while...), the simplicity of SSE is hard to beat. You don't even need to use chunked-encoding with it.
After reviewing the differences between raw TCP and websocket, I am thinking to use websocket, even though it will be a client/server system with no web browser in the picture. My motivation stems from:
websocket is message-oriented, so I do not have to write down a protocol on top of the tcp layer to delimit messages myself.
The initial handshake of websocket is quite fitting for my use case as I can authenticate/authorize the user in this initial response-request exchange.
Performance does matter a lot here though, I am wondering if, excluding the websocket handshake, there would be a loss of performance between the websocket messages vs writing a custom protocol on raw tcp? If not, then websocket is the most convenient choice to me, even if I don't use the benefits related to the "web" part.
Also would using wss change the answer to the above question?
You are basically asking if using an already implemented library which perfectly fits your requirements and which even has the option for secure connections (wss) is better then designing and implementing your own message based protocol on TCP, assuming that performance and overhead are not relevant for your use case.
If you rephrase your question this way the answer should be obvious: using an existing implementation which fits your purpose saves you a lot of time and hassle for design, implementation and testing. It is also easier to train developers to use this protocol. It is easier to debug problems since common tools like Wireshark understand the protocol already.
Apart from this websockets have an established mechanism to use proxies, use a common protocol so that they can easier pass firewalls etc. So you will likely run into less problems when rolling out your application.
In other words: I can see no reason on why you should not use websockets if they fit your purpose.
I've just started dabbling in some game development and wanted to create a simple multiplayer game. Is it feasible to use HTTP as the primary communication protocol for a multiplayer Game.
My game will not be making several requests a second but rather a a request every few seconds. The client will be a mobile device.
The reason I'm asking is, I thought it may be interesting to try using Tornado which reportedly scales well and supports non blocking requests and can handle "thousands of concurrent users".
So my client could make a HTTP Request, and when the game server has somethign to tell it, it will respond to the request. I believe this illustrates what some people call the COMET design pattern.
I understand that working at the socket level has less overhead but I am just wondering if this would be feasible at all given my game requirements? Or am I just thinking crazy?
Thanks in advance.
Q: Is it feasible to use HTTP as the primary communication protocol for a multiplayer Game.
A. Using HTTP as a communication protocol may make sense for your game, probably not, but that's for you to decide. I have developed applications for Windows Mobile, Blackberry, Android and iPhone for just over 10 years. All the way back to CE 1.0. With that in mind, here is my opinion.
First I suggest reading RFC 3205 as Teddy suggested. It explains the reasons for my suggestions in detail.
In general HTTP is good because...
If you're developing a browser based game - flash or javascript where you don't create the client, then use HTTP because it's built in anyway and it's likely all you can use.
You can get http server hosting with decent scripting super cheap anywhere
There's a ton of tools available and lots of documentation
It's easy to get started
HTTP may be bad because...
HTTP introduces huge overhead in terms of bandwidth compared to a simple TCP service.
For example Omegle.com sends 420 bytes of header data to send a 9 byte payload.
If you really need comet / long polling you'll waste a lot of time figuring out how to make your server talk right instead of working on what it says.
A steady stream of http traffic may overload mobile devices in both processing and bandwidth, giving you less resources to devote to your game performance.
You may not think you know how to create your own TCP server - but it really is easy.
If you're writing the server AND the client, then I would go straight to TCP. If you already know python then use the twisted networking library. You can get a simple server up in an hour or so just following the tutorials.
Check out the LineReceiver example for a super simple server you can test with any telnet client.
http://twistedmatrix.com/projects/core/documentation/howto/servers.html
WRT:
"my client could make a HTTP Request, and when the game server has somethign to tell it, it will respond to the request."
This is not how HTTP is supposed to work. So, no, HTTP would not be a good choice here. HTTP requests timeout if the response is not received back with the timeout (60 seconds is a common default but it would depend on the specifics).
Please read RFC 3205: On the use of HTTP as a Substrate, which deals with this.
With the target platform being a mobile device (and the limited bandwidth that entails) HTTP wouldn't be the first tool I would pull out of the box.
If you just fancy playing with all this technology, then you could give it a go. Tornado seems like a reasonable choice, if the example on the web site is anything to go by. But any simple server-side web language would suffice for serving up the responses you need at the rate you have mentioned. The performance characteristics are likely to be irrelevant here.
The COMET method is when you leave a HTTP connection open over a long period. It is primarily there for 'server push' of data. But usually you don't need this. It's usually much more straightforward to make repeated requests and handle the responses individually.
I am having a Web application sitting on IIS, and talking with [remote]Service-Machine.
I am not sure whether to choose TCP or Http, as the main protocol.
more details:
i will have more than one service\endpoint
some of them will be one-way
the other will be two-ways
the web pages will work infront of the services
we are talking about hi-scale web-site
I know the difference pretty well, but I am looking for a good benchmark, that shows how much faster is the TCP?
HTTP is a layer built ontop of the TCP layer to some what standardize data transmission. So naturally using TCP sockets will be less heavy than using HTTP. If performance is the only thing you care about then plain TCP is the best solution for you.
You may want to consider HTTP because of its ease of use and simplicity which ultimately reduces development time. If you are doing something that might be directly consumed by a browser (through an AJAX call) then you should use HTTP. For a non-modern browser to directly consume TCP connections without HTTP you would have to use Flash or Silverlight and this normally happens for rich content such as video and/or audio. However, many modern browsers now (as of 2013) support API's to access network, audio, and video resources directly via JavaScript. The only thing to consider is the usage rate of modern web browsers among your users; see caniuse.com for the latest info regarding browser compatibility.
As for benchmarks, this is the only thing I found. See page 5, it has the performance graph. Note that it doesn't really compare apples to apples since it compares the TCP/Binary data option with the HTTP/XML data option. Which begs the question: what kind of data are your services outputting? binary (video, audio, files) or text (JSON, XML, HTML)?
In general performance oriented system like those in the military or financial sectors will probably use plain TCP connections. Where as general web focused companies will opt to use HTTP and use IIS or Apache to host their services.
The question you really need an answer for is "will TCP or HTTP be faster for my application". The answer is that it depends on the nature of your application, and on the way that you use TCP and/or HTTP in your application. A generic HTTP vs TCP benchmark won't answer your question, because the chances are that the benchmark won't match your application behaviour.
In theory, an optimally designed / implemented solution using TCP will be faster than one that uses HTTP. But it may also be considerably more work to implement ... depending on the details of your application.
There are other issues that might affect your choice. For example, you are less likely to run into firewall issues if you use HTTP than if you use TCP on some random port. Another is that HTTP would make it easier to implement a load balancer between the IIS server and the backend systems.
Finally, at the end of the day it is probably more important that your system is secure, reliable, maintainable and (maybe) scalable than it is fast. A sensible strategy is to implement the simple version first, but have plans in your head for how to make it faster ... if the simple solution is too slow.
You could always benchmark it.
In general, if what you want to accomplish can be easily done over HTTP (i.e. the only reason you would otherwise think about using raw TCP is for a possible performance boost) you should probably just use HTTP. Sure, you can do socket programming, but why bother? Lots of people have spent a lot of time and effort building HTTP client libraries and servers, and they have spent waaaaaay more time optimizing and testing that code than you will ever be able to possibly spend on your TCP sockets. There are simply so many possible errors that you would have to handle, edge cases, and optimizations that can be done, that it is usually easier and safer to use a library function for HTTP.
Plus, the HTTP specs define all kinds of features (and clients/servers implement, which you get to use "for free", i.e. no extra implementation work) which makes any third-party interoperability that much easier. "Here is my URL, here are the rules for what you send, here are the rules for what I return..."
I have a Self Hosted Windows native C++ server application that I use the Casablanca C++ REST SDK code in. I can use any client C#, JavaScript, C++, cURL, basically anything that can send a POST, GET, PUT, DEL message can be used to send request messages to this self hosted windows app. Also I can use a plain browser address bar to do GET related requests using various parameters. Currently I only run this system on a private intranet so it is very fast - I haven't benchmark it against just doing raw TCP, but on a private intranet I doubt there would be even a few microseconds difference? For the convenience and ease of development and ability to expand to full blown internet app it's a dream come true. It is a dedicated system with a private protocol using small JSON packets so not certain if that fits your application needs or not? Another nice thing is this Windows application native C++ code could be ported fairly easily to run on Linux/MacOS as the Casablanca REST SDK is portable to those OSes.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have an existing standalone application which is going to be extended by a 3rd-party, using a network protocol. The capabilities are already implemented, all I need is to expose them to the outside.
Assuming the transport protocol is already chosen (UDP), are there any resources that will help me to design my application protocol?
There seems to be a lot of information about software design, but not on protocol design.
I've already looked at Application Protocol Design.
See Jabber protocols design guidelines and RFC 4101. Although it is aimed at making RFCs more easy to understand to reviewers, this RFC provides some interesting advices.
Have you looked at Google Protocol Buffer? It seems like a good way to resolve this issue.
You can create an endpoint that communicates with your existing app and then responds from 'outside' using the protobuffer protocol. It's binary, so it's tiny and fast and you don't have to write your own protocol manager, 'cause you can use the Google ones. The downside is that it has to be implemented on both sides of the system (on your 'server' side and on the consumer/client side).
Another recommendation for protocol buffers - nice tight binary with little effort. Note, however, that while the binary protocol is well defined, there isn't yet an agreed RPC standard (several are in progress, tending to lean towards TCP or HTTP).
The spec makes it very easy to have the client and server in different architectures, which is good - plus it is extensible.
Caveat: I'm the author of one of the .NET versions, so I may well be biased ;-p
First off, UDP is primarily a one-way broadcast transport method. Also, it is potentially lossy, so you need to be able handle missing packets and out-of-order packets. If you need any level of reliability from UDP, or require two-way connections, you will end up needing just about everything from TCP, so you might as well go with that to start with and let the network stack take care of it.
Next up, if your data is potentially larger than a single IP packet then you will need some way of identifying the start and end of each packet, and a means of handling illegal or corrupt packets. I would recommend some kind of header with packet length, some kind of footer, and maybe a checksum.
Then you need some way of encoding the messages and responses. There are many RPC protocols around. You could look at SOAP, or design a custom XML-based protocol, or a binary one.
You should really think hard about whether you really want to design, document and maintain your own protocol or use something that is already existing. It is probable there is already a documented protocol that matches your needs. Depending on what you are doing it will probably look overkill at first and implementing all the spec will look tedious and a lot less fun than writing your own but if you intend for your application to still be actively developed in a few years it should save you a lot of time and money to use something that already exist and is known by third parties. Besides, if you can use an existing library for that protocol, the implementation part should be a lot faster.
Designing new protocol is more fun than implementing one but less than maintaining one as you have to live with all the defects. No protocol is perfect but if you have never designed one you can be assured you will make more mistake designing it than the people who designed the existing well known protocol you could use instead.
In short, leverage what already exists whenever possible.
If you're choosing XML keep in mind that you will have a giant overhead of markup.
A simple binary protocol will also be need not so much ressources to parse compared to xml.
If you do not want to build your protocol from ground up, you should take a look at SOAP. Support varies for different programming languages, but cross language communication is explicitly encouraged.
Unfortunately UDP and SOAP seem to have stuck in its infancy, HTTP is most commonly used.
I have an existing standalone application which is going to be extended by a 3rd-party, using a network protocol.
It would help to know a little more about what your program does and what the nature of these 3rd party extensions are. Maybe some rationale for using UDP?