Best solution to notify back-end server about changes on client side - http

There are about a thousand clients (mobile application).
There is a server (we are going to use SpringBoot), which saves information about changes on the client side.
Changes from each client to the server should arrive every 5 minutes.
Please tell me the best way to realise it:
use simple http requests to the server
use websocket
use long polling
I can’t understand what is more critical, to keep a large number of connections or to respond to a large number of requests.

Related

How to throttle SignalR clients on the server side

I am using a PersistentConnection for publishing large amounts of data (many small packages) to the connected clients.
It is basically a one way direction of data (since each client will call endpoints on other servers to set up various subscriptions, so they will not push any data back to the server via the SignalR connection).
Is there any way to detect that the client cannot keep up with the messages sent to it?
The example could be a mobile client on a poor connection (e.g. in a roaming situation, the speed may vary a lot). If we are sending 100 messages per second, but the client can only handle 10, we will eventually lose the messages (due to the message buffer on the server side).
I was looking for a server side event, similar to what has been done on the (SignalR) client, e.g.
protected override Task OnConnectionSlow(IRequest request, string connectionId) {}
but that is not part of the framework (for good reasons, I assume).
I have considered using the approach (suggested elsewhere on Stackoverflow), to let the client tell the server (e.g. every 10-30 seconds) how many messages it has received, and if that number differentiates a lot from the number of messages sent to the client, it is likely that the client cannot keep up.
The event would be used to tell the distributed backend that the client cannot keep up, and then turn down the data generation rate.
There's no way to to this right now other than coding something custom. We have discussed this in the past as a potential feature but it isn't anywhere the roadmap right now. It's also not clear what "slow" means as it's up to the application to decide. There'd probably be some kind of bandwidth/time/message based setting that would make this hypothetical event trigger.
If you want to hook in at a really low level, you could use owin middleware to replace the client's underlying stream with one that you owned so that you'd see all of the data going over the write (you'd have to do the same for websockets though and that might be non trivial).
Once you have that, you could write some time based logic that determined if the flush was taking too long and kill the client that way.
That's very fuzzy but it's basically a brain dump of how a feature like this could work.

Facebook graph api and keep alive connections

We want to reduce the https connect latency to facebook graph servers. We have configured our servers to do a http keep alive. However, it looks like the connection gets closed after every call (from traffic server logs...).
Is there a way to deterministically see that keep_alive connections are honored or not by graph.facebook.com? ..or for that matter any server in general?
Based on Facebook's attitude towards efficient queries, I would not expect them to honor any type of persistent connection. Especially since your request is going to end up in a performance hit on their servers.
You should look at fetching as much data as you can in one go by combining your queries into a single batch request.

Is polling the way to go for live chat on web?

I'm trying to implement a custom live chat program on the web, but I'm not sure how to handle the real-time (or near real-time) updates for users. Would it make more sense to send Ajax requests from the client side every second or so, polling the database for new comments?
Is there a way to somehow broadcast from the database each time a comment is added? If this is possible how would that work? I'm using Sql Server 2008 with Asp.net (c#).
Thanks!
Use long polling/server side push/comet:
http://en.wikipedia.org/wiki/Comet_(programming))
Also see:
http://en.wikipedia.org/wiki/Push_technology
I think when you use long polling you'll also want your web server to provide some support in the form of non-blocking io for requests, so that you aren't holding a thread per connection.
You could have each client poll the server, and at the server side keep the connection open without responding.
As soon there is a message detected at server side, this data is returned through the already open connection. On receipt, your client immediately issues a new request.
There's some complexity as you need to keep track server side which connections is associated with which session, and which should be responded upon to prevent timeouts.
I never actually did this but this should be the most resource efficient way.
Nope. use queuing systems like RabiitMq or ActiveMQ. Check mongoDB too.
A queuing system will give u a publish - subscribe facilities.

Why HTTP was designed to be a pull protocol?

I was watching many presentations about Html 5 WebSockets , where server can initialize connection with client and push the data without the request from the client.
We don't need Polling etc.
And , I am curious , why Http was designed as a "pull" and not full duplex protocol in the first place ? What where the reasons behind that kind of decision ?
Because when http was first designed it was meant to be used to retrieve documents from a server. And the easiest way to do is when the client asks the server for a document and gets it delivered as response (or an error in case it does not exist). When you have push protocol that means the server would need to keep client connections around for potentially a long time creating more resource management problems - remember we are talking about early 1990s here.
Http was designed for simply retrieving hypertext documents from a server. There were no reasons to push anything to the client when the pages were just pure, static html without scripting capabilities.
Since there was no need at the time for pushing things back to the client, the protocol was kept simple.
HTTP is mainly a pull protocol—someone loads information on a Web server and
users use HTTP to pull the information from the server at their convenience. In particular,
the TCP connection is initiated by the machine that wants to receive the file.

Chat Server - persistent TCP or new Connection for each poll

Whats the best practice for scalable servers which need to maintain a list of active users?
Should I open a persistent TCP Connection for each client on which the server sends update messages?
This could lead in many open connection and propably no traffic for many seconds. Is this a problem in TCP?
Or would it be better to let the Client poll updates periodically (with a new tcp connection each)?
How do Chat Servers or large Online Games handle this?
Personally I'd go for a single persistent TCP connection per client to avoid a) the additional work in creating and destroying connections and the additional latency involved in all the TCP packets involved and b) to avoid creating lots of sockets in TIME_WAIT on either the clients or the server. There's simply no good reason to create and destroy the connections.
Depending on your platform there may be various tricks to deal with the various platform specific problems that you might get when you have lots of connections open, and by lots I mean 10s of thousands. For example, on Windows, using overlapped I/O and I/O completion ports would be a good design for lots of connections and if your connections are generally idle most of the time then you might find that using the 'zero byte read' trick would allow you to handle more connections on lesser hardware; but it's something you can add once you know you have a problem due to the amount of buffer space that you have waiting for reads which only complete infrequently.
I wouldn't have the clients polling the server. It's inefficient. Have the server publish data to the clients as and when there is data available. This would allow the server to control the workload somewhat by letting it decide how often to send the data to the clients - it could either send every time new data became available for a client or send after it had batched up some data and waited a short while, etc. If the server is pushing the data then the server (the weak point, the place that might get overwhelmed by client demand) has more control over the work that it will need to do.
If you have each client polling then a) you're generating more network noise as each client sends a message to ask the server if it has anything that it should send it and b) you're generating more work for the server as it needs to respond to the polls. The server knows when there's data for the client, let it be responsible for telling the clients.

Resources