SockJS and meteor: what if load balancer do not support sticky sessions? - meteor

I'm exploring balancing options for Meteor. This article looks very cool and it says that the following should be supported to load balance Meteor:
Mongo optailing. Otherwise, it may take up to ten seconds for one instance of Meteor to get updates from the another, because polling Mongo driver will be used, which polls-and-diffs DB each ten seconds.
Websocket. It's clear too - otherwise clients will fallback to HTTP and long-polling, which will work, but it's not as cool as Websocket.
Sticky sessions 'which are required by SockJS'. Here the question comes:
As I understood, 'sticky sessions support' is something that assign one client to the same server during his session. Is it essential? What may happen if I don't configure sticky sessions at all?
Here's what I came up to by myself:
Because Meteor stores all data sent to client in memory, if client connects to X servers, then X times more memory will be consumed
Some minor (or major, if there are no oplog) lag may appear for the same user in, say, different tabs or windows, which may be surprising.
If SockJS reconnects and wants some data to persist across reconnections, it gonna have a bad time. I'm not sure about how SockJS works, is this point valid?
What bad can happen? These three points doesn't look very bad: data is valid, available, may be at a cost of extra memory consumption.

Basics
Sticky Sessions are required to ensure that the browser's in memory session can be managed correctly by the server.
First let me explain why you need sticky sessions:
Each publish that uses an ordinary publish cursor keeps track of whatever collections the client may have, so when something changes it knows what to send down back to the client. This would apply to every Meteor app if it needs a DDP connection. This is the case with websockets and sockjs
Additionally there may be other client session state stored in variables but those you would be edge cases (e.g you store the user's state in a variable).
The problem happens when the server disconnects and reconnects, but somehow perhaps the connection gets transferred to the other node (without re-establishing a new connection) - which has no idea about the client's data, so the behaviour could turn up a bit weird.
The issue with SockJS & Long Polling
With SockJS there is an additional issue. SockJS uses websocket emulation when it falls back to long polling.
With Long polling a new connection attempt/new http request is made every time new data is available.
If sticky Sessions are not enabled each of these connections will be randomly assigned to a different node/dynamo.
So you have a 50% chance (in the case its random) that the server has no idea about the client's DDP Session with every every time new data is available.
It would then force the client to re-negotiate a connection/ignore the clients DDP commands and you would end up getting very weird behaviour on the client.
Half of these would be to the wrong node:

Related

Isn't http keep alive feature against three rule of thumbs: assyncronous, reactive programing and scalability

I know in HTTP 1.1, keep-alive is the default behavior, unless the client explicitly asks the server to close the connection by including a Connection: close header in its request, or the server decides to includes a Connection: close header in its response. I am wondering if this isn't kind of an obstacle in scalability when growning servers horizontaly.
My scenario: we are developing all new services following microservices patterns either in Java or Phyton. It is desarible we can design and implement such way we can increase horizontally. For isntance, I can use docker in order to easily scale up or use Spring Boot Cloud Config. Whatever the phisical host implementation the basic idea is favour scalability.
My understanding: I must keep server and client as musch agnostic as possible and when I set up HTTP Keep Alive I understand there will be advantage while taking use of same http connection (and save some cpu process) but I guess I am forcing the client (eg. another service) to keep using same connection which may downgrade the advantage of several docker instances of same service since I will promote the client to keep consuming the same initial connection.
Assuming my understanding is correct, I assume it is not a good idea since we develop the service providing response that can be reuseable from different consumers with different approaches: some consumers can consume assyncronously or following reactive design paradigms which make me wondering if keeping alive same connection. Let's say in practical terms: the connection used should be free soon as possible in order to really balance the demand over all providers.
***edited after first comment
Let´s assume I have multiple diferent consumer services (CS1, CS2 ... CSn) connecting to a single Load Balance instance (LB) which will forward the request to multiple Dockers with same provider service (D1, D2 ... Dn). Since keep alive is the default behaviour in http 1+, we have keep "alive = true" in all connection (either between Cx and LB or LB and Dx). As far as I know the only advantage to keep alive is save cpu process while opening/closing a connection. If I send Connection:close after each request there is no advantage at all to use keep alive. If I have some logic to send "connection: close" it means I promote LB to keep connected to a specific Dx using exactly the same connection for while, right? (I choose here the word promote because I iguess force might not be the appropriate one since there is time out in keep alive and then LB migh route to another Dx anyway). So I have in some moment C1 -> LB -> D1 alive persisted for while, right? Comming back to my original question, isn't that against the idea of assyncronous/paralelal/reactive paradigm? For instance, I have some scenario where a single consumer service will call another service few times before returning a single answer to a page. Today we are doing it sequentially but if we decide to call in paralalel and depending on first answer therer will be already a answer to a page or we decide to compouse an answer to the page but I don't care the order. The caller service will wait every answers before returning to a ccontroller and the order doesn't matter. Ins't strange I have keep alive = true?
I am forcing the client (eg. another service) to keep using same connection
You are not forcing. The client can easily avoid persistent connections by sending HTTP/1.0 and/or Connection: close. There are many practical HTTP applications that work just like that.
keep using same connection which may downgrade the advantage of several docker instances of same service since I will promote the client to keep consuming the same initial connection
Assuming your load balancer works well, it will usually distribute connections evenly across your instances. Of course, this may not work when you only have a few connections altogether, but a few connections can hardly pose a scalability or performance problem.

What is the behavior of observeChanges when the internet connection is temporarily lost?

Say my client code is observing document additions using the added callback mechanism. By design and in perfect conditions, if documents 1,2,3...N are added, client callback should be fired N times.
Now let's say the network connection is lost for periods of time, are there any guarantees/invariants about the number of times that the client added callback will be fired? i.e. can callbacks be logically "lost" due to network issues
Excellent question! I was experimenting with this extensively recently as I was writing the serversync package. What I found is this:
If the timeout of the DDP ping is not reached, i.e., the disconnection is only very brief:
    Business as usual, no events happen. I believe the default DDP ping timeout from the client is 30 seconds.
Else:
As long as you do not use the autopublish package on the server, the client will not fire its observeChanges handlers again upon reconnection. If you use autopublish, then added events will fire for each document of the collection.
You might be interested in knowing that under the hood, DDP does send added messages to the client upon each reconnection, i.e., on each reconnect DDP sends a full copy of the collection. This happens regardless of whether you use autopublish or not. However, the client seems to be smart enough to know which of these documents he already has, so he doesn't fire the actual observeChanges events for documents he already has, only for new, changed, and removed ones. This behavior of DDP may still be relevant to you if you anticipate a lot of disconnections, because it will increase your network traffic considerably -- especially when publishing large collections. I believe this is a ramification of the following note in the docs of onConnection:
Currently when a client reconnects to the server (such as after temporarily losing its Internet connection), it will get a new connection each time. The onConnection callbacks will be called again, and the new connection will have a new connection id.
In the future, when client reconnection is fully implemented, reconnecting from the client will reconnect to the same connection on the server: the onConnection callback won’t be called for that connection again, and the connection will still have the same connection id.
Hope this helps!

ASP.Net MVC Delayed requests arriving long after client browser closed

I think I know what is happening here, but would appreciate a confirmation and/or reading material that can turn that "think" into just "know", actual questions at the end of post in Tl,DR section:
Scenario:
I am in the middle of testing my MVC application for a case where one of the internal components is stalling (timeouts on connections to our database).
On one of my web pages there is a Jquery datatable which queries for an update via ajax every half a second - my current task is to display correct error if that data requests times out. So to test, I made a stored procedure that asks DB server to wait 3 seconds before responding, which is longer than the configured timeout settings - so this guarantees a time out exception for me to trap.
I am testing in Chrome browser, one client. Application is being debugged in VS2013 IIS Express
Problem:
Did not expect the following symptoms to show up when my purposeful slow down is activated:
1) After launching the page with the rigged datatable, application slowed down in handling of all requests from the client browser - there are 3 other components that send ajax update requests parallel to the one I purposefully broke, and this same slow down also applied to any actions I made in the web application that would generate a request (like navigating to other pages). The browser's debugger showed the requests were being sent on time, but the corresponding break points on the server side were getting hit much later (delays of over 10 seconds to even a several minutes)
2) My server kept processing requests even after I close the tab with the application. I closed the browser, I made sure that the chrome.exe process is terminated, but breakpoints on various Controller actions were still getting hit for 20 minutes afterward - mostly on the actions that were "triggered" by automatically looping ajax requests from several pages I was trying to visit during my tests. Also breakpoints were hit on main pages I was trying to navigate to. On second test I used RawCap monitor the loopback interface to make sure that there was nothing actually making requests still running in the background.
Theory I would like confirmed or denied with an alternate explanation:
So the above scenario was making looped requests at a frequency that the server couldn't handle - the client datatable loop was sending them every .5 seconds, and each one would take at least 3 seconds to generate the timeout. And obviously somewhere in IIS express there has to be a limit of how many concurrent requests it is able to handle...
What was a surprise for me was that I sort of assumed that if that limit (which I also assumed to exist) was reached, then requests would be denied - instead it appears they were queued for an absolutely useless amount of time to be processed later - I mean, under what scenario would it be useful to process a queued web request half an hour later?
So my questions so far are these:
Tl,DR questions:
Does IIS Express (that comes with Visual Studio 2013) have a concurrent connection limit?
If yes :
{
Is this limit configurable somewhere, and if yes, where?
How does IIS express handle situations where that limit is reached - is that handling also configurable somewhere? ( i mean like queueing vs. immediate error like server is busy)
}
If no:
{
How does the server handle scenarios when requests are coming faster than they can be processed and can that handling be configured anywhere?
}
Here - http://www.iis.net/learn/install/installing-iis-7/iis-features-and-vista-editions
I found that IIS7 at least allowed unlimited number of silmulatneous connections, but how does that actually work if the server is just not fast enough to process all requests? Can a limit be configured anywhere, as well as handling of that limit being reached?
Would appreciate any links to online reading material on the above.
First, here's a brief web server 101. Production-class web servers are multithreaded, and roughly one thread = one request. You'll typically see some sort of setting for your web server called its "max requests", and this, again, roughly corresponds to how many threads it can spawn. Each thread has overhead in terms of CPU and RAM, so there's a very real upward limit to how many a web server can spawn given the resources the machine it's running on has.
When a web server reaches this limit, it does not start denying requests, but rather queues requests to handled once threads free up. For example, if a web server has a max requests of 1000 (typical) and it suddenly gets bombarded with 1500 requests. The first 1000 will be handled immediately and the further 500 will be queued until some of the initial requests have been responded to, freeing up threads and allowing some of the queued requests to be processed.
A related topic area here is async, which in the context of a web application, allows threads to be returned to the "pool" when they're in a wait-state. For example, if you were talking to an API, there's a period of waiting, usually due to network latency, between sending the request and getting a response from the API. If you handled this asynchronously, then during that period, the thread could be returned to the pool to handle other requests (like those 500 queued up requests from the previous example). When the API finally responded, a thread would be returned to finish processing the request. Async allows the server to handle resources more efficiently by using threads that otherwise would be idle to handle new requests.
Then, there's the concept of client-server. In protocols like HTTP, the client makes a request and the server responds to that request. However, there's no persistent connection between the two. (This is somewhat untrue as of HTTP 1.1. Connections between the client and server are sometimes persisted, but this is only to allow faster future requests/responses, as the time it takes to initiate the connection is not a factor. However, there's no real persistent communication about the status of the client/server still in this scenario). The main point here is that if a client, like a web browser, sends a request to the server, and then the client is closed (such as closing the tab in the browser), that fact is not communicated to the server. All the server knows is that it received a request and must respond, and respond it will, even though there's technically nothing on the other end to receive it, any more. In other words, just because the browser tab has been closed, doesn't mean that the server will just stop processing the request and move on.
Then there's timeouts. Both clients and servers will have some timeout value they'll abide by. The distributed nature of the Internet (enabled by protocols like TCP/IP and HTTP), means that nodes in the network are assumed to be transient. There's no persistent connection (aside from the same note above) and network interruptions could occur between the client making a request and the server responding to the request. If the client/server did not plan for this, they could simply sit there forever waiting. However, these timeouts are can vary widely. A server will usually timeout in responding to a request within 30 seconds (though it could potentially be set indefinitely). Clients like web browsers tend to be a bit more forgiving, having timeouts of 2 minutes or longer in some cases. When the server hits its timeout, the request will be aborted. Depending on why the timeout occurred the client may receive various error responses. When the client times out, however, there's usually no notification to the server. That means that if the server's timeout is higher than the client's, the server will continue trying to respond, even though the client has already moved on. Closing a browser tab could be considered an immediate client timeout, but again, the server is none the wiser and keeps trying to do its job.
So, what all this boils down is this. First, when doing long-polling (which is what you're doing by submitting an AJAX request repeatedly per some interval of time), you need to build in a cancellation scheme. For example, if the last 5 requests have timed out, you should stop polling at least for some period of time. Even better would be to have the response of one AJAX request initiate the next. So, instead of using something like setInterval, you could use setTimeout and have the AJAX callback initiate it. That way, the requests only continue if the chain is unbroken. If one AJAX request fails, the polling stops immediately. However, in that scenario, you may need some fallback to re-initiate the request chain after some period of time. This prevents bombarding your already failing server endlessly with new requests. Also, there should always be some upward limit of the time polling should continue. If the user leaves the tab open for days, not using it, should you really keep polling the server for all that time?
On the server-side, you can use async with cancellation tokens. This does two things: 1) it gives your server a little more breathing room to handle more requests and 2) it provides a way to unwind the request if some portion of it should time out. More information about that can be found at: http://www.asp.net/mvc/overview/performance/using-asynchronous-methods-in-aspnet-mvc-4#CancelToken

How to throttle SignalR clients on the server side

I am using a PersistentConnection for publishing large amounts of data (many small packages) to the connected clients.
It is basically a one way direction of data (since each client will call endpoints on other servers to set up various subscriptions, so they will not push any data back to the server via the SignalR connection).
Is there any way to detect that the client cannot keep up with the messages sent to it?
The example could be a mobile client on a poor connection (e.g. in a roaming situation, the speed may vary a lot). If we are sending 100 messages per second, but the client can only handle 10, we will eventually lose the messages (due to the message buffer on the server side).
I was looking for a server side event, similar to what has been done on the (SignalR) client, e.g.
protected override Task OnConnectionSlow(IRequest request, string connectionId) {}
but that is not part of the framework (for good reasons, I assume).
I have considered using the approach (suggested elsewhere on Stackoverflow), to let the client tell the server (e.g. every 10-30 seconds) how many messages it has received, and if that number differentiates a lot from the number of messages sent to the client, it is likely that the client cannot keep up.
The event would be used to tell the distributed backend that the client cannot keep up, and then turn down the data generation rate.
There's no way to to this right now other than coding something custom. We have discussed this in the past as a potential feature but it isn't anywhere the roadmap right now. It's also not clear what "slow" means as it's up to the application to decide. There'd probably be some kind of bandwidth/time/message based setting that would make this hypothetical event trigger.
If you want to hook in at a really low level, you could use owin middleware to replace the client's underlying stream with one that you owned so that you'd see all of the data going over the write (you'd have to do the same for websockets though and that might be non trivial).
Once you have that, you could write some time based logic that determined if the flush was taking too long and kill the client that way.
That's very fuzzy but it's basically a brain dump of how a feature like this could work.

Server -> Many Clients: Simultaneous Events

Not sure what category this question falls into; perhaps general networking / design / algorithms.
For a project I am looking at having one server with multiple connected clients. After some time, when all clients have connected, the server should send a message to each client instructing them to take some action. I need to guarantee that each client will execute this action at exactly the same time. Theoretically, how can this be done? What are the practical complications I will come up against? My target platform is mobile.
One solution I can think of;
The server actively and continuously keep track of the round-trip latency for each client. Provided this latency doesn't change too fast over time, the server should be able to compensate for each client's lag and send messages to each such that they all start execution at roughly the same time. Is there a better way?
One not-really related question: Client side and server side events not firing simultaneously
It can easily be done.
You don't care about latency nor you need the same machine time at clients.
The key here is to create a precise appointment.
Since clients communicate to the server, and not vice versa (you didn't say anything about it though). I can give you the following solution:
When a client connects to the server, it should send their local time.
When the server thinks it's time for the event to be set. It should send an appointment event to each client, with their local time in it. Server can calculate this.
Then, each client knows when exactly they need to do something by setting a timer till the time for their appointment comes.
In theory yes you can but not in real life.
At least you should add some a validity time-slot. All actions should be in that predefined time-slot in order that action to be valid.
So basically "same moment" = "a predefined time slot".
A predefined time-slot can be any value that is close to same moment or real-time.

Resources