how can i keep my jvm from exiting while netty client connection is open? - tcp

I have an API which uses netty to open client connection to a tcp server. The server may send data to the client at any time. I'm facing the following scenario:
Client connects to server
Sends data to server
Disconnects and the JVM exist (not sure happens first)
This is what I expect:
Client connects to server
Sends data to server
Client simply keeps the connections open, waiting to receive data or for the user of client API to send data.
This is an outline of my connection method (obviously there is a much larger API around it):
```
public FIXClient connect(String host, int port) throws Throwable {
...
ChannelPipeline pipe = org.jboss.netty.channel.Channels.pipeline(...);
ChannelFactory factory = new NioClientSocketChannelFactory(
Executors.newCachedThreadPool(),
Executors.newCachedThreadPool());
ClientBootstrap bootstrap = new ClientBootstrap(factory);
bootstrap.setPipeline(pipe);
bootstrap.setOption("tcpNoDelay", true);
bootstrap.setOption("keepAlive", true);
ChannelFuture future = bootstrap.connect(new InetSocketAddress(host, port));
//forcing the connect call to block
//don't want clients to deal with async connect calls
future.awaitUninterruptibly();
if(future.isSuccess()){
this.channel = future.getChannel();
//channel.getCloseFuture();//TODO notifies whenever channel closes
}
else{
throw future.getCause();//wrap this in a more specific exception
}
return this;
}
```

That has nothing todo with netty... You need to make sure your "main" method will not exist if you call it from there. Otherwise it the job of the container..

There's a couple of ways you can do this, but one thing I have observed, is that with this code:
ChannelFactory factory = new NioClientSocketChannelFactory(
Executors.newCachedThreadPool(),
Executors.newCachedThreadPool());
... if you make a successful connection, your JVM will not shutdown of it's own accord for some time until you force it (like a kill) or you call a releaseExternalResources() on your channel factory. This is because:
The threads created by Executors.newCachedThreadPool() are nonDaemon threads.
At least 1 thread would be created once you submit your connection request.
The cached thread pool threads have a keep alive time of 60 seconds, meaning they don't go away until they've been idle for 60 seconds, so that would be 60 seconds after your connect and send (assuming that they both completed).
So I'm not sure if you're diagnosing the issue correctly. Having said that, I recommend you handle the task this this way:
Once you boot in your main method (in the main thread)
Now launch all your actual useful work in new threads.
Once the useful threads have been launched, in the main thread, call Thread.currentThread().join(). Since main is always non-dameon, you have made sure the JVM will not shutdown until you're good and ready.
At some point, unless you want to kill -9 the JVM as a shutdown strategy, you will want a controlled shutdown, so you can add a shutdown hook to shutdown Netty and then interrupt the main thread.
I hope that's helpful.

Related

Qt: Detect a QTcpSocket disconnection in a console app when the user closes it

My question title should be enough. I already tried (without success):
Using a C-style destructor in a function: __attribute__((destructor)):
void sendToServerAtExit() __attribute__((destructor)) {
mySocket->write("$%BYE_CODE%$");
}
The application destructor is called, but the socket is already disconnected and I can't write to the server.
Using the standard C function atexit(), but the TCP connection is already lost so I can't send anything to the server.
atexit(sendToServerAtExit); // is the same function of point 1
The solution I found is check every second if all connected sockets are still connected, but I don't want to do so inefficient thing. It's only a temporary solution. Also, I want that others apps (even web ones) can join the chat room of my console app, and I don't want to request data every second.
What should I do?
Handle the below signal (QTcpSocket is inherited from QAbstractSocket)
void QAbstractSocket::stateChanged(QAbstractSocket::SocketState socketState)
Inside the slot called, check if socketState is QAbstractSocket::ClosingState.
QAbstractSocket::ClosingState indicates the socket is about to close.
http://doc.qt.io/qt-5/qabstractsocket.html#SocketState-enum
You can connect a slot to the disconnect signal.
connect(m_socket, &QTcpSocket::disconnected, this, &Class::clientDisconnected);
Check the documentation.
You can also know which user has been disconnected using a slot like this:
void Class::clientDisconnected
{
QTcpSocket* client = qobject_cast<QTcpSocket*>(sender());
if(client)
{
// Do something
client->deleteLater();
}
else
{
// Handle error
}
}
This method is usefull if you have a connections pool. You can use it as well if you have a single connection, but do not forget nullptr after client->deleteLater().
If I understand you question correctly, you want to send data over TCP to notify the remote computer that you are closing the socket.
Technically this can be done in Qt by listenning to the QIODevice::aboutToClose() or QAbstractSocket::stateChanged() signals.
However, if you graciously exit your program and close the QTcpSocket by sending a FIN packet to the remote computer. This means that on the remote computer,
the running program will be notified that the TCP connection finished. For instance, if the remote program is also using QTcpSocket, the QAbstractSocket::disconnected()
signal will be emitted.
The real issues arise when one of the program does not graciously exit (crash, hardware issue, cable unplugged, etc.). In this case, the TCP FIN packet will
not be sent and the remote computer will never get notified that the other side of the TCP connection is disconnected. The TCP connection will just time-out after a few minutes.
However, in this case you cannot send your final piece of data to the server either.
In the end the only solution is to send a "I am here" packet every now and then. Even though you claim it is ineficient, it is a widely used technique and it also has the advantage that it works.

Asynchronous web socket application server with two event loops

I'm trying to make a distributed RPC-type web application that uses websockets for its main interface. I want to use a queuing system (like RabbitMQ) in order to distribute the expensive jobs that are requested through the websocket connections.
Basically, the flow would go like this:
A client sends a job via websocket connection to the server
The server would send this message to a RabbitMQ exchange to be processed by a worker
The worker would execute the job and add the result of the job to a response queue
The server would check the response queue and send the result of the job back to the client via websocket connection.
As far as I can tell, on the server I need two event loops that share memory. The websocket server needs to be listening for incoming jobs, and a RabbitMQ consumer needs to be listening for job results to send back to the clients.
What's the appropriate technologies for me to use here? I've considered the following:
multithreading the application and starting one event loop on each thread
using two processes with shm (shared memory)
using two processes that communicate via socket (either a unix socket or maybe even set up the workers as special websocket clients)
hooking into the websocket server's event loop to check the result queue
I'm new to both websockets and distributed computing, so I really have no idea which of these (or maybe something I didn't think of) would work best for me.
As far as I can tell, on the server I need two event loops that share memory. The websocket server needs to be listening for incoming jobs, and a RabbitMQ consumer needs to be listening for job results to send back to the clients.
Since you can have multiple clients sending jobs concurrently, you will need a multithreaded server. Unless your application would process client per client. Now there are multiple approaches to implement a multithreaded server, each with their own advantages/disadvantages. Take a look at multithreading through :
A thread per request (+ : throughput potentially maximized, - : threads are expensive to create, must manage concurrency)
A thread per client (+ : less thread management overhead, - : doesn't scale to many many connections, still manage concurrency)
A thread pool (+ : Avoids overhead of thread creation, scalable up to N concurrent connections (N = size of thread pool), - : Manage concurrency between N threads)
It's up to you to choose one of the above approaches (I would opt for a thread per client as it is relatively easy to implement and the chance that you'll have tens of thousands of clients is relatively small).
Notice that this is a multithreaded approach and not an event-driven approach ! But since you are not limited to one thread (in which case it should be event driven in order to be able to process multiple clients "concurrently") I wouldn't go for that option as it is more difficult to implement. (Programmers sometimes speak about a "callback hell" in an event-driven approach).
This is how I would implement it (one thread per client, Java) :
Basically, the flow would go like this:
A client sends a job via websocket connection to the server
Server part :
public class Server {
private static ServerSocket server_skt;
private static ... channel; // channel to communicate with the rabbitMQ distributed priority queue.
// Constructor
Server(int port) {
server_skt = new ServerSocket(port);
/*
* Set up connection with the distributed queue
* channel = ...;
*/
}
public static void main(String argv[]) {
Server server = new Server(5555); // Make server instance
while(true) {
// Always waiting for new clients to connect
try {
System.out.println("Waiting for a client to connect...");
// Spawn new thread for communication with client (hence one thread per client approach)
new CommunicationThread(server_skt.accept(), server.channel).start(); // Will listen for new jobs and send them
} catch(IOException e) {
System.out.println("Exception occured :" + e.getStackTrace());
}
}
}
}
The server would send this message to a RabbitMQ exchange to be processed by a worker
...
The server would check the response queue and send the result of the job back to the client via websocket connection.
public class CommunicationThread extends Thread {
private Socket client_socket;
private InputStream client_in;
private OutputStream client_out;
private ... channel; // Channel to communicate with rabbitMQ
private ... resultQueue;
public CommunicationThread(Socket socket, ... channel) { // replace ... by type of the rabbitMQ channel
try {
this.client_socket = socket;
this.client_in = client_socket.getInputStream();
this.client_out = client_socket.getOutputStream();
this.channel = channel;
this.resultQueue = ...;
System.out.println("Client connected : " + client_socket.getInetAddress().toString());
} catch(IOException e) {
System.out.println("Could not initialize communication properly. -- CommunicationThread.\n");
}
}
public yourJobType readJob() {
// Read input from client (e.g. read a String from "client_in")
// Make a job from it (e.g. map String to a job)
// return the job
}
#Override
public void run() {
while(active) {
/*
* Always listen for incoming jobs (sent by client) and for results (to be sent back to client)
*/
// Read client input (only if available, else it would be blocking!)
if(client_in.available() > 0) {
yourJobType job = readJob();
channel.basicPublish(...); // Send job to rabbitMQ
}
/* Check result queue (THIS is why reading client input MUST be NON-BLOCKING! Else while loop could be blocked on reading input
* and the result queue won't be checked until next job arrives)
*/
ResultType next_result = resultQueue.poll(); // Could be "null" if the queue is empty
if(next_result != null) {
// There is a result
client_out.write(next_result.toByteArray());
client_out.flush();
}
}
client_in.close();
client_out.close();
}
}
Note that when reading from the result queue it is important that you only read results of jobs sent by that client.
If you have one result queue containing the results of jobs (of all clients) and you retrieve a result like in the code above, then that result could be the result of a job of another client, hence sending the result back the the wrong client.
To fix this you could poll() the result queue with a filter and a wildcard (*) or have a result queue for each client, hence knowing that a result retrieved from our queue wil be sent to the corresponding client.
(*) : You could assign an ID to every client. When receiving a job from a client, pair the job with the client ID (e.g. in a tuple < clientID, job >) and put it in the queue. And do the same for the results (pair the result with the client ID and put it in the result queue). Then in the run() method of CommunicationThread you would have to poll the result queue only for results of the form < clientID, ? >.
Important : You'll also have to assign an ID for every job! Because sending job A and then job B doesn't guarantee that result of job A will come before the result of job B. (Job B could be less time consuming then job A and thus the result could be sent back to the client before job A's result).
(PS : It's up to you to see how to implement the workers (executed by server with one thread for each worker? Or executed by other processes?))
The above answer is a possible, multithreaded solution. I only discussed the server part, the client part should send jobs and wait for results (how to implement this depends on your goals, do clients first send all jobs and then receive the results of each job or can this be mixed ?).
There are other ways it could be implemented, but for a beginner in distributed computing I think this is the easiest solution (using thread pools, ... would make it trickier).

Client Reconnection

My understanding of the (JavaScript) hub client is that if a connection is lost, it enters a 'Reconnecting...' phase which attempts to reconnect. If it can't do so, it will enter a 'Disconnected' state which is where it'll stay until asked to start again.
How long is the 'Reconnecting...' phase meant to last before it gives up? I've read 40 seconds before, but my client seems to take much less time - about 10, maybe less. [EDIT: Nevermind this part, I had configured a 10 disconnect on the server as a test... and forgot. I understand this is set by the server during the negotiate. Makes sense!] ... I'd prefer to have the client continually retry until it is told to abort - can this be done, and would it cause issues?
Another question; during the Reconnecting... phase, if I attempt to call a hub method (again, in JS) it never seems to complete. I'm using the returned Deferred to check for 'done' and 'fail' events, but neither seems to get called. Is this by design?
Thanks.
You can definitely have it continually reconnect.
Handle the disconnected event on the client and call connection.start:
$.connection.hub.disconnected(function() {
setTimeout(function() {
$.connection.hub.start();
}, 5000); // Re-start connection after 5 seconds
});
The only issues this would cause is that you could potentially be triggering infinite requests to a server that isn't there for client machines. This becomes even more troublesome when you introduce the mobile market into the situation (drains battery like crazy).
When you attempt to call a hub method while reconnecting SignalR will try to send your command. Since there are 2 channels, one for receiving data and one for sending, (for all transports except web sockets) in some cases it can still be possible to send requests while your offline. Therefore SignalR does not know if a request fails until the browser tells it that it could not successfully make the request.
Hope this helps!
I might have a clue... Touching the Web.config produces an appPool Recycle, meaning that a new worker process will be created for new requests while the existing process will continue for a while until the remaining requests end or the timeout is reached. Request that do not end in the timeout period are terminated.
Signalr client reconnects to the new process while the long running task is running in the old process, so when on the long running task you do
GlobalHost.ConnectionManager.GetHubContext<ForceHub>();
you actually get a reference for "old" hub while the client is connected to the "new" hub.
That's why the test preformed by Wasp worked: he was making a new request to publish on the signalr hub that was processed in the newly created worker process.
You could try to configure a singalr backplane (https://www.asp.net/signalr/overview/performance/scaleout-in-signalr), it’s really easy to configure it using Sql Server (https://www.asp.net/signalr/overview/performance/scaleout-with-sql-server). The backplane should be capable of connect the two worker processes and hopefully you will get the notification on the client.
If this is the problem, notifications generated by new requests will work even without the backplane. Notice that the real purpose of the backplane is to scale out signalr, this is, to connect a farm of WebServers between them.
Also keep in mind that running long-running task inside IIS is as task hard to achieve as, among other things, IIS does regular appPool recycles and has timeout limits for the requests to execute. I recommend that you read the following post: http://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx
“If you think you can just write a background task yourself, it's likely you'll get it wrong. I'm not impugning your skills, I'm just saying it's subtle. Plus, why should you have to?”
Hope this helps

netty client + keep-alive=true

I'm confused for how to deal with lots of connections in netty (3.6.2.FINAL) and keep-alive=true.
For work on a netty client as a server side connector, making http calls to another service, it wants to always keep the connection open for performance (keep-alive=true).
The issue: there is a hard limit for number of open channels, after which the client will hang when attempting to open a channel. Why no exception just hangs? Is this a setting in terms of channel timeout?
I can't seem to understand Netty in terms of overall managing of connections within worker threads:
With a blocking write/read client ChannelHandler (http request/response), how do you detect that the connection pool is empty?
The handler can receive ChannelEvent(s) but nothing about the overall count available in the connection pool (its very non-deterministic anyway). And if the channel is not open, does it make sense for the handler to initiate opening a new channel given its running in a worker thread?
But if the connection pool is exhausted, how do you go and cleanup some idle connections (within the handler)?
I had to completely rip apart my handler to get the client blocking call to work without hanging. The issue was mostly resolved by not holding onto local channel ref within the handler.
Now we just pass a ConnectionInterface#openConnection() [returns a new ChannelFuture] into the shared custom ChannelHandler#call( ConnectionInterface connectionInterface, HttpRequest request ).
Better to open-channel within the handler call method, and to pass that channel along with checks on its state before channel.write(x), if !channel.isWritable() then recycle the channel (from a new client connection eg. ConnectionInterface#openConnection()) and retry the write. There isn't even a need to close the channel (it gets handled in the pool).
Just ran it with 500 threads / 5000 requests and it succeeds fine.

Detecting aborted requests in a HttpServlet

Is there a way to find out if a HttpServletRequest is aborted?
I'm writing an instant browser application (some kind of chat): The clients asks for new events in a loop using AJAX-HTTP-Requests. The server (Tomcat) handles the requests in a HttpServlet. If there are no new events for this client, the server delays the reply until a new event arrives or a timeout occurs (30sec).
Now I want to identify clients that are no longer polling. Therefore, I start a kick-Timer at the end of a request which is stopped when a new request arrives. If the client closes the browser window the TCP-Connection is closed and the HTTP-Request is aborted.
Problem: The client does not run into the kick-Timeout because the Servlet still handles the event request - sleeping and waiting for an event or timeout.
It would be great if I could somehow listen for connection abort events and then notify the waiting request in order to stop it. But I couldn't find anything like that in the HttpServletRequest or HttpServletResponse...
This probably won't help the OP any more, but it might help others trying to detect aborted HTTP connections in HttpServlet in general, as I was having a similar problem and finally found an answer.
The key is that when the client cancels the request, normally the only way for the server to find out is to send some data back to the client, which will fail in that case. I wanted to detect when a client stops waiting for a long computation on server, so I ended up periodically writing a single character to response body through HttpServletResponse's writer. To force sending the data to the client, you must call HttpServletResponse.flushBuffer(), which throws ClientAbortException if the connection is aborted.
You are probably using some sort of thread-notification (Semaphores or Object.wait) to hold and release the Servlet threads. How about adding a timeout (~10s) to the wait, then somehow checking whether the connection is still alive and then continuing the wait for another 10s, if the connection is still there.
I don't know whether there are reliable ways to poll the "liveness" of the connection (e.g. resp.getOutputStream not throwing an Exception) and if so, which way is the best (most reliable, least CPU intense).
It seems like having waiting requests could degrade the performance of your system pretty quickly. The threads that respond to requests would get used up fast if requests are held open. You could try completing all requests (and returning "null" to your clients if there is no message), and having a thread on the back-end that keeps track of how long it's been since clients have polled. The thread could mark a client as being inactive.

Resources