BlazeDS+ActiveMQ: non-graceful disconnection of Flex client from a durable topic does not remove it from ActiveMQ - apache-flex

I'm trying to make a Flex-based desktop application consume messages from an ActiveMQ topic with a durable subscription, using the JMS bridge of BlazeDS. The basic scenario is as follows:
Messages are produced by other producers in the topic to which the Flex client is subscribed.
The Flex client may go offline from time to time, but it must receive all the messages it has missed while being offline when it connects to BlazeDS again. (Of course the Flex client connects with the same client ID every time).
It can not be guaranteed that the Flex client is shut down gracefully.
Everything works fine if I explicitly disconnect my consumer on the Flex side by calling disconnect() - I do it in the exit handler of the application. However, due to #3 above, it is not guaranteed that disconnect() is called all the time. When the Flex client shuts down without calling disconnect(), it seems that the subscription of the "proxy JMS client" that BlazeDS creates and associates to the Flex client stays active towards ActiveMQ, so ActiveMQ still thinks that the client is logged in. When the Flex app starts up the next time, it is unable to log in to BlazeDS because ActiveMQ refuses its subscription, claiming that the client ID is already taken. Why is it so and what can I do here to ensure that BlazeDS makes the "proxy JMS client" offline in ActiveMQ when its real Flex counterpart terminates unexpectedly?
More detailed information: some debugging revealed that:
BlazeDS becomes aware of the termination of the Flex client because it prints a few exceptions to the console when in debug mode. The messages are as follows:
[BlazeDS]23:18:13.688 [WARN] Endpoint with id 'my-streaming-amf' is closing the streaming connection to FlexClient with id '71E6466F-D91F-201C-F60A-A6CB52F95D9F' because endpoint encountered a socket write error, possibly due to an unresponsive FlexClient.
ClientAbortException: java.net.SocketException: Broken pipe
at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:319)
at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:288)
at org.apache.catalina.connector.Response.flushBuffer(Response.java:542)
at org.apache.catalina.connector.ResponseFacade.flushBuffer(ResponseFacade.java:279)
at flex.messaging.endpoints.BaseStreamingHTTPEndpoint.handleFlexClientStreamingOpenRequest(BaseStreamingHTTPEndpoint.java:818)
at flex.messaging.endpoints.BaseStreamingHTTPEndpoint.serviceStreamingRequest(BaseStreamingHTTPEndpoint.java:1055)
at flex.messaging.endpoints.BaseStreamingHTTPEndpoint.service(BaseStreamingHTTPEndpoint.java:460)
at flex.messaging.MessageBrokerServlet.service(MessageBrokerServlet.java:353)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:737)
at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:434)
at org.apache.coyote.http11.InternalOutputBuffer.flush(InternalOutputBuffer.java:299)
at org.apache.coyote.http11.Http11Processor.action(Http11Processor.java:963)
at org.apache.coyote.Response.action(Response.java:183)
at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:314)
... 20 more
[BlazeDS]23:18:13.689 [DEBUG] Streaming thread 'http-8400-1' for endpoint with id 'my-streaming-amf' is releasing connection and returning to the request handler pool.
[BlazeDS]23:18:13.689 [INFO] Number of streaming clients for FlexSession with id '5BC5E8D604A361BCA673B05AC624CCC1' is 0.
[BlazeDS]23:18:13.689 [DEBUG] Number of streaming clients for endpoint with id 'my-streaming-amf' is 0.
At this stage, the subscriptions are still shown on the ActiveMQ web admin interface as being active.
Killing BlazeDS (more precisely, the Tomcat server that hosts it) with kill -9 from the console makes ActiveMQ realize immediately that the "proxy JMS client" is gone and it becomes offline on the ActiveMQ web admin interface. This made me conclude that BlazeDS is keeping the proxy JMS client alive explicitly since kill -9 gives no chance to BlazeDS to unsubscribe the client but it still becomes offline in ActiveMQ.
So, the question once again: What can I do here to ensure that BlazeDS makes the "proxy JMS client" offline in ActiveMQ when its real Flex counterpart terminates unexpectedly? Is this a bug in BlazeDS or am I just missing some hidden configuration setting that would make it work?
Version information: BlazeDS 4.0, ActiveMQ 5.5.0, both freshly downloaded today. I'm using the Tomcat server in the BlazeDS turnkey but ActiveMQ is installed separately because the BlazeDS turnkey ships with ActiveMQ 4.1.1 only. By the way, that version of ActiveMQ has the same issue.

The problem is that there is no way for BlazeDS to detect that your Flex client was shutdown, you will have to implement your own mechanism - my suggestion is to use a heart beat implemented with messaging. If no message is received from the client after a time interval you can assume that the Flex client is gone and do the disconnect (or you can use the session timeout mechanism on the server, and do the disconnect on session expire).
What you have seen (the exception caught when the streaming channel is closed) is not enough to say 100% sure that the Flex client is gone. The streaming is implemented using an HTTP connection kept open forever (used to send server messages) and periodic HTTP post calls (initiated by the client to send messages). In some networks the firewall can decide to kill the HTTP connection after a couple of seconds and you will receive the same error like the one you posted. However, it does not mean that the Flex client is killed - the Flex client can use a fallback strategy and switch to short/long polling in this case. Actually it would be a bug if BlazeDS will automatically do the JMS disconnect in this case.

Related

gRPC call, channel, connection and HTTP/2 lifecycle

I read the gRPC Core concepts, architecture and lifecycle, but it doesn't go into the depth I like to see. There is the RPC call, gRPC channel, gRPC connection (not described in the article) and HTTP/2 connection (not described in the article).
I'm interested in knowing how these come together. For example, what happens to the channel when a RPC throws an exception? What happens to the gRPC connection when the channel is closed? When is the channel closed? When is the gRPC connection closed? Heart beats? What if the deadline is exceeded?
Can anyone answer these questions, or point me to resources that can?
The connection is not a gRPC concept. It is not part of the normal API and is an implementation detail. This should be seen as fairly normal, like HTTP libraries providing details about HTTP exchanges but not exposing connections.
It is best to view RPCs and connections as two mostly-separate systems.
The only real guarantee is that "connections are managed by channels," for varying definitions of "managed." You must shut down channels when no longer used if you want connections and other resources to be freed. Other details are either an implementation detail or an advanced API detail.
There is no "gRPC connection." A "gRPC connection" would just be a standard "HTTP/2 connection." Except that is even an implementation detail of the transport in many gRPC implementations. That allows having alternative "connection" types like "inprocess" or QUIC (via Cronet, where there is not a classic "connection" at all).
It is the channel's job to hold all the connections and reconnect as necessary. It delegates part of that responsibility to load balancers and the load balancing APIs do have a concept of connections (subchannels). By not exposing connections to the application, load balancers have a lot of freedom to operate.
I'll note that gRPC C-core based implementations share connections across channels.
What happens to the channel when a RPC throws an exception?
The channel and connection is not impacted by a failed RPC. Note that connection-level failures typically cause RPCs to fail. But things like retries could allow the RPC to be re-sent on a new connection.
What happens to the gRPC connection when the channel is closed?
The connections are closed, eventually. Channel shutdown isn't instantaneous because existing RPCs can continue, and connection shutdown isn't instantaneous as well. But once all RPCs complete the connections are closed. Although C-core won't shut down a connection until no channels are using it.
When is the channel closed?
Only when the user closes it.
When is the gRPC connection closed?
Lots of times. The client may close it when no longer needed. For example, let's say the server IP address changes and the client need to connect to 1.1.1.2 instead of 1.1.1.1. A new connection will be created and new RPCs will go to the new IP address. The client may also close connections it thinks are dead (e.g., via keepalive timeouts).
Servers have a lot of say of when to close connections. They may close them simply because they are old, or because they have been idle, or because the server is overloaded. But those are simply use-cases; the server can shut down a connection at-will.
What if the deadline is exceeded?
Deadline only applies to RPCs and doesn't impact the channel or a connection.
I was actually waiting for Eric to answer this as he is the expert in this!
I also have been playing with gRPC for a while now, I would like to add few things here for beginners. Anyone more experienced, please feel free to edit!
Channel is an abstraction over a long-lived connection! The client application will create a channel on start up. The channel can be reused/shared among multiple threads. It is thred safe. One channel is enough (for most of the use cases) for multiple threads and multiplexing concurrent requests. It is channel's responsibility to close / reconnect / keep the connection alive etc. We as the users do not have to worry about this in general. The client application can close the channel anytime it wants. Channel creation seems to be an expensive process. So we would not open/close for every RPC.
When you use gRPC loadbalancer/nameresolver for a domain name and the nameresolver resolves the domain with multiple ip addresses, a channel creates multiple subchannels where each subchannel is an abstraction over a connection to 1 server. So a channel can also represent multiple connections!!
Adding some points to note from Eric's comment.
adding the default load balancer still only creates (approximately)
one connection if the name resolver returns multiple addresses, as the
default is pick_first. But if you change the load balancer to
round_robin or virtually any other policy, then yes, there will be
multiple connections in a channel. Even if a name resolver returns one
address, the load balancer is free to create multiple connections
(e.g., for higher throughput), but that's not common today
An underlying connection can be closed any time for any reason. For ex: remote server is shutting down gracefully for a scheduled maintenance or a connection is idle for longer duration. In that case, the server could send GOAWAY signal to the client and client might disconnect and reconnect to some other server. or Server might crash due to OOM error. In this case channel will detect connection failure and will retry for new connection for some other server etc.
A channel can keep sending PING frame to the server to keep the connection alive. These are all configurable via channel builder.
With these information above, if we look at your questions,
what happens to the channel when a RPC throws an exception?
Nothing happens to the channel. The unhandled exception on the server might the fail the RPC on the client side. But channel is still usable for any RPC calls.
What happens to the gRPC connection when the channel is closed?
Channel is an abstraction over the connection. So it will be closed. (again there is no gRPC connection as such as Eric had mentioned. It would be a HTTP2 connection)
When is the channel closed?
Any time you want. But normally when the application shuts down.
When is the gRPC connection closed?
It is not our problem. Channel takes care of this.
Heart beats?
Channel sends PING frames periodicaly to keep the connection alive.
What if the deadline is exceeded?
It is something like timeout on the client side. When the deadline exceeds, the client might cancel the request. Once again nothing happens to the channel. (But it might trigger exception on the server side which I had noticed few times. (Received DATA frame for an unknown stream. https://github.com/grpc/grpc-java/issues/3548). It seems to have been fixed now).

Rebus retry policy when RabbitMQ is temporarily down

I have a dockerized microservice architecture where I am using Rebus with RabbitMQ as message bus.
One container is running RabbitMQ. Other containers are running services that communicate with each other via Rebus/RabbitMQ.
I want my solution to be resilient to container restarts so if for example the RabbitMQ container restarts I expect the other services to be unaffected by that.
I expect that messages sent while RabbitMQ is down are queued up for delivery by Rebus
in the sending service and that they are delivered when the RabbitMQ connection is restored.
To verify that I run this test scenario:
Service A sends a message to service B via Rebus and RabbitMQ. That works fine.
I stop the RabbitMQ container.
Service A sends a message to service B via Rebus and RabbitMQ. That fails because RabbitMQ is unavailable.
I start the RabbitMQ container again.
I can see that Rebus in my services automatically reconnect to RabbitMQ when it is up. That is as expected.
Now that the RabbitMQ connection is restored I would expect that Rebus sends the pending message from Service A to service B, but it does not.
Is this not expected behaviour of Rebus? If not, can I enable this feature?
I have read this topic https://github.com/rebus-org/Rebus/wiki/Automatic-retries-and-error-handling
and tried to configure Rbus like this:
Configure.With(...)
.Options(b => b.SimpleRetryStrategy(maxDeliveryAttempts: 10))
.(...)
but with no luck.
The "delivery attempts" you're configuring is how you configure how many Rebus should try to consume a received message before giving up (i.e. moving it to the error queue).
If Rebus loses its connection to the broker, it will not be able to receive anything for the entire duration of the outage, so stopping RabbitMQ should effectively pause all message processing (possibly with some exceptions in all messages being handled at the instant where RabbitMQ goes away).
Since no Rebus handlers will be running then, while RabbitMQ is down, you will have to deal with outgoing messages sent from other places, e.g. like messages sent/published from a web request.
(...) I expect that messages sent while RabbitMQ is down are queued up for delivery by Rebus (...)
...but Rebus cannot queue anything up, because RabbitMQ is down(*).
The natural thing to do for Rebus in this situation is to give you, the caller, the responsibility of deciding what to do about the problem.
In .NET, you usually do that by throwing an exception back at you. 🙂
This leaves you with the option of
performing some alternative action, or
retrying some more times, or
whatever makes sense in that particular situation
A simple approach to building some resilience into your system in this case would be to use something like Polly to try sending outgoing messages multiple times in cases where it could fail.
I hope that makes sense. Please let me know if anything needs to be elaborated on. 🙂
(*) Of course Rebus could have "cheated" and queued outgoing messages up in memory, but that would make it very hard for you to write resilient code, because you would not know whether an outgoing message had been safely delivered to the broker, or whether it was just sitting in memory waiting to be saved somewhere.

Will spring kafka API automatically attempt for reconnection in case of broker failure?

I have a doubt regarding spring kafka broker failover mechanism. I was checking by bringing the brokers down, voluntarily and I've been getting these "Connection to node -1 could not be established. Brokers may not be available" warnings continuously as soon as the brokers went down. I understand that it's because of the broker unavailability. I want a support document to know that if it happens automatically by the API itself?
This is all handled by the underlying kafka-clients code; it is outside of Spring's responsibility; Spring is not informed of the situation.
The client will reconnect when the brokers come back up.

AMQP, RabbitMQ Push API how works?

I'm trying to get a deep understand how works the Push API communication between the client and the RabbitMQ server.
As I know - but correct me in case - the client open a TCP connenction to the broker (RabbitMQ) and keep this connenction alive until the client decision to close it. But during this connection the client can get messages immediately.
My question is, during this connection, do the client monitor the Broker to ask him for messages, or when the Broker forward a message to the Queue, where the client subscribed, just take that connencion and push the data to the client?
first case: client monitor the broker for messages
last case: client don't need to monitor the broker, broker just push the data
or other?
There are two options to receive messages
The client registers a consumer callback (basicConsume) on the channel; the broker then "pushes" messages to the consumer.
The client sends the broker a basicGet and receives one message (if present).
The first use case is the most common.
Since you tagged the question with spring-amqp I assume you are interested in Spring. For the first case, Spring AMQP has a listener container (and #RabbitListener annotation); for the second case, one of the RabbitTemplate receive operations can be used.
I suggest you look at the tutorials to get a basic understanding. They cover several languages including pure java and Spring AMQP.
You can also look at the Spring AMQP Reference Manual.

Problem consuming ActiveMQ messages from Flex client

I am unable to consume messages sent via ActiveMQ from my Flex client. Sending messages via the Producer seems to work, I can also see that the Flex client is connected and subscribed via the properties on the Consumer object, however the "message" event on the Consumer is never fired so it seems like the messages are not received.
When I look in the ActiveMQ console, I can see the number of subscribers, the number of messages sent and the number of messages received. The strange thing is that the received messages counter seems to increment and that I can also trace the log statements in the Tomcat console, but again no messages are received in the Flex client.
Any ideas?
After rebuilding my app from scratch with a fresh install of Tomcat, everything seems to work. Maybe this was caused by the fact that I was using the BlazeDS Turnkey version that contains a preconfigured instance of Tomcat.
BTW: This is a great tutorial: http://mmartinsoftware.blogspot.com/2008/05/simplified-blazeds-and-jms.html

Resources