spring-kafka listener shutdown - spring-kafka

I'm trying to verify that kafka-listeners are closed gracefully when i shutdown my application (either gracefully or in some aggressive way). Can somebody point me to place where it is handled? I was looking at destroy and close methods but I couldn't locate where kafka client deregistration is actually happening.
What happens if deregistration takes too much time? Or if some other spring bean shutdown hook takes too much time?
Also what it actually means for currently processing events? Can they commit offset during shutdown if they happen to finish their work?
I'm working with spring-kafka 1.3.5 version.

The current 1.3.x version is 1.3.8; you should upgrade. 1.3.9 will be released early January.
When you stop() a listener container, the container will stop in an orderly fashion and process any records that have previously returned by a pol() (but no further polls will be made). However, you should ensure your listener will process any remaining records within the shutdownTimeout (default 10 seconds, but configurable). The stop() operation will block for that time, waiting for the container(s) to stop.
With more modern versions (2.2.x, currently 2.2.2) you can use an ApplicationListener or #EventListener to consume ContainerStoppedEvent.

Related

WebAssembly blocks the web worker thread too

This is related to the previous question WebAssembly in async code
Basically, that question is about the problem of the WebAssembly blocking the main thread, and the answer to the question is to move the WebAssembly code to a web worker. That works.
The problem now is that the WebAssembly blocks the onmessage() on the worker.
My long running WebAssembly code has functions like play(), pause(), stop(), etc. The play() checks a pause flag and a stop flag periodically to determine if the play() should return. The pause() and the stop() are used to set those flags.
The JavaScript main thread calls postMessage() to send a message to the worker, which further calls the play().
Since the onmessage() is blocked, the worker will have no chance to receive further messages to do pause() or stop() until the play() is completed. That will defeat the very purposes of the pause/stop.
It seems the simple use case of play/pause/stop cannot be supported by the WebAssembly.
Any comments or suggestions?
By the way, that use case is well supported by the defunct Google PNaCl.
Thanks.
In short: Web worksers do not ignore messages even if the web worker thread is blocked.
All browsers events, including web worker postMessage()/onmessage() events are queued. This is the fundamental philosophy of JavaScript (onmessage() is done in JS even if you use WebAssembly). Have a look at "Concurrency model and Event Loop" from MDN for further detail.
So what going to happen in your case is, while onmessage() is blocked, the events from main thread postMessage() are queued automatically. When a single onmessage() job is finished in the worker thread, from the worker event queue, will check if postMessage() is called before it finishes and catch the message if there is. So you don't need to worry about that case as long as the onmessage() job takes like 10 seconds and the you get hundreds of events in the queue.
This is how asynchronous execution is done everywhere in the browser.
Considering you are targeting recent browsers (WebAssembly), you can most likely rely on SharedArrayBuffer and Atomics. Have a look at these solutions Is it possible to pause/resume a web worker externally? , which in your case will need to be handled inside WebAssembly (Atomics.wait part)

How to handle errors and retries in spring-kafka

This is a question related to :
https://github.com/spring-projects/spring-kafka/issues/575
I'm using spring-kafka 1.3.7 and transactions in a read-process-write cycle.
For this purpose, I should use a KTM on the spring kafka container to enable transaction on the whole listener process and automatic handling the transaction id based on the partition for zombie fencing(1.3.7 changes).
If I understand well from the issue #575, I can not use a RetryTemplate in a container when using a transaction manager.
How am I supposed to handle errors and retries in a such case ?
The default behavior with transaction is infinite retries ? This seems really dangerous. An unexpected exception might simply block the whole process in production.
The upcoming 2.2 release adds recovery to the DefaultAfterRollbackProcessor - so you can stop retrying after some number of attempts.
Docs Here, PR here.
It also provides an optional mechanism to send the failed record to a dead-letter topic.
If you can't move to 2.2 (release candidate due at the end of this week, with GA in October), you can provide a custom AfterRollbackProcessor with similar functionality.
EDIT
Or, you could add code to your listener (or its error handler) to keep track of how many times the same record has been delivered, and handle the error in your listener, or its listener-level error handler.

Stopping ConcurrentMessageListenerContainer in 1.3.5

I have one question and more of trying to understand so i can implement it right way. Requirement is to stop the ConcurrentMessageListenerContainer by invoking stop method on container which OOTB iterates for KafkaMessageListenerContainer(based on concurrency defined) and invoke stop for each consumer Thread.
Just FYI, i am on 1.3.5 and i cannot migrate to 2.* due to Spring Boot 1.5.*.
Configuration:
Let's say i have topic with 5 partitions and concurrency defined as 5 as well. Using Batchlistener so have batch records count =100 for each poll.
Question:
When we invoke stop on container, it appears internally, it set running false for each KafkaMessageListenerContainer and call wakeup on listenerConsumer.
setRunning(false);
this.listenerConsumer.consumer.wakeup();
During testing what i have observed, by invoking stop on container in separate thread, it does below:
1)It stops listenerConsumer and that's for sure working. No more polling happens after calling stop.
2)It seems if any listenerConsumer has already polled 100 records, and in middle of processing it completes the execution before stopping.
is #2 per design that invoking container stop only send wakeup to stop next polling? Because i don't see any handling of below in KafkaMessageListenerContainer.run()
catch (WakeupException e) {
//Ignore, we're stopping
}
One more thing, even in spring kafka 2.1 version by having ContainerStoppingBatchErrorHandler, it calls same container stop so i guess it's more of my understanding how to handle this scenario..
To conclude, if above is lot more detail, i want to terminate the listenerThread if stop is been invoked from separate thread. I have manual offset commit so replaying of batch is fine.
Hi Gary,
As you suggested to have consumeraware listener, my question is specific to listener been stopped through container. Once container invokes stop, listener thread be it BatchListner supposed to be interrupted from execution. I know entire poll of records has been received by listener and question is not about loosing offset as ack is at batch level.
It's not clear what your issue is; the container stops when the listener returns after processing the batch.
If you want to manage offsets at the record level, don't use a BatchMessageListener; use a record-level MessageListener instead.
With the former, the whole batch of records returned by the poll is considered a unit of work; offsets for the entire batch are committed, or not.
Even with a record-level listener, the container will not stop until the current batch of records have been sent to the listener; in that case, what you should do depends on the acknowledge mode; with manual acks, simply ignore those records that are received after the stop. If the container manages the acks; throw an exception for the records you want to discard (with ackOnError=false - default is true).
It's unfortunate that you can't move to a more recent release.
With 2.1; there is much more flexibility. For example; the SeekToCurrentErrorHandler and SeekToCurrentBatchErrorHandler are provided.
The SeekToCurrentErrorHandler extends RemainingRecordsErrorHandler which means the remaining records in the poll are sent to the error handler instead of the listener; when the container detects there is a RemainingRecordsErrorHandler the listener won't get the remaining records in the batch and the handler can decide what do do - stop the container, perform seeks etc.
So you no longer need to stop the container when a bad record is received.
This is all designed for handling errors/stopping the container due to some data error.
There currently is no stopNow() option on the containers - immediately stop calling the listener after the current record is processed and discard any remaining records so they will be resent on a start. Currently, you have to discard them yourself as I described above.
We could consider adding such an option but it would be a 2.2 feature at the earliest.
We only provide bug fixes in 1.3.x and, then, only if there is no work around.
It's open source, so you can always fork it yourself, and make contributions back to the framework.

Canceling JavaFX Service when in SCHEDULED state

I am trying to cancel my JavaFX Service using the cancel() method.
Before I call cancel(), I check the Worker state.
System.out.println(service.getState());
System.out.println(service.cancel());
The cancel() method persistently fails when the Worker is in the SCHEDULED state. cancel() returns false, and the Worker proceeds to the RUNNUNG state, before terminating as SUCCEEDED.
I cant find anything in the docs about cancel() not affecting a SCHEDULED Worker. Is this normal behaviour?
If this is the case, then you should file a bug report, as the behavior you describe directly contradicts the specified behavior in the Worker.State javadoc.
A Worker may be cancelled when READY, SCHEDULED, or RUNNING, in which case the final status will be CANCELLED. When a Worker is cancelled in one of these circumstances it will transition immediately to the CANCELLED state.
Could it be some concurrency problem with my Task?
I want somebody to answer questions about potential issues in your code, you need to edit your question to provide an MCVE.
Is there any sources that outlines reasons for why a cancel() would fail?
javafx-src.zip is included in the Oracle Java 8 JRE, you can investigate the source code to determine some of the reasons for it's behavior.

EJB or Servlet - how to add a 'kill switch' to force a process/thread to stop

Kind of an open question that I run into once in a while -- if you have an EJB stateful or stateless bean, or possibly a direct servlet process, that may with the wrong parameters start running long on a production system, how could you effectively add in a manual 'kill switch' for an administrator/person to specifically kill that thread/process?
You can't, or at least you shouldn't, interfere with application server threads directly. So a "kill switch" look definitively inappropriate to me in a Java EE environment.
I do however understand the problem you have, but would rather suggest to take an asynchronous approach where you split you job in smaller work unit.
I did that using EJB Timers and was happy with the result: An initial timer is created for the first work unit. When the app. server executes the timer, it then register as second one that correspond to the 2nd work unit, etc. Information can be passed form one work unit to the other because EJB Timers support the storage of custom information. Also, timer execution and registration is transactional, which is fine to work with database. You can even shutdown and restart the application sever with this approach. Before each work unit ran, we checked in database if the job had been canceled in the meantime.

Resources