Spring for Apache Kafka idleBetweenPolls param leads to continuous rebalance - spring-kafka

I've got Spring for Apache Kafka KafkaListenerContainerFactory bean configured with idleBetweenPolls param:
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, Event>> dlqKafkaListenerContainerFactory(
Map<String, Object> kafkaConfigs
) {
kafkaConfigs.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 10 * 60 * 1000);
var factory = new ConcurrentKafkaListenerContainerFactory<String, Event2Extended<?>>();
factory.setConsumerFactory(new DefaultKafkaConsumerFactory<>(kafkaConfigs));
factory.getContainerProperties().setIdleBetweenPolls(5 * 60 * 1000L);
factory.setErrorHandler(new SeekToCurrentErrorHandler(new FixedBackOff(0L, 0L)));
factory.setMessageConverter(new StringJsonMessageConverter());
return factory;
}
My application is running in 4 pods in OpenShift and I have a continuous rebalancing problem.
Problem logs by o.a.k.c.c.internals.AbstractCoordinatorare:
Attempt to heartbeat with Generation...and group instance id Optional.empty failed due to UNKNOWN_MEMBER_ID, resetting generation
Attempt to heartbeat failed since group is rebalancing
(Re-)joining group
But the problem disappears if I remove:
factory.getContainerProperties().setIdleBetweenPolls(5 * 60 * 1000L);

Related

Spring Kafka got an infinite retry attempt

Spring Kafka got an infinite retry attempt
Version: Spring-Kafka-2.2.4.RELEASE
Listener setting
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setRetryTemplate(getRetryTemplate());
factory.setConcurrency(concurrency);
return factory;
}
And using a RetryTemplate with SimpleRetryPolicy for setting the MaxAttempts
Kafka Listener
#KafkaListener(topics = "topic")
public void myListener(ConsumerRecord<String, String> record) {
//Code here
}
Some strange issue is happening on server with multiple instance running. So whenever myListener throwing an exception, the message will be retried up to the max attempt and ACK'ed fine on local.
But on the server, the following is happen:
Node A: Consume #1 -> Retry #1 -> Retry #1 -> Consume #2 -> Retry #2 -> Retry #2 -> Consume #3 ....
Node B:...................................... Consume #1 -> Retry #1 -> Retry #1 -> Consume #2 ....
And so on without any ACK. After 500 Message (ex: Consume #500), node A reset again back to Consume #1. Not sure if the number #500 is having any relation to max.poll.records
Still can't reproducing the issue on local setup since it's working fine (ACK after maxAttempt).
But is there any kind of explanation/solution for this problem? currently the shortcut is just catch the exception and prevent the retrying mechanism so the message can be ACK and offset moving forwards.
Thanks!

Spring cloud stream handling poison pills with Kafka DLT

spring-boot 2.5.2
spring-cloud Hoxton.SR12
spring-kafka 2.6.7 (downgraded due to issue: https://github.com/spring-cloud/spring-cloud-stream-binder-kafka/issues/1079)
I'm following this recipe to handle deserialisation errors: https://github.com/spring-cloud/spring-cloud-stream-samples/blob/main/recipes/recipe-3-handling-deserialization-errors-dlq-kafka.adoc
I created the beans mentioned in the recipe above as:
Configuration
#Slf4j
public class ErrorHandlingConfig {
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<byte[], byte[]>> customizer(SeekToCurrentErrorHandler errorHandler) {
return (container, dest, group) -> {
container.setErrorHandler(errorHandler);
};
}
#Bean
public SeekToCurrentErrorHandler errorHandler(DeadLetterPublishingRecoverer deadLetterPublishingRecoverer) {
return new SeekToCurrentErrorHandler(deadLetterPublishingRecoverer);
}
#Bean
public DeadLetterPublishingRecoverer publisher(KafkaOperations bytesTemplate) {
return new DeadLetterPublishingRecoverer(bytesTemplate);
}
}
configuration file:
spring:
cloud:
stream:
default:
producer:
useNativeEncoding: true
consumer:
useNativeDecoding: true
bindings:
myInboundRoute:
destination: some-destination.1
group: a-custom-group
myOutboundRoute:
destination: some-destination.2
kafka:
binder:
brokers: localhost
defaultBrokerPort: 9092
configuration:
application:
security: PLAINTEXT
bindings:
myInboundRoute:
consumer:
autoCommitOffset: true
startOffset: latest
enableDlq: true
dlqName: my-dql.poison
dlqProducerProperties:
configuration:
value.serializer: myapp.serde.MyCustomSerializer
configuration:
value.deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
spring.deserializer.value.delegate.class: myapp.serde.MyCustomSerializer
myOutboundRoute:
producer:
configuration:
key.serializer: org.apache.kafka.common.serialization.StringSerializer
value.serializer: myapp.serde.MyCustomSerializer
I was expecting the DLT to be called my-dql.poison. This topic is in fact created fine, however I also get a second topic auto created called some-destination.1.DLT
Why does it create this as well as the one I have named in the config with dlqName ?
What am I doing wrong? When I poll for messages, the message is in the auto created some-destination.1.DLT and not my dlqName
You should not configure dlt processing in the binding if you configure the STCEH in the container. Also set maxAttempts=1 to disable retries there.
You need to configure a destination resolver in the DLPR to use a different name.
/**
* Create an instance with the provided template and destination resolving function,
* that receives the failed consumer record and the exception and returns a
* {#link TopicPartition}. If the partition in the {#link TopicPartition} is less than
* 0, no partition is set when publishing to the topic.
* #param template the {#link KafkaOperations} to use for publishing.
* #param destinationResolver the resolving function.
*/
public DeadLetterPublishingRecoverer(KafkaOperations<? extends Object, ? extends Object> template,
BiFunction<ConsumerRecord<?, ?>, Exception, TopicPartition> destinationResolver) {
this(Collections.singletonMap(Object.class, template), destinationResolver);
}
See https://docs.spring.io/spring-kafka/docs/current/reference/html/#dead-letters
There is an open issue to configure the DLPR with the binding's DLT name.
https://github.com/spring-cloud/spring-cloud-stream-binder-kafka/issues/1031

how can i send the records causing DeserializationException to a DLT while consuming a message from kafka topic using seekToErrorHandler?

I'm using spring boot 2.1.7.RELEASE and spring-kafka 2.2.8.RELEASE. We are in the process of upgrading the spring boot version but for now, we are using this spring-kafka version.
And I'm using #KafkaListener annotation to create a consumer and I'm using all default settings for the consumer.And I'm using below configuration as specified in the Spring-Kafka documentation.
// other props
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer2.class);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer2.class);
props.put(ErrorHandlingDeserializer.KEY_DESERIALIZER_CLASS, StringDeserializer.class);
props.put(ErrorHandlingDeserializer.VALUE_DESERIALIZER_CLASS, AvroDeserializer.class.getName());
return new DefaultKafkaConsumerFactory<>(props);
Now, I've implemented my custom SeekToCurrentErrorHandler by extending SeekToCurrentErrorHandler to capture the send the records causing deserialization exception and send them to DLT.
Now the problem is, when i'm trying to test this logic with 30 messages with alternate messages having the deserialization exception, the list of the handle method is getting all 30 messages instead of getting only 15 messages which are causing the exception. With that said, how can i get the records with exception? Please suggest.
Here is my custom SeekToCurrentErrorHandler code
#Component
public class MySeekToCurrentErrorHandler extends SeekToCurrentErrorHandler {
private final MyDeadLetterRecoverer deadLetterRecoverer;
#Autowired
public MySeekToCurrentErrorHandler(MyDeadLetterRecoverer deadLetterRecoverer) {
super(-1);
this.deadLetterRecoverer = deadLetterRecoverer;
}
#Override
public void handle(Exception thrownException, List<ConsumerRecord<?, ?>> data, Consumer<?, ?> consumer, MessageListenerContainer container) {
if (thrownException instanceof DeserializationException) {
//Improve to support multiple records
DeserializationException deserializationException = (DeserializationException) thrownException;
deadLetterRecoverer.accept(data.get(0), deserializationException);
ConsumerRecord<?, ?>. consumerRecord = data.get(0);
sout(consumerRecord.key());
sout(consumerRecord.value());
} else {
//Calling super method to let the 'SeekToCurrentErrorHandler' do what it is actually designed for
super.handle(thrownException, data, consumer, container);
}
}
}
We have to pass all the remaining records, so that the STCEH can re-seek all partitions for the records that weren't processed.
After you recover the failed record, use SeekUtils to seek the remaining records (remove the one that you have recovered from the list).
Set recoverable to false so that doSeeks() doesn't try to recover the new first record.
/**
* Seek records to earliest position, optionally skipping the first.
* #param records the records.
* #param consumer the consumer.
* #param exception the exception
* #param recoverable true if skipping the first record is allowed.
* #param skipper function to determine whether or not to skip seeking the first.
* #param logger a {#link Log} for seek errors.
* #return true if the failed record was skipped.
*/
public static boolean doSeeks(List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer, Exception exception,
boolean recoverable, BiPredicate<ConsumerRecord<?, ?>, Exception> skipper, Log logger) {
You won't need all this code when you move to a more recent version (Boot 2.1 and Spring for Apache Kafka 2.2 are no longer supported).

Options to reduce processing rate using Spring Kafka

I'm using Spring Boot 2.2.7 and Spring Kafka. I have a KafkaListener which is a continuously processing stats data from a topic and writing the data into MongoDB and Elasticsearch (using Spring Data).
My configuration is as follows:
#Configuration
public class StatListenerConfig {
#Autowired
private KafkaConfig kafkaConfig;
#Bean
public ConsumerFactory<String, StatsRequestDto> statsConsumerFactory() {
return new DefaultKafkaConsumerFactory<>(kafkaConfig.statsConsumerConfigs());
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, StatsRequestDto> kafkaStatsListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, StatsRequestDto> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(statsConsumerFactory());
factory.getContainerProperties().setAckMode(AckMode.RECORD);
return factory;
}
}
#Service
public class StatListener {
private static final Logger LOGGER = LoggerFactory.getLogger(StatListener.class);
#Autowired
private StatsService statsService;
#KafkaListener(topics = "${kafka.topic.stats}", containerFactory = "kafkaStatsListenerContainerFactory")
public void receive(#Payload StatsRequestDto data) {
Stat stats = statsService.convertToStats(data);
statsService.save(stats).get();
}
}
The save method is an async method.
The problem I am having is that when the queue is being processed, Elastisearch CPU consumption is around 250%. This leads to sporadic timeout errors across the application. I am looking into how I can optimise Elasticsearch as indexing can cause CPU spikes.
I wanted to check that if I used an async method (like above), the next message from the topic would not be processed until the previous one had completed. If that is correct, what options are there in Spring Kafka that I could use to relieve pressure of a downstream operation that might take time to complete.
Any advice would be much appreciated.
In version 2.3, we added the idleBetweenPolls container property.
With earlier versions, you could simulate that by, say, sleeping in the consumer for some time after some number of records.
You just need to be sure the sleep+processing time for the records returned by a poll does not exceed max.poll.intervsl.ms, to avoid a rebalance.

Configure kafka topic retention policy during creation in spring-mvc?

Configure retention policy of all topics during creation
Trying to configure rentention.ms using spring, as I get an error of:
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.PolicyViolationException: Invalid retention.ms specified. The allowed range is [3600000..2592000000]
From what I've read the new value is -1 (infinity) so is out of that range
Following what was in
How to configure kafka topic retention policy during creation in spring-mvc? ,I added the below code but it seems to have no effect.
Any ideas/hints on how might solve this?
ApplicationConfigurationTest.java
#test
public void kafkaAdmin () {
KafkaAdmin admin = configuration.admin();
assertThat(admin, instanceOf(KafkaAdmin.class));
}
ApplicationConfiguration.java
#Bean
public KafkaAdmin admin() {
Map<String, Object> configs = new HashMap<>();
configs.put(TopicConfig.RETENTION_MS_CONFIG, "1680000");
return new KafkaAdmin(configs);
}
Found the solution by setting the value
spring.kafka.streams.topic.retention.ms: 86400000
in application.yml.
Our application uses spring mvc, hence the spring notation.
topic.retention.ms is the value that needs to be set in the streams config
86400000 is a random value just used as it is in range of [3600000..2592000000]

Resources