How to analyze and solve the lag problem of Spring-Kafka Consumer? - spring-kafka

I'm using spring-kafka to build my consumer. The consumer seems to be working fine, but at most time, it lags behind by about 1000 records, sometimes can reach 3000 records. I checked the printed INFO, the consumer frequently calls KafkaConsumer.seek(), and it will seek to the same offset for several times. I'm not sure whether it is normal. Is there a good way to find the main bottleneck that causes the lag problem?
#Configuration
public class KafkaConsumerConfig {
#Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
public Map<String, Object> consumerConfig(){
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringSerializer.class);
return props;
}
#Bean
public ConsumerFactory<String, String> consumerFactory(){
return new DefaultKafkaConsumerFactory<>(consumerConfig());
}
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> factory(
ConsumerFactory<String, String> consumerFactory
){
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
return factory;
}
}
Edit:
I find the problem. The lag problem is caused by too many times of retry. I disabled the retry after exception and the problem is solved.

Using the seek() function, the KAFKA API lets you see seek a specific offest, instead of using seekToBeginning(TopicPartition tp) and seekToEnd(TopicPartition tp).
When a consumer starts or when new partitionsare assigned, it uses seek() to a specific offset and start processing the message.
So in your case, your consumer may seeking a particular offset and skipping events (go back a few messages or skip ahead a few messages), and also may not be having better commit process in place, hence, you could see the consumer lags.
Find the reason/use-case, why you need the seek in your code. You can even remove and test it once.

Related

How to manual commit do not recorverd offset already sent DLT through CommonErrorHandler

A simple example is currently being made through the spring kafka.
If an exception occurs at the service layer, I want to commit the original offset after trying to retry and loading it into the dead letter queue.
However, the dead letter queue is loaded properly, but the original message remains in the kafka because the commit is not processed.
To show you my code, it is as follows.
KafkaConfig.java
...
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setCommonErrorHandler(kafkaListenerErrorHandler());
factory.getContainerProperties().setAckMode(AckMode.MANUAL_IMMEDIATE);
return factory;
}
private CommonErrorHandler kafkaListenerErrorHandler() {
DefaultErrorHandler defaultErrorHandler = new DefaultErrorHandler(
new DeadLetterPublishingRecoverer(template, DEAD_TOPIC_DESTINATION_RESOLVER),
new FixedBackOff(1000, 3));
defaultErrorHandler.setCommitRecovered(true);
defaultErrorHandler.setAckAfterHandle(true);
defaultErrorHandler.setResetStateOnRecoveryFailure(false);
return defaultErrorHandler;
}
...
KafkaListener.java
...
#KafkaListener(topics = TOPIC_NAME, containerFactory = "kafkaListenerContainerFactory", groupId = "stock-adjustment-0")
public void subscribe(final String message, Acknowledgment ack) throws IOException {
log.info(String.format("Message Received : [%s]", message));
StockAdjustment stockAdjustment = StockAdjustment.deserializeJSON(message);
if(stockService.isAlreadyProcessedOrderId(stockAdjustment.getOrderId())) {
log.info(String.format("AlreadyProcessedOrderId : [%s]", stockAdjustment.getOrderId()));
} else {
if(stockAdjustment.getAdjustmentType().equals("REDUCE")) {
stockService.decreaseStock(stockAdjustment);
}
}
ack.acknowledge(); // <<< does not work!
}
...
Stockservice.java
...
if(stockAdjustment.getQty() > stock.getAvailableStockQty()) {
throw new RuntimeException(String.format("Stock decreased Request [decreasedQty: %s][availableQty : %s]", stockAdjustment.getQty(), stock.getAvailableStockQty()));
}
...
At this time, when RuntimeException occur in the service layer as above, the DLT is issued through an CommonErrorhandler according to the Kafka setting.
However, after issuing the DLT, the original message remains in Kafka, so there is a need for a solution.
I looked it up and found that the setting I wrote is processed through SeekUtils.seekOrRecover(), and if it is not processed even if the maximum number of attempts is not processed, an exception occurs and the original offset is rolled back without processing a commit.
According to the document, it seems that the AfterRollbackProcessor handles rollback if it fails with the default value, but I don't know how to write the code to commit even if it fails.
EDITED
The above code and settings work normally.
I thought the consumer lag would occur, but when I judged the actual logic code(SeekUtils.seekOrRecover()) and checked the offset commit and lag, I confirmed that it works normally.
I think it was caused by my mistake.
Records are never removed (until they expire), the consumer's committed offset is updated.
Use kafka-consumer-groups.sh to describe the group to see the committed offset for the failed record that was sent to the DLT.

Spring Integration with Kafka not sending messages

I am working on Spring - Kafka using Java DSL and I see that the messages are not produced/sent to the kafka topic.
The code I have been using is:
#Bean
public IntegrationFlow sendToKafkaFlow() {
return IntegrationFlows.from(kafkaPublishChannel)
.handle(kafkaMessageHandler())
.get();
}
private KafkaProducerMessageHandlerSpec<String, Object, ?> kafkaMessageHandler() {
return Kafka
.outboundChannelAdapter(_kafkaProducerFactory.getKafkaTemplate().getProducerFactory())
.messageKey(m -> m
.getHeaders()
.getId())
//.headerMapper(mapper())
.topic(_topicConfiguration.getCheProgressUpdateTopic())
.configureKafkaTemplate(t -> t.getTemplate());
}
#Bean
public DefaultKafkaHeaderMapper mapper() {
return new DefaultKafkaHeaderMapper();
}
The producer configurations I am using are:
private ProducerFactory<String, Object> producerFactory() {
final Map<String, Object> producerProps = new HashMap<>();
producerProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaProducerConfiguration.getKafkaServerProducerHost());
producerProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
producerProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
producerProps.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, kafkaProducerConfiguration.isKafkaProducerIdempotentEnabled());
producerProps.put(ProducerConfig.ACKS_CONFIG, kafkaProducerConfiguration.getKafkaProducerAcks());
producerProps.put(ProducerConfig.RETRIES_CONFIG, kafkaProducerConfiguration.getKafkaProducerRetries());
producerProps.put(ProducerConfig.BATCH_SIZE_CONFIG, kafkaProducerConfiguration.getKafkaProducerBatchSize());
producerProps.put(ProducerConfig.LINGER_MS_CONFIG, kafkaProducerConfiguration.getKafkaProducerLingerMs());
return new DefaultKafkaProducerFactory<>(producerProps);
}
Not sure why I am not seeing the messages in the kafka topic. Can you please help me out here?
Try to use sync(true) on that Kafka.outboundChannelAdapter(). I believe there should be some errors if you don't see any progress during sending. You may also consider to use a DEBUG logging level for the org.springframework.integration category to see how your messages are traveling through your integration flow.

Options to reduce processing rate using Spring Kafka

I'm using Spring Boot 2.2.7 and Spring Kafka. I have a KafkaListener which is a continuously processing stats data from a topic and writing the data into MongoDB and Elasticsearch (using Spring Data).
My configuration is as follows:
#Configuration
public class StatListenerConfig {
#Autowired
private KafkaConfig kafkaConfig;
#Bean
public ConsumerFactory<String, StatsRequestDto> statsConsumerFactory() {
return new DefaultKafkaConsumerFactory<>(kafkaConfig.statsConsumerConfigs());
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, StatsRequestDto> kafkaStatsListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, StatsRequestDto> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(statsConsumerFactory());
factory.getContainerProperties().setAckMode(AckMode.RECORD);
return factory;
}
}
#Service
public class StatListener {
private static final Logger LOGGER = LoggerFactory.getLogger(StatListener.class);
#Autowired
private StatsService statsService;
#KafkaListener(topics = "${kafka.topic.stats}", containerFactory = "kafkaStatsListenerContainerFactory")
public void receive(#Payload StatsRequestDto data) {
Stat stats = statsService.convertToStats(data);
statsService.save(stats).get();
}
}
The save method is an async method.
The problem I am having is that when the queue is being processed, Elastisearch CPU consumption is around 250%. This leads to sporadic timeout errors across the application. I am looking into how I can optimise Elasticsearch as indexing can cause CPU spikes.
I wanted to check that if I used an async method (like above), the next message from the topic would not be processed until the previous one had completed. If that is correct, what options are there in Spring Kafka that I could use to relieve pressure of a downstream operation that might take time to complete.
Any advice would be much appreciated.
In version 2.3, we added the idleBetweenPolls container property.
With earlier versions, you could simulate that by, say, sleeping in the consumer for some time after some number of records.
You just need to be sure the sleep+processing time for the records returned by a poll does not exceed max.poll.intervsl.ms, to avoid a rebalance.

Custom metrics confusion

I added https://micrometer.io to our staging server in google cloud. The metric does not show up in "Cloud Run Revision" resource types. It is only visible if I select "Global" as seen here...
The instructions were very simple and very clear (MUCH UNLIKE opencensus which has a way overdesigned api). In fact, unlike opencensus, it worked out of the box except for it is not recording into "Cloud Run Revision".
I can't even choose the service_name in the filter so once I deploy to production, the metric will be recording BOTH prod and staging which is not what we want.
How do I debug micrometer further
If anyone knows offhand as well what the issue might be, that would be great as well? (though I don't mind learning micrometer and debugging it a bit more).
For now the only available monitored-resource types in your custom metrics are:
aws_ec2_instance: Amazon EC2 instance.
dataflow_job: Dataflow job.
gce_instance: Compute Engine instance.
gke_container: GKE container instance.
generic_node: User-specified computing node.
generic_task: User-defined task.
global: Use this resource when no other resource type is suitable. For most use cases, generic_node or generic_task are better choices than global.
k8s_cluster: Kubernetes cluster.
k8s_container: Kubernetes container.
k8s_node: Kubernetes node.
k8s_pod: Kubernetes pod.
So, global is the correct monitored-resource type in this case, since there is not a Cloud Run monitored-resource type yet.
To identify better the metrics, you can create metric descriptors, either Auto-creation or manually
For completeness, I have it recording all the JVM stats now but have a new post on aggregation in google's website here that seems to be a new issue...
Google Cloud Metrics and MicroMeter JVM reporting (is this a Micrometer bug or?)
My code that did the trick was (and using revisionName is CRITICALL for not getting errors!!!)
String projectId = MetadataConfig.getProjectId();
String service = System.getenv("K_SERVICE");
String revisionName = System.getenv("K_REVISION");
String config = System.getenv("K_CONFIGURATION");
String zone = MetadataConfig.getZone();
Map<String, String> map = new HashMap<>();
map.put("namespace", service);
map.put("job", "nothing");
map.put("task_id", revisionName);
map.put("location", zone);
log.info("project="+projectId+" svc="+service+" r="+revisionName+" config="+config+" zone="+zone);
StackdriverConfig stackdriverConfig = new OurGoogleConfig(projectId, map);
//figure out how to put in template better
MeterRegistry googleRegistry = StackdriverMeterRegistry.builder(stackdriverConfig).build();
Metrics.addRegistry(googleRegistry);
//This is what would be used in Development Server
//Metrics.addRegistry(new SimpleMeterRegistry());
//How to expose on #backend perhaps at /#metrics
CompositeMeterRegistry registry = Metrics.globalRegistry;
new ClassLoaderMetrics().bindTo(registry);
new JvmMemoryMetrics().bindTo(registry);
new JvmGcMetrics().bindTo(registry);
new ProcessorMetrics().bindTo(registry);
new JvmThreadMetrics().bindTo(registry);
and then the config is simple...
private static class OurGoogleConfig implements StackdriverConfig {
private String projectId;
private Map<String, String> resourceLabels;
public OurGoogleConfig(String projectId, Map<String, String> resourceLabels) {
this.projectId = projectId;
this.resourceLabels = resourceLabels;
}
#Override
public String projectId() {
return projectId;
}
#Override
public String get(String key) {
return null;
}
#Override
public String resourceType() {
return "generic_task";
}
#Override
public Map<String, String> resourceLabels() {
//they call this EVERY time, so save on memory by only passing the same
//map every time instead of re-creating it...
return resourceLabels;
}
};

From consumer end, Is there an option to create a topic with custom configurations?

I'm writing a kafka consumer using 'org.springframework.kafka.annotation.KafkaListener' (#KafkaListener) annotation. This annotation is expecting the topic to be already at the time of subscribing and trying to create the topic if the topic is not present.
In my case, i don't want the consumer to create a topic with default configuration but it should create a topic with custom configurations (like the no of partitions, clean up policy etc). Is there any option for this in spring-kafka?
See the documentation configuring topics.
If you define a KafkaAdmin bean in your application context, it can automatically add topics to the broker. To do so, you can add a NewTopic #Bean for each topic to the application context. The following example shows how to do so:
#Bean
public KafkaAdmin admin() {
Map<String, Object> configs = new HashMap<>();
configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,
StringUtils.arrayToCommaDelimitedString(embeddedKafka().getBrokerAddresses()));
return new KafkaAdmin(configs);
}
#Bean
public NewTopic topic1() {
return new NewTopic("thing1", 10, (short) 2);
}
#Bean
public NewTopic topic2() {
return new NewTopic("thing2", 10, (short) 2);
}
By default, if the broker is not available, a message is logged, but the context continues to load. You can programmatically invoke the admin’s initialize() method to try again later. If you wish this condition to be considered fatal, set the admin’s fatalIfBrokerNotAvailable property to true. The context then fails to initialize.
If the broker supports it (1.0.0 or higher), the admin increases the number of partitions if it is found that an existing topic has fewer partitions than the NewTopic.numPartitions.
If you are using Spring Boot, you don't need an admin bean because boot will automatically configure one for you.

Resources