Infinite loop when using #RetryableTopic - spring-kafka

I'm trying to use two #KafkaListener the first with #RetryableTopic
#Component
#Slf4j
public class RetryableKafkaListener {
#RetryableTopic(
attempts = "4",
backoff = #Backoff(delay = 1000, multiplier = 2.0),
autoCreateTopics = "false",
topicSuffixingStrategy = TopicSuffixingStrategy.SUFFIX_WITH_INDEX_VALUE)
#KafkaListener(topics = "orders")
public void listen(String in, #Header(KafkaHeaders.RECEIVED_TOPIC) String topic) {
log.info(in + " from " + topic);
throw new RuntimeException("test");
}
#DltHandler
public void dlt(String in, #Header(KafkaHeaders.RECEIVED_TOPIC) String topic) {
log.info(in + " from " + topic);
}
}
and the second is a simple Listener
#Component
#Slf4j
public class BasicKafkaListener {
#KafkaListener(topics = "another-orders")
public void listen(String in, #Header(KafkaHeaders.RECEIVED_TOPIC) String topic) {
log.info(in + " from " + topic);
throw new RuntimeException("another test");
}
}
I raised a RunTimeException when processing Listeners on both of them.
The first "RetryableKafkaListener" is working as excepted afters 4 attemps i get the record on the DLT topic.
The second should end after 10 failures "DefaultErrorHandler". but instead i get an infinite loop. i tried to set manually a FixedBackOff , but it's not working
When i remove the first RetryableKafkaListener with the BasicKafkaListener work as excepted and it stop after 10 failures.
My question is how to get both Listeners working as excepted should i define 2 ListenerContainer ?
I pushed a project on github https://github.com/elfelli/fork-spring-kafka-non-blocking-retries-and-dlt
To reprodure just run the app and push a record on the topic "another-order"
kafka-console-producer --broker-list localhost:9092 --topic another-orders
I'm not sure if this a bug or a bad use
i'm using
spring_boot_version= 2.6.2
spring_dependency_management_version=1.0.11.RELEASE
spring_kafka_version=2.8.1
Thank you for your Help

First - you should not specify a spring-kafka version; Spring Boot 2.6.x uses Spring for Apache Kafka 2.8.x. You should let Boot bring in the correct version. The sample pulls in 2.7.0.
It is by design - you should use a different container factory when mixing listener types.

Related

Retrying Kafka errors using #RetryableTopic

Is there a way to specify the dlt used when retrying with spring-kafka #RetryableTopic.
I use a listener with the following configuration :
#RetryableTopic(
attempts = "4",
backoff = #Backoff(delay = 1000),
autoCreateTopics = "false",
topicSuffixingStrategy = TopicSuffixingStrategy.SUFFIX_WITH_DELAY_VALUE,
fixedDelayTopicStrategy = FixedDelayStrategy.SINGLE_TOPIC)
#KafkaListener(topics = "${spring.kafka.template.default-topic}")
This retries using a single topic but uses my main topic +"_dlt" for exhausted retries even though I have a dead letter topic with a different name configured at :
spring:
kafka:
consumer:
template:
dead-letter-topic: its_dead_jim
I've used a DeadLetterPublishingRecoverer in the past and have implemented the dlt resolver function but I don't see a way to override the default behavior in the documentation for RetryableTopic. I've looked at RetryTopicConfigurationBuilder and RetryTopicConfigurer
but nothing seem applicable to change the DLT name.
I am not sure why you think there is such a property on the template.
See the documentation.
Extend RetryTopicConfigurationSupport in a #Configuration class and...
Custom naming strategies
More complex naming strategies can be accomplished by registering a bean that implements RetryTopicNamesProviderFactory. The default implementation is SuffixingRetryTopicNamesProviderFactory and a different implementation can be registered in the following way:
#Override
protected RetryTopicComponentFactory createComponentFactory() {
return new RetryTopicComponentFactory() {
#Override
public RetryTopicNamesProviderFactory retryTopicNamesProviderFactory() {
return new CustomRetryTopicNamesProviderFactory();
}
};
}
As an example the following implementation, in addition to the standard suffix, adds a prefix to retry/dl topics names:
public class CustomRetryTopicNamesProviderFactory implements RetryTopicNamesProviderFactory {
#Override
public RetryTopicNamesProvider createRetryTopicNamesProvider(
DestinationTopic.Properties properties) {
if(properties.isMainEndpoint()) {
return new SuffixingRetryTopicNamesProvider(properties);
}
else {
return new SuffixingRetryTopicNamesProvider(properties) {
#Override
public String getTopicName(String topic) {
return "my-prefix-" + super.getTopicName(topic);
}
};
}
}
}

contractVerifierMessaging.receive is null

I'm setting up contract tests for Kafka messaging with Test Containers in a way described in spring-cloud-contract-samples/producer_kafka_middleware/. Works good with Embedded Kafka but not with TestContainers.
When I try to run the generated ContractVerifierTest:
public void validate_shouldProduceKafkaMessage() throws Exception {
// when:
triggerMessageSent();
// then:
ContractVerifierMessage response = contractVerifierMessaging.receive("kafka-messages",
contract(this, "shouldProduceKafkaMessage.yml"));
Cannot invoke "org.springframework.messaging.Message.getPayload()" because "receive" is null
is thrown
Kafka container is running, the topic is created. When debugging receive method I see the message is null in the message(destination);
Contract itself:
label("triggerMessage")
input {
triggeredBy("triggerMessageSent()")
}
outputMessage {
sentTo "kafka-messages"
body(file("kafkaMessage.json"))
Base test configuration:
#SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.NONE, classes = {TestConfig.class, ServiceApplication.class})
#Testcontainers
#AutoConfigureMessageVerifier
#ActiveProfiles("test")
public abstract class BaseClass {
What am I missing? Maybe a point of communication between the container and ContractVerifierMessage methods?
Resolved the issue by adding a specific topic name to listen() method in KafkaMessageVerifier implementation class.
So instead of #KafkaListener(id = "listener", topicPattern = ".*"), it works with:
#KafkaListener(topics = {"my-messages-topic"})
public void listen(ConsumerRecord payload, #Header(KafkaHeaders.RECEIVED_TOPIC)

How to get threadlocal for concurrency consumer?

I am developing spring kafka consumer. Due to message volume, I need use concurrency to make sure throughput. Due to used concurrency, I used threadlocal object to save thread based data. Now I need remove this threadlocal object after use it.
Spring document with below links suggested to implement a EventListener which listen to event ConsumerStoppedEvent . But did not mention any sample eventlistener code to get threadlocal object and remove the value. May you please let me know how to get the threadlocal instance in this case?
Code samples will be appreciated.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#thread-safety
Something like this:
#SpringBootApplication
public class So71884752Application {
public static void main(String[] args) {
SpringApplication.run(So71884752Application.class, args);
}
#Bean
public NewTopic topic2() {
return TopicBuilder.name("topic1").partitions(2).build();
}
#Component
static class MyListener implements ApplicationListener<ConsumerStoppedEvent> {
private static final ThreadLocal<Long> threadLocalState = new ThreadLocal<>();
#KafkaListener(topics = "topic1", groupId = "my-consumer", concurrency = "2")
public void listen() {
long id = Thread.currentThread().getId();
System.out.println("set thread id to ThreadLocal: " + id);
threadLocalState.set(id);
}
#Override
public void onApplicationEvent(ConsumerStoppedEvent event) {
System.out.println("Remove from ThreadLocal: " + threadLocalState.get());
threadLocalState.remove();
}
}
}
So, I have two concurrent listener containers for those two partitions in the topic. Each of them is going to call this my #KafkaListener method anyway. I store the thread id into the ThreadLocal. For simple use-case and testing the feature.
The I implement ApplicationListener<ConsumerStoppedEvent> which is emitted in the appropriate consumer thread. And that one helps me to extract ThreadLocal value and clean it up in the end of consumer life.
The test against embedded Kafka looks like this:
#SpringBootTest
#EmbeddedKafka(bootstrapServersProperty = "spring.kafka.bootstrap-servers")
#DirtiesContext
class So71884752ApplicationTests {
#Autowired
KafkaTemplate<String, String> kafkaTemplate;
#Autowired
KafkaListenerEndpointRegistry kafkaListenerEndpointRegistry;
#Test
void contextLoads() throws InterruptedException {
this.kafkaTemplate.send("topic1", "1", "foo");
this.kafkaTemplate.send("topic1", "2", "bar");
this.kafkaTemplate.flush();
Thread.sleep(1000); // Give it a chance to consume data
this.kafkaListenerEndpointRegistry.stop();
}
}
Right. It doesn't verify anything, but it demonstrate how that event can happen.
I see something like this in log output:
set thread id to ThreadLocal: 125
set thread id to ThreadLocal: 127
...
Remove from ThreadLocal: 125
Remove from ThreadLocal: 127
So, whatever that doc says is correct.

Integration Spring Reactive with Spring MVC + MySQL

Trying to figure out if I can use Spring Reactive (Flux/Mono) with Spring MVC ?
The structure of microservices using Spring MVC + Feign Client, Eureka Server (Netflix OSS), Hystrix, MySQL database.
My first microservice addDistanceClient adds data to the database.
Here is an example controller:
#RequestMapping("/")
#RestController
public class RemoteMvcController {
#Autowired
EmployeeService service;
#GetMapping(path = "/show")
public List<EmployeeEntity> getAllEmployeesList() {
return service.getAllEmployees();
}
}
Here I can use Mono/Flux, I think there will be no problems.
My second microservice is showDistanceClient - it is not directly connected to the database.
He has a method that calls the method (as described above) on the first microservice to retrieve data from the database.
It uses the Feign Client.
Second microservice controller:
#Controller
#RequestMapping("/")
public class EmployeeMvcController {
private ServiceFeignClient serviceFeignClient;
#RequestMapping(path = "/getAllDataFromAddService")
public String getData2(Model model) {
List<EmployeeEntity> list = ServiceFeignClient.FeignHolder.create().getAllEmployeesList();
model.addAttribute("employees", list);
return "resultlist-employees";
}
}
and ServiceFeignClient itself, with which we call the method on the first microservice, looks like this:
#FeignClient(name = "add-client", url = "http://localhost:8081/", fallback = Fallback.class)
public interface ServiceFeignClient {
class FeignHolder {
public static ServiceFeignClient create() {
return HystrixFeign.builder().encoder(new GsonEncoder()).decoder(new GsonDecoder()).target(ServiceFeignClient.class, "http://localhost:8081/", new FallbackFactory<ServiceFeignClient>() {
#Override
public ServiceFeignClient create(Throwable throwable) {
return new ServiceFeignClient() {
#Override
public List<EmployeeEntity> getAllEmployeesList() {
System.out.println(throwable.getMessage());
return null;
}
};
}
});
}
}
#RequestLine("GET /show")
List<EmployeeEntity> getAllEmployeesList();
}
It is working properly now. Those, if both microservices are OK, I get data from the database.
If the first microservice (addDistanceClient) is dead, then when I call the method on second microservice (showDistanceClient) to get data from the database through the first microservice (using Feign Client on second microservice), I get a page on which the spinner is spinning and the text that the service is unavailable, try again later. All perfectly.
My goal:
To do this using Spring Reactive (not sure if this will help me, but I think I'm thinking in the right direction) to make the message that the service is currently unavailable and the spinning spinner on the second microservice will automatically disappear and the data from the database will be displayed as soon as the first microservice (addDistanceClient) will come to life again (without re-sending the request, i.e. without reloading the page).
Will I be able to do this through Spring WebFlux ?
I know that a stream is used through Spring WebFlux, which itself will notify us if data appears in it, we do not need to resubmit the request here.
I started thinking about this and cannot figure out how to do this:
1) using Spring Reactive
In this case, I need to implement Flux/Mono into the MVC model in the second showDistanceClient microservice, which returns HTML. I don't understand how. I know how to do this with REST.
2) If the first item is incorrect, maybe I need to use a WebSocket for this ?
If so, please share useful links with examples. I will be very grateful.
Indeed, this topic is very interesting to me and I want to understand it.
I will be very grateful for your help. Thanks everyone!
UPDATED POST:
I updated both controllers with REST + WebFlux. Everything works for me.
The first addDistanceClient service and its controller:
#RestController
#RequestMapping("/")
public class BucketController {
#Autowired
private BucketRepository bucketRepository;
// Get all Bucket from the database (every 1 second you will receive 1 record from the DB)
#GetMapping(value = "/stream/buckets/delay", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<Bucket> streamAllBucketsDelay() {
return bucketRepository.findAll().delayElements(Duration.ofSeconds(5));
}
}
He pulls out all the records from the database with an interval of 5 seconds each record. I added an interval for an example to test.
The second service is showDistanceClient and its controller.
Here I used WebClient instead of Feign Client.
#RestController
#RequestMapping("/")
public class UserController {
#Autowired
private WebClient webClient;
#Autowired
private WebClientService webClientService;
// Using WebClient
#GetMapping(value = "/getDataByWebClient",produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<Bucket> getDataByWebClient() {
return webClientService.getDataByWebClient();
}
}
and its Service layer (WebClientService):
#Service
public class WebClientService {
private static final String API_MIME_TYPE = "application/json";
private static final String API_BASE_URL = "http://localhost:8081";
private static final String USER_AGENT = "User Service";
private static final Logger logger = LoggerFactory.getLogger(WebClientService.class);
private WebClient webClient;
public WebClientService() {
this.webClient = WebClient.builder()
.baseUrl(API_BASE_URL)
.defaultHeader(HttpHeaders.CONTENT_TYPE, API_MIME_TYPE)
.defaultHeader(HttpHeaders.USER_AGENT, USER_AGENT)
.build();
}
public Flux<Bucket> getDataByWebClient() {
return webClient.get()
.uri("/stream/buckets/delay")
.exchange()
.flatMapMany(clientResponse -> clientResponse.bodyToFlux(Bucket.class));
}
}
Now everything works in a reactive environment. Fine.
But my problem remained unresolved.
My goal: everything works, everything is fine, and if I suddenly called on the second service a method that using WebClient called the first service to get the data, and at that moment my first service died, I received a message that the service is temporarily unavailable and then my first service My request for data was revived and I received all the data and instead of reporting that the service was temporarily unavailable I would get all the data (important: without reloading the page).
How do I achieve this ?

Kafka consumer health check

Is there a simple way to say if a consumer (created with spring boot and #KafkaListener) is operating normally?
This includes - can access and poll a broker, has at least one partition assigned, etc.
I see there are ways to subscribe to different lifecycle events but this seems to be a very fragile solution.
Thanks in advance!
You can use the AdminClient to get the current group status...
#SpringBootApplication
public class So56134056Application {
public static void main(String[] args) {
SpringApplication.run(So56134056Application.class, args);
}
#Bean
public NewTopic topic() {
return new NewTopic("so56134056", 1, (short) 1);
}
#KafkaListener(id = "so56134056", topics = "so56134056")
public void listen(String in) {
System.out.println(in);
}
#Bean
public ApplicationRunner runner(KafkaAdmin admin) {
return args -> {
try (AdminClient client = AdminClient.create(admin.getConfig())) {
while (true) {
Map<String, ConsumerGroupDescription> map =
client.describeConsumerGroups(Collections.singletonList("so56134056")).all().get(10, TimeUnit.SECONDS);
System.out.println(map);
System.in.read();
}
}
};
}
}
{so56134056=(groupId=so56134056, isSimpleConsumerGroup=false, members=(memberId=consumer-2-32a80e0a-2b8d-4519-b71d-671117e7eaf8, clientId=consumer-2, host=/127.0.0.1, assignment=(topicPartitions=so56134056-0)), partitionAssignor=range, state=Stable, coordinator=localhost:9092 (id: 0 rack: null))}
We have been thinking about exposing getLastPollTime() to the listener container API.
getAssignedPartitions() has been available since 2.1.3.
I know that you haven't mentioned it in your post - but beware of adding items like this to a health check if you then deploy in AWS and use such a health check for your ELB scaling environment.
For example one scenario that can happen is that your app loses connectivity to Kafka - your health check turns RED - and then elastic beanstalks begins a process of killing and re-starting your instances (which will happen continually until your Kafka instances are available again). This could be costly!
There is also a more general philosophical question on whether health checks should 'cascade failures' or not e.g. kafka is down so app connected to kafka claims it is down, the next app in the chain also does the same, etc etc. This is often more normally implemented via circuit breakers which are designed to minimise slow calls destined for failure.
You could check using the AdminClient for the topic description.
final AdminClient client = AdminClient.create(kafkaConsumerFactory.getConfigurationProperties());
final String topic = "someTopicName";
final DescribeTopicsResult describeTopicsResult = client.describeTopics(Collections.singleton(topic));
final KafkaFuture<TopicDescription> future = describeTopicsResult.values().get(topic);
try {
// for healthcheck purposes we're fetching the topic description
future.get(10, TimeUnit.SECONDS);
} catch (final InterruptedException | ExecutionException | TimeoutException e) {
throw new RuntimeException("Failed to retrieve topic description for topic: " + topic, e);
}

Resources