I have a Kafka Consumer application that currently has a custom CommonErrorHandler to handle #KafkaListener exceptions when they are thrown. I have a custom FixedBackOff strategy in place where it will retry up to 3 times and then publish the record to a DLQ topic but every time it retries it prints the entire stack trace of the error and I wanted to know if I can suppress that to maybe a DEBUG level so it does not clutter the console output? This is currently what I have in place (Kotlin):
factory.setCommonErrorHandler(
DefaultErrorHandler({ record, exception ->
val thrownException = exception.cause ?: exception.localizedMessage
log.error(
"Kafka record offset=${record.offset()} is being sent to the DLQ due to=" +
"$thrownException"
)
val producerRecord = ProducerRecord<String, String>(dlqTopic, record.value().toString())
producerRecord.headers().add("dlq-failure-reason", thrownException.toString().toByteArray())
kafkaTemplate?.send(producerRecord)
}, FixedBackOff(0L, 2L))
I have looked here: https://docs.spring.io/spring-kafka/docs/current/reference/html/#error-handlers, and it mentions this, "All of the framework error handlers extend KafkaExceptionLogLevelAware which allows you to control the level at which these exceptions are logged." However, I have tried setting the logging level by extending that class and setting it's log level explicitly but to no avail. Am I missing something?
You don't need to subclass it, just set the property.
All that KafkaExceptionLogLevelAware does is set the log level on the KafkaException thrown by the error handler; it's the container that then logs it via the exception's selfLog method.
You are logging it yourself at error level (during recovery).
EDIT
Works fine for me...
#SpringBootApplication
public class So71373630Application {
public static void main(String[] args) {
SpringApplication.run(So71373630Application.class, args).close();
}
#Bean
DefaultErrorHandler eh() {
DefaultErrorHandler eh = new DefaultErrorHandler((r, ex) -> System.out.println("Failed:" + r.value()),
new FixedBackOff(0L, 3L));
eh.setLogLevel(Level.DEBUG);
return eh;
}
#KafkaListener(id = "so71373630", topics = "so71373630")
void listen(String in) {
System.out.println(in);
throw new RuntimeException("test");
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so71373630").partitions(1).replicas(1).build();
}
#Bean
ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
template.send("so71373630", "foo");
Thread.sleep(5000);
};
}
}
2022-03-07 10:10:35.460 INFO 31409 --- [o71373630-0-C-1] o.s.k.l.KafkaMessageListenerContainer : so71373630: partitions assigned: [so71373630-0]
foo
2022-03-07 10:10:35.492 INFO 31409 --- [o71373630-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-so71373630-1, groupId=so71373630] Seeking to offset 3 for partition so71373630-0
foo
2022-03-07 10:10:35.987 INFO 31409 --- [o71373630-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-so71373630-1, groupId=so71373630] Seeking to offset 3 for partition so71373630-0
foo
2022-03-07 10:10:36.490 INFO 31409 --- [o71373630-0-C-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=consumer-so71373630-1, groupId=so71373630] Seeking to offset 3 for partition so71373630-0
foo
Failed:foo
Related
I am developing spring kafka consumer. Due to message volume, I need use concurrency to make sure throughput. Due to used concurrency, I used threadlocal object to save thread based data. Now I need remove this threadlocal object after use it.
Spring document with below links suggested to implement a EventListener which listen to event ConsumerStoppedEvent . But did not mention any sample eventlistener code to get threadlocal object and remove the value. May you please let me know how to get the threadlocal instance in this case?
Code samples will be appreciated.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#thread-safety
Something like this:
#SpringBootApplication
public class So71884752Application {
public static void main(String[] args) {
SpringApplication.run(So71884752Application.class, args);
}
#Bean
public NewTopic topic2() {
return TopicBuilder.name("topic1").partitions(2).build();
}
#Component
static class MyListener implements ApplicationListener<ConsumerStoppedEvent> {
private static final ThreadLocal<Long> threadLocalState = new ThreadLocal<>();
#KafkaListener(topics = "topic1", groupId = "my-consumer", concurrency = "2")
public void listen() {
long id = Thread.currentThread().getId();
System.out.println("set thread id to ThreadLocal: " + id);
threadLocalState.set(id);
}
#Override
public void onApplicationEvent(ConsumerStoppedEvent event) {
System.out.println("Remove from ThreadLocal: " + threadLocalState.get());
threadLocalState.remove();
}
}
}
So, I have two concurrent listener containers for those two partitions in the topic. Each of them is going to call this my #KafkaListener method anyway. I store the thread id into the ThreadLocal. For simple use-case and testing the feature.
The I implement ApplicationListener<ConsumerStoppedEvent> which is emitted in the appropriate consumer thread. And that one helps me to extract ThreadLocal value and clean it up in the end of consumer life.
The test against embedded Kafka looks like this:
#SpringBootTest
#EmbeddedKafka(bootstrapServersProperty = "spring.kafka.bootstrap-servers")
#DirtiesContext
class So71884752ApplicationTests {
#Autowired
KafkaTemplate<String, String> kafkaTemplate;
#Autowired
KafkaListenerEndpointRegistry kafkaListenerEndpointRegistry;
#Test
void contextLoads() throws InterruptedException {
this.kafkaTemplate.send("topic1", "1", "foo");
this.kafkaTemplate.send("topic1", "2", "bar");
this.kafkaTemplate.flush();
Thread.sleep(1000); // Give it a chance to consume data
this.kafkaListenerEndpointRegistry.stop();
}
}
Right. It doesn't verify anything, but it demonstrate how that event can happen.
I see something like this in log output:
set thread id to ThreadLocal: 125
set thread id to ThreadLocal: 127
...
Remove from ThreadLocal: 125
Remove from ThreadLocal: 127
So, whatever that doc says is correct.
We have an integration test where we use EmbeddedKafka and produce a message to a topic, our app processes that message, and the result is sent to a second topic where we consume and assert the output. In CI this works maybe 2/3 of the time, but we will hit cases where KafkaTestUtils.getSingleRecord throws java.lang.IllegalStateException: No records found for topic (See [1] below).
To try and resolve this, I added ContainerTestUtils.waitForAssignment for each listener container in the registry (See [2] below). After a few successful runs in CI, I saw a new exception: java.lang.IllegalStateException: Expected 1 but got 0 partitions. This now has me wondering if this was actually the root cause of the original exception of no records found.
Any ideas what could help with the random failures here? I would appreciate any suggestions on how to troubleshoot.
spring-kafka and spring-kafka-test v2.6.4.
Edit: Added newConsumer for reference.
Example of our setup:
#SpringBootTest
#RunWith(SpringRunner.class)
#DirtiesContext
#EmbeddedKafka(
topics = { "topic1","topic2" },
partitions = 1,
brokerProperties = {"listeners=PLAINTEXT://localhost:9099", "port=9099"})
public class IntegrationTest {
#Autowired
private EmbeddedKafkaBroker embeddedKafkaBroker;
#Autowired
private KafkaListenerEndpointRegistry kafkaListenerEndpointRegistry;
#Test
public void testExample() {
try (Consumer<String, String> consumer = newConsumer()) {
for (MessageListenerContainer messageListenerContainer : kafkaListenerEndpointRegistry.getListenerContainers()) {
[2]
ContainerTestUtils.waitForAssignment(messageListenerContainer, embeddedKafkaBroker.getPartitionsPerTopic());
}
try (Producer<String, String> producer = newProducer()) {
embeddedKafkaBroker.consumeFromAnEmbeddedTopic(consumer, "topic2"); // [1]
producer.send(new ProducerRecord<>(
"topic1",
"test payload"));
producer.flush();
}
String result = KafkaTestUtils.getSingleRecord(consumer, "topic2").value();
assertEquals(result, "expected result");
}
}
private Consumer<String, String> newConsumer() {
Map<String, Object> consumerProps = KafkaTestUtils.consumerProps("groupId", "false", embeddedKafkaBroker);
ConsumerFactory<String, AssetTransferResponse> consumerFactory = new DefaultKafkaConsumerFactory<>(
consumerProps,
new StringDeserializer(),
new CustomDeserializer<>());
return consumerFactory.createConsumer();
}
}
I am sending message to Kafka by Kafka Template but I wanted to test exception, So I have provided wrong topic name but When I run the code, it says " Error while fetching metadata with correlation id 2 : {ocf-oots-gr-outbound_123=LEADER_NOT_AVAILABLE" not available but the topic itself is created in Kafka that I can also see through Kafka tool and when broker is stopped, it is also not throwing exception.
Code:
KafkaTemplate<String, Object> kafkaTemplate = (KafkaTemplate<String, Object>) CommonAppContextProvider.getApplicationContext().getBean("kafkaTemplate");
//kafkaTemplate.send(CommonAppContextProvider.getApplicationContext().getEnvironment().getProperty("kafka.transalators.outbound.topic"), kafkaMessageFormat);
ListenableFuture listenableFuture = kafkaTemplate.send(CommonAppContextProvider.getApplicationContext().
getEnvironment().getProperty("kafka.transalators.outbound.topic"), kafkaMessageFormat);
listenableFuture.addCallback(new ListenableFutureCallback<SendResult<?, ?>>() {
#Override
public void onSuccess(SendResult<?, ?> result) {
System.out.println("Sent");
}
#Override
public void onFailure(Throwable ex) {
throw new KafkaException();
}
});
}
It should throw exception may be KafkaException, TimeOutException, Interrupted exception etc.
You have to call get(time, timeUnit) on the future to get the result (success or otherwise).
I'm trying to set up Hangfire recurrent jobs on application startup (in Startup.cs of a .NET.Core web application), so that we can be sure that they are always registered. I managed to get it to work for a dummy job but I get a strange error for a more real test job:
System.InvalidOperationException: Expression object should be not null.
at Hangfire.Common.Job.FromExpression(LambdaExpression methodCall, Type explicitType)
at Hangfire.RecurringJob.AddOrUpdate(String recurringJobId, Expression`1 methodCall, Func`1 cronExpression, TimeZoneInfo timeZone, String queue)
at WsApplication.Web.Startup.Startup.ConfigureServices(IServiceCollection services) in C:\Projects\boxman\aspnet-core\src\WsApplication.Web.Host\Startup\Startup.cs:line 163
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at Microsoft.AspNetCore.Hosting.ConventionBasedStartup.ConfigureServices(IServiceCollection services)
at Microsoft.AspNetCore.Hosting.Internal.WebHost.EnsureApplicationServices()
at Microsoft.AspNetCore.Hosting.Internal.WebHost.BuildApplication()
This is what I have in my startup class:
// Hangfire (Enable to use Hangfire instead of default job manager)
services.AddHangfire(config =>
{
config.UseSqlServerStorage(_appConfiguration.GetConnectionString("Default"));
});
JobStorage.Current = new SqlServerStorage(_appConfiguration.GetConnectionString("Default"));
// Register recurring jobs on startup
var sp = services.BuildServiceProvider();
var offHireJob = sp.GetService<UnsetContainerIsBookedPerOffHireJob>();
// Test job 1 - this gets registered fine!
RecurringJob.AddOrUpdate("Test", () => Console.Write("Test"), Cron.Minutely);
// Test job 2 - This line triggers the error!
RecurringJob.AddOrUpdate(UnsetContainerIsBookedPerOffHireJob.JobId, () => offHireJob.Run(), Cron.Minutely);
And this is my actual test job:
public class UnsetContainerIsBookedPerOffHireJob : ApplicationService
{
public readonly static string JobId = nameof(UnsetContainerIsBookedPerOffHireJob);
private readonly IRepository<ContractLineMovement, int> _contractLineMovementRepository;
public UnsetContainerIsBookedPerOffHireJob(
IRepository<ContractLineMovement, int> contractLineMovementRepository
){
_contractLineMovementRepository = contractLineMovementRepository;
}
public void Run()
{
Logger.Debug("Running UnsetContainerIsBookedPerOffHireJob.Run()");
var offHirequery = from contractLineMovement in _contractLineMovementRepository.GetAll()
where contractLineMovement.ContractWorkFlowEventId == (int)ContractWorkFlowEventValues.OffHire
select contractLineMovement;
foreach (var offHire in offHirequery)
{
Logger.Debug("Processing offhire with ID: " + offHire.Id + " and Date: " + offHire.HireDate);
}
}
}
I've found the piece of code in Hangfire's source code which raises this exception but I still don't really understand why this expression
() => offHireJob.Run()
would cause a problem. It looks basic to me. Maybe I am missing something basic?
Or is there a better way to register my recurrent jobs once at an application level?
we are using Spring KafkaListener which acknowledges each records after it is processed to DB. If we have problems writing to DB we don't acknowledge the record so that offsets are not committed for the consumer. this works fine. Now we want to get the failed messages in next poll to retry them. we added errorhandler to our listener and invoked ConsumerAwareListenerErrorHandler and tried to do consumer.seek() for the failed message offset. Expectation is during next poll, we should received the failed messages. This is not happening. Next poll fetches only the new messages and not the failed messages Code snippet is given below.
#Service
public class KafkaConsumer {
#KafkaListener(topics = ("${kafka.input.stream.topic}"), containerFactory = "kafkaManualAckListenerContainerFactory", errorHandler = "listen3ErrorHandler")
public void onMessage(ConsumerRecord<Integer, String> record,
Acknowledgment acknowledgment ) throws Exception {
try {
msg = JaxbUtil.convertJsonStringToMsg(record.value());
onHandList = DCMUtil.convertMsgToOnHandDTO(msg);
TeradataDAO.updateData(onHandList);
acknowledgment.acknowledge();
recordSuccess = true;
LOGGER.info("Message Saved in Teradata DB");
} catch (Exception e) {
LOGGER.error("Error Processing On Hand Data ", e);
recordSuccess = false;
}
}
#Bean
public ConsumerAwareListenerErrorHandler listen3ErrorHandler() throws InterruptedException {
return (message, exception, consumer) -> {
this.listen3Exception = exception;
MessageHeaders headers = message.getHeaders();
consumer.seek(new org.apache.kafka.common.TopicPartition(
headers.get(KafkaHeaders.RECEIVED_TOPIC, String.class),
headers.get(KafkaHeaders.RECEIVED_PARTITION_ID, Integer.class)),
headers.get(KafkaHeaders.OFFSET, Long.class));
return null;
};
}
}
Container Class
#Bean
public Map<Object,Object> consumerConfigs() {
Map<Object,Object> props = new HashMap<Object,Object> ();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
localhost:9092);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "example-1");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
return props;
}
#SuppressWarnings({ "rawtypes", "unchecked" })
#Bean
public ConsumerFactory consumerFactory() {
return new DefaultKafkaConsumerFactory(consumerConfigs());
}
#SuppressWarnings("unchecked")
#Bean
KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<Integer, String>>
kafkaManualAckListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<Integer, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setAckMode(AckMode.MANUAL);
return factory;
}
It's supposed to work like this:
The error handler needs to throw an exception if you want to discard additional records from the previous poll.
Since you are "handling" the error, the container knows nothing and will continue to call the listener with the remaining records from the poll.
That said, I see that the container is also ignoring an exception thrown by the error handler (it will discard if the error handler throws an Error not an exception). I will open an issue for this.
Another work around would be to add the Consumer to the listener method signature and do the seek there (and throw an exception). If there is no error handler, the rest of the batch is discarded.
Correction
If the container has no ErrorHandler, any Throwable thrown by a ListenerErrorHandler will cause the remaining records to be discarded.
Please try using SeekToCurrentErrorHandler. The doc says "This allows implementations to seek all unprocessed topic/partitions so the current record (and the others remaining) will be retrieved by the next poll. The SeekToCurrentErrorHandler does exactly this.
The container will commit any pending offset commits before calling the error handler."
https://docs.spring.io/autorepo/docs/spring-kafka-dist/2.1.0.BUILD-SNAPSHOT/reference/htmlsingle/#_seek_to_current_container_error_handlers