#KafkaListener skipping messages when using Acknowledgment.nack() - spring-kafka

Spring Kafka version - 2.8.5
```#KafkaListener
consumeMessages(#Payload List<String> messages,Ack ack)
{
//process records in separate Thread and acknowledge
// Thread is not available nack it after 2 seconds
}```
Nack records should be reprocessed after 2 seconds in KafkaListener. However, Skipped records were not processed by KafkaListener. The missing message is consumed again after restarting the Spring Boot app.

You can't use nack() from another thread; only from the listener thread
#Override
public void nack(long sleepMillis) {
Assert.state(Thread.currentThread().equals(ListenerConsumer.this.consumerThread),
"nack() can only be called on the consumer thread");
Even when called from the listener thread, nack() does not support "skipping" records; when using manual commits, it is the application's responsibility to commit the offset to skip a record (by calling acknowledge()).

Related

How to retry in kafka with separate retry topics and exponentional backoff

To implement Retry architecture in spring kafka ( kafka version 2.3.6) with backoffs.
Requirement.
Event is first published in main-topic.
If the processing fails, then its pushed to retry-topic-2s after 2s. This retry topic has a separate processing code as compared to the main topic (consequently have a different listener function).
If the processing fails in retry-topic-2s, then its pushed to retry-topic-6s after 6s. This retry topic has a separate processing code as compared to the previous retry topic.
Finally if processing fails, event is pushed to DLT (dead-letter-topic).
Problem.
How to push events into a custom topic with a delay ( without using Thread.sleep ).
I have tried ContainerCustomizer to direct messages to retry-topic-2s directly without and delay and applying retry backoffs in this topic. But this is not satisfying the requirement.
private ContainerCustomizer<String,String, ConcurrentMessageListenerContainer<String, String>> customizer(KafkaTemplate<String,String> template) {
return container -> {
ExponentialBackOff exponentialBackOff = new ExponentialBackOff(retryFixedBackoff, retryMultiplier);
exponentialBackOff.setMaxElapsedTime(retryMaxElapsedTime);
if (container.getContainerProperties().getTopics()[0].equals(callbackRetryTopic)) {
container.setErrorHandler(new SeekToCurrentErrorHandler(
new DeadLetterPublishingRecoverer(template,
(cr, ex) -> new TopicPartition(callbackDeadLetterTopic, cr.partition())),
exponentialBackOff));
}
};
}
Can someone help me with this usecase?
It is non-trivial and is now supported natively by all supported versions of Spring for Apache Kafka. There is a lot of code involved in supporting this feature.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#retry-topic

Can I use CompletableFuture.runAsync inside a spring kafka batch listener?

considering my question:
#KafkaListener(..)
public void receive(
List<ConsumerRecord<String, String>> records,
Acknowledgment ack) {
records.stream().forEach(r -> CompletableFuture.runAsync(ConsumerService::process);
ack.acknowledge();
}
What are the pitfalls? Is it a good code?
My process method will to repost to kafka if fail, in this case I can commit if or not I get some error...
You run the risk of losing messages because you are committing the offsets before the async tasks complete. (If there is a failure (server crash, power failure etc.).

Why does Vertx throws a warning even with blocking attribute?

I have a Quarkus application where I use the event bus.
the code in question looks like this:
#ConsumeEvent(value = "execution-request", blocking = true)
#Transactional
#TransactionConfiguration(timeout = 3600)
public void consume(final Message<ExecutionRequest> msg) {
try {
execute(...);
} catch (final Exception e) {
// some logging
}
}
private void execute(...)
throws InterruptedException {
// it actually runs a long running task, but for
// this example this has the same effect
Thread.sleep(65000);
}
Why do I still get a
WARN [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-worker-thread-0,5,main] has been blocked for 63066 ms, time limit is 60000 ms: io.vertx.core.VertxException: Thread blocked
I'm I doing something wrong? Is the blocking parameter at the ConsumeEvent annotation not enough to let that handle in a separate Worker?
Your annotation is working as designed; the method is running in a worker thread. You can tell by both the name of the thread "vert.x-worker-thread-0", and by the 60 second timeout before the warnings were logged. The eventloop thread only has a 3 second timeout, I believe.
The default Vert.x worker thread pool is not designed for "very" long running blocking code, as stated in their docs:
Warning:
Blocking code should block for a reasonable amount of time (i.e no more than a few seconds). Long blocking operations or polling operations (i.e a thread that spin in a loop polling events in a blocking fashion) are precluded. When the blocking operation lasts more than the 10 seconds, a message will be printed on the console by the blocked thread checker. Long blocking operations should use a dedicated thread managed by the application, which can interact with verticles using the event-bus or runOnContext
That message mentions blocking for more than 10 seconds triggers a warning, but I think that's a typo; the default is actually 60.
To avoid the warning, you'll need to create a dedicated WorkerExecutor (via vertx.createSharedWorkerExecutor) configured with a very high maxExcecuteTime. However, it does not appear you can tell the #ConsumeEvent annotation to use it instead of the default worker pool, so you'd need to manually create an event bus consumer, as well, or use a regular #ConsumeEvent annotation, but call workerExectur.executeBlocking inside of it.

ChangeFeedProcessorBuilder checkpointing after unsuccessful processing

I was investigating the behavior of a ChangeFeedProcessorBuilder processor1 that throws an exception or goes down while processing the particular change. Upon recovery, the same change will not be picked up anymore. Is there any way to checkpoint only after the successful processing of the notification?
The delegate is as follows:
var builder = container.GetChangeFeedProcessorBuilder("migrationProcessor",
(IReadOnlyCollection<object> input, CancellationToken cancellationToken) =>
{
Console.WriteLine(input.Count + " Changes Received by " + a);
// just first try will fail (static variable)
if (a++ == 0)
{
throw new Exception();
}
return Task.CompletedTask;
});
Thank you!
The default behavior of the Change Feed Processor is to checkpoint after a successful delegate execution: https://learn.microsoft.com/azure/cosmos-db/change-feed-processor#processing-life-cycle
The normal life cycle of a host instance is:
Read the change feed.
If there are no changes, sleep for a predefined amount of time (customizable with WithPollInterval in the Builder) and go to #1.
If there are changes, send them to the delegate.
When the delegate finishes processing the changes successfully, update the lease store with the latest processed point in time and go to #1.
If your delegate handler throws an unhandled exception, there is no checkpoint.
Adding from comments: The only scenario where the batch might not be retried is if the batch that throws is the first ever (lease has no Continuation). Because when the host picks up the lease again to reprocess, it has no point in time to retry from. Based on the official documentation, one lease is owned by a single instance, so there is no way that other instance could have picked up the same lease and be processing it in parallel (within the same Deployment Unit context).

How to perform ASP.NET Core execution outside of the initial pooled thread to a non-pooled thread?

Consider the normal scenario where an ASP.NET Core Web API application executes the service Controller action, but instead of executing all the work under the same thread (thread pool thread) until the response is created, I would like to use non-pooled threads (ideally pre-created) to execute the main work, either by scheduling one of these threads from the initial action pooled thread and free the pooled thread for serving other incoming requests, or passing the job to a pre-created non-pooled thread.
Among other reasons, the main reason to have these non-pooled and long running threads is that some requests may be prioritized and their threads put on hold (synchronized), thus it would not block new incoming requests to the API due to thread pool starvation, but older requests on hold (non-pooled threads) may be waked up and rejected and some sort of call back to the thread pool to return the web response back to the clients.
In summary, the ideal solution would be using a synchronization mechanism (like .NET RegisterWaitForSingleObject) where the pooled thread would hook to the waitHandle but be freed up for other thread pool work, and a new non-pooled thread would be created or used to carry on the execution. Ideally from a list of pre-created and idle non-pooled threads.
Seems async-await only works with Tasks and threads from the .NET thread pool, not with other threads. Also most techniques to create non-pooled threads do not allow the pooled thread to be free and return to the pool.
Any ideas? I'm using .NET Core and latest versions of tools and frameworks.
Thank you for the comments provided. The suggestion to check TaskCompletionSource was fundamental. So my goal was to have potentially hundreds or thousands of API requests on ASP.NET Core and being able to serve only a portion of them at a given time frame (due to backend constraints), choosing which ones should be served first and hold the others until backends are free or reject them later. Doing all this with thread pool threads is bad: blocking/holding and having to accept thousands in short time (thread pool size growing).
The design goal was the request jobs to move their processing from the ASP.NET threads to non pooled threads. I plan to to have these pre-created in reasonable numbers to avoid the overhead of creating them all the time. These threads implement a generic request processing engine and can be reused for subsequent requests. Blocking these threads to manage request prioritization is not a problem (using synchronization), most of them will not use CPU at all time and the memory footprint is manageable. The most important is that the thread pool threads will only be used on the very start of the request and released right away, to be only be used once the request is completed and return a response to the remote clients.
The solution is to have a TaskCompletionSource object created and passed to an available non-pooled thread to process the request. This can be done by queuing the request data together with the TaskCompletetionSource object on the right queue depending the type of service and priority of the client, or just passing it to a newly created thread if none available. The ASP.NET controller action will await on the TaskCompletionSouce.Task and once the main processing thread sets the result on this object, the rest of the code from the controller action will be executed by a pooled thread and return the response to the client. Meanwhile, the main processing thread can either be terminated or go get more request jobs from the queues.
using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
namespace MyApi.Controllers
{
[Route("api/[controller]")]
public class ValuesController : Controller
{
public static readonly object locker = new object();
public static DateTime time;
public static volatile TaskCompletionSource<string> tcs;
// GET api/values
[HttpGet]
public async Task<string> Get()
{
time = DateTime.Now;
ShowThreads("Starting Get Action...");
// Using await will free the pooled thread until a Task result is available, basically
// returns a Task to the ASP.NET, which is a "promise" to have a result in the future.
string result = await CreateTaskCompletionSource();
// This code is only executed once a Task result is available: the non-pooled thread
// completes processing and signals (TrySetResult) the TaskCompletionSource object
ShowThreads($"Signaled... Result: {result}");
Thread.Sleep(2_000);
ShowThreads("End Get Action!");
return result;
}
public static Task<string> CreateTaskCompletionSource()
{
ShowThreads($"Start Task Completion...");
string data = "Data";
tcs = new TaskCompletionSource<string>();
// Create a non-pooled thread (LongRunning), alternatively place the job data into a queue
// or similar and not create a thread because these would already have been pre-created and
// waiting for jobs from queues. The point is that is not mandatory to create a thread here.
Task.Factory.StartNew(s => Workload(data), tcs,
CancellationToken.None, TaskCreationOptions.LongRunning, TaskScheduler.Default);
ShowThreads($"Task Completion created...");
return tcs.Task;
}
public static void Workload(object data)
{
// I have put this Sleep here to give some time to show that the ASP.NET pooled
// thread was freed and gone back to the pool when the workload starts.
Thread.Sleep(100);
ShowThreads($"Started Workload... Data is: {(string)data}");
Thread.Sleep(10_000);
ShowThreads($"Going to signal...");
// Signal the TaskCompletionSource that work has finished, wich will force a pooled thread
// to be scheduled to execute the final part of the APS.NET controller action and finish.
// tcs.TrySetResult("Done!");
Task.Run((() => tcs.TrySetResult("Done!")));
// The only reason I show the TrySetResult into a task is to free this non-pooled thread
// imediately, otherwise the following line would only be executed after ASP.NET have
// finished processing the response. This briefly activates a pooled thread just execute
// the TrySetResult. If there is no problem to wait for ASP.NET to complete the response,
// we do it synchronosly and avoi using another pooled thread.
Thread.Sleep(1_000);
ShowThreads("End Workload");
}
public static void ShowThreads(string message = null)
{
int maxWorkers, maxIos, minWorkers, minIos, freeWorkers, freeIos;
lock (locker)
{
double elapsed = DateTime.Now.Subtract(time).TotalSeconds;
ThreadPool.GetMaxThreads(out maxWorkers, out maxIos);
ThreadPool.GetMinThreads(out minWorkers, out minIos);
ThreadPool.GetAvailableThreads(out freeWorkers, out freeIos);
Console.WriteLine($"Used WT: {maxWorkers - freeWorkers}, Used IoT: {maxIos - freeIos} - "+
$"+{elapsed.ToString("0.000 s")} : {message}");
}
}
}
}
I have placed the whole sample code so anyone can easily create as ASP.NET Core API project and test it without any changes. Here is the resulting output:
MyApi> Now listening on: http://localhost:23145
MyApi> Application started. Press Ctrl+C to shut down.
MyApi> Used WT: 1, Used IoT: 0 - +0.012 s : Starting Get Action...
MyApi> Used WT: 1, Used IoT: 0 - +0.015 s : Start Task Completion...
MyApi> Used WT: 1, Used IoT: 0 - +0.035 s : Task Completion created...
MyApi> Used WT: 0, Used IoT: 0 - +0.135 s : Started Workload... Data is: Data
MyApi> Used WT: 0, Used IoT: 0 - +10.135 s : Going to signal...
MyApi> Used WT: 2, Used IoT: 0 - +10.136 s : Signaled... Result: Done!
MyApi> Used WT: 1, Used IoT: 0 - +11.142 s : End Workload
MyApi> Used WT: 1, Used IoT: 0 - +12.136 s : End Get Action!
As you can see the pooled thread runs until the await on the TaskCompletionSource creation, and by the time the Workload starts to process the request on the non-pooled thread there is ZERO ThreadPool threads being used and remains using no pooled threads for the entire duration of the processing. When the Run.Task executes the TrySetResult fires a pooled thread for a brief moment to trigger the rest of the controller action code, reason the Worker thread count is 2 for a moment, then a fresh pooled thread runs the rest of the ASP.NET controller action to finish with the response.

Resources