#KafkaListener skipping messages when using Acknowledgment.nack() after RecordInterceptor changed consumer record - spring-kafka

Let's assume the following RecordInterceptor to simply return a copy of the received consumer record.
class CustomRecordInterceptor : RecordInterceptor<Any, Any> {
override fun intercept(record: ConsumerRecord<Any, Any>): ConsumerRecord<Any, Any>? {
return with(record) {
ConsumerRecord(
topic(),
partition(),
offset(),
timestamp(),
timestampType(),
checksum(),
serializedKeySize(),
serializedValueSize(),
key(),
value(),
headers(),
leaderEpoch())
}
}
}
With such an interceptor in place, we experience lost records with the following Kafka listener.
Note: record is the result returned by the interceptor.
#KafkaListener(topics = ["topic"])
fun listenToEvents(
record: ConsumerRecord<SpecificRecordBase, SpecificRecordBase?>,
ack: Acknowledgment
) {
if (shouldNegativelyAcknowledge()) {
ack.nack(2_000L)
return
}
processRecord(record)
ack.acknowledge()
}
Whenever shouldNegativelyAcknowledge() is true, we would expect that record to be reprocessed by the listener after > 2 seconds. We are using ackMode = MANUAL.
What we see however is that after a while the skipped record was not reprocessed by the listener: processRecord was never invoked for that record. After a while, the consumer group has a lag of 0.
While debugging, we found this code block in KafkaMessageListenerContainer.ListenerConsumer#handleNack:
if (next.equals(record) || list.size() > 0) {
list.add(next);
}
next is the record after the interceptor treatment (so it's the copy of the original record)
record is the record before the interceptor treatment
Note that next and record can never be equal because ConsumerRecord does not override equals.
Could this be the cause for unexpectedly skipped records, maybe a bug even?
Or is it a misuse of the record interceptor to return a different ConsumerRecord object, not equal to the original?

It's a bug and it does explain why the remaining records are not sent to the listener - please open an issue on GitHub
https://github.com/spring-projects/spring-kafka/issues

Related

I don't understand the order that code executes when calling onAppear

I have been getting this problem now a few times when I'm coding and I think I just don't understand the way SwiftUI execute the order of the code.
I have a method in my context model that gets data from Firebase that I call in .onAppear. But the method doesn't execute the last line in the method after running the whole for loop.
And when I set breakpoints on different places it seems that the code first is just run through without making the for loop and then it returns to the method again and then does one run of the for loop and then it jumps to some other strange place and then back to the method again...
I guess I just don't get it?
Has it something to do with main/background thread? Can you help me?
Here is my code.
Part of my UI-view that calls the method getTeachersAndCoursesInSchool
VStack {
//Title
Text("Settings")
.font(.title)
Spacer()
NavigationView {
VStack {
NavigationLink {
ManageCourses()
.onAppear {
model.getTeachersAndCoursesInSchool()
}
} label: {
ZStack {
// ...
}
}
}
}
}
Here is the for-loop of my method:
//Get a reference to the teacher list of the school
let teachersInSchool = schoolColl.document("TeacherList")
//Get teacherlist document data
teachersInSchool.getDocument { docSnapshot, docError in
if docError == nil && docSnapshot != nil {
//Create temporary modelArr to append teachermodel to
var tempTeacherAndCoursesInSchoolArr = [TeacherModel]()
//Loop through all FB teachers collections in local array and get their teacherData
for name in teachersInSchoolArr {
//Get reference to each teachers data document and get the document data
schoolColl.document("Teachers").collection(name).document("Teacher data").getDocument {
teacherDataSnapshot, teacherDataError in
//check for error in getting snapshot
if teacherDataError == nil {
//Load teacher data from FB
//check for snapshot is not nil
if let teacherDataSnapshot = teacherDataSnapshot {
do {
//Set local variable to teacher data
let teacherData: TeacherModel = try teacherDataSnapshot.data(as: TeacherModel.self)
//Append teacher to total contentmodel array of teacherdata
tempTeacherAndCoursesInSchoolArr.append(teacherData)
} catch {
//Handle error
}
}
} else {
//TODO: Error in loading data, handle error
}
}
}
//Assign all teacher and their courses to contentmodel data
self.teacherAndCoursesInSchool = tempTeacherAndCoursesInSchoolArr
} else {
//TODO: handle error in fetching teacher Data
}
}
The method assigns data correctly to the tempTeacherAndCoursesInSchoolArr but the method doesn't assign the tempTeacherAndCoursesInSchoolArr to self.teacherAndCoursesInSchool in the last line. Why doesn't it do that?
Most of Firebase's API calls are asynchronous: when you ask Firestore to fetch a document for you, it needs to communicate with the backend, and - even on a fast connection - that will take some time.
To deal with this, you can use two approaches: callbacks and async/await. Both work fine, but you might find that async/await is easier to read. If you're interested in the details, check out my blog post Calling asynchronous Firebase APIs from Swift - Callbacks, Combine, and async/await | Peter Friese.
In your code snippet, you use a completion handler for handling the documents that getDocuments returns once the asynchronous call returns:
schoolColl.document("Teachers").collection(name).document("Teacher data").getDocument { teacherDataSnapshot, teacherDataError in
// ...
}
However, the code for assigning tempTeacherAndCoursesInSchoolArr to self.teacherAndCoursesInSchool is outside of the completion handler, so it will be called before the completion handler is even called.
You can fix this in a couple of ways:
Use Swift's async/await for fetching the data, and then use a Task group (see Paul's excellent article about how they work) to fetch all the teachers' data in parallel, and aggregate them once all the data has been received.
You might also want to consider using a collection group query - it seems like your data is structure in a way that should make this possible.
Generally, iterating over the elements of a collection and performing Firestore queries for each of the elements is considered a bad practice as is drags down the performance of your app, since it will perform N+1 network requests when instead it could just send one single network request (using a collection group query).

Kotlin Execution order and result issues

I'm trying to get all files from firebase's storage through listAll.
By the way..
storageReference.listAll().addOnSuccessListener { listResult ->
val image_task : FileDownloadTask
for (fileRef in listResult.items) {
fileRef.downloadUrl.addOnSuccessListener { Uri ->
image_list.add(Uri.toString())
println("size1 : " + image_list.size)
}
}
println("size2 : " + image_list.size)
}//addOnSuccessListener
enter image description here
Why is the execution order like this?
How do I solve it??
When you add a listener or callback to something, the code inside the listener will not be called until sometime later. Everything else in the current function will happen first.
You are adding listeners for each item using your for loop. No code in the listeners is running yet. Then your "size2" println call is made after the for loop. At some later time, all your listeners will fire.
If you want asynchronous code like this to be written sequentially, then you need to use coroutines. That's a huge topic, but your code would look something like this (but probably a little more involved than this if you want to properly handle errors). I'm using lifecycleScope from an Android Activity or Fragment for this example. If you're not on Android, you need to use some other CoroutineScope.
The calls to await() are an alternative to adding success and failure listeners. await() suspends the coroutine and then returns a result or throws an exception on failure.
lifecycleScope.launch {
val results = try {
storageReference.listAll().await()
} catch (e: Exception) {
println("Failed to get list: ${e.message}")
return#launch
}
val uris = try {
results.map { it.downloadUrl.await().toString() }
} catch (e: Exception) {
println("Failed to get at least one URI: ${e.message}")
return#launch
}
image_list.addAll(uris)
}
There is nothing wrong with the execution order here.
fileRef.downloadUrl.addOnSuccessListener { Uri ->
the downloadUrl is an asynchronous action which means it doesn't wait for the action to actually complete in order to move along with the code.
You receive the result with the success listener (at least in this case)
If you want to deal with it in a sequential way, look at coroutines.

DocumentDB select document at specific index

Is it possible to select a document at a specific index?
I have a document import process, I get a page of items from my data source (250 items at once) I then import these into DocumentDB in concurrently. Assuming I get an error inserting these items into DocumentDB I wont be sure what individual item or items failed. (I could work it out but don't want to). It would be easier to just Upsert all the items from the page again.
The items i'm inserting have an ascending id. So if i query DocumentDB (ordered by id) and select the id at position (count of all Id's - page size) I can start importing from that point forward again.
I know SKIP is not implemented, I want to check if there is another option?
You could try a bulk import stored procedure. The sproc creation code below is from Azure's github repo. This sproc will report back the number of docs created in the batch and continue trying to create docs in multiple batches if the sproc times out.
Since the sproc is ACID, you will have to retry from the beginning (or the last successful batch) if there are any exceptions thrown.
You could also change the createDocument function to upsertDocument if you just want to retry the entire batch process if any exception is thrown.
{
id: "bulkImportSproc",
body: function bulkImport(docs) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
// The count of imported docs, also used as current doc index.
var count = 0;
// Validate input.
if (!docs) throw new Error("The array is undefined or null.");
var docsLength = docs.length;
if (docsLength == 0) {
getContext().getResponse().setBody(0);
return;
}
// Call the CRUD API to create a document.
tryCreate(docs[count], callback);
// Note that there are 2 exit conditions:
// 1) The createDocument request was not accepted.
// In this case the callback will not be called, we just call setBody and we are done.
// 2) The callback was called docs.length times.
// In this case all documents were created and we don't need to call tryCreate anymore. Just call setBody and we are done.
function tryCreate(doc, callback) {
var isAccepted = collection.createDocument(collectionLink, doc, callback);
// If the request was accepted, callback will be called.
// Otherwise report current count back to the client,
// which will call the script again with remaining set of docs.
// This condition will happen when this stored procedure has been running too long
// and is about to get cancelled by the server. This will allow the calling client
// to resume this batch from the point we got to before isAccepted was set to false
if (!isAccepted) getContext().getResponse().setBody(count);
}
// This is called when collection.createDocument is done and the document has been persisted.
function callback(err, doc, options) {
if (err) throw err;
// One more document has been inserted, increment the count.
count++;
if (count >= docsLength) {
// If we have created all documents, we are done. Just set the response.
getContext().getResponse().setBody(count);
} else {
// Create next document.
tryCreate(docs[count], callback);
}
}
}
}

How to make multiple API request with RxJava and combine them?

I have to make N REST API calls and combine the results of all of them, or fail if at least one of the calls failed (returned an error or a timeout).
I want to use RxJava and I have some requirements:
Be able to configure a retry of each individual api call under some circumstances. I mean, if I have a retry = 2 and I make 3 requests each one has to be retried at most 2 times, with at most 6 requests in total.
Fail fast! If one API calls have failed N times (where the N is the configuration of the retries) it doesn't mater if the remaining requests hasn't ended, I want to return an error.
If I wish to make all the request with a single Thread, I would need an async Http Client, wouldn't?
Thanks.
You could use Zip operator to zip all request together once they ends and check there if all of them were success
private Scheduler scheduler;
private Scheduler scheduler1;
private Scheduler scheduler2;
/**
* Since every observable into the zip is created to subscribeOn a different thread, it´s means all of them will run in parallel.
* By default Rx is not async, only if you explicitly use subscribeOn.
*/
#Test
public void testAsyncZip() {
scheduler = Schedulers.newThread();
scheduler1 = Schedulers.newThread();
scheduler2 = Schedulers.newThread();
long start = System.currentTimeMillis();
Observable.zip(obAsyncString(), obAsyncString1(), obAsyncString2(), (s, s2, s3) -> s.concat(s2)
.concat(s3))
.subscribe(result -> showResult("Async in:", start, result));
}
private Observable<String> obAsyncString() {
return Observable.just("Request1")
.observeOn(scheduler)
.doOnNext(val -> {
System.out.println("Thread " + Thread.currentThread()
.getName());
})
.map(val -> "Hello");
}
private Observable<String> obAsyncString1() {
return Observable.just("Request2")
.observeOn(scheduler1)
.doOnNext(val -> {
System.out.println("Thread " + Thread.currentThread()
.getName());
})
.map(val -> " World");
}
private Observable<String> obAsyncString2() {
return Observable.just("Request3")
.observeOn(scheduler2)
.doOnNext(val -> {
System.out.println("Thread " + Thread.currentThread()
.getName());
})
.map(val -> "!");
}
In this example we just concat the results, but instead of do that, you can check the results and do your business logic there.
You can also consider merge or contact also.
you can take a look more examples here https://github.com/politrons/reactive
I would suggest to use an Observable to wrap all the calls.
Let's say you have your function to call the API:
fun restAPIcall(request: Request): Single<HttpResponse>
And you want to call this n times. I am assuming that you want to call them with a list of values:
val valuesToSend: List<Request>
Observable
.fromIterable(valuesToSend)
.flatMapSingle { valueToSend: Request ->
restAPIcall(valueToSend)
}
.toList() // This converts: Observable<Response> -> Single<List<Response>>
.map { responses: List<Response> ->
// Do something with the responses
}
So with this you can call the restAPI from the elements of your list, and have the result as a list.
The other problem is the retries. You said you wanted to retry when an individual cap is reached. This is tricky. I believe there is nothing out of the box in RxJava for this.
You can use retry(n) where you can retry n times in total, but that
is not what you wanted.
There's also a retryWhen { error -> ... } where you can do
something given an exception, but you would know what element throw
the error (unless you add the element to the exception I think).
I have not used the retries before, nevertheless it seems that it retries the whole observable, which is not ideal.
My first approach would be doing something like the following, where you save the count of each element in a dictionary or something like that and only retry if there is not a single element that exceeds your limit. This means that you have to keep a counter and search each time if any of the elements exceeded.
val counter = valuesToSend.toMap()
yourObservable
.map { value: String ->
counter[value] = counter[value]?.let { it + 1 }?: 0 // Update the counter
value // Return again the value so you can use it later for the api call
}
.map { restAPIcall(it) }
// Found a way to take yourObservable and readd the element if it doesn't exceeds
// your limit (maybe in an `onErrorResumeNext` or something). Else throw error

Sending multiple messages in the same saga

Here's my scenario:
A modal fires that sends a message to nservicebus, this modal can fire x times, BUT I only need to send the latest message. I can do this using multiple sagas (1 per message) however for cleanliness I want to do it in 1 saga.
Here's my Bus.Send
busService.Send(new PendingMentorEmailCommand()
{
PendingMentorEmailCommandId = mentorshipData.CandidateMentorMenteeMatchID,
MentorshipData = mentorshipData,
JobBoardCode = Config.JobBoardCode
});
Command Handler:
public void Handle(PendingMentorEmailCommand message)
{
Data.PendingMentorEmailCommandId = message.PendingMentorEmailCommandId;
Data.MentorshipData = message.MentorshipData;
Data.JobBoardCode = message.JobBoardCode;
RequestTimeout<PendingMentorEmailTimeout>(TimeSpan.FromSeconds(PendingMentorEmailTimeoutValue));
}
Timeout:
public void Timeout(PendingMentorEmailTimeout state)
{
Bus.Send(new PendingMentorEmailMessage
{
PendingMentorEmailCommandId = Data.PendingMentorEmailCommandId,
MentorshipData = Data.MentorshipData,
JobBoardCode = Data.JobBoardCode
});
}
Message handler:
public void Handle(PendingMentorEmailMessage message)
{
ResendPendingNotification(message);
}
Inside my Resend method, I need to send an email based on a check...
// is there another (newer) message in the queue?
if (currentMentorShipData.DateMentorContacted == message.MentorshipData.DateMentorContacted)
CurrentMentorShipData is a database pull to get the values at the time of the message.
So I run message one at 10:22 and expect it to fire at 10:25 when I do nothing, however when I send a second message at 10:24, I only want 1 message to fire at 10:27 (the updated one), and nothing to fire at 10:25 because my if condition should fail at 10:25. I'm thinking what's happening is the Saga Data object is getting overridden by the 2nd message causing both messages to fire with DateMentorContacted = 10:24 on the message on the 1st and 2nd message, so my question is how can I persist each message's data individually?
Let me know if I can explain anything else, I'm new to nServiceBus & have tried to provide as much detail as possible.
Hearing this statement "I only need to send the latest message", I assume that that would be true for a given application-specific ID (maybe CandidateMentorMenteeMatchID in your case).
I would use that ID as a correlation ID in your saga so that you end up with one saga instance per ID.
Next, I'd have the saga itself filter out the unnecessary message sending.
This can be done by having a kind of sequence number that you store on the saga and pass back in the timeout data. Then, in your timeout handler, you can compare the sequence number currently on the saga against that which came in the timeout data, which would tell you if another message had been received during the timeout. Only if the sequence numbers match would you send the message which would ultimately cause the email to be sent.

Resources