kafka Transactional producer example - spring-kafka

I get a message from a source-topic. Then I split the message into 3 parts and send every parts to 3 different topics.
Now 2 messages are delivered to 1st 2 topic successfully. But while sending 3rd message we get exceptions (e.g. ProducerFencedException | OutOfOrderSequenceException | AuthorizationException | RecordLengthException)
How to roll back / revert 2 other messages from the previous 2 topics ?
Full Java Code example will be very helpful. Dont want to use producer.initTransactions() kind of methods.
I refer this also - Transactional Kafka Producer
but have doubt really we need to - write all the #Bean for Producer, template, factory , Tx - because those can easily e provided in application.yml.

See the documentation https://docs.spring.io/spring-kafka/docs/current/reference/html/#transactions
Spring Boot will automatically configure a KafkaTransactionManager bean into the container factory when the transactionIdPrefix property is set; the container will then start the transaction and if any of the publishing fails, they will all roll back.
#Bean
#ConditionalOnProperty(name = "spring.kafka.producer.transaction-id-prefix")
#ConditionalOnMissingBean
public KafkaTransactionManager<?, ?> kafkaTransactionManager(ProducerFactory<?, ?> producerFactory) {
return new KafkaTransactionManager<>(producerFactory);
}

Related

What's different with Spring Kafka MessageListenerContainer start/Resume / stop/ Pause # spring-kafka

for (MessageListenerContainer e : containers) {
if (!e.isContainerPaused()) {
e.pause();
e.stop();
} else {
e.resume();
e.start();
}
}
in my springboot-kafka project I want to control these container, but I don't know what's the different on pause / stop or resume / start.
thx.
See documentation for more details: https://docs.spring.io/spring-kafka/docs/current/reference/html/#pause-resume
The pause/resume is, essentially, a KafkaConsumer feature to prevent rebalance when we cannot keep up for poll.interval. See more in its docs: https://medium.com/#vaibhav.vb24/kafka-confluent-pause-and-resume-consumer-cda7305944bf.
The start/stop is a Spring's lifecycle management feature. You subscribe to the topic and start consuming from it when the start() is called. You stop consuming and unsubscribe from the topic when stop() is called. This way a rebalance happens for the consumer group. See more info in docs:
https://docs.spring.io/spring-kafka/docs/current/reference/html/#connecting
https://docs.spring.io/spring-framework/docs/current/reference/html/core.html#beans-factory-lifecycle-processor

How to avoid hitting all retryable topics for fatal-by-default exceptions?

My team is writing a service that leverages the retryable topics mechanism offered by Spring Kafka (version 2.8.2). Here is a subset of the configuration:
#Bean
public ConsumerFactory<String, UploadMessage> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(
this.springProperties.buildConsumerProperties(),
new StringDeserializer(),
new ErrorHandlingDeserializer<>(new KafkaMessageDeserializer()));
}
#Bean
public RetryTopicConfiguration retryTopicConfiguration(KafkaTemplate<String, Object> kafkaTemplate) {
final var retry = this.applicationProperties.retry();
return RetryTopicConfigurationBuilder.newInstance()
.doNotAutoCreateRetryTopics()
.suffixTopicsWithIndexValues()
.maxAttempts(retry.attempts())
.exponentialBackoff(retry.initialDelay(), retry.multiplier(), retry.maxDelay())
.dltHandlerMethod(DeadLetterTopicProcessor.ENDPOINT_HANDLER_METHOD)
.create(kafkaTemplate);
}
KafkaMessageDeserializer is a custom deserialiser that decodes protobuf-encoded messages and may throw a SerializationException in case of a failure. This exception is correctly captured and transformed into a DeserializationException by Spring Kafka. What I find a bit confusing is that the intercepted poison pill message then hits all of the retry topics before eventually reaching the dead letter one. Obviously it fails with exactly the same error at every step.
I know that RetryTopicConfigurationBuilder::notRetryOn may be used to skip the retry attempts for particular exception types, but what if I want to use exactly the same list of exceptions as in ExceptionClassifier::configureDefaultClassifier? Is there a way to programmatically access this information without basically duplicating the code?
That is a good suggestion; it probably should be the default behavior (or at least optionally).
Please open a feature request on GitHub.
There is a, somewhat, related discussion here: https://github.com/spring-projects/spring-kafka/discussions/2101

How to pause a specific kafka consumer thread when concurrency is set to more than 1?

I am using spring-kafka 2.2.8 and setting concurrency to 2 as shown below and trying to understand how do i pause an consumer thread/instance when particular condition is met.
#KafkaListener(id = "myConsumerId", topics = "myTopic", concurrency=2)
public void listen(String in) {
System.out.println(in);
}
Now, I've two questions.
Would my consumer span two different poll threads to poll the records?
If i'm setting an id to the consumer as shown above. How can i pause a specific consumer thread (with concurrency set to more than 1).
Please suggest.
Use the KafkaListenerEndpointRegistry.getListenerContainer(id) method to get a reference to the container.
Cast it to a ConcurrentMessageListenerContainer and call getContainers() to get a list of the child KafkaMessageListenerContainers; you can then pause/resume them individually.
You can determine which topics/partitions each one has using getAssignedPartitions().

TransactionId prefix for producer-only and read-process-write - ProducerFencedException

Background: We have been getting ProducerFencedException in our producer-only transactions, and want to introduce uniqueness to our prefix to prevent this issue.
In this discussion, Gary mentions that in the case of read-process-write, the prefix must be the same in all instances and after each restart.
How to choose Kafka transaction id for several applications, hosted in Kubernetes?
While digging into this issue, I came to the realisation that we are sharing the same prefixId for both producer-only and read-process-write.
In our TopicPublisher class wrapping kafkaTemplate, we already have a publish() and publishInTransaction() methods for read-process-write and producer-only use cases respectively.
I am thinking to have 2 sets of kafkaTemplates/TransactionManagers/ProducerFactories, one with a fixed prefixId to be used by the publish() method and one with a unique prefix to be used in publishInTransaction().
My question is:
Does the prefix for producer-only need to be the same after a pod is restarted. Can we just append some uuid or k8s podId? Someone mentioned there may be delays with aborting transactions.
Is there a clean way to detect if the TopicPublisher is being called from a KafkaListener, so we can have just 1 publish method that uses the correct kafkaTemplate as needed?
Actually, there is no issue using the same transactionIdPrefix, at least with recent versions.
The factory gets a txIdPrefix.
For read-process-write, we create (and cache) a producer with transactionalId:
private String zombieFenceTxIdSuffix(String topic, int partition) {
return this.consumerGroupId + "." + topic + "." + partition;
}
which is suffixed onto the prefix.
For producer only-transactions, we create (and cache) a producer with the prefix and simple numeric suffix.
In the upcoming 2.3 release, there is also an option to assign a producer to a thread so the same thread always uses the same transactional.id.
I believe it needs to be the same, unless you don't mind waiting for transaction.timeout.ms (default 1 minute).
The maximum amount of time in ms that the transaction coordinator will wait for a transaction status update from the producer before proactively aborting the ongoing transaction.If this value is larger than the transaction.max.timeout.ms setting in the broker, the request will fail with a InvalidTransactionTimeout error.
This is what we do in spring-integration-kafka
if (this.transactional
&& TransactionSynchronizationManager.getResource(this.kafkaTemplate.getProducerFactory()) == null) {
sendFuture = this.kafkaTemplate.executeInTransaction(t -> {
return t.send(producerRecord);
});
}
else {
sendFuture = this.kafkaTemplate.send(producerRecord);
}
You can also use String suffix = TransactionSupport.getTransactionIdSuffix(); which is what the factory uses when it is asked for producer - if null, you are not running on a transactional consumer thread.

Does Speedment support transactions?

I have implemented the persistence layer using Speedment and I would like to
test the code using spring boot unit tests. I have annotated my unit tests with the following annotations:
#RunWith(SpringRunner.class)
#SpringBootTest
#Transactional
public class MovieServiceTest {
...
}
By default, Spring will start a new transaction surrounding each test method and #Before/#After callbacks, performing a roll back of the transaction at the end. With Speedment however this does not seem to work.
Does Speedment support transactions across several invocations, and if yes, how do I have to configure Spring to use the Speedment transactions or how doe I have to configure Speedment to use the data source provided by Spring?
Transaction support was added in Speedment 3.0.17. However, it does not integrate with the Spring #Transactional-annotation yet so you will have to wrap the code you want to execute as a single transaction like shown here:
txHandler.createAndAccept(tx ->
Account sender = accounts.stream()
.filter(Account.ID.equal(1))
.findAny()
.get();
Account receiver = accounts.stream()
.filter(Account.ID.equal(2))
.findAny()
.get();
accounts.update(sender.setBalance(sender.getBalance() - 100));
accounts.update(receiver.setBalance(receiver.getBalance() + 100));
tx.commit();
}
It is likely that you are streaming over a table and then conducts an update/remove operation while the stream is still open. Most database cannot handle having an open ResultSet on a Connection and then perform update operations on the same connection.
Luckily, there is an easy work around: consider collecting the entities you would like to modify in an intermediate Collection (such as a List or Set) and then use that Collection to perform the desired operations.
This case is described in the Speedment User's Guide here
txHandler.createAndAccept(
tx -> {
// Collect to a list before performing actions
List<Language> toDelete = languages.stream()
.filter(Language.LANGUAGE_ID.notEqual((short) 1))
.collect(toList());
// Do the actual actions
toDelete.forEach(languages.remover());
tx.commit();
}
);
AFAIK it does not (yet) - correction: it seems to setup one transaction per stream / statement.
See this article: https://dzone.com/articles/best-java-orm-frameworks-for-postgresql
But it should be possible to implement with writing a custom extension: https://github.com/speedment/speedment/wiki/Tutorial:-Writing-your-own-extensions
Edit:
According to a speedment developer one stream maps to one transaction: https://www.slideshare.net/Hazelcast/webinar-20150305-speedment-2

Resources