JMS Session and JPA transaction with XA - ejb

I'm using WebSphere 8.5 with EJB 3.1 and JMS Generic provider.
I need to write messages in a queue using a stateless session bean as a producer. The EJB is annotated with the TransactionAttributeType.REQUIRED because I need to perform some "DB insert" before I send messages on a queue and consume these messages reading records wrote by the producer.
The problem is if I define a JDBC non XA datasource, the producer writes the messages in queue but the server complains about a failed 2 phase-commit of a local resource (the Datasource itself I think) and doesn't call the onMessage method of the MDB. If I define a JDBC XA everything works.
My questions:
Is JMS session required to be a default XA resources? And why?
What happen if I configure my JMS connection factory to create a non XA JMS session in a JTA Transaction? Is that a bad practice?
What happen if the consumer starts to consume message while the producer is still finishing his operations on database? Would the consumer see changes on database because they are in the same transaction?
Thanks in advance, regards

Is JMS session required to be a default XA resources? And why?
You need both resources to be XA. This is distributed transaction - among 2 different resources - database and JMS queue. To participate in one, same transaction they both must be XA (there is an option to have one non XA resource in transaction - using last participant support, but I wouldn't recommend that) .
If your resources are not XA, then you may set bean to NOT_SUPPORTED and handle transaction by yourself - means - manage 2 separate transactions, first to database and second to JMS queue. However, since db transaction will be commited first, you would have to code compensating it, when sending message fails (as you cannot do rollback), to avoid situation were database state has changed and you didn't send the message.
What happens if I configure my JMS connection factory to create a non XA JMS session in a JTA Transaction?
If another resource is a part of that transaction (e.g. database) you will have exception about 2 phase-commit support.
What happen if the consumer starts to consume message while the producer is still finishing his operations on database?
It's not clear for me, what you are asking. If producer first writes to the database, then writes to the queue in one XA transaction, they will be commited at the same time, so consumer will not be able to see the message first.
However, if you create 2 separate transactions (one for db access, second for queue access) you could have a situation, if you first commit the queue, that consumer could read the message. But in that case, consumer will not be able to see changes to the db, if they are not commited.
Would the consumer see changes on database because they are in the same transaction?
Producer and consumer are not in the same transaction (producer creates message and commits, consumer starts separate transaction to read).

Related

Handling defunct deferred (timeout) messages?

I am new to Rebus and am trying to get up to speed with some patterns we currently use in Azure Logic Apps. The current target implementation would use Azure Service Bus with Saga storage preferably in Cosmos DB (still investigating that sample implementation). Maybe even use Rebus Mongo DB with Cosmos DB using the Mongo DB API (not sure if that is possible though).
One major use case we have is an event/timeout pattern, and after doing some reading of samples/forums/Stack Overflow this is not uncommon. The tricky part is that our Sagas would behave more as a Finite State Machine vs. a Directed Acyclic Graph. This mainly happens because dates are externally changed and therefore timeouts for events change.
The Defer() method does not return a timeout identifier, which we assume is an implementation restriction (Azure Service Bus returns a long). Since we must ignore timeouts that had been scheduled for an event which has now shifted in time, we see a way of having those timeouts "ignored" (since they cannot be cancelled) as follows:
Use a Dictionary<string, Guid> in our own SagaData-derived base class, where the key is some derivative of the timeout message type, and the Guid is the identifier given to the timeout message when it was created. I don't believe this needs to be a concurrent dictionary but that is why I am here...
On receipt of the event message, remove the corresponding timeout message type key from the above dictionary;
On receipt of the timeout message:
Ignore if it's timeout message type key is not present or the Guid does not match the dictionary key/value; else
Process. We could also remove the dictionary key at this point as well.
When event rescheduling occurs, simply add the timeout message type/Guid dictionary entry, or update the Guid with the new timeout message Guid.
Is this on the right track, or is there a more 'correct' way of handling defunct timeout (deferred) messages?
You are on the right track 🙂
I don't believe this needs to be a concurrent dictionary but that is why I am here...
Rebus lets your saga handler work on its own copy of the saga data (using optimistic concurrency), so you're free to model the saga data as if it's being only being accessed by one at a time.

What's the basic difference between single record kafka consumer and kafka batch consumer?

I am using spring-kafka 2.2.8 and trying to understand what's the main difference between single record consumer and a batch consumer.
As far as I understand, reading messages/bytes from a topic wouldn't be any different for a single record consumer vs batch consumer. The only difference is how the offset is committed. And hence error handling. Is my understanding correct? Please confirm.
With a record-based listener, the records returned by the poll are handed to the listener one at a time. The container can be configured to commit the offsets one-at-a-time, or after all records are processed (default).
With a batch listener, the records returned by the poll are all handed to the listener in one call.

How can we pause Kafka consumer polling/processing records when there is an exception because of downstream system

I'm using spring boot 2.1.7.RELEASE and spring-kafka 2.2.8.RELEASE.And I'm using #KafkaListener annotation to create a consumer and I'm using all default settings for the consumer.
Now, In my consumer, the processing logic includes a DB call and I'm sending the record to DLT if there is an error/exception during processing.
With this setup, If the DB is down for few mins because of some reason, I want to pause/stop my consumer from consuming more records otherwise it keeps on consuming the messages and will get the DB exception and eventually fill up my DLT which I don't want to do unless the DB is back (based on some health check).
Now I've few questions here.
Does spring-kafka provide an option to trigger infinite retry based on the exception type (in this case a DB exception but I want to add few more types of exception based on my consumer logic)
Does spring-kafka provide an option to trigger the message consumption based on a condition?
There is a ContainerStoppingErrorHandler but it will stop the container for all exceptions.
You would need to create a custom error handler that stops (or pauses) the container after a specific failure as well as some mechanism to restart (or resume) the container.

Datastore: Failed transactions and rollbacks: What happens if rollback is not called or fails?

What happens if a transaction fails and the application crashes for other reasons and the transaction is not rolled back?
Also, what happens and how should rollback failures be treated?
You don't have to worry about the impact of your app's crashes on transaction rollbacks (or any other stateful datastore operation).
The application just sends RPC requests for the operations. The actual operation steps/sequence execution, happens on the datastore backend side, not inside your application.
From Life of a Datastore Write:
We'll dive into a bit more detail in terms of what new data is placed
in the datastore as part of write operations such as inserts,
deletions, updates, and transactions. The focus is on the backend work
that is common to all of the runtimes.
...
When we call put or makePersistent, several things happen behind
the scenes before the call returns and sets the entity's key:
The my_todo object is converted into a protocol buffer.
The appserver makes an RPC call to the datastore server, sending the entity data in a protocol buffer.
If a key name is not provided, a unique ID is determined for this entity's key. The entity key is composed of app ID | ancestor keys |
kind name | key name or ID.
The datastore server processes the request in two phases that are executed in order: commit, then apply. In each phase, the datastore
server identifies the Bigtable tablet servers that should receive
the data.
Now, depending on the client library you use, transaction rollback could be entirely automatic (in the ndb python client library, for example) or could be your app's responsibility. But even if it is your app's responsibility, it's a best-effort attempt anyways. Crashing without requesting a rollback would simply mean that some potentially pending operations on the backend side will eventually time out instead of being actively ended. See also related GAE: How to rollback a transaction?

Handling Repository and Kafka Transactions

I have a use case where I need to consume from a kafka topic, do some work ,produce to another kafka topic with only once semantics and save to a mongo database .After going through docs what I figure is that the kafka transaction and mongo transaction can be synchronized but they are still two different transactions .In the below scenario if the mongo commit fails is there a way to roll back the kafka record that was committed to the topic and replayed from consumer.
producer.send()
producer.sendOffsetsToTransaction()
mongoDao.commit()
If the listener throws an exception, the kafka transaction will be rolled back and redelivered.
If the mongo commit succeeds and the kafka commit fails, you will need to deal with a duplicate delivery.
If you wire the KafkaTransactionManager (or a KafkaChainedTransactionManager containing one) into the listener container, you don't need to send the offsets to the transaction, the container will do it for you before committing.

Resources