How to control sequence of Kafka Messages sent Asynchronously - asynchronous

I have developed a Kafka version : 0.9.0.1 application.
One very important aspect of this application is that messages need to be consumed in the correct sequence. This is because I am propagating Database rows between two databases. This means I need to ensure all records within each Unit Of Work are sent as well as UOW-Insert arriving before UOW-Update.
If I use asynchronous message production, how can I guarantee the messages are consumed in the correct sequence?
I will employ Kafka Producer send with Callback to be notified whether or not each message was successfuly sent.
I will have acks=all, retries=0, batch.size=16384 and my Kafka topics have a single partition.
My consumer can cope with duplicate messages, e.g. in the case of retries being necessary, however consumers cannot cope with messages out of sequence.
My retry approach will be to Fail Fast, e.g. as soon as a messages fails to send I report the records Log Record Sequence Number (LRSN) or Relative Byte Address (RBA) and stop sending any messages.
Then reset the source database logs to the reported LRSN or RBA and restart my message production.
for example I send messages
Message UOW
M1 uow-0000
M2 uow-0000
M3 uow-0000
M4 uow-0001
M5 uow-0001
M6 uow-0001
M7 uow-0001
M8 uow-0002
When message M5 failed to send successfully, I would stop sending anymore messages. However I have an issue that the consumer will have received messages M1, M2, M3, M4, M6, M7, & M8.
To recover this situation I will reset my source database log reader to the reported LRSN or RBA of M5 and start resending messages from that point.
Now the consumer receives
Message UOW Duplicate
M5 uow-0001 No
M6 uow-0001 Yes
M7 uow-0001 Yes
M8 uow-0002 Yes
With this approach I get the speed of Asynchronous messaging and "hopefully" all messages consumed in the desired sequence.

Related

Does MPI_Ssend/MPI_Issend use a system buffer?

According to the documentation, MPI_Ssend and MPI_Issend are a blocking and a non-blocking send operations, both synchronous. The MPI specification says that a synchronous send completes when the receiver has started to receive the message and after that it is safe to update the send buffer:
The functions MPI_WAIT and MPI_TEST are used to complete a nonblocking
communication. The completion of a send operation indicates that the
sender is now free to update the locations in the send buffer (the
send operation itself leaves the content of the send buffer
unchanged). It does not indicate that the message has been received,
rather, it may have been buffered by the communication subsystem.
However, if a synchronous mode send was used, the completion of the
send operation indicates that a matching receive was initiated, and
that the message will eventually be received by this matching receive.
Bearing in mind that a synchronous send is considered to be completed when it's just started to be received, I am not sure of the following:
It is possible that only a part of the data has been read from the send buffer at the moment when MPI_Ssend or MPI_Issend signal about send completion? For example, the first N bytes have been sent and received while the next M bytes are still being sent.
How can the caller be safe to modify the data until the whole message is received? Does it mean that the data is necessarily copied to the system buffer? As far as I understand, the MPI standard permits the use of a system buffer but does not require it. Moreover, from here I read that MPI_Issend() doesn't ever buffer data locally.
MPI_Ssend() (or the MPI_Wait() related to MPI_Issend()) returns when :
the receiver has started to receive the message
and the send buffer can be reused
the second condition is met if the message was fully received, or the MPI library buffers the data locally.
I did not read that the MPI standard prohibits data buffering.
From the standard, MPI 3.1 chpt 3.4 page 37
A send that uses the synchronous mode can be started whether or not a matching
receive was posted. However, the send will complete successfully only if a matching receive is
posted, and the receive operation has started to receive the message sent by the synchronous
send. Thus, the completion of a synchronous send not only indicates that the send buffer
can be reused, but it also indicates that the receiver has reached a certain point in its
execution, namely that it has started executing the matching receive. If both sends and
receives are blocking operations then the use of the synchronous mode provides synchronous
communication semantics: a communication does not complete at either end before both
processes rendezvous at the communication. A send executed in this mode is non-local.

RabbitMq .net core client to handle multiple messages in parallel (not one by one)

Let's say I have one publisher and 2 consumers.
Each consumer should consume 5 messages at a time (in parallel).
(one exchange, bound to one queue, direct mode)
Publisher produces messages (1,2,3,...14,15)
Consumer A consumes (1,3,5,7,9)
Consumer B consumes (2,4,6,8,10)
Consumer A finished processing message 1 and receives message 11
... etc
How can I achieve this behaviour?
I realized, that the consumer.Receive event is only fired when the previous message has been processed.
When reading the rabbitmq docs, this seemed exactly what I need:
https://www.rabbitmq.com/consumer-prefetch.html
but obviously that setting has no impact on the above mentioned behaviour (messages are still processed serially).
Any ideas?
setting prefetch, messages are still processed serially
Because per-channel messages are be processed serially. So you have two options:
consume on a single channel and spawn multiple task thread to handle the message.
open multiple consumer channel, and process message in that channel thread.

How to we monitor transactions in transit in the MQ on both sending and receiving side in Corda?

We understand that there are port tear-down during transactions and different ports may be used when sending messages over to the counterparties. When a node goes down, the messages are still sent but they are being queued in the MQ, is there a recommended way how could we monitor these transactions/messages?
Unfortunately, you can't currently monitor these messages.
This is because Artemis does not store its queued messages in a human-readable/queryable format. Instead, the queued messages are stored in the form of a high-performance journal that contains a lot of information that is required in case the message queue's state needs to be restored from a hard bounce.
I approached this by finding the documents here: https://docs.corda.net/node-administration.html#monitoring-your-node
where it illustrates Corda flow metrics visualized using hawtio.
I just needed to download and startup hawt.io and connect it to any ( or the specified node PID ) net.corda.node.Corda and by going to the JMX tab we could see the messages in queue.

Reliable WCF Service with MSMQ + Order processing web application. One way calls delivery

I am trying to implement Reliable WCF Service with MSMQ based on this architecture (http://www.devx.com/enterprise/Article/39015)
A message may be lost if queue is not available (even cluster doesn't provide zero downtime)
Take a look at the simple order processing workflow
A user enters credit card details and makes a payment
Application receives a success result from payment gateway
Application send a message as “fire and forget”/”one way” call to a backend service by WCF MSMQ binding
The user will be redirected on the “success” page
Message is stored in a REMOTE transactional queue (windows cluster)
The backend service dequeue and process the message, completes complex order processing workflow and, as a result, sends an as email confirmation to the user
Everything looks fine as excepted.
What I cannot understand how can we guarantee that all “one way” calls will be delivered in the queue?
Duplex communication is not a case due to the user should be redirected at the result web page ASAP.
Imagine the case when a user received “success” page with language “… Your payment was made, order has been starting to process, and you will email notifications later…” but the message itself is lost.
How durability can be implemented for step 3?
One of the possible solutions that I can see is
3a. Create a database record with a transaction details marked as uncompleted, just to have any record about the transaction. This record may be used as a start point to process the lost message in case of the message will not be saved in the queue.
I read this post
The main thing to understand about transactional MSMQ is that there
are three distinct transactions involved in a transactional send to a
remote queue.
The sender writes the message to a local queue.
The queue manager on the senders machine transmits the message across the wire to the queue manager on the recipient machine
The receiver service processes the queue message and then removes the message from the queue.
But it doesn’t solve described issue - as I know WCF netMsmqBinding‎ doesn’t use local queue to send messages to remote one.
But it doesn’t solve described issue - as I know WCF netMsmqBinding‎
doesn’t use local queue to send messages to remote one.
Actually this is not correct. MSMQ always sends to a remote queue via local queue, regardless of whether you are using WCF or not.
If you send a message to a remote queue then look in Message Queuing in Server Management you will see in Outbound queues that a queue has been created with the address of the remote queue. This is a temporary queue which is automatically created for you. If the remote queue was for some reason unavailable, the message would sit in the local queue until it became available, and then it would be transmitted.
So durability is provided because of the three-phase commit:
transactionally write message locally
transactionally transmit message
transactionally receive and process message
There are instances where you may drop messages, for example, if your message processing happens outside the scope of the dequeue transaction, and also instances where it is not possible to know if the processing was successful (eg back-end web service call times out), and of course you could have a badly formed message which will never succeed processing, but in all cases it should be possible to design for these.
If you're using public queues on a clustered environment then I think there may be more scope for failure as clustering msmq introduces complexity (I have not really used so I don't know) so try to avoid if possible.

BizTalk Zombies - any way to explicitly REMOVE a subscription from within a BizTalk orchestration

Background:
We make use of a lot of aggregation, singleton and multiton orchestrations, similar to Seroter's Round Robin technique described here (BizTalk 2009).
All of the these orchestration types have fairly arbitrary exit or continuation points (for aggregations), usually defined by a timer - i.e. if an Orch hasn't received any more messages within X minutes then proceed with the batching, and if after Y more minutes have elapsed and no more messages then quit. (We also exit our Single / N-Tons due to concerns about degraded performance after large numbers of messages are subscribed to the singleton over a period).
As much as we've tried to mitigate against Zombies e.g. by Starting any continuation processing in an asynch refactored orchestration, there is always a point of weakness where a 'well' timed message could cause a zombie. (i.e. receiving more incoming messages correlated to the 'already completed' shapes of an orchestration),
If a message causes a zombie on one of the subscriptions, the message does not appear to be propogated to OTHER subscribers either (i.e. orchs totally decoupled from the 'zombie causing' orchestration), i.e. the zombie-causing message is not processed.
Question
So I would be very interested in seeing if anyone has another way, programmatically or otherwise, to explicitly remove a correlated subscription from a running orchestration once the orchestration has 'progressed' beyond the point where it is interested in this correlated message. (this new message would then would typically start a new orchestration with its own correlations etc)
At this point we would consider even a hack solution such as a reflected BizTalk API call or direct SQL delete against the MsgBoxDB.
No you can't explicitly remove the subscription in an Orchestration.
The subscription will be removed as the Orchestration is tearing itself down, but a message arriving at that exact instance will be routed to the Orchestration but the Orchestration will end without processing it, and that's your Zombie.
Microsoft Article about Zombies http://msdn.microsoft.com/en-us/library/bb203853.aspx
I once also had to have an receive, debatch, aggregate, send pattern. Receiving enveloped messages from multiple senders, debatching them, aggregating by intended recipient (based on two rules, number of messages or time delay, whichever occurred first).
This scenario was ripe for Zombies and when I read about them I designed it so it would not occur. This was for BizTalk 2004
I debatched the messages, and inserted them into a database. I had a stored procedure that was polled by a receive port that would work out if there was a batch to send in if there was it would trigger of an Orcherstration that would take that message and route it dynamically.
Since neither Orchestrations had to wait for a another message they could end gracefully and there would be no Zombies.

Resources