RPC semantics what exactly is the purpose - networking

I was going through the rpc semantics, at-least-once and at-most-once semantics, how does they work?
Couldn't understand the concept of their implementation.

In both cases, the goal is to invoke the function once. However, the difference is in their failure modes. In "at-least-once", the system will retry on failure until it knows that the function was successfully invoked, while "at-most-once" will not attempt a retry (or will ensure that there is a negative acknowledgement of the invocation before retrying).
As to how these are implemented, this can vary, but the pseudo-code might look like this:
At least once:
request_received = false
while not request_received:
send RPC
wait for acknowledgement with timeout
if acknowledgment received and acknowledgement.is_successful:
request_received = true
At most once:
request_sent = false
while not request_sent:
send RPC
request_sent = true
wait for acknowledgement with timeout
if acknowledgment received and not acknowledgement.is_successful:
request_sent = false
An example case where you want to do "at-most-once" would be something like payments (you wouldn't want to accidentally bill someone's credit card twice), where an example case of "at-least-once" would be something like updating a database with a particular value (if you happen to write the same value to the database twice in a row, that really isn't going to have any effect on anything). You almost always want to use "at-least-once" for non-mutating (a.k.a. idempotent) operations; by contrast, most mutating operations (or at least ones that incrementally mutate the state and are thus dependent on the current/prior state when applying the mutation) would need "at-most-once".
I should add that it is fairly common to implement "at most once" semantics on top of an "at least once" system by including an identifier in the body of the RPC that uniquely identifies it and by ensuring on the server that each ID seen by the system is processed only once. You can think of the sequence numbers in TCP packets (ensuring the packets are delivered once and in order) as a special case of this pattern. This approach, however, can be somewhat challenging to implement correctly on distributed systems where retries of the same RPC could arrive at two separate computers running the same server software. (One technique for dealing with this is to record the transaction where the RPC is received, but then to aggregate and deduplicate these records using a centralized system before redistributing the requests inside the system for further processing; another technique is to opportunistically process the RPC, but to reconcile/restore/rollback state when synchronization between the servers eventually detects this duplication... this approach would probably not fly for payments, but it can be useful in other situations like forum posts).


How to keep order when consuming async messages (such as SQS or any other messaging service)

I've encountered this problem a few times and now I wonder what the industry best practice is, the context is, we have a data store which aggregates pieces of information taken from multiple micro-services, the way the data comes to us is through messages broadcasted by every source when there is a change
The problem is how to guarantee that our data will be eventually consistent and that the updates were applied in the order they were meant to be received. For example, Let's say we have an entity User
User {
display_name : String,
email: String,
bio: String
And we are listening changes on those users to keep "display_name" updated in our data store, the messages come in a format such as
event: "UserCreated",
id: 1000,
display_name: "MyNewUser"
event: "UserChanged",
id: 1000,
display_name: "MyNewUser2"
There is a scenario where "UserChanged" reaches our listeners before "UserCreated" therefore our code won't be able to find user with id 1000 and fail both transactions. This is where a mechanism to sort those two is desired, we have considered:
Timestamps: The problem with timestamps is that although we know the last time we read an update we don't know how many events happened between the last event seen and the one we are currently processing
Sequence numbers: This is slightly better but if a sequence is lost then we won't update our storage unless we relax the rules a little bit, we could say that after some time if a sequence hasn't been seen then proceed with the rest of operations
If anyone knows common design patterns that tackle this sort of issue would be great to know, also open to suggestions on perhaps data modeling, etc. Bottomline, I'm pretty sure this is a common software problem that has been solved many times before
Thanks a lot for the help!
My first thought here would be to jump directly to a sequence numbers-based approach, but this works when you got 1 to 1 communication, like in TCP orientated communications. In your case, there is many to one, so without a coordination between the senders, it would be challenging to implement this approach correctly (ex. 2 senders can use the same sequence number).
Yes, losing the messages would be problematic, but I don't think that's the case of SQS or other cloud-based message queues (of course, it depends on the scale you're working on), because they're known for data duplication instead of data loss (AFAIK).
One idea I can think of right now is to add a new layer between the senders and the consumer, which will orchestrate the events. It can be the consumer itself, but it can be another service in front of it, let's call it orchestrator.
The orchestrator is connected with each senders (individually) via 2 queues:
The first queue is used to get the actual events from the sender
The second queue is used to signal back to the sender an ACK-like event (the message has been received, validated and successfully passed downstream to the consumer (or consumed directly)).
The way it works is the following:
The orchestrator gets the event from the sender A
It tries to execute a validation-like operation specific to the message (update on an inexisting user), the operation fails, so it sends a N-ACK message back to sender A, signaling that its message was not able to be processed successfully. Sender A will try to resend the message after some time.
In the mean time, it gets the "create user" message from sender B, the message get passed downstream to the consumer
Finally, it will get the message from sender A (after some retries).
This solution ensures message ordering in a pretty basic way, without keeping the events in memory, but rather in the queues. It may work, but it depends on a lot of factors, like number of events, number of senders, etc.

Handling Race Conditions / Concurrency in Network Protocol Design

I am looking for possible techniques to gracefully handle race conditions in network protocol design. I find that in some cases, it is particularly hard to synchronize two nodes to enter a specific protocol state. Here is an example protocol with such a problem.
Let's say A and B are in an ESTABLISHED state and exchange data. All messages sent by A or B use a monotonically increasing sequence number, such that A can know the order of the messages sent by B, and A can know the order of the messages sent by B. At any time in this state, either A or B can send a ACTION_1 message to the other, in order to enter a different state where a strictly sequential exchange of message needs to happen:
send ACTION_1
recv ACTION_2
send ACTION_3
However, it is possible that both A and B send the ACTION_1 message at the same time, causing both of them to receive an ACTION_1 message, while they would expect to receive an ACTION_2 message as a result of sending ACTION_1.
Here are a few possible ways this could be handled:
1) change state after sending ACTION_1 to ACTION_1_SENT. If we receive ACTION_1 in this state, we detect the race condition, and proceed to arbitrate who gets to start the sequence. However, I have no idea how to fairly arbitrate this. Since both ends are likely going to detect the race condition at about the same time, any action that follows will be prone to other similar race conditions, such as sending ACTION_1 again.
2) Duplicate the entire sequence of messages. If we receive ACTION_1 in the ACTION_1_SENT state, we include the data of the other ACTION_1 message in the ACTION_2 message, etc. This can only work if there is no need to decide who is the "owner" of the action, since both ends will end up doing the same action to each other.
3) Use absolute time stamps, but then, accurate time synchronization is not an easy thing at all.
4) Use lamport clocks, but from what I understood these are only useful for events that are causally related. Since in this case the ACTION_1 messages are not causally related, I don't see how it could help solve the problem of figuring out which one happened first to discard the second one.
5) Use some predefined way of discarding one of the two messages on receipt by both ends. However, I cannot find a way to do this that is unflawed. A naive idea would be to include a random number on both sides, and select the message with the highest number as the "winner", discarding the one with the lowest number. However, we have a tie if both numbers are equal, and then we need another way to recover from this. A possible improvement would be to deal with arbitration once at connection time and repeat similar sequence until one of the two "wins", marking it as favourite. Every time a tie happens, the favourite wins.
Does anybody have further ideas on how to handle this?
Here is the current solution I came up with. Since I couldn't find 100% safe way to prevent ties, I decided to have my protocol elect a "favorite" during the connection sequence. Electing this favorite requires breaking possible ties, but in this case the protocol will allow for trying multiple times to elect the favorite until a consensus is reached. After the favorite is elected, all further ties are resolved by favoring the elected favorite. This isolates the problem of possible ties to a single part of the protocol.
As for fairness in the election process, I wrote something rather simple based on two values sent in each of the client/server packets. In this case, this number is a sequence number starting at a random value, but they could be anything as long as those numbers are fairly random to be fair.
When the client and server have to resolve a conflict, they both call this function with the send (their value) and the recv (the other value) values. The favorite calls this function with the favorite parameter set to TRUE. This function is guaranteed to give the opposite result on both ends, such that it is possible to break the tie without retransmitting a new message.
BOOL ResolveConflict(BOOL favorite, UINT32 sendVal, UINT32 recvVal)
BOOL winner;
int sendDiff;
int recvDiff;
UINT32 xorVal;
xorVal = sendVal ^ recvVal;
sendDiff = (xorVal < sendVal) ? sendVal - xorVal : xorVal - sendVal;
recvDiff = (xorVal < recvVal) ? recvVal - xorVal : xorVal - recvVal;
if (sendDiff != recvDiff)
winner = (sendDiff < recvDiff) ? TRUE : FALSE; /* closest value to xorVal wins */
winner = favorite; /* break tie, make favorite win */
return winner;
Let's say that both ends enter the ACTION_1_SENT state after sending the ACTION_1 message. Both will receive the ACTION_1 message in the ACTION_1_SENT state, but only one will win. The loser accepts the ACTION_1 message and enters the ACTION_1_RCVD state, while the winner discards the incoming ACTION_1 message. The rest of the sequence continues as if the loser had never sent ACTION_1 in a race condition with the winner.
Let me know what you think, and how this could be further improved.
To me, this whole idea that this ACTION_1 - ACTION_2 - ACTION_3 handshake must occur in sequence with no other message intervening is very onerous, and not at all in line with the reality of networks (or distributed systems in general). The complexity of some of your proposed solutions give reason to step back and rethink.
There are all kinds of complicating factors when dealing with systems distributed over a network: packets which don't arrive, arrive late, arrive out of order, arrive duplicated, clocks which are out of sync, clocks which go backwards sometimes, nodes which crash/reboot, etc. etc. You would like your protocol to be robust under any of these adverse conditions, and you would like to know with certainty that it is robust. That means making it simple enough that you can think through all the possible cases that may occur.
It also means abandoning the idea that there will always be "one true state" shared by all nodes, and the idea that you can make things happen in a very controlled, precise, "clockwork" sequence. You want to design for the case where the nodes do not agree on their shared state, and make the system self-healing under that condition. You also must assume that any possible message may occur in any order at all.
In this case, the problem is claiming "ownership" of a shared clipboard. Here's a basic question you need to think through first:
If all the nodes involved cannot communicate at some point in time, should a node which is trying to claim ownership just go ahead and behave as if it is the owner? (This means the system doesn't freeze when the network is down, but it means you will have multiple "owners" at times, and there will be divergent changes to the clipboard which have to be merged or otherwise "fixed up" later.)
Or, should no node ever assume it is the owner unless it receives confirmation from all other nodes? (This means the system will freeze sometimes, or just respond very slowly, but you will never have weird situations with divergent changes.)
If your answer is #1: don't focus so much on the protocol for claiming ownership. Come up with something simple which reduces the chances that two nodes will both become "owner" at the same time, but be very explicit that there can be more than one owner. Put more effort into the procedure for resolving divergence when it does happen. Think that part through extra carefully and make sure that the multiple owners will always converge. There should be no case where they can get stuck in an infinite loop trying to converge but failing.
If your answer is #2: here be dragons! You are trying to do something which buts up against some fundamental limitations.
Be very explicit that there is a state where a node is "seeking ownership", but has not obtained it yet.
When a node is seeking ownership, I would say that it should send a request to all other nodes, at intervals (in case another one misses the first request). Put a unique identifier on each such request, which is repeated in the reply (so delayed replies are not misinterpreted as applying to a request sent later).
To become owner, a node should receive a positive reply from all other nodes within a certain period of time. During that wait period, it should refuse to grant ownership to any other node. On the other hand, if a node has agreed to grant ownership to another node, it should not request ownership for another period of time (which must be somewhat longer).
If a node thinks it is owner, it should notify the others, and repeat the notification periodically.
You need to deal with the situation where two nodes both try to seek ownership at the same time, and both NAK (refuse ownership to) each other. You have to avoid a situation where they keep timing out, retrying, and then NAKing each other again (meaning that nobody would ever get ownership).
You could use exponential backoff, or you could make a simple tie-breaking rule (it doesn't have to be fair, since this should be a rare occurrence). Give each node a priority (you will have to figure out how to derive the priorities), and say that if a node which is seeking ownership receives a request for ownership from a higher-priority node, it will immediately stop seeking ownership and grant it to the high-priority node instead.
This will not result in more than one node becoming owner, because if the high-priority node had previously ACKed the request sent by the low-priority node, it would not send a request of its own until enough time had passed that it was sure its previous ACK was no longer valid.
You also have to consider what happens if a node becomes owner, and then "goes dark" -- stops responding. At what point are other nodes allowed to assume that ownership is "up for grabs" again? This is a very sticky issue, and I suspect you will not find any solution which eliminates the possibility of having multiple owners at the same time.
Probably, all the nodes will need to "ping" each other from time to time. (Not referring to an ICMP echo, but something built in to your own protocol.) If the clipboard owner can't reach the others for some period of time, it must assume that it is no longer owner. And if the others can't reach the owner for a longer period of time, they can assume that ownership is available and can be requested.
Here is a simplified answer for the protocol of interest here.
In this case, there is only a client and a server, communicating over TCP. The goal of the protocol is to two system clipboards. The regular state when outside of a particular sequence is simply "CLIPBOARD_ESTABLISHED".
Whenever one of the two systems pastes something onto its clipboard, it sends a ClipboardFormatListReq message, and transitions to the CLIPBOARD_FORMAT_LIST_REQ_SENT state. This message contains a sequence number that is incremented when sending the ClipboardFormatListReq message. Under normal circumstances, no race condition occurs and a ClipboardFormatListRsp message is sent back to acknowledge the new sequence number and owner. The list contained in the request is used to expose clipboard data formats offered by the owner, and any of these formats can be requested by an application on the remote system.
When an application requests one of the data formats from the clipboard owner, a ClipboardFormatDataReq message is sent with the sequence number, and format id from the list, the state is changed to CLIPBOARD_FORMAT_DATA_REQ_SENT. Under normal circumstances, there is no change of clipboard ownership during that time, and the data is returned in the ClipboardFormatDataRsp message. A timer should be used to timeout if no response is sent fast enough from the other system, and abort the sequence if it takes too long.
Now, for the special cases:
If we receive ClipboardFormatListReq in the CLIPBOARD_FORMAT_LIST_REQ_SENT state, it means both systems are trying to gain ownership at the same time. Only one owner should be selected, in this case, we can keep it simple an elect the client as the default winner. With the client as the default owner, the server should respond to the client with ClipboardFormatListRsp consider the client as the new owner.
If we receive ClipboardFormatDataReq in the CLIPBOARD_FORMAT_LIST_REQ_SENT state, it means we have just received a request for data from the previous list of data formats, since we have just sent a request to become the new owner with a new list of data formats. We can respond with a failure right away, and sequence numbers will not match.
Etc, etc. The main issue I was trying to solve here is fast recovery from such states, with going into a loop of retrying until it works. The main issue with immediate retrial is that it is going to happen with timing likely to cause new race conditions. We can solve the issue by expecting such inconsistent states as long as we can move back to proper protocol states when detecting them. The other part of the problem is with electing a "winner" that will have its request accepted without resending new messages. A default winner can be elected by default, such as the client or the server, or some sort of random voting system can be implemented with a default favorite to break ties.

How do I know which UDP packets I have already received?

I am making a game using a client-server model with UDP. Here's how I have implemented it so far:
All packets include a sequence number and a flag specifying whether they are "important".
Important message types require acknowledgement and will be re-sent after a delay if no acknowledgement is received.
Most message types are "unimportant" - that is, they do not require an acknowledgement, and if such a message is received with an older sequence number than the latest, it is dropped.
My dilemma is this: if an "important" message arrives twice, I only want to process it once. But how will I know that I have already received it, without keeping an ever-expanding list in memory?
Just remember the last X "important" messages received - the likelihood of receiving a VERY old message is slim (not ideal as it's not 100% reliable).
Use TCP for "important" messages (not ideal due to the complications and overhead involved in managing 2 protocols simultaneously).
Have a separate sequence number for "important" messages and ensure that these are always received in order, so only the most recent message needs to be remembered (I'm leaning towards this).
Any other ideas?
Use #3. The fact that you are ACK-ing the important messages provides the mechanism to ensure they are received in order, i.e. don't ACK an out-of-sequence one, and just remember the sequence number of the last one you ACK-ed.
Have a separate sequence number for "important" messages (starting from zero), and the following variables:
a variable min_recv, indicating that you received all "important" messages from 0 to min_recv (excluded);
a list of the "important" sequence number that you already have received.
At any time (e.g. after receiving another "important" message), you store its sequence number in the list; then you can check if you can compact the list:
while list contains `min_recv`:
remove `min_recv` from list
increment min_recv
In this way you consume minimal memory, because even when you receive out-of-order important messages (and the size of the list will start to grow), eventually you will receive the missing message, because it will be retransmitted if lost, and you will empty the list.

A MailboxProcessor that operates with a LIFO logic

I am learning about F# agents (MailboxProcessor).
I am dealing with a rather unconventional problem.
I have one agent (dataSource) which is a source of streaming data. The data has to be processed by an array of agents (dataProcessor). We can consider dataProcessor as some sort of tracking device.
Data may flow in faster than the speed with which the dataProcessor may be able to process its input.
It is OK to have some delay. However, I have to ensure that the agent stays on top of its work and does not get piled under obsolete observations
I am exploring ways to deal with this problem.
The first idea is to implement a stack (LIFO) in dataSource. dataSource would send over the latest observation available when dataProcessor becomes available to receive and process the data. This solution may work but it may get complicated as dataProcessor may need to be blocked and re-activated; and communicate its status to dataSource, leading to a two way communication problem. This problem may boil down to a blocking queue in the consumer-producer problem but I am not sure..
The second idea is to have dataProcessor taking care of message sorting. In this architecture, dataSource will simply post updates in dataProcessor's queue. dataProcessor will use Scanto fetch the latest data available in his queue. This may be the way to go. However, I am not sure if in the current design of MailboxProcessorit is possible to clear a queue of messages, deleting the older obsolete ones. Furthermore, here, it is written that:
Unfortunately, the TryScan function in the current version of F# is
broken in two ways. Firstly, the whole point is to specify a timeout
but the implementation does not actually honor it. Specifically,
irrelevant messages reset the timer. Secondly, as with the other Scan
function, the message queue is examined under a lock that prevents any
other threads from posting for the duration of the scan, which can be
an arbitrarily long time. Consequently, the TryScan function itself
tends to lock-up concurrent systems and can even introduce deadlocks
because the caller's code is evaluated inside the lock (e.g. posting
from the function argument to Scan or TryScan can deadlock the agent
when the code under the lock blocks waiting to acquire the lock it is
already under).
Having the latest observation bounced back may be a problem.
The author of this post, #Jon Harrop, suggests that
I managed to architect around it and the resulting architecture was actually better. In essence, I eagerly Receive all messages and filter using my own local queue.
This idea is surely worth exploring but, before starting to play around with code, I would welcome some inputs on how I could structure my solution.
Thank you.
Sounds like you might need a destructive scan version of the mailbox processor, I implemented this with TPL Dataflow in a blog series that you might be interested in.
My blog is currently down for maintenance but I can point you to the posts in markdown format.
You can also check out the code on github
I also wrote about the issues with scan in my lurking horror post
Hope that helps...
tl;dr I would try this: take Mailbox implementation from FSharp.Actor or Zach Bray's blog post, replace ConcurrentQueue by ConcurrentStack (plus add some bounded capacity logic) and use this changed agent as a dispatcher to pass messages from dataSource to an army of dataProcessors implemented as ordinary MBPs or Actors.
tl;dr2 If workers are a scarce and slow resource and we need to process a message that is the latest at the moment when a worker is ready, then it all boils down to an agent with a stack instead of a queue (with some bounded capacity logic) plus a BlockingQueue of workers. Dispatcher dequeues a ready worker, then pops a message from the stack and sends this message to the worker. After the job is done the worker enqueues itself to the queue when becomes ready (e.g. before let! msg = inbox.Receive()). Dispatcher consumer thread then blocks until any worker is ready, while producer thread keeps the bounded stack updated. (bounded stack could be done with an array + offset + size inside a lock, below is too complex one)
MailBoxProcessor is designed to have only one consumer. This is even commented in the source code of MBP here (search for the word 'DRAGONS' :) )
If you post your data to MBP then only one thread could take it from internal queue or stack.
In you particular use case I would use ConcurrentStack directly or better wrapped into BlockingCollection:
It will allow many concurrent consumers
It is very fast and thread safe
BlockingCollection has BoundedCapacity property that allows you to limit the size of a collection. It throws on Add, but you could catch it or use TryAdd. If A is a main stack and B is a standby, then TryAdd to A, on false Add to B and swap the two with Interlocked.Exchange, then process needed messages in A, clear it, make a new standby - or use three stacks if processing A could be longer than B could become full again; in this way you do not block and do not lose any messages, but could discard unneeded ones is a controlled way.
BlockingCollection has methods like AddToAny/TakeFromAny, which work on an arrays of BlockingCollections. This could help, e.g.:
dataSource produces messages to a BlockingCollection with ConcurrentStack implementation (BCCS)
another thread consumes messages from BCCS and sends them to an array of processing BCCSs. You said that there is a lot of data. You may sacrifice one thread to be blocking and dispatching your messages indefinitely
each processing agent has its own BCCS or implemented as an Agent/Actor/MBP to which the dispatcher posts messages. In your case you need to send a message to only one processorAgent, so you may store processing agents in a circular buffer to always dispatch a message to least recently used processor.
Something like this:
(data stream produces 'T)
[dispatcher's BCSC]
(a dispatcher thread consumes 'T and pushes to processors, manages capacity of BCCS and LRU queue)
| |
[processor1's BCCS/Actor/MBP] ... [processorN's BCCS/Actor/MBP]
| |
(process) (process)
Instead of ConcurrentStack, you may want to read about heap data structure. If you need your latest messages by some property of messages, e.g. timestamp, rather than by the order in which they arrive to the stack (e.g. if there could be delays in transit and arrival order <> creation order), you can get the latest message by using heap.
If you still need Agents semantics/API, you could read several sources in addition to Dave's links, and somehow adopt implementation to multiple concurrent consumers:
An interesting article by Zach Bray on efficient Actors implementation. There you do need to replace (under the comment // Might want to schedule this call on another thread.) the line execute true by a line async { execute true } |> Async.Start or similar, because otherwise producing thread will be consuming thread - not good for a single fast producer. However, for a dispatcher like described above this is exactly what needed.
FSharp.Actor (aka Fakka) development branch and FSharp MPB source code (first link above) here could be very useful for implementation details. FSharp.Actors library has been in a freeze for several months but there is some activity in dev branch.
Should not miss discussion about Fakka in Google Groups in this context.
I have a somewhat similar use case and for the last two days I have researched everything I could find on the F# Agents/Actors. This answer is a kind of TODO for myself to try these ideas, of which half were born during writing it.
The simplest solution is to greedily eat all messages in the inbox when one arrives and discard all but the most recent. Easily done using TryReceive:
let rec readLatestLoop oldMsg =
async { let! newMsg = inbox.TryReceive 0
match newMsg with
| None -> oldMsg
| Some newMsg -> return! readLatestLoop newMsg }
let readLatest() =
async { let! msg = inbox.Receive()
return! readLatestLoop msg }
When faced with the same problem I architected a more sophisticated and efficient solution I called cancellable streaming and described in in an F# Journal article here. The idea is to start processing messages and then cancel that processing if they are superceded. This significantly improves concurrency if significant processing is being done.

Biztalk Ordered Delivery failure

We have a BizTalk application where the order of messages being inputted is very important and has to be kept, meaning they have to be outputted in the same order. Normally ordered delivery would do the trick here.
However I read that ordered delivery is only guaranteed when you connect a receive location directly to a send port. The moment you use orchestrations the order delivery isn't guaranteed anymore. Is there a way to work around or fix this? Because this kind of ruins our whole application and we've been working on this for months.
I read a work around from Microsoft where they use an extra field which has a counter and where they use an end orchestration which checks the counters. But this is way too much work for us to do now. So this work around is a no go. Plus not all messages are translated which creates holes in our flow and not all messages are coming from the same source either which makes this work around useless anyway.
Any other ideas?
Check out this page.
It explains that if you have an orchestration that follows the singleton pattern to ensure only one instance of the orchestration exists, and you make sure you set the orchestration's receive port to ordered delivery, than you should get a valid end-to-end ordered delivery scenario
To provide end-to-end ordered delivery the following conditions must be met:
Messages must be received with an adapter that preserves the order of the messages when submitting them to BizTalk Server. In BizTalk Server 2006, examples of such adapters are MSMQ, MQSeries, and MSMQT. In addition, HTTP or SOAP adapters can be used to submit messages in order, but in that case the HTTP or SOAP client needs to enforce the order by submitting messages one at a time.
You must subscribe to these messages with a send port that has the Ordered Delivery option set to True.
If an orchestration is used to process the messages, only a single instance of the orchestration should be used, the orchestration should be configured to use a sequential convoy, and the Ordered Delivery property of the orchestration's receive port should be set to True.
Resequencing strategies for ordered delivery in BizTalk:
I recently responded to a LinkedIn user's question regarding ordered delivery options in BizTalk.
I thought it would be useful for people to understand some of the strategies for re-sequencing messages using BizTalk.
Often as an BizTalk Developer, you are required to integrate to line-of-business systems which are unchangeable. This can be for one or more of many different reasons. As an example, the cost of changing a system can be too high or the vendor license states that support may be withdrawn if the system is changed.
This would not normally represent a problem where the vendor has provided a thoughtfully designed API as a point-of-integration endpoint. However, as many Integration Developers quickly learn, this is very rarely the case.
What do I mean by a thoughtfully designed API? Well, aside from all the SODA principals (service composition, fault contracts etc.), the most important feature of an API is to support the consumption of data which arrives in the wrong order.
This is a fairly simple thing to do. For example, if you are a vendor and you provide a HTTP operation as your integration point then one of the fields you could expose on your operation is a time-stamp or, even better, a sequence number. This means that if a call is made with an out-of-date payload, the relevant compensating mechanism can kick-in - which can be as simple as discarding the data.
This article discusses what to do when the vendor has not built this feature into an API, and as an integrator you therefore are forced to implement end-to-end ordered delivery as part of your integration solution.
As stated in my response to the user's post on LinkedIn (see link above), in BizTalk ordered delivery in any but the simplest of scenarios is complicated at best and at worst can represent a huge cost in increased complexity, both in terms of development and support. The basic reason is that BizTalk is designed to be massively concurrent to enable high throughput, and there is a direct and unavoidable conflict between concurrency and ordering. Shoe-horning E2E ordered delivery into a BizTalk solution relies on artefacts such as singleton orchestrations which introduce complexity and increase both failure rate and cost-per-failure numbers.
A far better solution is to maintain concurrent processing to as near as possible to the line-of-business system endpoints, and then implement what is called a re-sequencer wrapper around each of the endpoints which require data to be delivered in the correct order.
How to implement such a wrapper in BizTalk depends on some factors, which are outlined in the following table:
|Sequencing |Messages|Database |Wrapper |
|field |are |integration?|strategy |
| |deltas? | | |
|n of a total m| N | Y |Stored procedure |
|n of a total m| N | N |Singleton orchestration |
|n of a total m| Y | Y |Batched singleton orchestration |
|n of a total m| Y | N |Batched singleton orchestration |
|Timestamp | N | Y |Stored procedure |
|Timestamp | N | N |Singleton orchestration |
|Timestamp | Y | Y |Buffer table with staggered reader|
|Timestamp | Y | N |Buffer table with staggered reader|
The first factor Sequencing field relates to the idea that in order to implement any kind of re-sequencer wrapper, as a minimum you will require that your message data contains some sequencing information. This can take the form of a source time-stamp; however a better, though rarer, kind of sequencing information consists of a sequence number combined with the total number of messages, for example, message 1 of 10, 2 of 10, etc..
The second factor Messages are deltas? relates to whether or not the payload of your message contains a single state change to the data or the sum of all past changes to the data. Put another way, is it possible to reconstruct the full current state of the data from this message? If the message payload contains just a single change then it may not be possible to reconstruct the state of the data from the single message, and in this instance your message is a delta.
The third factor Database integration? relates to whether or not the integration-entry-point to a system is a database. The reason this matters is that integrating at the database layer is a fairly common integration scenario, and if available can greatly simplify handling re-sequencing.
The strategies from the above table are described in detail below:
Stored procedure wrapper
This is the simplest of the resequencing strategies. A new stored procedure is created which queries the target data before making a decision about whether to update the target data. The decision can be as simple as Is the data I have newer than the data in the target system?
Of course, in order to implement this strategy, the target data also has to include the sequencing field of the source data, although an approximation can be made if necessary by relying on existing time-stamps which may already exist in the target data. The stored procedure wrapper can be contained either in the target database or ideally in a separate database.
Singleton orchestration wrapper
The idea behind this strategy is the singleton orchestration. This is a pattern you can implement to ensure that only a single instance of the orchestration will exist at any one time. There are many articles on the web demonstrating how to implement this pattern in BizTalk.
The core of the idea is that the singleton simply keeps a track of the most recent successfully processed message sequence (or time-stamp). If the singleton receives a message which is older than the most recent sequence it is simply discarded. This works because the messages are non-deltas, so the target system can commit only the most recent of a number of messages and the data will be in the most recent state. Only when data is committed successfully is the most recent sequence held by the singleton updated.
Batched singleton orchestration wrapper
This strategy is based on the Singleton orchestration wrapper above, except it is more complex. Rather than only keep the most recent sequence information in memory the singleton is required to create and hold a working set of messages in memory which it will re-order and then process once all expected messages from the batch have arrived. This is because the messages are deltas so the target system MUST receive each message in the order they were intended. Once the batch has been sent successfully the singleton can terminate.
To do this it is a requisite of the source data that it contain a correlation identifier of some description which allows the batch of messages to be defined. For example, processing a defined set of orders from a customer, the inbound messages must contain an identifier for the customer. This can then be used to route the messages to the singleton orchestration instance correlated with this customer. Furthermore the message sequence field available must be of the n of a total m form.
Once the singleton is initialised it assembles a working set of messages in memory and proceeds to populate it as new messages arrive. One way I have seen this done is using a System.Collections.Generic.List as the container for the working set. Once the list has been fully populated (list length = m) then it is assumed all messages in the batch have been received and the orchestration then loops over the working set in sequence and processes the messages into the target system.
One of the benefits of the batched singleton orchestration wrapper is it allows concurrent processing by correlation identifier. In the example above this means that messages from two customers would be processed concurrently.
Buffer table with staggered reader wrapper
Arguably the most complex of the strategies presented, this solution is to be used when you have delta messaging with a time-stamp-based sequencing field. It can be implemented with a database of some description which acts as a re-sequencing buffer.
It is worth noting here that this re-sequencing wrapper does not guarantee ordered delivery, but used well it makes ordered delivery highly likely.
As messages arrive, they are written into the buffer and in the same operation the buffer is reordered, so that the order of messages held in the buffer are always correct.
To create the buffer reader, have a receive location which reads the messages in the buffer before passing the messages to a send port with ordered delivery enabled, which then will process the messages into the target system. You can also use a singleton orchestration as an intermediary if your target system's API semantics are too complex for a send port.
However, using this wrapper as I have described it above will not enable ordered delivery, as the messages will almost certainly be committed to the buffer in the wrong order, which will result in the messages being processed into the target system in the same (wrong) order. This is where the staggered query comes in. This is a fancy way of saying your buffer query needs to only select data at intervals of time T, AND only select those rows where the row-number is lower than buffer total row count minus C.
This has the effect of allowing sequencing to occur over an appropriate timespan. T will be familiar to most BizTalk developers as the polling interval of some adapters (such as the WCF-SQL adapter). C is slightly more difficult to set, but by increasing this number you are reducing the chances that when you poll, you will miss a message older than the most recent one in your retrieved data set.
What T and C are depends on many things, although these values should be based on your latency SLA and your message volume (or throughput). As a guideline, if you have a SLA to deliver data into your target system within 30 seconds and you process 10 messages per second then T should be around 10 seconds and C should be around 100 rows.
Of course this only works if your messages for a given correlation id are sent by the source system during a short space of time (ideally back-to-back). The longer the interval between sends, the greater the required value of C, and the less effective the wrapper becomes.
One of the benefits of this strategy is you can also perform de-duplication of messages in the buffer if your data source is prone to sending duplicate messages and your target system endpoint is not idempotent. You can also use the buffer to implement FILO and other non-standard queueing semantics.
The strategies I have discussed here are ways of bending BizTalk to a task which is wasn't designed to do. As a result each has caveats around cost and complexity to support, and also may not work in certain scenarios. I would like to hear from anyone who has implemented other patterns for ordered delivery in BizTalk.
