Release memory by removing consumed messages in BizTalk Orchestration? - biztalk

I have built an orchestration with a loop to retreive paged data from REST web service. From page size and offset I am able to call the service for "next page" of data. Then I debatch it, map it to internal format and process it further. When one page is processed, I request next page from the REST web service.
As it turns out, the host running the orchestration and send ports causes the memory to constantly grow during processing of all the data, and eventually hit the throttling mode.
Why is memory not released when I am done with one page loop? Is it the "consumed" messages that are stored in the orchestration that builds up the memory? Is it possible to clear orchestration from these "consumed" messages, to release the memory used?
(No message tracking active on the orchestration, or send ports.)

Apparently, there is no way to prohibit BizTalk Orchestrations from building up a list of messages in Orchestration, including used/processed/consumed messages. Putting things in Scope does not prohibit this behaviour.
Hence, for long-running Orchestrations there can be a lot of messages building up. Especially for singleton Orchestrations, where the general solution proposed to deal with this problem is to make sure Orchestration shuts down once in a while (when idle, e.g.).
My solution was to split the Orchestration into two, and have the initial Orchestration start the second Orchestration with the Start Orchestration, which in turn calls the second Orchestration recursively, and so on, until last page is received and the last Orchestration ends.

Yes, what you need to do is to have scopes, and that the messages are initialised in the scope (Green highlight below) rather than an the top level (yellow), and that means they will also be disposed of at the end of the scope. Note: That means those message can't be used outside of the scope.
However if you are just re-using the same messages in the loop, then I wouldn't expect it to increase memory usage. So there is possibly something else going on. I suspect that you must be adding each page to a message, and that is what is growing

Related

BizTalk 2013R2: Why does my orchestration initialise after being terminated according to the Orchestration Debugger?

I am currently looking at diagnosing some reoccurring issues within a BizTalk environment and currently that is the issue of zombie messages. I am aware of the conditions that create these errors and whilst diagnosing the orchestration and making use of the Orchestration Debugger, I see that when a message has hit a terminate shape, it is followed by an initialisation.
The general structure of the orchestration is as follows:
The first scope is a long-running transaction and within the loop after that scope, there is a listen shape that waits for a message for 10 seconds. If a message comes in time, it enters another long-running transaction. It's like a singleton in a way? Both scopes share the same logical receive port and are correlated, only odd part is how the first scope is repeated within the loop that's inside the listen shape. (Orchestration is part of a behemoth of an application that wasn't written by myself.)
Would this initialisation after a termination (what actually causes this to happen?) cause zombies, if so is the structure of the orchestration and the transactions a cause of this? Or am I looking in the wrong place?.
Let me know if there's any extra information that can help!
In the Orchestration debugger it will show when something start and also when it ends with slightly different icons. So what you are seeing is the end of the Orchestration.
No, that will not cause zombies. Zombies occur after it ends the logical receive location that listens for something (and it is tearing down the instance subscription) and another message arrives that matched that subscription before the Orchestration has fully ended.

BizTalk message box full of BTXTimerMessage messages

We are using BizTalk (2013 R2 CU 6) EDI functionality to batch EDI files. This uses the Microsoft.BizTalk.Edi.BatchingOrchestration.BatchingService Orchestration which is always running in a waiting state (for lack of a better term), dehydrated most of the time. While running the orchestration builds up instances of the BTXTimerMessages in the Queued (awaiting processing) state. These messages are never removed or processed, that I can tell. This eventually causes us to pass the 50k message threshold and start throttling.
As far as I can tell there is no way to setup a reoccurring schedule for the batcher, it must always run or be manually started. If we leave the batcher off there we get routing errors.
Currently the only way we have to eliminate these messages is to terminate the EDI batcher for each party, then restart it.
Is there a better way to purge these messages from the system, or stop them from being generated all together?

Deeply control orchestration throttling and dispatching in BizTalk based on message batch size

I have a biztalk orchestration which processes a single message. This messages are actually batches of messages. Most of the time, the batch size n is small (<1.000) but once in a while there are very large batches (>50.000). We have a high throughput of messages as well.
The orchestration takes a linear O(n) amount of system memory depending on the batch size and I know by observation that a single server can process up to an accumulated batch size of ~250k in parallel before it runs out of system memory and only returns OutOfMemoryExceptions. (Which will kill the BizTalk host instance and the orchestrations will startup on another host which will ultimately break again leaving our BizTalk group in a broken state which can currently only be recovered by manual intervention)
Small batches are common, large batches are rare but kind of deadly if there is more than one at the same time.
I know the batch size in advance so I could tell biztalk about it. But I see no way to interact with throttling. When throttling detects a lack of system memory it is already too late.
Do I have to build my own queueing and dispatching on top of biztalk to achieve my goals?
Our current solution is to use a semaphore with a value of 8 and every large message n>1000 needs to get a semaphore slot before it is allowed to start processing. We had an edge case the other day where even this was too much. We reduced 8 to 4 to resolve this but now, we impacted the general throughput noticeably.
Any idea or hint is welcomed!
Don't use XmlDocument within your processing. It will further exacerbate your memory issues. Prefer XmlReader for sure here. However, I'd still try to move processing outside of your orchestration. Even if you can get the streaming working in a .NET component called from the orchestration, you can still end up with an orchestration instance that runs for a long time and consumes lots of memory, which should be avoided whenever possible. Therefore...
Avoid letting the orchestration get messages that large to begin with. It may be possible to debatch the message using the OOB XmlDisassembler if you can mark the schema as an envelope schema; if not, you may need to create a custom disassembler component to do your debatching (just remember to promote/write the proper context properties to the newly created messages from the original). If you use some streaming techniques (see https://www.microsoft.com/en-us/download/details.aspx?id=20375) in the pipeline, you can greatly reduce the memory footprint and have much greater control there. Again, use XmlReader to actually parse and debatch the message (it shouldn't be super difficult - look into the ReadToFollowing and ReadSubTree, as in this Splitting large xml files in to sub files without memory contention). You might get away with doing this in an orchestration rather than a pipeline component, but in a pipeline component it should be easier to control memory usage. You may also look into promoting things like a batch ID if you need to correlate the messages back together.
If you get a large batch, you will still need to throttle the number of concurrent orchestrations; you could do so as Richard Seroter suggests here, which uses multiple convoys that correlate on instance IDs to prevent too many from running at once. Alternatively, you could use ordered delivery on the receive shape (see MSDN), which would probably be my preferred option as it takes significantly less work and won't face the concerns around zombie messages that are possible with convoys.
Basically: try to think small and lean as much as possible and BizTalk will be happier. BizTalk would much rather process 1000 small messages in a second than 1 very large message in a minute.

How to call Biztalk orchestrations without using the messagebox

Is there anyway you could call a Biztalk orchestration without placing a message in the messagebox? The point here is to use an orchestration that is stored and configured in Biztalk but avoid the performance loss of using a database to trigger it.
Message box is an integral part of BizTalk server and no transaction can occur without message box. In most cases it works great and having message box provides lots of benefits for message delivery and processing. If you are having performance issues, I would recommend you measure your solution performance and identify bottlenecks. Some key points you can look:
Orchestration Persistent points
Use BizTalk host settings and change Polling intervals for messaging
and Orchestration to reduce from 500 ms to 50 ms, it does help.
If message box is a bottleneck (which is usually not the case until your volume is very high) add slave message boxes. BizTalk allows to scale out message box by adding slave message boxes. In this case one message box serves as master and rest others as slaves to process the request. Scale out message box
The answer to the question is NO, but, you are probably laboring under a false assumption.
There is no 'performance loss' due to the MessageBox. If you can prove the MessageBox causes you to miss an SLA, then you should be considering a completely different app platform such as a Windows Service. However, many of us have implemented very low latency apps with BizTalk without issue.
So, unless you SLA approaches the definition of 'real-time', I wouldn't worry about it.

Should I use BizTalk Orchestration

I am currently in the process of porting an existing application (BizTalk 2004) to a newer version of BizTalk. The current solution takes multiple types of EDI documents, modifies it if its necessary and sends it to our legacy system to be loaded and processed.
This process is developed using a combination of Receive Ports, Pipeline component, Send Ports and Maps, Schema and Message Queue Components. This solution uses 10 send & receive ports to handle various aspects of the process such as Bursting EDI into individual messages, Transforming Messages, Error handling, EDI Validation and Batching of EDI Messages. All the modification of EDI is done using Message Queue Components.
This solution does NOT use orchestration at all. I am considering implementing the current solution as a BizTalk orchestration. I have read up a bit on orchestrations and worked through few sample applications. But I am still very confused over what benefit of using orchestration, if a solution can be developed without it. I am sure I am missing something here. What additional benefit orchestration gives that the current solution does not?
Edit:...I should clarify the question...I can do this app without using Orchestration using content based routing & maps. My question is, if I am missing something by not using Orchestration?
If you can perform your task at hand with message based routing, an orchestration is overkill.
Orchestrations will help you with calling rules, or handling transactions. The following points can help you decide whether to use orchestration or not:
Is the handling Transactional
Is ordering of messages important
Are you going to process the message using business rules
Do you have to call external assemblies
A quote from "Microsoft BizTalk Server Pattern"
Orchestrations come at a considerable cost. Many of these costs manifest themselves as roundtrips to the messagebox, which means crossing a process boundary and writing to and reading from a database -the messagebox
An orchestration can potentially take twice as long for the same process. For example: A simple process of receiving a message and sending it will make 2 message hops with the messaging approach vs 4 with the orchestration.
Here are the steps for a messaging only solution
Receive the message via the adapter save it to the message box
Retrieve the message for the send port
vs:
Steps for Orchestration approach
Receive the message via the adapter and save it to the message box
Retrieve the message to start the orchestration
Do your mapping if you need to
Retrieve the item again for the send port.
Choose wisely
It sounds like you could re-implement the solution in a messaging only solution and don't need an Orchestration. If you can that's great, we prefer messaging only as they are simpler to maintain and generally more efficient. Orchestration are useful if you need to have a workflow of multiple actions, or special error handling that you can't easily do with a messaging only solution.

Resources