We are using BizTalk (2013 R2 CU 6) EDI functionality to batch EDI files. This uses the Microsoft.BizTalk.Edi.BatchingOrchestration.BatchingService Orchestration which is always running in a waiting state (for lack of a better term), dehydrated most of the time. While running the orchestration builds up instances of the BTXTimerMessages in the Queued (awaiting processing) state. These messages are never removed or processed, that I can tell. This eventually causes us to pass the 50k message threshold and start throttling.
As far as I can tell there is no way to setup a reoccurring schedule for the batcher, it must always run or be manually started. If we leave the batcher off there we get routing errors.
Currently the only way we have to eliminate these messages is to terminate the EDI batcher for each party, then restart it.
Is there a better way to purge these messages from the system, or stop them from being generated all together?
Related
I have built an orchestration with a loop to retreive paged data from REST web service. From page size and offset I am able to call the service for "next page" of data. Then I debatch it, map it to internal format and process it further. When one page is processed, I request next page from the REST web service.
As it turns out, the host running the orchestration and send ports causes the memory to constantly grow during processing of all the data, and eventually hit the throttling mode.
Why is memory not released when I am done with one page loop? Is it the "consumed" messages that are stored in the orchestration that builds up the memory? Is it possible to clear orchestration from these "consumed" messages, to release the memory used?
(No message tracking active on the orchestration, or send ports.)
Apparently, there is no way to prohibit BizTalk Orchestrations from building up a list of messages in Orchestration, including used/processed/consumed messages. Putting things in Scope does not prohibit this behaviour.
Hence, for long-running Orchestrations there can be a lot of messages building up. Especially for singleton Orchestrations, where the general solution proposed to deal with this problem is to make sure Orchestration shuts down once in a while (when idle, e.g.).
My solution was to split the Orchestration into two, and have the initial Orchestration start the second Orchestration with the Start Orchestration, which in turn calls the second Orchestration recursively, and so on, until last page is received and the last Orchestration ends.
Yes, what you need to do is to have scopes, and that the messages are initialised in the scope (Green highlight below) rather than an the top level (yellow), and that means they will also be disposed of at the end of the scope. Note: That means those message can't be used outside of the scope.
However if you are just re-using the same messages in the loop, then I wouldn't expect it to increase memory usage. So there is possibly something else going on. I suspect that you must be adding each page to a message, and that is what is growing
I am currently looking at diagnosing some reoccurring issues within a BizTalk environment and currently that is the issue of zombie messages. I am aware of the conditions that create these errors and whilst diagnosing the orchestration and making use of the Orchestration Debugger, I see that when a message has hit a terminate shape, it is followed by an initialisation.
The general structure of the orchestration is as follows:
The first scope is a long-running transaction and within the loop after that scope, there is a listen shape that waits for a message for 10 seconds. If a message comes in time, it enters another long-running transaction. It's like a singleton in a way? Both scopes share the same logical receive port and are correlated, only odd part is how the first scope is repeated within the loop that's inside the listen shape. (Orchestration is part of a behemoth of an application that wasn't written by myself.)
Would this initialisation after a termination (what actually causes this to happen?) cause zombies, if so is the structure of the orchestration and the transactions a cause of this? Or am I looking in the wrong place?.
Let me know if there's any extra information that can help!
In the Orchestration debugger it will show when something start and also when it ends with slightly different icons. So what you are seeing is the end of the Orchestration.
No, that will not cause zombies. Zombies occur after it ends the logical receive location that listens for something (and it is tearing down the instance subscription) and another message arrives that matched that subscription before the Orchestration has fully ended.
Had an orchestration following the parallel convoy pattern to receive and wait a bunch of messages before kicking off the flow. Question is we are to move to an high variability environment with multiple active-active biztalk servers, would this be a problem? will the messagebox smart enough to figure out which host instance it should push the message to?
It's no problem at all. Don't even think about it.
An Orchestration can run an any available Host Computer but a specific instance of an Orchestration would only run on one at a time.
It doesn't matter which one.
Scenario:
A biztalk application is deployed with a receive port, orchestration and send port. Messages flow correctly.
At some point, a bug is found in the orchestration, causing messages to suspend. The orchestration must be fixed and redeployed.
Question:
Because you can't redploy an orchestration with suspended instances, how would you go about retaining those messages, terminating the instances, redeploying and then resending those messages through the fixed orchestration? Is there a process or tool for this?
If the bug doesn`t require major modification - i.e no new orchestrations, no new schemas, no new promoted fields etc, then a short term 'hack' is possible, viz by simply reinstalling the fixed MSIs (and GAC) on your servers, and restarting the host instances (using NLB if applicable) (i.e. without importing the MSI's into BizTalk).
You should then be able to resume any suspended (resumable) orchs. Then schedule some downtime at a less busy time, put your app into partially stopped to prevent new orchs starting, wait for all running orchs to complete, and then import the fixed MSI (consider bumping up the buggy orch assembly version with the hotfix)
Building a custom tool with the ability to audit all messages going in and out of Biztalk is useful, so you can replay them. This will allow you to terminate orchs, reinstall, and then replay.
You can as well fix the orchstration and while building it, increment the version of the assembly. This way you can have parallel deployment of the orchestrations. You can unenlist the existing one after deploying the newer versioned orchstration.
I have an application with a file receive location. After the host instance has been running for a few hours the receive location fails to identify new files dropped into the folder that it is monitoring. It doesn't forget about them altogether, it's just that performance grinds to a crawl. The receive location is configured to poll the target folder every 60 seconds but after host instance has been running for an hour or so, then it seems that the target folder is being polled only every thirty minutes. If I restart the host instance then the files waiting in the target folder are collected right away and performance is fine for the next hour or so.
The same application runs fine in a different environment.
There are now obvious entries in the event log related to the problem.
All the BizTalk SQL jobs are running fine except for Backup BizTalk Server (BizTalkMgmtDb).
Any suggestions gratefully received.
Thanks
Rob
Here are some additional tools which may help you identify and diagnose BizTalk database issues.
BizTalk MsgBox Viewer
Here is a tool to repair identified errors:
Terminator
Use at your own risk... read the glogs and docs. Start with the message box viewer and let us know our results.
Without more details, the biggest tell is that your Backup Job is failing. If the backup job is failing, it may not be properly configured. If it is properly configured and still failing, then you've got other issues. Can you give us some more information about your BizTalk install.
What version are you running?
What are our database sizes?
What are your purge and archive settings like?
Is there any long running blocks in your SQL Server DB coming from BizTalk?
Another thing to consider is the user accounts the send, receive and orchestration hosts are running under. Please check the BizTalk Administration Console. If they are all running the same account, sometimes the orchestrations can starve the send and receive processes of CPU time. I believe priority is given to orchestrations then receive, then send. Even if you are just developing, it is useful to use separate accounts for this. This also improves security.
The Wrox BizTalk Server 2006 will also supply tuning advice.
What other things are going on with the server? Is BizTalk pegged otherwise or is it idle?
You mention that the solution does not have any problems in another environment, so it's likely that there is a configuration problem.
Check the following:
** On SQL Server, set some upper memory limit for SQL Server. By default, SQL Server uses whatever it can get and then hangs onto it, so set a reasonable limit so that your system can operate without spending a lot of time paging memory onto and from your hard drive(s).
** Ensure that you have available disk space - maybe you are running low - this can lead to all kinds of strange problems.
** Try to split up the system's paging file among its physical drives (if you have more than one drive on the system). Also consider using a faster drive, or if you have lots of cash laying around, get a SAN.
** In BizTalk, is tracking enabled? If so, are you also tracking message bodies? Disable tacking or message body tracking and see if there is a difference.
** Start performance monitor and monitor the following counters when running your solution
Object: BizTalk Messaging
Instance: (select the receiving host) %%
Counter: Documents Received/Sec
Object: BizTalk Messaging
Instance: (select the transmitting host) %%
Counter: Documents Sent/Sec
Object: XLANG/s Orchestrations
Instance: (select the processing host) %%
Counter: Orchestrations Completed/Sec.
%% You may have only one host, so just use it. Since BizTalk configurations vary, I am using generic names for hosts.
The preceding counters monitor the most basic aspects of your server, but may help to narrow down places to look further. You can, of course, add CPU and Memory too. If you have time (days...maybe weeks) you could monitor for processes that allocate memory and never release it. Use the following counter...
Object: Memory
Counter: Pool Nonpaged Bytes
Slow decline of this counter indicates that a process is not releasing memory, which affects everything on the system.
Let us know how things turn out!
I had the same problem with, when my orchestration was idle for some time it took a long time to process the first msg. A article of EvYoung helped me solve this problem.
"This is caused by application domain unloading within the BizTalk host process. If an AppDomain is shutdown after idle, the next message that comes needs to wait for the Orchestration to compile again. Depending on the complexity of your design, this can be a noticeable wait. To prevent this in low latency requirement scenario, you can modify the BTSNTSVC.EXE.config file and set SecondsIdleBeforeShutdown property to -1. This will prevent AppDomain shutdown due to idle."
You can find the article in here:
http://blogs.msdn.com/b/biztalkcpr/archive/2008/05/08/thoughts-on-orchestration-performance.aspx
It took me to long to respond but i thought i might help someone. cheers :)
Some good suggestions from others. I will add :
Do you have any custom receive pipeline components on the receive location ? If so perhaps one is leaking memory, calling some external component eg database which is taking a long time ?
How big are the files you are receiving ?
On the File transport properties of your receive location, set "file renaming" on, do the files get renamed within 60s.