Biztalk Message restore - biztalk

Requirement: Updating BizTalk application to a new version
Problem: The MSI import does not go through if there are running/suspended instances. Termination would result in loss of messages
What did I try:
I had about 100+ messages in messagebox some active, some with suspended resumable status.
I took the back up of BizTalkMsgBoxDb, I then terminated all instances from BTAdmin console and then restored the BizTalkMsgBoxDb.
I was expecting the messages to be back in BizTalkMsgBoxDb but when I queried from BiztalkAdmin console I don't find any of the message back.
Did I miss anything?

if your changes do not contain any changes to ports etc. try and replace the assemblies in GAC and then restart your host instances.

Doing a backup of just one of the BizTalk databases and restoring it is a very dangerous practice and I would strongly advise against it as it can cause some very nasty side effects.
The normal process of a deployment would be to switch of the receive locations and allow any running processes to finish and to resume or terminate any messages/orchestrations as appropriate.
Once there were no longer any suspended and running processes/messages would you unenlist all the Orchestrations and do the deployment.
If there are some long running processes that cannot be completed or terminated inside the deployment window then you would have to look at doing a side-by-side deployment. That involves changing the version number of all the DLLs, deploying this and then switching of the receive locations of the old version and switching on the new one.
When the old version has finished you stop that and un-deploy it.

Related

How to poll an unix server reboot in an automated way?

I am performing an install on a server, which requires the server to reboot to complete the installation. My query is how to find out when the reboot completes so that I can run a basic smoke test on the server and confirm the deployment status.
This is a non-trivial task in general, and the solution will be depending on the operating system of the server that's restarted.
In general, it will be best to let the server decide itself on when the startup process has completed, and have it send some notification to interested entities.
If you can't do that, you can check for services that are typically available after reboot (like, HTTP port reachable, exported file shares mountable,...). Adding some additional time on top of that may be useful.
On the other, more unreliable end of possible solutions, you can simply wait for a certain amount of time that's typically required for completing a reboot.

Managing suspended/dehydrated instances when redeploying orchestrations

Scenario:
A biztalk application is deployed with a receive port, orchestration and send port. Messages flow correctly.
At some point, a bug is found in the orchestration, causing messages to suspend. The orchestration must be fixed and redeployed.
Question:
Because you can't redploy an orchestration with suspended instances, how would you go about retaining those messages, terminating the instances, redeploying and then resending those messages through the fixed orchestration? Is there a process or tool for this?
If the bug doesn`t require major modification - i.e no new orchestrations, no new schemas, no new promoted fields etc, then a short term 'hack' is possible, viz by simply reinstalling the fixed MSIs (and GAC) on your servers, and restarting the host instances (using NLB if applicable) (i.e. without importing the MSI's into BizTalk).
You should then be able to resume any suspended (resumable) orchs. Then schedule some downtime at a less busy time, put your app into partially stopped to prevent new orchs starting, wait for all running orchs to complete, and then import the fixed MSI (consider bumping up the buggy orch assembly version with the hotfix)
Building a custom tool with the ability to audit all messages going in and out of Biztalk is useful, so you can replay them. This will allow you to terminate orchs, reinstall, and then replay.
You can as well fix the orchstration and while building it, increment the version of the assembly. This way you can have parallel deployment of the orchestrations. You can unenlist the existing one after deploying the newer versioned orchstration.

impact of disabling recycling of worker process in IIS application pool

I have a WCF service hosted in IIS which takes a long time (around 5 hours) to execute. the WCF service basically generates some reports using SSRS (SQL server reporting services) and saves them to a locaton on the server. this service was actually stopping after generating few reports, so I disabled the "recycling of worker process", "shut down worker processes after being idle" and "limit kernel request queue" in application pool and that fixed the issue and all the reports were generated regardless of the amount taken to generate them. but I am not sure if this is the right fix for this and I would like to know what is the impact of unchecking these settings in application pool for the WCF service in IIS? and is there any better way to get around this problem?
For any long running process it is much better to do it outside of IIS.
In this case I would have a regular windows service running that monitored a request queue. When a request comes in to generate a report, it would then spin off a thread to perform the generation.
The web service would be responsible for 3 things. First, adding an item to the queue to be handled. Second, checking status on the queue as to whether the report is ready. Third, sending the completed report back to the calling client.
This would allow the client to essentially do a fire and forget on the report request and call back later to check on it's status. Further it would mean that if IIS recycled for whatever reason you are still OK.
For bonus points I would add some error handling code that when the windows service restarted it could restart report jobs that were in the middle of execution. This would make it a bit more robust and allow you to reboot the server at any point.
I have disable also all the automatic shut down process from iis for my application with out any issue. I have monitor the memory limits and of course the program work smoothly with out any issues on memory.
I think that this automatic shut down triggers are designed mostly for process that keep too many web sites together and possible some of them have not good programming. But if you are the master of your iis, and you have check your program that have not memory issue, then is better to not shut it down, or at least control the shut down process with some way.
Ok is better to make long running process outside IIS, but is not so simple to develop it, not so simple to install it, not so simple to check it out.

MSMQ Inconsistent State After Restart

I'm seeing a really strange error that I'm having a difficult time
tracking down. I think its related to my configuration of Rhino ESB, though I'm not sure
if RSB is actually causing it, so I figured I'd ask and see if
anyone else has come across this in any other usages of MSMQ.
I'm using RSB as a client in a web app (ASP.NET, the client runs in the background). The client talks to a windows service via the MSMQ binding for RSB. Restarting the service never appears to have an effect on MSMQ, neither does restarting IIS by hand. However, whenever I actually restart the computer itself, MSMQ always refuses to start back up, claiming that a "queue is in an inconsistent state". Attempting to start MSMQ manually results in the same error, effectively rendering the MSMQ install completely useless. The only way to solve it is to actually remove then reinstall MSMQ.
The only information I've found via the almighty Google are references to a problem in MSMQ 2.0 (this problem is occurring in MSMQ 4.0). I've verified that Dispose is being called on on the bus at shutdown, in both the service and the web site.
Does anyone have any idea why this could be occurring? Thanks!
I faced the same issue on Window 2008 Server (Virtual Machine). Although the environment was not related to rhino tools.
The error in the event log:
"The Message Queuing service cannot start because a queue is in an inconsistent state. For more information, see Microsoft Knowledge Base article 827493 at support.microsoft.com."
As Roy pointed out, this is happening every 2-3 days. Every time we would follow the steps below to recover - instead re-installing the MSMQ.
1) Stop all applications and services that uses MSMQ.
2) Kill the mqsvc.exe from the Task Manager
3) Go to C:\Windows\System32\msmq\storage and delete any .mq files
4) Start the MSMQ Service
4) Start your application
In my scenario I've been able to fix "queue is in an inconsistent state" error after MSMQ service restart.
Turns out the computer name was too long, so changing computer name to a name with less than 15 characters fixed the issue.
My team is experiencing a similar issue, with MSMQ getting called by NSB 2.5. The issue came up recently after Infrastructure moved our VM to another physical server and for some reason lowered available RAM. We think the issue may be memory-related.
EDIT
After a week of no more issues with this, I can confidently say that raising RAM on the server solved our MSMQ's "Inconsistent state" issue. Mind you, we did have to re-install MSMQ first -- but the issue never came back, and before the RAM update the issue popped up every 2 days.
Regularly on Windows 2008RC2, MSMQ cannot start after reboot.
The two regular issues for me are:
"The Message Queuing service cannot start because a queue is in an inconsistent state"
and
"The dependency service does not exist or has been marked for deletion"
Sometimes, the following has helped (although we are seeking a more solid answer)
rename msmq folder to msmq_old
net stop wuauserv
net stop bits
Delete “%windir%\softwaredistribution” directory
Reboot
This has occurred 5 times this year, and each time, a variety of the above with plenty of reboots.
Sometimes, we revert to Remove Feature / Add Feature, however you may get yourself in a loop. As it boots up, a rollback occurs in the windows update service, so the Feature is never uninstalled, and the problem is never repaired.
Following the steps above can help with that.

Slow BizTalk File Receive

I have an application with a file receive location. After the host instance has been running for a few hours the receive location fails to identify new files dropped into the folder that it is monitoring. It doesn't forget about them altogether, it's just that performance grinds to a crawl. The receive location is configured to poll the target folder every 60 seconds but after host instance has been running for an hour or so, then it seems that the target folder is being polled only every thirty minutes. If I restart the host instance then the files waiting in the target folder are collected right away and performance is fine for the next hour or so.
The same application runs fine in a different environment.
There are now obvious entries in the event log related to the problem.
All the BizTalk SQL jobs are running fine except for Backup BizTalk Server (BizTalkMgmtDb).
Any suggestions gratefully received.
Thanks
Rob
Here are some additional tools which may help you identify and diagnose BizTalk database issues.
BizTalk MsgBox Viewer
Here is a tool to repair identified errors:
Terminator
Use at your own risk... read the glogs and docs. Start with the message box viewer and let us know our results.
Without more details, the biggest tell is that your Backup Job is failing. If the backup job is failing, it may not be properly configured. If it is properly configured and still failing, then you've got other issues. Can you give us some more information about your BizTalk install.
What version are you running?
What are our database sizes?
What are your purge and archive settings like?
Is there any long running blocks in your SQL Server DB coming from BizTalk?
Another thing to consider is the user accounts the send, receive and orchestration hosts are running under. Please check the BizTalk Administration Console. If they are all running the same account, sometimes the orchestrations can starve the send and receive processes of CPU time. I believe priority is given to orchestrations then receive, then send. Even if you are just developing, it is useful to use separate accounts for this. This also improves security.
The Wrox BizTalk Server 2006 will also supply tuning advice.
What other things are going on with the server? Is BizTalk pegged otherwise or is it idle?
You mention that the solution does not have any problems in another environment, so it's likely that there is a configuration problem.
Check the following:
** On SQL Server, set some upper memory limit for SQL Server. By default, SQL Server uses whatever it can get and then hangs onto it, so set a reasonable limit so that your system can operate without spending a lot of time paging memory onto and from your hard drive(s).
** Ensure that you have available disk space - maybe you are running low - this can lead to all kinds of strange problems.
** Try to split up the system's paging file among its physical drives (if you have more than one drive on the system). Also consider using a faster drive, or if you have lots of cash laying around, get a SAN.
** In BizTalk, is tracking enabled? If so, are you also tracking message bodies? Disable tacking or message body tracking and see if there is a difference.
** Start performance monitor and monitor the following counters when running your solution
Object: BizTalk Messaging
Instance: (select the receiving host) %%
Counter: Documents Received/Sec
Object: BizTalk Messaging
Instance: (select the transmitting host) %%
Counter: Documents Sent/Sec
Object: XLANG/s Orchestrations
Instance: (select the processing host) %%
Counter: Orchestrations Completed/Sec.
%% You may have only one host, so just use it. Since BizTalk configurations vary, I am using generic names for hosts.
The preceding counters monitor the most basic aspects of your server, but may help to narrow down places to look further. You can, of course, add CPU and Memory too. If you have time (days...maybe weeks) you could monitor for processes that allocate memory and never release it. Use the following counter...
Object: Memory
Counter: Pool Nonpaged Bytes
Slow decline of this counter indicates that a process is not releasing memory, which affects everything on the system.
Let us know how things turn out!
I had the same problem with, when my orchestration was idle for some time it took a long time to process the first msg. A article of EvYoung helped me solve this problem.
"This is caused by application domain unloading within the BizTalk host process. If an AppDomain is shutdown after idle, the next message that comes needs to wait for the Orchestration to compile again. Depending on the complexity of your design, this can be a noticeable wait. To prevent this in low latency requirement scenario, you can modify the BTSNTSVC.EXE.config file and set SecondsIdleBeforeShutdown property to -1. This will prevent AppDomain shutdown due to idle."
You can find the article in here:
http://blogs.msdn.com/b/biztalkcpr/archive/2008/05/08/thoughts-on-orchestration-performance.aspx
It took me to long to respond but i thought i might help someone. cheers :)
Some good suggestions from others. I will add :
Do you have any custom receive pipeline components on the receive location ? If so perhaps one is leaking memory, calling some external component eg database which is taking a long time ?
How big are the files you are receiving ?
On the File transport properties of your receive location, set "file renaming" on, do the files get renamed within 60s.

Resources