Biztalk orchestration slows down on QA machine not on DEV

Biztalk orchestration slows down on QA machine not on DEV - biztalk

I have an BizTalk application which loops on a XML and send data to SQL server database. The orchestration works fine on the DEV machine throughout the process and is consistent. But if I process the same file on the QA machine it starts with the same speed and then the performance keeps on degrading. There is no issue on the Database object, the throttling settings are the same compared to DEV. I restarted the machine. Not sure why QA is reacting this way for this application.
What are the areas to be checked?

There are various factors which can cause this and overall your solution performance:
Is QA a shared environment, i.e. there are other solutions on it
which may cause the slow down?
If you are sharing hosts on which orchestration is running then that host might be throttling due to various reasons such as memory issues etc, Use performance counter to monitor the host throttling state.
You may have too many persistent points in orchestration, since you are looping and sending message to sql db in loop. if you are using send shape it will cause persistent point per send in loop,will degrade performance considerable.
Isolate the issue i.e. whether it is orchestration running slow or
sending to SQL is taking time.
Tracking is turned on and DTA jobs are not running
Message clean jobs not running as expected in QA

I wrote a blog about how to use SQL Server Profiler to capture the RPC call from BizTalk to SQL Server. You could isolate whether SQL is causing the issue that way; capture the RPC call on DEV or QA, and then try running just the stored procedure on QA. If it doesn't run as quickly as on DEV, that's your problem. If it does, look at your BizTalk artifacts.
Here's the blog: http://blog.tallan.com/2015/01/09/capturing-and-debugging-a-sql-stored-procedure-call-from-biztalk/

BizTalk host throttled because DatabaseSize exceeded the configured throttling limit. Also The SQL Server Agent was not running on the server, so purge processes did not run. This looks to have built up the database size over time until Biztalk throttled the application due to the resources being low

Related

Some BizTalk Receive Locations are disabled after server reboot

It is found that some BizTalk Receive Locations are disabled after server reboot (BizTalk server and SQL Server are separately installed to different physical servers)
Is there any idea on this scenario? Due to the boot sequence or other issues?

I will assume that, once you enable the receive locations manually, they are working correctly.
Typically, when FILE receive locations fail while pointing to an external server/share, it is because they are no longer available.
Make sure that, during the night, there are no network issues, planned/unplanned downtime of the share (= here your SQL server). A BizTalk receive location will retry to access a share for quite a while before disabling itself. Check the event log(s) for more information. You would want to look for errors/warnings there indicating an issue with connectivity between BizTalk and SQL.
Another issue might be that there are too many connections between your BizTalk server and SQL server. You can provide a maximum number of connections in the FILE share properties.
Also, you could try this link: https://serverfault.com/questions/235032/intermittent-connection-to-windows-7-shared-folder-from-windows-xp-workstations
It describes a potential fix for optimizing throughput for file sharing, although it depends on your operating system.

Calling SqlConnection.ClearAllPools() in Application_Start & Application_End?

We are trying to diagnose an issue that occurred in our production environment last week. Long story short, the database connection pool seemed to be full of active connections from our ASP.NET 3.5 app that would not clear, even after restarting the application pool and IIS.
The senior DBA said that because the network connections occur at the operating system level, recycling the app and IIS did not sever the actual network connections, so SQL Server left the database connections to continue running, and our app was still unable to reach the database.
In looking up ways to force a database connection pool to reset, I found the static method SqlConnection.ClearAllPools(), with documentation explaining what it does, but little to nothing explaining when to call it. It seems like calling it at the beginning of Application_Start and the end of Application_End in my global.asax.cs is a good safety measure to protect the app from poisoned connection pools, though it would of course incur a performance hit on startup/shutdown times.
Is what I've described a good practice? Is there a better one? The goal is to allow a simple app restart to reset an app's mangled connection pool without having to restart the OS or the SQL Server service, which would affect many other apps.
Any guidance is much appreciated.

When a process dies, all network connection are always, always, always closed immediately. That's at the TCP level. Has nothing to do with ADO.NET and goes for all applications. Kill the browser, and all downloads stop. Kill the FTP client and all connections are closed immediately.
Also, the connection pool is per process. So clearing it when starting the app is useless because the pool is empty. Clearing it at shutdown is not necessary because all connections will (gracefully) shut down any moment.
Probably, your app is not returning connections to the pool. You must dispose of all connections after use in all cases. If you fail to do that, dangling connections will accumulate for an indefinite amount of time.
Clearing the pool does not free up dangling connections because those appear to be in use. How could ADO.NET tell that you'll never use them again? It can't.
Look at sys.dm_exec_connections to see who is holding connections open. You might increase the ADO.NET pool size as a stop-gap measure. SQL Server can take over 30k connections per instance. You'll normally never saturate that.

Can low memory on IIS server cause SQL Timeouts (SQL Server on separate box)?

I have an IIS Web Server that hosts 400 web applications (distributed across 30 application pools). They are both ASP.NET applications and WCF Services end points. The server has 32GB of RAM and is usually running fast; although it's running at 95% memory usage. Worker processes each take between 500MB and 1.5GB of RAM.
I also have another box running SQL Server. That one has plenty of free memory.
Sometimes, the Web Server starts throwing SQL Timeout exceptions. A few per minutes at first and rapidly increasing to hundreds per minute; effectively making the server down. This problem affects applications in all pools. Some requests still complete but most of them don't. While this happens the CPU usage on the server is around 30% (which is the normal load on that box).
While this is happening, we can still use SQL Server Management Studio (from the IIS Server) to execute requests successfully (and fast).
The fix is to restart IIS. And then everything goes back to normal until the next time.
Because the server is running with very low memory, I feel like this is the cause. But I cannot explain the relationship between low memory and sudden bursts of SQL Timeout exceptions.
Any idea?

Memory pressure can trigger paging and garbage collection. Both introduce latency which would not be present otherwise.
GC'ing 32GB of data can take seconds. Why would all app processes GC at the same time? Because at about 95% memory utilization Windows sets a "low memory" event that the CLR listens to. It will try to release memory to help other processes.
If the applications get into a paging frenzy that would also explain huge delays in normal execution.
This is just guessing, though. You can try proving it by looking at the "Hard page faults/sec" counter. There also must be a counter for "full GC" or "Gen 2 GC".
The fix would be running at a higher margin to the physical memory limit.

The first problem is to discover where the timeout is happening. Can you tell from the stack trace if the timeout is happening when executing a request against the database, or when connecting to the database? (Or even connecting to the web server?)
Timeouts executing database requests can be a variety of causes. The problem might be in the database with blocking processes, database maintenance (also locking), deadlocks, etc. When apps are running slowly, do you see a lot of entries in sys.dm_exec_requests, and if so, what are their wait_types?
Even if you can run SQL in the query window while the web server is timing out, that doesn't mean there isn't massive blocking or deadlocking going on.
If it is a timeout connecting to the database, then it is possible the ADO connection pools are overwhelmed and not getting cleaned up, or the database has a connection limit, and the web services are timing out waiting for a connection.
One of the best ways to find out what is going on is to capture a memory dump of the w3wp.exe process and analyze it. Even if you aren't adept at a debugger like WinDbg, Microsoft's DebugDiag tool can produce some nice reports with helpful information.

SqlCommand.CommandTimeout
This property is the cumulative time-out for all network reads during command execution or processing of the results. A time-out can still occur after the first row is returned, and does not include user processing time, only network read time.
It is a client based time out. If stuff is getting queued due to memory constraints then that could cause a timeout.
Are you retrieving a lot of data from these queries?
If some queries return a lot of data consider breaking them up and give the user a next and prior button.
Have you considered asynch like BeginExecuteReader?
The advantage is no timeout.
It does not release the calling thread.
isExecutingFTSindexWordOnce = true;
sqlCmdFTSindexWordOnce.BeginExecuteNonQuery(callbackFTSindexWordOnce, sqlCmdFTSindexWordOnce);
// isExecutingFTSindexWordOnce set to false in the callback
Debug.WriteLine("Calling thread active");
But I agree with your comment how to respond to the request as the answer does not come back to the calling thread.
Sorry I am used to WPF where I just update a public property on the call back.

BizTalk Send Port OutOfMemory Exception

I have a BizTalk solution deployed at a customer site.
(BizTalk 2006 R2 with the non WCF adapters)
One of the orchestrations issues an RFC call to a SAP system and returns a big message.
The message is mapped to a SQL Adapter Send Port schema which is calling a stored procedure.
Recently I'm running into a OutOfMemory exception when the message is sent to the SQL Adapter.
I did a trace with SQL profiler which resulted in a 90MB+ tracefile for just a few seconds.
As I imagined, the Stored Procedure is executed rapidly one after another.
But even this does not explain why I get the error message. The Server has 32GB RAM and I can't even see it peaking out...
Any suggestions what else I could check?

Have you looked at how much memory the host of the Send Port is configured to use? 32GB of memory is irrelevant if the host is not set to take advantage of it. That being said, I think the default is 25% of the available memory. So barring any change this may not be the problem.
You could also try playing with the throttling settings for the host. Although I have never had much luck with it myself.
One way to get more information is to hook up Perfmon and look at the memory around that specific host process.

Slow BizTalk File Receive

I have an application with a file receive location. After the host instance has been running for a few hours the receive location fails to identify new files dropped into the folder that it is monitoring. It doesn't forget about them altogether, it's just that performance grinds to a crawl. The receive location is configured to poll the target folder every 60 seconds but after host instance has been running for an hour or so, then it seems that the target folder is being polled only every thirty minutes. If I restart the host instance then the files waiting in the target folder are collected right away and performance is fine for the next hour or so.
The same application runs fine in a different environment.
There are now obvious entries in the event log related to the problem.
All the BizTalk SQL jobs are running fine except for Backup BizTalk Server (BizTalkMgmtDb).
Any suggestions gratefully received.
Thanks
Rob

Here are some additional tools which may help you identify and diagnose BizTalk database issues.
BizTalk MsgBox Viewer
Here is a tool to repair identified errors:
Terminator
Use at your own risk... read the glogs and docs. Start with the message box viewer and let us know our results.

Without more details, the biggest tell is that your Backup Job is failing. If the backup job is failing, it may not be properly configured. If it is properly configured and still failing, then you've got other issues. Can you give us some more information about your BizTalk install.
What version are you running?
What are our database sizes?
What are your purge and archive settings like?
Is there any long running blocks in your SQL Server DB coming from BizTalk?

Another thing to consider is the user accounts the send, receive and orchestration hosts are running under. Please check the BizTalk Administration Console. If they are all running the same account, sometimes the orchestrations can starve the send and receive processes of CPU time. I believe priority is given to orchestrations then receive, then send. Even if you are just developing, it is useful to use separate accounts for this. This also improves security.
The Wrox BizTalk Server 2006 will also supply tuning advice.

What other things are going on with the server? Is BizTalk pegged otherwise or is it idle?

You mention that the solution does not have any problems in another environment, so it's likely that there is a configuration problem.
Check the following:
** On SQL Server, set some upper memory limit for SQL Server. By default, SQL Server uses whatever it can get and then hangs onto it, so set a reasonable limit so that your system can operate without spending a lot of time paging memory onto and from your hard drive(s).
** Ensure that you have available disk space - maybe you are running low - this can lead to all kinds of strange problems.
** Try to split up the system's paging file among its physical drives (if you have more than one drive on the system). Also consider using a faster drive, or if you have lots of cash laying around, get a SAN.
** In BizTalk, is tracking enabled? If so, are you also tracking message bodies? Disable tacking or message body tracking and see if there is a difference.
** Start performance monitor and monitor the following counters when running your solution
Object: BizTalk Messaging
Instance: (select the receiving host) %%
Counter: Documents Received/Sec
Object: BizTalk Messaging
Instance: (select the transmitting host) %%
Counter: Documents Sent/Sec
Object: XLANG/s Orchestrations
Instance: (select the processing host) %%
Counter: Orchestrations Completed/Sec.
%% You may have only one host, so just use it. Since BizTalk configurations vary, I am using generic names for hosts.
The preceding counters monitor the most basic aspects of your server, but may help to narrow down places to look further. You can, of course, add CPU and Memory too. If you have time (days...maybe weeks) you could monitor for processes that allocate memory and never release it. Use the following counter...
Object: Memory
Counter: Pool Nonpaged Bytes
Slow decline of this counter indicates that a process is not releasing memory, which affects everything on the system.
Let us know how things turn out!

I had the same problem with, when my orchestration was idle for some time it took a long time to process the first msg. A article of EvYoung helped me solve this problem.
"This is caused by application domain unloading within the BizTalk host process. If an AppDomain is shutdown after idle, the next message that comes needs to wait for the Orchestration to compile again. Depending on the complexity of your design, this can be a noticeable wait. To prevent this in low latency requirement scenario, you can modify the BTSNTSVC.EXE.config file and set SecondsIdleBeforeShutdown property to -1. This will prevent AppDomain shutdown due to idle."
You can find the article in here:
http://blogs.msdn.com/b/biztalkcpr/archive/2008/05/08/thoughts-on-orchestration-performance.aspx
It took me to long to respond but i thought i might help someone. cheers :)

Some good suggestions from others. I will add :
Do you have any custom receive pipeline components on the receive location ? If so perhaps one is leaking memory, calling some external component eg database which is taking a long time ?
How big are the files you are receiving ?
On the File transport properties of your receive location, set "file renaming" on, do the files get renamed within 60s.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex