Unexpected data found error during BizTalk Simultaneous Receive - biztalk

I have a receive port with two FILE receive locations polling the same network share. The only difference between the receive locations is that they use a different file mask. They both use a custom pipeline with single Flat file disassembler component. I have a send port subscribing to the receive port. (this is just the minimal setup where I can reproduce the problem).
When processing a group of files (up to 1mb in size) occasionally the pipeline throws an error. This only occurs when more than one file is copied to the receive location file share at once and occurs irregularly. The error generally reads:
An error occurred when parsing the incoming document: "Unexpected data
found while looking for: '\r\n' The current definition being parsed is
GIRMFile. The stream offset where the error occured is 491540. The
line number where the error occured is 2446. The column where the
error occured is 199.".
Examining the suspended message at the line number shown, consistently 512 bytes of data is different from the incoming message. This 512 bytes of data always matches data from one of the other input files consumed at the same time! Or in a few rare cases the incorrect 512 bytes of data is data from a file consumed at the same time but after it had been processed by the pipeline (i.e. the suspended flat file has a 512 byte chunk of xml!). The 512 bytes is never in a consistent location within the suspended messages.
Thinking the BizTalk databases were corrupted in some way, I deleted them and re-configured. The problem returned after a few hundred files were processed successfully.
This only occurs on our test box (a VMWare vm) so I suspect the machine is the problem in some way. But it seems odd that the machine isn't reporting other errors in other processes.

Interesting - I recall seeing similar things in BizTalk 2004 but haven't seen anything like that with BT2006.
It sounds like the pipeline may be running into threading issues - perhaps due to receiving the files from the same file location.
Have you tried any of the advanced file receive location properties?
I'm thinking in particular the 'Rename files while reading' checkbox. Perhaps if the issue is with non-threadsafe stream reads, this process of creating a renamed file (which I think just uses standard IO libraries) will allow BizTalk to get a clean stream.
Only guessing though - please report back if you find a solution!

This only occurs on our test box (a VMWare vm)
If you do not succeed in reproducing this on another machine with the same configuration, I'd mark this off as a non-issue, or external. Agree with the aforementioned that concurrency problems are highly unlikely

I have to say I find this very strange, I would find it very hard to believe that 5 years into BizTalk (counting from 2004 :-)), the FILE adapter and the standard disassemblers have threading issues.,
Are the files coming into the pick up location over the network? what file masks are you using? is there a chance that one of the receive location is picking up the files before their transfer is complete?

You said the receive location network is a network share - perhaps it's a network problem? Can you reproduce this on local drive?

A few more thoughts... is the share a DFS share? Can you put the receive locations on different hosts and see what happens then?

We have similar issues with programs running on VMWare VMs accessing shares. For some reason files will appear to be corrupt.
This was not BizTalk related, it was happening with an in house developed application.
Rebooting the VM fixes our issue for a while. In our case we were able to reconfigured our process to not use shares. We never did persue finding a solution to the real problem.

Related

BizTalk Server: maximum number of receive locations per host

I have more than 900 receive locations associated with the same host.
All receive locations are enabled but sometimes some of them are not working (and are still enabled).
When I disabled and re-enabled it, the receive location works but another one is going into trouble.
Are there any known limitations of the number of receive locations that can be associated with the same host in BizTalk 2016?
I don't know if there is a limitation number, but if you associate all the receive locations to the same Host, problably your problems are due to the Throttling mechanism.
While there are no hard limits to Receive Locations or Send Ports, there are still practical limits based on available resources.
900 is a lot for a single Host. Even if everything was running perfectly, I would still break that up across ~3 Hosts.
If these are File Receive Locations, there are other techniques to reduce the amount even more. Some options:
Use a Windows Scheduler task to move files from various locations to fewer, or maybe one location. If 'source' information is necessary, you can add a tag to the file name which can be extracted in a custom Pipeline Component.
Modify the sample File Adapter in the SDK to scan sub-folders as well. You can combine this with option 1 if you cannot modify the filename for some reason.
Similar to option 1, the script can write a meta-data file before moving the file with any data you need to preserve. The meta-data can then be read in a Pipeline Component.

EDI Receive Pipeline performance issue

I have a file receive location with edireceive pipeline configed to receive incoming HIPPA 5010 837 files.
The normal incoming file size is 4 to 6 megabytes, contains 3K to 5K records. The 837 schema deployed is the "multiple" version which have the subdocument_break="yes". So the file been processed will generate 3K to 5K messages per file.
The pipeline works fine and can split the file into multiple messages as expected. for 1 single file, BizTalk takes less than 5 mins to process.
The problem is when more than 10 files was put to the incoming folder at same time, Biztalk will start process these files parallel. But it will take hours to process these files and the BizTalk Host consumes more than 10G memory.
Some other info:
The BizTalk host is a dedicated 64bit receive host
No file lock by other applications found
Batching setting in file adapter is Num of Msgs in a batch = 1; Max batch size = 10240000
Rename file while reading is checked.
My question is: Is this performance normal? how can I improve it?
You are correct, 5K message is not really the issue, it 5 batchs of 5K message at the same time that's causing the problem.
To serialize the debatching you can use an Ordered Delivery Two Way Send Port with an Loopback Adapter which debatches the EDI on the Receive Side. In this case, the initial Receive Location would be a PassThrough.
You can find several Loopback Adapters here: http://social.technet.microsoft.com/wiki/contents/articles/12824.biztalk-server-list-of-custom-adapters.aspx#jjj
BizTalk isn't really made to process multiple large files at once, and the file adapter doesn't have any built in way to limit how many files it will pull at once.
There's a commercial solution available to help handle this (disclosure: I work for Tallan and work on this solution) called the T-Connect EDI Splitter (https://www.tallan.com/products/t-connect/edi-file-splitter/). The use case is splitting the files on a pipeline into more manageable chunks to be consumed elsewhere. This is not a trivial task, unfortunately.
If your files are small enough to process without splitting them before they hit the EDI recieve pipeline (you don't need to split them further, you just need to process them one at a time), you'll have to come up with a more complicated messaging flow to deal with that - receive them using PassThrough transmit, send them somewhere that can just consume them, then poll them using a second receive location that offers more precise control of polling.
You could also just write your own adapter that offers polling and interval settings, but that's much more complicated and messy.

Can uploads be too fast?

I'm not sure if this is the right place to ask this, but I'll do it anyway.
I have developed an uploader for the Umbraco CMS that lets people upload a queue of files in one go. This uses some simple flash app that just calls a .NET ashx to upload the files one at a time. When one is done, the next one starts.
Recently I've had a user hit a problem where 1 or 2 uploads will go up fine, but then the rest fail. This happens for himself and a client of his. After some debugging, he thinks he's found the problem, but it seems weird so was wondering if anyone else has had this problem?
Both him and his client are on a fibre optic broadband connection so have got really fast upload speeds. When it was tested on a lesser speed broadband connection, all the files were uploaded no problem. According to one of his developer friends, apparently they had come across it before and had to put a slight delay in the upload script to make it work.
Does this sound possible? Had anyone else hit this problem? Is there a known workaround to prevent the uploads from failing?
I have not struck this precise problem before, but I have done a lot of diagnosis of DSL and broadband troubleshooting before, so will do my best to answer this.
There are 2 possible causes for this particular symptom, both generally outside of your network control (I would have thought).
1) Packet loss
Of course where some links receive a very high volume of traffic then they can chose to just dump a lot of data (eg all that is over that link maximum set size), but TCP/IP should be controlling that, and also expecting that sort of thing to drop from time to time, so this seems less likely.
2) The receiving server
May have some HTTP bottlenecks into that server or even the receiving server CPU / RAM etc, may be at capacity.
From a troubleshooting perspective, even if these symptoms shouldn't (in theory) exist, the fact that they do, and you have a specific
Next steps if you really need to understand how it is all working might be to get some sort of packet sniffer (like WireShark) to try to work out at a packet level what exactly is happening.
Also Socket programming can often program directly to the TCP/IP sockets, so you would be processing at the lower network layers, and seeing the responses and timeouts etc.
Also if you control the receiving server, then you can do the same from that end, or at least review the error logs to see what is getting thrown up as a problem.
A really basic method could be to send a pathping to the receiving server if that is possible, and that might highlight slow nodes getting the server, or packet loss between your local machine and the end server.
The upshot? Put in a slow down function in the upload code, and that should at least make the code work.
Get in touch if you need any analysis of the WireShark stuff.
I have run into a similar problem with an MVC2 website using Flash uploader and Firefox. The servers were load balanced with a Big-IP load balancer. What we found out in debugging this is that Flash, in Firefox, did not send the session ID on continuation requests and the load balancer would send continuation requests off to another server. Because the user had no session on the new server, the request failed.
If a file could be sent in one chunk, it would upload fine. If it required a second chunk, it failed. Because of this the upload would fail after an undetermined number of files being uploaded.
To fix it, I wrote a Silverlight uploader.

What Causes "Internal connection fatal errors"

I've got a number of ASP.Net websites (.Net v3.5) running on a server with a SQL 2000 database backend. For several months, I've been receiving seemingly random InvalidOperationExceptions with the message "Internal connection fatal error". Sometimes there's a few days in between, while other times there are multiple errors per day.
The exception is not limited to one site in particular, though they share business and data access assemblies. The error seems to always be thrown from SqlClient.TdsParser.Run(). It sometimes is thrown from old-school direct SqlCommand.Execute() calls, while other times it is thrown from Linq2Sql code.
I've been assured by the network guys that there are no errors or packets lost on their end. Has anyone else experienced this? Could it be a driver problem? We have been unable as of yet to pinpoint a specific trigger for this exception.
We're running II6 on Windows Server 2003.
After a few months of ignoring this issue, it started to reach a critical mass as traffic gradually increased. Under heavy load, including some crawlers, things got crazy and these errors poured in nonstop.
Through trial and error, we eventually tracked down a handful of SqlCommand or LINQ queries whose SqlConnection wasn't closed immediately after use. Instead, through some sloppy programming originating from a misunderstanding of LINQ connections, the DataContext objects were disposed (and connections closed) only at the end of a request rather than immediately.
Once we refactored these methods to immediately close the connection with a C# "using" block (freeing up that pool for the next request), we received no more errors. While we still don't know the underlying reason that a connection pool would get so mixed up, we were able to cease all errors of this type. This problem was resolved in conjunction with another similar error I posted, found here: Why is my SqlCommand returning a string when it should be an int?
Sounds like the database connection is getting dropped or timing out.
We recently had similar issues moving to IIS 6 from IIS 5 connecting to SQL 2000. Our issue was solved by increasing number of ephemeral ports available.
Look at the usage of the ephemeral ports by the IIS server. The default max no. of ports available is normally 4000. You might want to consider increasing this if the sites on your server are particularly busy or your application is making a lot of database calls.
You can monitor these first to see if going over max limit.
Search Microsoft Knowledge base for "MaxUserPort" and "TcpTimedWaitDelay" and make necessary registry changes. Make sure you back up registry or snapshot server before making the changes. Will need to reboot for changes to take effect.
You should double check your database and recordset connection are being closed after use. Not closing will use up this port range unnecessarily.
Check the efficiency of your stored procedures anyway as they might be taking longer than they need too.
"If you rapidly open and close 4000 sockets in less than four minutes, you will reach the default maximum setting for client anonymous ports, and new socket connection attempts fail until the existing set of TIME_WAIT sockets times out." - from http://support.microsoft.com/kb/328476
Check your server's LOG folder (\program files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG or similar) for files named SqlDump*.mdmp and SqlDump*.txt. If you do find any you'll have to take it to Product Support.
I was creating a new EF Core project and was trying to create the database to an external Linux server instead of a Windows Server or local one. After hours of searching I found out that I am using MySQL instead of the Microsoft SQL server.
I found it weird that everyone was using 1433 instead of the usual 3306. So to fix my 'Internal connection fatal error' I had to set up a docker instance of SQL Server bound to its default port of 1433.
It literally was that simple. In the docker repo look for "microsoft-mssql-server" and run the image as described neatly in the description below. Everything works now and I am able to push my database from my EF Core project to an external server.

Slow BizTalk File Receive

I have an application with a file receive location. After the host instance has been running for a few hours the receive location fails to identify new files dropped into the folder that it is monitoring. It doesn't forget about them altogether, it's just that performance grinds to a crawl. The receive location is configured to poll the target folder every 60 seconds but after host instance has been running for an hour or so, then it seems that the target folder is being polled only every thirty minutes. If I restart the host instance then the files waiting in the target folder are collected right away and performance is fine for the next hour or so.
The same application runs fine in a different environment.
There are now obvious entries in the event log related to the problem.
All the BizTalk SQL jobs are running fine except for Backup BizTalk Server (BizTalkMgmtDb).
Any suggestions gratefully received.
Thanks
Rob
Here are some additional tools which may help you identify and diagnose BizTalk database issues.
BizTalk MsgBox Viewer
Here is a tool to repair identified errors:
Terminator
Use at your own risk... read the glogs and docs. Start with the message box viewer and let us know our results.
Without more details, the biggest tell is that your Backup Job is failing. If the backup job is failing, it may not be properly configured. If it is properly configured and still failing, then you've got other issues. Can you give us some more information about your BizTalk install.
What version are you running?
What are our database sizes?
What are your purge and archive settings like?
Is there any long running blocks in your SQL Server DB coming from BizTalk?
Another thing to consider is the user accounts the send, receive and orchestration hosts are running under. Please check the BizTalk Administration Console. If they are all running the same account, sometimes the orchestrations can starve the send and receive processes of CPU time. I believe priority is given to orchestrations then receive, then send. Even if you are just developing, it is useful to use separate accounts for this. This also improves security.
The Wrox BizTalk Server 2006 will also supply tuning advice.
What other things are going on with the server? Is BizTalk pegged otherwise or is it idle?
You mention that the solution does not have any problems in another environment, so it's likely that there is a configuration problem.
Check the following:
** On SQL Server, set some upper memory limit for SQL Server. By default, SQL Server uses whatever it can get and then hangs onto it, so set a reasonable limit so that your system can operate without spending a lot of time paging memory onto and from your hard drive(s).
** Ensure that you have available disk space - maybe you are running low - this can lead to all kinds of strange problems.
** Try to split up the system's paging file among its physical drives (if you have more than one drive on the system). Also consider using a faster drive, or if you have lots of cash laying around, get a SAN.
** In BizTalk, is tracking enabled? If so, are you also tracking message bodies? Disable tacking or message body tracking and see if there is a difference.
** Start performance monitor and monitor the following counters when running your solution
Object: BizTalk Messaging
Instance: (select the receiving host) %%
Counter: Documents Received/Sec
Object: BizTalk Messaging
Instance: (select the transmitting host) %%
Counter: Documents Sent/Sec
Object: XLANG/s Orchestrations
Instance: (select the processing host) %%
Counter: Orchestrations Completed/Sec.
%% You may have only one host, so just use it. Since BizTalk configurations vary, I am using generic names for hosts.
The preceding counters monitor the most basic aspects of your server, but may help to narrow down places to look further. You can, of course, add CPU and Memory too. If you have time (days...maybe weeks) you could monitor for processes that allocate memory and never release it. Use the following counter...
Object: Memory
Counter: Pool Nonpaged Bytes
Slow decline of this counter indicates that a process is not releasing memory, which affects everything on the system.
Let us know how things turn out!
I had the same problem with, when my orchestration was idle for some time it took a long time to process the first msg. A article of EvYoung helped me solve this problem.
"This is caused by application domain unloading within the BizTalk host process. If an AppDomain is shutdown after idle, the next message that comes needs to wait for the Orchestration to compile again. Depending on the complexity of your design, this can be a noticeable wait. To prevent this in low latency requirement scenario, you can modify the BTSNTSVC.EXE.config file and set SecondsIdleBeforeShutdown property to -1. This will prevent AppDomain shutdown due to idle."
You can find the article in here:
http://blogs.msdn.com/b/biztalkcpr/archive/2008/05/08/thoughts-on-orchestration-performance.aspx
It took me to long to respond but i thought i might help someone. cheers :)
Some good suggestions from others. I will add :
Do you have any custom receive pipeline components on the receive location ? If so perhaps one is leaking memory, calling some external component eg database which is taking a long time ?
How big are the files you are receiving ?
On the File transport properties of your receive location, set "file renaming" on, do the files get renamed within 60s.

Resources