Continuously running send pipeline instance - biztalk

An instance of a BizTalk send pipeline has started to run continuously. On 09/12/2021 an attempt was made to send a file via SFTP, which retried several times but ultimately failed due to a network issue. The error from the event logs is:
The adapter failed to transmit message going to send port "Deliver Outgoing - SFTP" with URL "sftp://xxx.xxxxxx.co.nz:22/To_****/%SourceFileName%". It will be retransmitted after the retry interval specified for this Send Port. Details:"WinSCP.SessionRemoteException: Network error: Software caused connection abort.
For some reason BizTalk made another send attempt at 1:49pm on 10/12/2021 which succeeded as confirmed by the administrator of the SFTP site. Despite this, BizTalk continued making intermittent send attempts and the pipeline instance is still running. The same file has been sent 4 times to the SFTP server.
The pipeline instance in theory should have suspended at 9:47pm on 09/12/2021. I have been able to confirm definitively whether anybody resumed it, but it seems unlikely at this stage. In any case, after sending successfully the pipeline instance should have terminated and should not be re-executing intermittently.
Does anybody know what could account for this behaviour? This is occurring on BTS2020 with CU2 applied.

I've sent messages over SFTP where the WinSCP interpretation of the date-modified attribute doesn't work with a specific type of SFTP server.
With the WinSCP GUI a dialogue box appears and you can disregard this error, but this option isn't available with BizTalk's GUI. This error appears when a file with the same filename already exists on the server and is supposed to be overwritten.
My solution was to create a pipeline component that removed %SourceFileName% on the server. The pipeline component (just like WinSCP GUI) can disregard the modified-date.

Related

How to recover with a retry from gremlin NoHostAvailableException

I am using Gremlin Java driver to connect to a local gremlin server.
Simple code flow
Creating client
Cluster cluster = Cluster.build().addContactPoint(<endp>).port(<port>).enableSsl(false).create()
Client client = cluster.connect();
Submit Script
client.submit("g.V().count()");
If when i submit the first time the Gremlin server is down, on subsequent retries after bringing back gremlin server, connection still fails to create.
Exception First attempt when Gremlin Server is down:
org.apache.tinkerpop.gremlin.driver.exception.NoHostAvailableException: All hosts are considered unavailable due to previous exceptions. Check the error log to find the actual reason
Exception After Gremlin server is brought back up:
tinkerpop.gremlin.driver.exception.NoHostAvailableException: All hosts are considered unavailable due to previous exceptions
One thing to note is i do not create client on retry just do
Submit Script
client.submit("g.V().count()");
It is quite possible that Gremlin server may go down anytime, how to recover in such circumstances. Fundamentally is
NoHostAvailableExceptio
recoverable?
Normally, the Client should attempt to reconnect to a host that is previously marked unavailable. By default, it should be retrying the host at 1 second intervals as governed by this configuration: connectionPool.reconnectInterval. In your case, however I think you've uncovered a bug where the reconnect attempts are not started because the Client was never able to reach the host in the first place. As of 3.4.11, you can only remedy this by recreating the Client as you noted in your comments. I've created an issue to track this problem here: TINKERPOP-2569

BizTalk: Knowing send port finished write file

I has a problem with send port and an application: The process cannot access the file because another process has locked a portion of the file.
I guess the problem is while BizTalk send port is writing a file, the application pickup this file and process.
My scenario:
I have an orchestration with a file send port to write a file to a location.
After this port I have another send port to call an application to picking the written file and process.
I think: While file send port is writing and not yet finished, the orchestration does not wait but continue next step - calling application. And this leads to above error.
Is my assumption correct?
And how can I solve this problem?
You are absolutely correct your orchestration basically throws the message in your send port and continues, but you can change this behavior and I'll give you a really simple solution here it is
* Set your Logical send port like this
Now your orchestration will wait for delivery ACK
*To make things cleaner
Create a scope and catch the Microsoft.XLANGs.BaseTypes.DeliveryFailureException which occurs when you don't get an ACK
*Also add in your catch block a suspend Orchestration shape so you can resume your orchestration if your message doesn't get to its destination :)
This works with both File and FTP protocol, (I didn't test others)

SQL to resume suspended messages in order

We have an upcoming deploy for a system that processes a lot of messages through BizTalk. Since those messages are cumulative updates they need to be queued up during the deployment outage then processed in order when the deploy is finished. Since there may be a large number of them it’s difficult to do this manually.
One possible solution is to leave the send port stopped and let the messages suspend. We can then resume them in order when the deployment is completed.
Is it possible to run a SQL script (or a tool) against the BizTalk messagebox database that will resume suspended messages, for a specific port, in order of receipt?
If you have an ordered requirement (you either do or don't), then the Send Port should be marked for Ordered Delivery.
If so, then when you Start a Stopped Send Port, the messages will be processed in the same order they were submitted.
If you stop the port (but leave it subscribed) and start it again afterwards it should resume the message itself, or if not it is simple enough to go into the Administration Console and batch resume them.
However if the response messages of the send port are subscribed too by running Orchestrations you will not be able to un-deploy the Orchestrations until they have all completed, so stopping the send port would not work in this scenario.
Sometimes one option is if the initiating port is a one way receive, is to stop the receive location and let everything complete. You can then stop the application and redeploy and restart it and the send port will pick up all the waiting message to process.
If the above is not possible you may want to look at doing a side by side deployment where you increment the version numbers of all the assemblies in the solution so you can have both versions deployed at the same time and you can then allow the old version to finish running but have the new version processing any new messages.
The better option is to send messages to msmq, usually there is no extra coding required for this. You can just route messages to msmq using MSMQ adapter and then after deployment receive them in order as MSMQ adapter allows to receive in order. Just make sure you do a small test in yr QA environment before doing it in production.

network error (Tcp error)

I am inside a network where I need proxy settings to access the internet.
I have a weird problem.
The internet is working fine.
But it is one particular instance when i get this error:
Network Error (tcp_error)
A communication error occurred: "Operation timed out"
The Web Server may be down, too busy, or experiencing other problems preventing it from responding to requests. You may wish to try again at a later time.
For assistance, contact your network support team.
This happens when I use hadoop in local mode.
I can access the UI interface. I can see the jobs running. but when I try to see the logs of each task.. i am not able to access those logs.
UI--> job-->map--> task--> all <-- this is where the error is..
Any clues?
THanks
Not sure about exactly what your tcp action is, or about Hadoop or your proxy setup, but if you can reliably repeat the error, and the timeout error happens at approximately the same time each time you test, and that time is on the order of minutes, my guess would be that you've got a true processing delay (perhaps caused by blocking somewhere) at the server, but not necessarily.

Sandbox violation on second socket send

I have a Flex client using a Flash binary (TCP) socket for communication with a Java server. I have a localhost (Apache) server providing a crossdomain.xml file which is wide open just while I am testing.
My code successfully loads the policy file on startup.
I then connect the socket to the server without any difficulty and send a message and get a response. All good so far.
However, when I send a second message through the same socket I get a pause of about 12 seconds then a sandbox violation error:
Security Error: Error #2048: Security sandbox violation: file:///C:/apache_root/ttt1/ttt1.swf cannot load data from localhost:45455.
This is the same port and socket through which the first message succeeded.
I tried re-loading the policy file before every send, but I get the same result.
Any idea why this might be happening? I clearly have an open socket at one point. I am flushing the socket after each send and I tried doing that after each read as well, but the same result.
Thanks in advance
EDIT:
If I recreate the socket prior to every call my code works. I am struggling to believe that this is correct, but maybe there is a Socket setting I am missing.
As far as I know if you're doing binary sockets the crossdomain.xml is not loaded via http.
Have you checked your apache's access logs if the crossdomain is even queried?
You might get connection from flash via tcp from flash asking for the file on your java server (not using http. It just sends the string "" or similar). Look out for them. If you don't answer them within 3 seconds (or so) flash throws an sandbox violation.
The first thing you have to do when you want to make a socket connection is to load the policy file. This only has to be done once per load of the SWF.
Security.allowDomain(host);
Security.loadPolicyFile("xmlsocket://"+host+":"+port);
The request will be made on the assigned port(45455 in your case) your server will have to listen on that port for a request "<policy-file-request/>" without the quotes.
When that request is found then you need to return to the client the crossdomain.xml
with node <allow-access-from domain="*" to-ports="*" />
After the cross domain is sent you need to close the socket on the server side
On the client side you need to ignore the domain response as Flex will handle that however at that time you can reconnect to the socket server.
At this time you can do your data send/receive.
I have a feeling the reason it actually worked for you is because you were using the connection for the policy file to transmit your data before it timed out.
I would suggest reading up on the new style of crossdomain policies and also reading up on the protocol you are using for your socket server
I think it depends on the sandbox-policy you used in the compilation process of your swf not on your crossdomain.xml... maybe this docu helps you:Security sandboxes
But I'm not 100% sure
This sort of sounds like a cache problem. Perhaps you're pulling the first socket connection out of cache and the second one gets rejected because it's getting a 200 from the server.
You might want to add localhost to your flash security exceptions list for debugging. that will quiet the sandbox errors until you get your piece to it's production environment.

Resources