Avoiding Zombies in BizTalk 2006 Orchestration - biztalk

I'm running BizTalk 2006, and I have an orchestration that receives a series of messages (orders) that are correlated on BTS.MessageType. On my delay shape, I check the time until midnight, which is the batch cut off. I'm getting occasional instances where I receive a message after the loop ends, and this is creating Zombie messages. I still need these messages to process, but in a new instance of the orchestration. I need some ideas of how to handle this gracefully.

One option would be to correlate on the date (in addition to BTS.MessageType)
You would have to create a pipeline component that promotes the date without the time. But there could be some time window where messages would go "randomly" either to the old or new instance (for example if you have multiple BizTalk servers with slightly different times, or if the system clocks is resynchronized with a NTP source). To be safe, wait a few minutes before ending the previous day's instance.
If that window of overlap between the old and new instances is a problem, you should instead correlate on another value that changes only once a day, such as a Guid stored in a database and promoted by a pipeline component.
Otherwise, I've successfully used your "hackish" solution in past projects, as long as you can tolerate a small window where messages are queued and not processed immediately for a few minutes every day. In my case it was fine because messages are produced by american users during their work day, and sent by FTP or MSMQ. However if you have international users that sent messages by web services, then you may not have a time in the day where you probably won't receive anything, and the web services won't be able to queue the messages for later processing.

Related

Facing latency issue in Azure EventHub consumer

How to avoid latency in EventHub consumer data ?
My Architecture (data flow): IOTHub -> EventHub -> BlobStorage (No deviation from IOTHub packet to Blob Storage JSON packet)
Deviation occurs only on consumer application side (Listener is receiving with delay of 30-50 seconds)
Azure Configuration: 4 Partitions with standard S2 tier subscription.
Publisher: 3000 packets per minute.
My question: BlobStorage has proper data without deviation, but why listener part is receiving with latency. How could I overcome this ?
Tried with EventProcessorClient with respective handlers as suggested in GitHub sample code. Works fine without error. But having huge latency.Tried EventHubProducerClient as well. still same latency issue.
I can't speak to how IoT Hub manages data internally or what it's expected latency between IoT data being received and when IoT Hub itself publishes to Event Hubs.
With respect to Event Hubs, you should expect to see individual events with varying degrees of end-to-end latency. Event Hubs is optimized for throughput (the number of events that flow through the system) and not for the latency of an individual event (the amount of time it takes for it to flow from publisher to consumer).
What I'd suggest monitoring is the backlog of events available to be read in a partition. If there are ample events already available in the partition and you’re not seeing them flow consistently through the processor as fast as you’re able to process them, that’s something we should look to tune.
Additional Event Hubs context
When an event is published - by IoT Hub or another producer - the operation completes when the service acknowledged receipt of the event. At this point, the service has not yet committed the event to a partition, and it is not available to be read. The time that it takes for an event to be available for reading varies and has no SLA associated with it. Most often, it’s milliseconds but can be several seconds in some scenarios – for example, if a partition is moving between nodes.
Another thing to keep in mind is that networks are inherently unreliable. The Event Hubs consumer types, including EventProcessorClient, are resilient to intermittent failures and will retry or recover, which will sometimes entail creating a new connection, opening a new link, performing authorization, and positioning the reader. This is also the case when scaling up/down and partition ownership is moving around. That process may take a bit of time and varies depending on the environment.
Finally, it's important to note that overall throughput is also limited by the time that it takes for you to process events. For a given partition, your handler is invoked and the processor will wait for it to complete before it sends any more events for that partition. If it takes 30 seconds for your application to process an event, that partition will only see 2 events per minute flow through.

Queue system recommendation approach

We have a bus reservation system running in GKE in which we are handling the creation of such reservations with different threads. Due to that, CRUD java methods can sometimes run simultaneously referring to the same bus, resulting in the save in our DB of the LAST simultaneous update only (so the other simultaneous updates are lost).
Even if the probabilities are low (the simultaneous updates need to be really close, 1-2 seconds), we need to avoid this. My question is about how to address the solution:
Lock the bus object and return error to the other simultaneous requests
In-memory map or Redis caché to track the bus requests
Use GCP Pub/Sub, Kafka or RabbitMQ as a queue system.
Try to focus the efforts on reducing the simultaneous time window (reduce from 1-2 seconds up to milliseconds)
Others?
Also, we are worried if in the future the GKE requests handling scalability may be an issue. If we manage a relatively higher number of buses, should we need to implement a queue system between the client and the server? Or GKE load balancer & ambassador will already manages it for us? In case we need a queue system in the future, could it be used also for the collision problem we are facing now?
Last, the reservation requests from the client often takes a while. Therefore, we are changing the requests to be handled asynchronously with a long polling approach from the client to know the task status. Could we link this solution to the current problem? For example, using the Redis caché or the queue system to know the task status? Or should we try to keep the requests synchronous and focus on reducing the processing time (it may be quite difficult).

Release memory by removing consumed messages in BizTalk Orchestration?

I have built an orchestration with a loop to retreive paged data from REST web service. From page size and offset I am able to call the service for "next page" of data. Then I debatch it, map it to internal format and process it further. When one page is processed, I request next page from the REST web service.
As it turns out, the host running the orchestration and send ports causes the memory to constantly grow during processing of all the data, and eventually hit the throttling mode.
Why is memory not released when I am done with one page loop? Is it the "consumed" messages that are stored in the orchestration that builds up the memory? Is it possible to clear orchestration from these "consumed" messages, to release the memory used?
(No message tracking active on the orchestration, or send ports.)
Apparently, there is no way to prohibit BizTalk Orchestrations from building up a list of messages in Orchestration, including used/processed/consumed messages. Putting things in Scope does not prohibit this behaviour.
Hence, for long-running Orchestrations there can be a lot of messages building up. Especially for singleton Orchestrations, where the general solution proposed to deal with this problem is to make sure Orchestration shuts down once in a while (when idle, e.g.).
My solution was to split the Orchestration into two, and have the initial Orchestration start the second Orchestration with the Start Orchestration, which in turn calls the second Orchestration recursively, and so on, until last page is received and the last Orchestration ends.
Yes, what you need to do is to have scopes, and that the messages are initialised in the scope (Green highlight below) rather than an the top level (yellow), and that means they will also be disposed of at the end of the scope. Note: That means those message can't be used outside of the scope.
However if you are just re-using the same messages in the loop, then I wouldn't expect it to increase memory usage. So there is possibly something else going on. I suspect that you must be adding each page to a message, and that is what is growing

How to blank out and read complete AnyOfferChangeNotification in SQS?

As anyofferschange notification amount varies with time. We don't have any specific way to read multiple notifications together.
So, I am reading one by one and saving some of information in sql server database, It takes quite a lot of time that I can never finish reading all the notifications.
What is the best possible way to achieve this?
Here's what I did...I started by clearing out the queue. Then I started my windows service that every few seconds polled the queue. I think I pulled back 10 messages at time. I would get a total count of messages and then spin up a number of threads that could handle the amount of messages I had waiting. One by one, I read the message, add to my SQL database, then delete the message from SQS.
Over time, I understood better how many threads to spin up and how often to poll my queue. As long as my service was running, I would maintain just a handful of SQS messages in the queue at a time and I would quickly read and process them. Occasionally, due to bad programming (yeah, it happens), my service would crash and I wouldn't know about it. Tens of thousands of messages become queued up and I would put my service in "crisis" mode, which polled at an increasing rate and essentially maxed out the number of calls I could make to SQS. Usually in a few hours, my service would catch up and then I increase the polling interval. Sometimes though, I would just dump the queue and start over as I'd have potentially hundreds of price changes on a single SKU and didn't want to waste the processing time to go through them. But most of the time, things ran smoothly.
Why can't you read more than one notification together? Like I said, I believe I read 10 at a time on each thread. Once I got the 10 messages, I processed them in a loop and dumped them to a SQL database. Once the 10 were processed, I send a message up to SQS to delete.
I ran this for several years on an account with over 10,000 SKU's. We had up to the minute price change notifications on all our products and could instantly reprice and update Amazon, if needed.

BizTalk message box full of BTXTimerMessage messages

We are using BizTalk (2013 R2 CU 6) EDI functionality to batch EDI files. This uses the Microsoft.BizTalk.Edi.BatchingOrchestration.BatchingService Orchestration which is always running in a waiting state (for lack of a better term), dehydrated most of the time. While running the orchestration builds up instances of the BTXTimerMessages in the Queued (awaiting processing) state. These messages are never removed or processed, that I can tell. This eventually causes us to pass the 50k message threshold and start throttling.
As far as I can tell there is no way to setup a reoccurring schedule for the batcher, it must always run or be manually started. If we leave the batcher off there we get routing errors.
Currently the only way we have to eliminate these messages is to terminate the EDI batcher for each party, then restart it.
Is there a better way to purge these messages from the system, or stop them from being generated all together?

Resources