Handle RPC and inter Node communication timeout - corda

I have 5 nodes & 1 Notary node corda private network and a web client with UI & RESTful services.
There are quite a few states and its attributes that are managed by users using the UI. I need to understand how to handle timeouts and avoid multiple updates or Errors
Scenario 1
User is viewing a specific un-consumed current state of a feature.
User performs an edit and updates the state
Upon receiving the request RESTful component use CORDA RPCClient to start the flow. It sets the timeout value e.g. 2 secs
CORDA flow runs the configured rules and it has to sync / collect signatures from all the participating nodes (4). Complete processing takes more than 2 secs (Some file processing, multiple states update etc. I can change the timeout to higher value based on specific use cases. It can surely happen anytime. Need to understand what is the recommended way of handling)
As time taken is higher than provided, CORDA RPCClient throws exception. For the RESTFul service / User transaction has failed.
Behind the scenes CORDA is processing and collecting signatures and updating nodes. From CORDA perspective everything looks fine and changed set is committed to the ledger.
Question:
Is there a way to know transaction submitted is in progress so RESTful service should wait
If user submits again we do check for transaction hash is the latest one associated with unconsumed state and reject if not (It was provided to UI while querying.
Any recommended way of handling.
Scenario 2
User viewing a specific un-consumed current state of a feature.
User performs an edit and updates the state
Upon receiving the request RESTful component use CORDA RPCClient to start the flow. It sets the timeout value e.g. 2 secs
CORDA flow runs the configured rules and it has to sync / collect signatures from all the participating nodes (4). One of the nodes is down or not reachable. Flow hangs / waits for the node to be live again.
RESTFul service / UI receives a timeout exception. User refreshes the view and submits the change again. Querying the current node will return old data and user will try to make change again and submit. Same will happen at CORDA layer transaction will be of latest unconsumed state (comparing the tx hash as state is not committed, it will proceed further and will hang / waits for the node to be live again. It waits for long time i have waited for a minute it did not quite trying.
Now the node comes up and will be syncing with peers. Notary will give exception as there are two states / requests pending to form the next state in chain. Transaction fails.
Question:
Is there a way to know transaction submitted is in progress so RESTful service should wait
Any recommended way of handling.
Is there a way to provide timeout values for node communication.
Do i need to keep monitoring if the node is active or not and accordingly tailor user experience.
Appreciate all the help and support for above issue. Please let me know if there is any additional information needed.

Timeouts
As of Corda 3.3, there is no way to set a timeout either on a Corda RPC request, a flow, or a message to another node. If another node is down when you try to contact it as part of a flow, the message will simply remain in your outbound message queue until it can be successfully delivered.
Checking flow progress
Each flow has a unique run ID. When you start a flow via RPC (e.g. using CordaRPCOps.startFlowDynamic), you get back a FlowHandle. The flow's unique run ID is then available via FlowHandle.id. Once you have this ID, you can check whether the flow is still in progress by checking whether it is still present in the list of current state machines (i.e. flows):
val flowInProgress = flowHandle.id in cordaRPCOps.stateMachinesSnapshot().map { it.id }
You can also monitor the state machine manager to wait until the flow completes, then get its result:
val flowUpdates = cordaRPCOps.stateMachinesFeed().updates
flowUpdates.subscribe {
if (it.id == flowHandle.id && it is StateMachineUpdate.Removed) {
val int = it.result.getOrThrow()
// Handle result.
}
}
Handling duplicate requests
The flow will throw an exception if you try and consume the same state twice, either when you query the vault to retrieve the state or when you try to notarise the transaction. I'd suggest letting the user start the flow again, then handling any double-spend errors appropriately and reflecting them on the front-end (e.g. via an error message and automatic refresh).

Related

Handling queries alongside subscription queries using Axon Server

I'm currently using Axon Framework alongside Axon Server for my event-based microservices application. Recently I've encountered the need to wait for saga to fully complete and return its execution result (success or error) to the client, and I've solved it by creating a subscription query before dispatching a command that triggers the saga that locks then waits for updates that are being dispatched from saga and returns the result to client.
Now, that worked a treat in reporting on saga completion status to client, but now I've stumbled upon another seemingly connected problem. Every time a client queries our system's API, we perform an existence check of the client's account - and we do that by dispatching the corresponding query before we perform any business logic. After I've introduced the subscription query, when the client receives the response about saga completion status they immediately send a query to us for an updated list of certain entities, but the query that checks for account's existence fails with org.axonframework.queryhandling.NoHandlerForQueryException: No handler for query: ... which is returned by Axon Server upon sending despite the fact that there definitely is a handler registered for it and it's just handled exactly the same command during the previous request by client. This started to happen after I've added the inner subscription query mechanism to the equation.
This error disappears if we repeat the exact same query a bit later or put a delay of a couple hundred milliseconds between the calls, but that's certainly not a solution - what if our clients start to send loads of requests simultaneously, what will happen to account checking query? Are we unable to process some type of query when the subscription is not closed? I close the subscription on doFinally of Mono returned from SubscriptionQueryResult, but is there a chance that it doesn't actually get closed in Axon Server when the next query arrives? Or, which I think is closer to the truth, I need to somehow tune the query handling capacity of Axon Server? The documentation is rather concise on this topic, IMHO, especially concerning queries, not events/commands.
From the sound of it, you've hit a bug, #Ivan. I assume you are on Axon Server SE? It wouldn't hurt to construct an issue there describing the problem at hand.
As far as I know, the Subscription Query does not impact the capability to send other queries. So, it is not like a Query Handler suddenly unregisters once a subscription query is active. Or, it definitely shouldn't.
By the way, would you be able to share the versions of Axon Framework and Axon Server you are using? That helps with deducing the issue but also helps others that might hit the same problem.

How to handle client view synchronization with signal r when a client gets offline for a short period of time and some messages are lost?

I am using SignalR in my web api to provide real-time functionality to my client apps (mobile and web). Everything works ok but there is something that worries me a bit:
The clients get updated when different things happen in the backend. For example, when one of the clients does a CRUD operation on a resource that will be notified by SignalR. But, what happens when something happens on the client, let's say the mobile app, and the device data connection is dropped?.
It could happen that another client has done any action over a resource and when SignalR broadcasts the message it doesn't arrive to that client. So, that client will have an old view sate.
As I have read, it seems that there's no way to know if a meesage has been sent and received ok by all the clients. So, beside checking the network state and doing a full reload of the resource list when this happens is there any way to be sure message synchronization has been accomplished correctly on all the clients?
As you've suggested, ASP NET Core SignalR places the responsibility on the application for managing message buffering if that's required.
If an eventually consistent view is an issue (because order of operations is important, for example) and the full reload proves to be an expensive operation, you could manage some persistent queue of message events as far back as it makes sense to do so (until a full reload would be preferable) and take a page from message buses and event sourcing, with an onus on the client in a "dumb broker/smart consumer"-style approach.
It's not an exact match of your case, but credit where credit is due, there's a well thought out example of handling queuing up SignalR events here: https://stackoverflow.com/a/56984518/13374279 You'd have to adapt that some and give a numerical order to the queued events.
The initial state load and any subsequent events could have an aggregate version attached to them; at any time that the client receives an event from SignalR, it can compare its currently known state against what was received and determine whether it has missed events, be it from a disconnection or a delay in the hub connection starting up after the initial fetch; if the client's version is out of date and within the depth of your queue, you can issue a request to the server to replay the events out to that connection to bring the client back up to sync.
Some reading into immediate consistency vs eventual consistency may be helpful to come up with a plan. Hope this helps!

Corda: Making HTTP request within a responder flow?

Is it okay to make HTTP requests to a counter party's external service from within a responder flow?
My use case is a Party invokes a "request-token" flow with an exchange node. That exchange node makes a HTTP request (on the responder flow) to move cash from that parties account to an exchange account in the external payment system. The event of the funds actually hitting the count and hence the issuance of the tokens would happen with another flow.
If it is not okay, what may be an alternative design to achieve the task?
It is not always a good idea to make HTTP request that way.
Unless you think very carefully about what happens when the previous checkpoint is replayed.so dedupe and idempotence are key considerations.plus what happens if target is down? plus this may exhaust the thread pool upon which the fibers operate.
Flows are run on fibers. CordaServices can spawn their own threads
threads can block on I/O, fibers can only do so for short periods and we make no guarantees about freeing resources, or ordering unless it is the DB. Also threads can register observables
The real challenge is restart-ability and for that they need to test the hell out of their code with a random kills.
You need to be aware that steps can be replayed in the event of a crash. this is true of any server-side work based system that restarts work.
Effectively, you should:
Step 1) execute an on-ledger Corda transaction to move one or more
assets into a locked state (analogous to XA 'prepare'). When
successfully notarised,
Step 2) execute the off-ledger transaction
with an idempotent call that succeeds or fails. When we know if it
succeeded or failed, move to
Step 3) execute a second Corda
transaction that either reverts the status of the asset or moves it
to its intended final state

Strategy for sending delayed pushes

I have application that should notify user based on some interval pattern like:
Event
> Pushes
Pattern: Immediately - 3 day - 7 day - 12 day
If user made action for event pushes should stops for this event. It is possible to have multiple same type events that should send push when event occurred.
Also I do not want to bother user for example when the one have 5 events to send x5 more pushes, but reduce by taking together pushes that should happens next day (or some other interval) by sending one push for example 'reminder: you have 5 events'.
So now I decide this kind of solution, when event occurred, insert into db all pushes for event that should be send later with datetime for send. If user take action, pushes marks as redundant for this event. And before sending analyze interval for example take all pushes for next 24hour, send one and mark all others as already sent.
Is it ok, or maybe exists better solutions?
I have experience building same application with you. What I'm I doing is :
CMS -> redis -> worker
CMS, are used for creating push notification content, including the time when that content should be sent
Redis, are used for storing the delayed jobs data
Worker, php application that pulling delayed jobs data from Redis. I use Laravel on here, I take advantage from Laravel queue delayed dispatching.
Previously I have try use database and message broker SQS as queue driver. Why I'm switch to redis ? First, when I using database, is too costly, due the traffic of my queue data is very huge. Then when I use SQS, it's better than database, but SQS cannot hold delayed data with weeks age. So my last choice is Redis. Of course we can use another serivce such as Rabbitmq.

How to force the current message to be suspended and be retried later on from within a custom BizTalk **send** pipeline component?

Here is my scenario. BizTalk needs to transfer a file from a shared/central document library. First BizTalk receives an incoming message with a reference/path to this document in the library. Then it simply needs to read it out from this library and send it (potentially through different adapters). This is in essence, a scenario not so remote from the ClaimCheck EAI pattern.
Some ways to implement a claim check have been documented, noticeably BizTalk ESB Toolkit Claim Check, and BizTalk 2009: Dealing with Extremely Large Messages, Part I & Part II. These implementations do however take the assumption that the send pipeline can immediately read the stream that has been “checked in.”
That is not my case: the document will take some time before it is available in the shared library, and I cannot delay the initial received message. That leaves me with 2 options: either introduce some delay via an orchestration or ensure the send port will later on retry if the document is not there yet.
(A delay can only be introduced via an orchestration, there is no time-based subscriptions in BizTalk. Right?)
Since this a message-only flow I’d figure I could skip the orchestration. I have seen ways on how to have "Custom Retry Logic in Message Only Solution Using Pipeline" but what I need is not only a way to control the retry behavior (as performed by the adapter) but also to enforce it right from within the pipeline…
Every attempt I made so far just ended up with a suspended message that won’t be automatically retried even though the send adapter had retry configured… If this is indeed possible, then where/what should I do?
Oh right… and there is queuing… but unfortunately neither on premises nor in the cloud ;)
OK I may be pushing the limits… but just out of curiosity…
Many thanks for your help and suggestions!
I'm puzzled as to how this could be done without an Orch. The only way I can think of would be along the lines of:
The receive port for the initial messages just 'eats' the messages,
e.g. subscribing these messages to a dummy Send port with the Null Adapter,
ignoring them totally.
You monitor the Shared document library with a receive port, looking for any ? any new? document there.
Any located documents are subscribed by a send port and sent downstream.
An orchestration based approach would be along the lines of:
Orch is triggered by a receive of the Initial notification of an 'upcoming' new file to the library. If your initial notification is request response (e.g. exposed web service, you can immediately and synchronously issue the response)
Another receive port is used to do the monitoring of availability and retrieval of the file from shared library, correlating to the original notification message (e.g. by filename, or other key)
A mechanism to handle the retry if the document isn't available, and potentially an eventual timeout, e.g. if the document never makes it to the shared library.
And on success, a send port to then send the document downstream
Placing the delay shape in the Orch will offer more scalability than e.g. using Thread.Sleep() or similar in custom adapter or pipeline code, since BTS just calculates ad stamps the 'awaken' timestamp on the SQL record and can then dehydrate the orch, freeing up the thread.
The 'is the file there yet?' check can be done with a retry loop, delaying after each failed check, with a parallel branch with a timeout e.g. after an hour or so.
The polling interval can be controlled in the receive location, so I do not understand what you mean by there is no time based subscriptions in Biztalk. You also have a schedule window.
One way to introduce delay is to send that initial message to an internal webservice, which will simply post back the message to Biztalk after a specified time interval.
There are also loopback adapters, which simply post the message back into the messagebox. This can be ammended to add a delay.

Resources