A Queue is a collection for holding elements prior to processing - collections

A Queue is a collection for holding elements prior to processing. All the collection needs some data before processing. Then why only in the queue interface it is mentioned like this..?. ArrayList, Linked List all required needs to insert data before processing the collection. Can any one help me on this

I think its dependent upon the requirement how the data should be processed.
Let's say you have a queue in cinema for buying tickets, now the tickets should be allocated to a person in the order they have come, so in this case QUEUE is preferred data structure as it maintains the FIFO(First In First Out) order.
But, in some other scenario, you might want data to be processed in order of "priority", in that case QUEUE might not come handy, you would want some sorting mechanism on priority on the data structure before processing it.<>
So, there are different data structures which keeps different way of handling data based on requirement.
You can search on different data structure and their processing, data storage for finding which best suits the need

A queue is a waiting line. It was added to the JCF with Java 5.0. And Reflecting the print queue prototyping, the Java doc states this - a Queue is a collection for holding elements prior to processing.

Think that there are two states for an element of the queue, "on hold(OH)" and "ready for processing(RFP)". Now, the head of the queue have the RFP state and all other elements have the OH state. Concluding, queue holds it's OH-elements prior they become RFP-element (head). After an element become RFP it can be popped out and processed.

Related

Two conflicting long lived process managers

Let assume we got two long lived process managers. Both sagas operates over 10 milion items for example. First saga adds something to each item. Second saga removes it from each item. Given both process managers need few minutes to complete its job if I run them simultaneously I get into troubles.
Part of those items would hold the value while rest of them not. The result is close to random actually and depends on command order that affect particular item. I wondered if redispatching "Remove" command in case of failure would solve the problem. I mean if you try remove non existing value you should wait for the first saga to add the value. But while process managers are working someone else may dispatch "Remove" or "Add" command. In such case my approach would fail.
How may I solve such problem? :)
It seems that you would want the second saga to not run if the first saga is running (and presumably not run until some process which depends on whatever the first saga added being there). So the apparent solution would be to have a component (could be a microservice, could also be a record in a strongly consistent datastore like zookeeper/etcd/consul) that gives permission for the sagas to start executing. An example protocol might look like:
Saga sends a message to the component identifying the saga and conveying the intention to start
Component validates that no sagas might be running which would prevent this saga from running
Component responds with permission to start running
Subsequent saga attempts result in rejection until the running saga tells the component it's OK to run the other saga
Assuming that this component is reliably durable, the failure mode to worry about is that permission is granted but this component never processes the message that the saga finished (causes of this could include the permission message not getting delivered/processed or the saga crashing). No amount of acknowledgements or extra messages can solve this (it's basically the Two Generals' Problem).
A mitigation is to have this component (or something watching this component) alert if it seems that too much time has passed without saga completion. Whatever/whoever is responsible for ensuring liveness would then investigate to see if the saga is still running and if none is running, inform the component that it's OK to run the other saga. Note that this is not foolproof: it's quite possible for the decider in question to make what turns out to be the wrong decision.
I feel like I need more context. Whilst you don't say it explicitly, is the problem that the second saga tries to remove values that haven't been added by the first?
If that was true, a simple solution would be to just use a third state.
What I mean by that is to just more explicitly define and declare item state. You currently seem to have two states with value, and without value, but nothing to indicate if an item is ready to be processed by the second saga because the first saga has already done it's work on the item in question.
So all that needs to happen is that the second saga keeps looking for items where:
(with_value == true & ready_for_saga2 == true)
Ready_for_saga2 or "Saga 1 processing complete", whatever seems more appropriate in your context.
I'd say that the solution would vary based on which actual problem, we're trying to solve.
Say it's an inventory and add are items added to the inventory and remove are items requested for delivery. Then the order of commands does not matter that much because you could just process the request for delivery, when new items are added to the inventory.
This would lead to an aggregate root with two collections: Items and PendingOrders.
One process manager adds new inventory to Items - if any orders are pending, it will complete these orders in the same transaction and remove both the item and the order from the collections.
If the other process manager adds an order (tries to remove an item), it will either do it right away, if there's any items left - or it will add the order to the pending orders to be processed when new items arrive (and maybe notify someone about the delay, while we're at it).
This way we end up with the same state regardless of the order of commands, but the actual real-world-problem has great influence on the model chosen.
If we have other real world problems, we can make a model those too.
Let's say you have two users that each starts a process that bulk updates titles on inventory items. In this case you - and the users - have to decide how best to resolve this conflict - what will lead to the best real world outcome.
If you want consistency across all the items - all or no items should be updated by a single bulk update - I would embed this knowledge in a new model. Let's call it UpdateTitlesProcesses. We have only one instance of this model in the system. The state is shared between processes. This model is effectually a command queue, and when a user initiates the bulk operation, it adds all the commands to the queue and starts processing each item one at a time.
When the second user initiates another title update, the business logic in our models will reject this, as there's already another update started. Or if the experts say that the last write should win, then we ditch the remaining commands from the first process and add the new ones (and similarly we should decide what should happen if a user issues a single title update, not bulk - should it be rejected, prioritized or put on hold?).
So in short I'd say:
Make it clear which real world problem we are solving - and thus which conflict resolution outcome is best (probably a trade off, often also something that requires user interaction or notification).
Model this explicitly (where processes, actions and conflict handling are also part of the model).

Ordering of DynamoDB stream for transaction write operation

I have a DynamoDB transaction which appends > 1 records at any time in a single DynamoDB table using transactWrite. For example, in a single transaction, I can append A, B, and C records. Note that in my case, the operations are always append only (inserts only).
The records are then passed over to DynamoDB stream and to a lambda for processing. However, sometimes, lambda receives the events out of order. I understand that behavior I think because from DynamoDB's point of view, all 3 events were written at the same timestamp. So, there is no ordering. But if these events are part of same batch, I can always reorder them in the lambda before processing.
However, that is where the problem is. Even though these records are written in single transaction, they don't always appear together in the same batch in the lambda. Sometimes, I receive C as the only event and then A, B arrive in a batch later on. I think that the behavior is somewhat reasonable. Is there a way to guarantee that I receive all the records written in a transaction in one single batch.
Your items may be written in a single transaction, but each item could be in a separate stream shard. Streams have shards, therefore it is possible that each item each arrives at the same time, but each of items in the streams land on different stream shards. Ordering is by time and item, not overall keyspace and time: "For each item that is modified in a DynamoDB table, the stream records appear in the same sequence as the actual modifications to the item.” It is possible to ensure ordered updates to each item, but if you need to have consistency across all updates in the keyspace then this would need to be designed on the reader side.
All that said, I wonder if there is an opportunity to denormalize these three items into one item on the base table and skip using TransactWriteItem altogether.

Whats the best way to generate ledger change Events that include the Transaction Command?

The goal is to generate events on every participating node when a state is changed that includes the business action that caused the change. In our case, Business Action maps to the Transaction command and provides the business intent or what the user is doing in business terms. So in our case, where we are modelling the lifecycle of a loan, an action might be to "Close" the loan.
We model Event at a state level as follows: Each Event encapsulates a Transaction Command and is uniquely identified by a (TxnHash, OutputIndex) and a created/consumed status.
We would prefer a polling mechanism to generate events on demand, but an asynch approach to generate events on ledger changes would be acceptable. Either way our challenge is in getting the Command from the Transaction.
We considered querying the States using the Vault Query API vaultQueryBy() for the polling solution (or vaultTrackBy() for the asynch Obvservalble Stream solution). We were able to create a flow that gets the txn for a state. This had to be done in a flow, as Corda deprecated the function that would have allowed us to do this in our Springboot client. In the client we use vaultQueryBy() to get a list of States. Then we call a flow that iterates over the states, gets txHash from each StateRef and then calls serviceHub.validatedTransactions.getTransaction(txHash) to get signedTransaction from which we can ultimately retrieve the Command. Is this the best or recommended approach?
Alternatively, we have also thought of generating events of the Transaction by querying for transactions and then building the Event for each input and output state in the transaction. If we go this route what's the best way to query transactions from the vault? Is there an Observable Stream-based option?
I assume this mapping of states to command is a common requirement for observers of the ledger because it is standard to drive contract logic off the transaction command and quite natural to have the command map to the user intent.
What is the best way to generate events that encapsulate the transaction command for each state created or consumed on the ledger?
If I understand correctly you're attempting to get a notified when certain types of ledger updates occur (open, approved, closed, etc).
First: Asynchronous notifications are best practice in Corda, polling should be avoided due to the added weight it puts on the node for constant querying and delays. Corda provides several mechanisms for Observables which you can use: https://docs.corda.net/api/kotlin/corda/net.corda.core.messaging/-corda-r-p-c-ops/vault-track-by.html
Second: Avoid querying transactions from the database as these are intended to be internal to the node. See this answer for background on why to avoid transaction querying. In general only tables that begin with "VAULT_*" are intended to be queried.
One way to solve your use case would be a "status" field which reflects the command that was used to produce the current state. For example: if a "Close" command was used to produce the state it's status field could be "closed". This way you could use the above vaultTrackBy to look at each state's status field and infer the action that occured.
Just to finish up on my comment: While the approach met the requirements, The problem with this solution is that we have to add and maintain our own code across all relevant states to capture transaction-level information that is already tracked by the platform. I would think a better solution would be for the platform to provide consumers access to transaction-level information (selectively perhaps) just as it does for states. After all, the transaction is, in part, a business/functional construct that is meaningful at the client application level. For example, If I am "transferring" a loan, that may be a complex business transaction that involves many input and output states and may be an important construct/notion for the client application to manage.

Can I create a flow to migrate state to new version instead of using contract upgrade?

When I upgrade contract there are 2 step Authorise and Initiate. And since both step need to be done one by one per state (as my understanding) it take very long time when I have a large amount of data.
I ended up with looping call API to query some amount of data and then looping call ContractUpgradeFlow one by one.
The result is it took more than 11 hours and not finish upgrading.
So the question is if I create a flow A to query list of StateV1 as an input and create an out output to be list of StateV2.
Would it reduce the process for contract upgrade?
Should it be faster?
Is this considering same result like upgrade contract?
Would it be any effect to the next contract upgrade for StateV2 if I want to use Corda contract upgrade instead of flow A?
Yes correct with an explicit upgrade if there is a lot of data, it is going to take time as lot of things are happening behind the scenes.
Each and every unconsumed state is taken, new transaction is created, old state with old contract and new states with new contract are
added to this transaction, the transaction is sent to each signer for signing, setting of appropriate constraints is done, and finally the entire signed transaction is sent to notary.
“So the question is if I create a flow A to query list of StateV1 as an input and create an out output to be list of StateV2”
Yes you can very well create a flow to query list of StateV1 as an input and create an out output to be list of StateV2, but keep in mind you will also have to take care of all the steps which I have mentioned above which are as of now handled by the ContractUpgradeFlow.
“Would it reduce the process for contract upgrade?”
No I don’t think so as you will have to handle all the steps as mentioned above which are as of now handled by the ContractUpgradeFlow.
“Should it be faster?”
No it will take same time as taken by ContractUpgradeFlow

Traversing the Ledger

I have a cordapp set up that uploads an attachment with each transaction. The attachment is a zipped file of a list of unique identifiers related to the tx. I am trying to implement logic that forbids the same unique identifier to appear again in a subsequent transaction. Let's say I have an initial tx with an attachment listing A,B,C,D,E and it passes. Then I have Tx 2a with attachment F,G,H and Tx 2b with attachment C,F,G,H. I would want 2a to be accepted but 2b to be rejected.
I'm trying to figure out the best way to store and query the history of identifiers. I know that the attachment will be saved to the tx history, but traversing the ledger and opening/reading all attachments to ensure there are no duplicates seems extremely intensive as we scale (the attachments are more likely to list thousands of unique identifiers rather than 5).
Is it practical to create a table on the db - perhaps even the off-ledger portion of the vault - that just contains all of the ids that have been used? The node responsible for checking redundancy could read the incoming attachment, query the table, check redundancy, sign the tx, and then insert the new ids into the table? Or is there something better we can do that involves actually traversing the ledger?
Thank you
Assuming there are not millions of identifiers and if you don't mind all of the past identifiers being in the current version of the state then you can accumulate them inside the state, inside a Set? The Set will ensure there are no dupes. The benefit of this approach is that you can then perform the checking logic inside the contract.
If you don't care about performing these checks inside the contract then you can do one of the approaches you suggested:
"traversing the ledger" is really just performing a bunch of inefficient database queries queries as you rightly note
the other approach you suggested seems like a good idea. Keep an off-ledger DB table with the identifiers in. Currently working on a feature to make this much easier. In the meantime you can use ServiceHub.jdbcConnection to execute queries against the DB.
Which one you choose really depends on other aspects of your use-case.
One thing you could try is maintain a bloom filter inside your state object. This way you get a space efficient data structure and quick set membership checks. You'll have to update the filter each time an identifier is added. Could be something to look at.
Cheers

Resources