How to update PersistentState when LinearState is updated - corda

QueryableState.generateMappedObject define data to be inserted when State is created. However, I want to update Table defined by PersistentState when LinearState is Changed. How to keep data consistent between PersistentState Table data and ContractState vault data.

I don't think you can actually do this as far as I know.
I'd be curious about the use case as I've never heard of a situation where someone wanted to have a single state be both changing, and unchanging.
Take a look at this question on the distinctions between the states as that may also be helpful for you.
Corda - Persistence State V/S Linear State

Related

Debezium initial data snapshot and related entities order

After the first launch Debezium will do initial data snapshot of the already existing data.
Let's say I have two tables - A and B. Table B have NOT NULL FK constraint on A. According to Debezium default approach - Debezium will create two separate Kafka topics for data from tables A and B.
In my understanding, there is a very big chance that I'll potentially try to create record in new table B while appropriate record A will not be present in the appropriate new table A. This way I'll run into constraint violation error.
Do I need to use some internal 3rd party buffer and organize the proper order of insert into the sink database by myself or there is some standard mechanism in Debezium in order to handle such situations?
For example - can I use Debezium Topic Routing https://debezium.io/documentation/reference/configuration/topic-routing.html in order to fix such issue? I can potentially configure Topic Routing to send all depended events (from tables A and B in my example above) to the same topic. In case of the Kafka topic with a single partition all events must be ordered in a correct way. Will it work and this way will I have a correct related entities order for initial snapshot data load?
The IBM IDR (Data Replication) Product solved this with a solution that allows for exactly once semantics and re-creates the ordering of operations within a transaction and ordering of transactions.
Kafka's built in exactly once features has some limitations beyond performance, you don't inherently get the transaction re-ordered by operation, which is important for things like applying with referential integrity constraints.
So in our product we have a proper and a poor man's way to solve the problem. The poor man's is to send all the data for all the tables to a single topic. Obviously this is sub-optimal, but our product will produce data in operation order from a single producer if you do this. You'd probably want idempotence to avoid batches showing up out of order.
Now the pro-level way to solve this is a feature called the TCC (Transactionally Consistent Consumer).
I'm not sure if you need an enterprise level solution performance and feature wise.
If this is a non-critical project you might find the following discussion useful in how we approach delivering the features your looking for.
https://www.confluent.io/kafka-summit-sf18/a-solution-for-leveraging-kafka-to-provide-end-to-end-acid-transactions/
And here's our docs on the feature for reference.
https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/concepts/kafkatcc.html
That should give background as to why this problem is hard to solve and what goes into a solution hopefully.

Can i use TrackBy to track a particular state in Corda Vault?

https://docs.corda.net/api-vault-query.html specifies that "TrackBy updates do not take into account the full criteria specification due to different and more restrictive syntax in observables filtering (vs full SQL-92 JDBC filtering as used in snapshot views). Specifically, dynamic updates are filtered by contractStateType and stateType (UNCONSUMED, CONSUMED, ALL) only" (edited)
Does that mean that i cannot track a particular record ( state) in my vault based on its properties other than stateType? (edited)
This is what i have noticed also. I used a LinearStateQueryCriteria based on externalID but instead of updates of that one record, i got updates for all the records of the particular contractStateType.
Looking for confirmation so i can try another strategy.
To achieve what you want, you could use trackBy to monitor the state type you need and in the observable filter to only include states with the externalId that you desire. This might not be ideal but will achieve the goal you are looking for.

3 column query in DynamoDB using DynamooseJs

My table is (device, type, value, timestamp), where (device,type,timestamp) makes a unique combination ( a candidate for composite key in non-DynamoDB DBMS).
My queries can range between any of these three attributes, such as
GET (value)s from (device) with (type) having (timestamp) greater than <some-timestamp>
I'm using dynamoosejs/dynamoose. And from most of the searches, I believe I'm supposed to use a combination of the three fields (as a single field ; device-type-timestamp) as id. However the set: function of Schema doesn't let me use the object properties (such as this.device) and due to some reasons, I cannot do it externally.
The closest I got (id:uuidv4:hashKey, device:string:GlobalSecIndex, type:string:LocalSecIndex, timestamp:Date:LocalSecIndex)
and
(id:uuidv4:rangeKey, device:string:hashKey, type:string:LocalSecIndex, timestamp:Date:LocalSecIndex)
and so on..
However, while using a Query, it becomes difficult to fetch results of particular device,type as the id, (hashKey or rangeKey) keeps missing from the scene.
So the question. How would you do it for such kind of table?
And point to be noted, this table is meant to gather content from IoT devices, which is generated every 5 mins by each device on an average.
I'm curious why you are choosing DynamoDB for this task. Advanced queries like this seem to be much better suited for a SQL based database as opposed to a NoSQL database. Due to the advanced nature of SQL queries, this task in my experience is a lot easier in SQL databases. So I would encourage you to think about if DynamoDB is truly the right system for what you are trying to do here.
If you determine it is, you might have to restructure your data a little bit. You could do something like having a property that is device-type and that will be the device and type values combined. Then set that as an index, and query based on that and sort by the timestamp, and filter out the results that are not greater than the value you want.
You are correct that currently, Dynamoose does not pass in the entire object into the set function. This is something that personally I'm open to exploring. I'm a member on the GitHub project, and if you would like to submit a PR adding that feature I would be more than happy to help explore that option with you and get that into the codebase.
The other thing you might want to explore is having a DynamoDB stream, that will set that device-type property whenever it gets added to your DynamoDB table. That would abstract that logic out of DynamoDB and your application. I'm not sure if it's necessary for what you are doing to decouple it to that level, but it might be something you want to explore.
Finally, depending on your setup, you could figure out which item will be more unique, device or type, and setup an index on that property. Then just query based on that, and filter out the results of the other property that you don't want. I'm not sure if that is what you are looking for, it will of course work, but I'm not sure how many items you will have in your table, and there get to be questions about scalability at a certain level. One way to solve some of those scalability questions might be to set the TTL of your items if you know that you the timestamp you are querying for is constant, or predictable ahead of time.
Overall there are a lot of ways to achieve what you are looking to do. Without more detail about how many items, what exactly those properties will be doing, the amount of scalability you require, which of those properties will be most unique, etc. it's hard to give a good solution. I would highly encourage you to think about if NoSQL is truly the best way to go. That query you are looking to do seems a LOT more like a SQL query. Not saying it's impossible in DynamoDB, but it will require some thought about how you want to structure your data model, and such.
Considering opinion of #charlie-fish, I decided to jump into Dynamoose and improvise the code to pass the model to the set function of the attribute. However, I discovered that the model is already being passed to default parameter of the attribute. So I changed my Schema to the following:
id:hashKey;default: function(model){ return model.device + "" + model.type; }
timestamp:rangeKey
For anyone landing here on this answer, please note that the default & set functions can access attribute options & schema instance using this . However both those functions should be regular functions, rather than arrow functions.
Keeping this here as an answer, but I won't accept it as an answer to my question for sometime, as I want to wait for someone else to hit out a better approach.
I also want to make sure that if a value is passed for id field, it shouldn't be set. For this I can use set to ignore the actual incoming value, which I don't know how, as of yet.

Corda State Management Best Practice

A strategic question… When a state is going to have one to many type of data, should we always create a collection under the parent state object or create a separate state object for the child with the reference to parent? Example (Employer 1:M Employee) or (Employer 1:M Location) …. When to decide which strategy? I've listed some PROS & CONS for each. Not sure when to use what strategy. Looking for some feedback
Adding child as collection
PROS
=====
- Easier to manage from coding standpoint
- Easy access to child data as it will always be available when querying parent from vault
CONS
=====
- As each collection object is going to be represented as separate table in the database, Each time a new state is created child data is also replicated even though there may not be update on child which will cause database to grow unessential
- If we have too many of such collection objects then serialized transaction size could be huge so performance could be worst
Adding child as Separate State Object
PROS
=====
- Child data is not replicated with each time a new parent state is updated
- When there is an update on any of the Child data only that state needs to be communicated other participant
CONS
=====
- More coding needed in order to manage child state object separately
- Child data won't be available when querying parent from vault
- Each state needs to have its own contract so child objects can't be validated on the same parent contract
Since states are linked to persisting to a single table on the backend, it is difficult to manage child collections. At present, I think you would need to leave the collection property unbound (i.e. not mapped to a database column and marked as transient so that the class can still be serializable) and then do a separate filtered query for the child records that can be assigned to the collection property of the state. Then when any change is persisted, it will not try to persist the child records. Changes to child records should be done individually through their own state transactions. It would be nice if Corda had a feature that would support the JPA feature of doing joins between tables such as #OneToMany. This would facilitate queries, but persisting state changes would still need to be handled separately. There may be a way of doing this that I am unaware of.
It's an old question, but there does not seem to be an accepted answer, so I'll have a go.
Firstly, the Corda node is not just a back-end for your application, it's a node in a decentralised transaction processing system. The latter must be key requirement for you, otherwise you wouldn't use Corda.
Secondly, Corda implements UTXO (Unspent Transaction Output) paradigm for evolving distributed state through a series of transaction whereby a collection of objects representing input states are 'spent' (or consumed, become unavailable) and replaced by another collection of objects representing output states. The state objects themselves may have complex structures, but when they evolve they meant to be swapped as a whole. That is unlike, say, Ethereum or Hyperledger, where the global state is basically a large collection of unrelated key-value pairs that can change arbitrarily. UTXO model allows to easily implement features that are very hard to achieve with the global state model, such as transaction privacy. Important point here is that Corda can be made to emulate global state model, but it will be inefficient at it and lose most of its benefits.
Because of this, the way states are modelled must be based on the intended evolution of the distributed state of a CorDapp. The questions to ask yourself therefore are probably the following:
Will the child states 'live their own life', i.e. evolve
independently from the parent states? An example of a 'yes' would be
Corda Token SDK, whereby token type and tokens themselves are
separate states despite obvious parent-child relationship. Network
participants can trade the child states without controlling the
parent state.
Will there be a need to withhold the information in
a parent state but not the child, or vice versa from a party
participating in a transaction? An example of a 'yes' would be the
use of Oracle to sign off a child state object output without being
shown the parent. Corda IRS example does something similar with
transaction tear-offs when the Oracle fixes the interest rate.
Will there be a situation whereby a network participant may need to
know (i.e. keep in the vault) the child state, but not the parent
state. An example of a 'yes' would be an off-ledger settlement
workflow similar to the Corda Settler example, whereby paying agents
can be settling obligations without necessarily knowing the details
of the agreements that led to the obligations to arise.
If the answer to any of the above questions is 'yes', then you have to represent the children as separate states, otherwise it is better to leave them embedded into the parent state.

Corda dynamic data handling at the time creating a contract

I am finding a solution to handle dynamic data / list to all nodes or specified nodes at the time creating a contract in Corda. I don’t think Oracle is a good approach to use in my case for the following reasons:
The data can be a list of for example legal entity names, they are not from outside world, not a single value;
The list is depended on particular field(s) selected, therefore will need perhaps a centralized place to maintain the data relationship;
Appreciate if anyone can help on this. Thanks.
Kwan
This question is a little difficult to answer without further details on your use-case. However, on the surface, an Oracle doesn't sound like a bad solution:
The data provided by an oracle can be a list
The term "outside world" simply refers to any information not included in the transaction itself. This term should not be taken too literally.
Ultimately, you can think of an Oracle as a provider of "official" data. You request a command including the data from the oracle, include it in the transaction, and the oracle will sign over the transaction if and only if it agrees that the data in the command is true. As long as the Oracle is trusted by all parties involved, this allows data from outside the transaction to be included in the transaction in a reliable way.

Resources