I hava multiple questions about the disk size which Corda needs over time and could not find any information online.
How much disk space does a Corda transaction need?
How musch disk space does Corda need over the course of 10 years with 4.5 million transactions per month on average (without attachment etc.)
The size of a transaction is not fixed. It will depend on the states, contracts, attachments and other components used.
We do not have any rough guides currently, but we will likely be doing some tests shortly in the run-up to the release of Corda's enterprise version. This will give an idea of the storage requirements of running a node.
As was said the answer is it depends on the transaction size. The average bitcoin transaction runs about 560 bytes, giving around 2000 transactions per 1 meg block. Ethereum runs an average of about 2K per transaction so it can store 500 per 1 meg block and from best numbers I can get hyperledger runs about 5k per transaction to around 205 per block. Assuming CORDA will be somewhere in this spectrum, and assuming you will use the less is more axiom (store as little as possible in the blockchain block, defer all else to sideDB or offchain storage) then lets chose something easy to calculate with, let's say CORDA has a 1k per transaction average. That is 1000 trans/block. With the 1k size multiply TPSseconds of processing in a dayactual processing days per year to get your number. In your case (4,500,000*1024*12*10)/(1024^3) should give you gig. (seems to be about 515 gigabytes at a 1k transaction size)
I tried the CordApp example of an ultra simple IOU transaction to measure this. A single IOU transaction contains the identity of two counterparties and one notary and a singe double value (requiring 8 bytes).
Looking at the database I see that the serialised transaction requires 11 kB.
I am asking for alternative ways for serialisation in: Corda: Large serialized transaction size: Are there alternatives to current serialization design?
Related
Context & goal
I need to periodically create snapshots of cosmosDB partitions. That is:
export all documents from a single CosmosDB partition. Ca 100-10k doc per partition, 1KB-200KB each doc, entire partition JSON usually <50M)
each document must be handled separately, with id known.
Host the process in Azure function app, using consumption plan (so mem/CPU/duration matters).
And run this for thousands of partitions..
Using Microsoft.Azure.Cosmos v3 C# API.
What I've tried
I can skip deserialization using the Container.*StreamAsync() tools in API, and avoid parsing the document contents. This should notably reduce the CPU/Mem need also avoids accidentally touching the documents to be exported with serialization roundtrip. The tricky part is how to combine it with having 10k documents per partition.
Query individually x 10k
I could query item ids per partition using SQL and just send send separate ReadItemStreamAsync(id) requests.
This skips deserialization, still have ids, I could control how many docs are in memory at given time, etc.
It would work, but it smells as too chatty, sending 1+10k requests to CosmosDB per partition, which is a lot = millions of requests.. Also, by my experience SQL-querying large documents would usually be RU-wise cheaper than loading those documents by point reads which would add up in this scale. It would be nice to be able to pull N docuents with a single (SQL query) request..
Query all as stream x 1.
There is Container.GetItemQueryStreamIterator() for which I could just pass in select * from c where c.partiton = #key. It would be simpler, use less RUs, I could control batch size with MaxItemsCount, it sends just a minimal number or requests to cosmos (query+continuations). All is good, except ..
.. it would return a single JSON array for all documents in batch and I would need to deserialize it all to split it into individual documents and mapping to their ids. Defeating the purpose of loading them as Stream.
Similarly, ReadManyItemsStreamAsync(..) would return the items as single response stream.
Question
Does the CosmosDB API provide a better way to download a lot of individual raw JSON documents without deserializing?
Preferably with having some control over how much data is being buffered in client.
While I agree that designing the solution around streaming documents with change feed is promising and might have better scalability and cost-effect on cosmosDB side, but to answer the original question ..
The chattiness of solution "Query individually x 10k" could be reduced with Bulk mode.
That is:
Prepare a bulk CosmosClient with AllowBulkExecution option
query document ids to export (select c.id from c where c.partition = #key)
(Optionally) split the ids to batches of desired size to limit the number of documents loaded in memory.
For each batch:
Load all documents in batch concurrently using ReadItemStreamAsync(id, partition), this avoids deserialization but retains link to id.
Write all documents to destination before starting next batch to release memory.
Since all reads are to the same partition, then bulk mode will internally merge the underlying requests to CosmosDB, reducing the network "chattiness" and trading this for some internal (hidden) complexity and slight increase in latency.
It's worth noting that:
It is still doing the 1+10k queries to cosmosDB + their RU cost. It's just compacted in network.
batching ids and waiting on batch completion is required as otherwise Bulk would send all internal batches concurrently (See: Bulk support improvements for Azure Cosmos DB .NET SDK). Or don't, if you prefer to max out throughput instead and don't care about memory footprint. In this case the partitions are smallish enough so it does not matter much.
Bulk has a separate internal batch size. Most likely its best to use the same value. This seems to be 100, which is a rather good chunk of data to process anyway.
Bulk may add latency to requests if waiting for internal batch to fill up
before dispatching (100ms). Imho this is largely neglible in this case and could be avoided by fully filling the internal Bulk batch bucket if possible.
This solution is not optimal, for example due to burst load put on CosmosDB, but the main benefit is simplicity of implementation, and the logic could be run in any host, on-demand, with no infra setup required..
There isn't anything out of the box that provides a means to doing on-demand, per-partition batch copying of data from Cosmos to blob storage.
However, before looking at other ways you can do this as a batch job, another approach you may consider, is to stream your data using Change Feed from Cosmos to blob storage. The reason is that, for a database like Cosmos, throughput (and cost) is measured on a per-second basis. The more you can amortize the cost of some set of operations over time, the less expensive it is. One other major thing I should point out too is, the fact that you want to do this on a per-partition basis means that the amount of throughput and cost required for the batch operation will be a combination of throughput * the number of physical partitions for your container. This is because when you increase throughput in a container, the throughput is allocated evenly across all physical partitions, so if I need 10k RU additional throughput to do some work on data in one container with 10 physical partitions, I will need to provision 100k RU/s to do the same work for the same amount of time.
Streaming data is often a less expensive solution when the amount of data involved is significant. Streaming effectively amortizes cost over time reducing the amount of throughput required to move that data elsewhere. In scenarios where the data is being moved to blob storage, often when you need the data is not important because blob storage is very cheap (0.01 USD/GB)
compared to Cosmos (0.25c USD/GB)
As an example, if I have 1M (1Kb) documents in a container with 10 physical partitions and I need to copy from one container to another, the amount of RU needed to do the entire thing will be 1M RU to read each document, then approximately 10 RU (with no indexes) to write it into another container.
Here is the breakdown for how much incremental throughput I would need and the cost for that throughput (in USD), if I ran this as a batch job over that period of time. Keep in mind that Cosmos DB charges you for the maximum throughput per hour.
Complete in 1 second = 11M RU/s $880 USD * 10 partitions = $8800 USD
Complete in 1 minute = 183K RU/s $14 USD * 10 partitions = $140 USD
Complete in 10 minutes = 18.3K $1/USD * 10 partitions = $10 USD
However, if I streamed this job over the course of a month, the incremental throughput required would be, only 4 RU/s which can be done without any additional RU at all. Another benefit is that it is usually less complex to stream data than to handle as a batch. Handling exceptions and dead-letter queuing are easier to manage. Although because you are streaming, you will need to first look up the document in blob storage and then replace it due to the data being streamed over time.
There are two simple ways you can stream data from Cosmos DB to blob storage. The easiest is Azure Data Factory. However, it doesn't really give you the ability to capture costs on a per logical partition basis as you're looking to do.
To do this you'd need to write your own utility using change feed processor. Then within the utility, as you read in and write each item, you can capture the amount of throughput to read the data (usually 1 RU/s) and can calculate the cost of writing it to blob storage based upon the per unit cost for whatever your monthly hosting cost is for the Azure Function that hosts the process.
As I prefaced, this is only a suggestion. But given the amount of data and the fact that it is on a per-partition basis, may be worth exploring.
my application will query dynamodb 500queries/second, and for each query the estimated response data will be 300bytes. and my application will keep this frequency every second meaning it will continuous make 500queries/second. what's the right number I should pick for read capacity unit in my case? Thanks
From the docs...
Read capacity unit (RCU): Each API call to read data from your table is a read request. Read requests can be strongly consistent, eventually consistent, or transactional. For items up to 4 KB in size, one RCU can perform one strongly consistent read request per second. Items larger than 4 KB require additional RCUs. For items up to 4 KB in size, one RCU can perform two eventually consistent read requests per second.
So if eventually consistent is good enough, then 250 RCU is all that is needed.
If you need strongly consistent, then you'd need 500 RCU.
I did a Load test by sending around a few million records over a period of 12 hours , here is the analysis.
Hour 1 the transactions commits are very fast , within a few 100
milliseconds. as the hours go by and the number of transactions
committed to the corda DB increases so did the reduction in the
performance of the Corda Node.
After Around 2 million transactions
committed , the node efficiency goes down to about a few seconds per
transaction. After DB refresh of the nodes i.e resetting the DB to
a version with no Data , the transactions execute again within the
milliseconds range
Following is the query
Whether the MQ in the Corda Node impacts this ?
Any Corda Query that is causing the drop in performance ?
P.S : I am working with corda 3.3 enterprise version
There might many factors to consider in this question. For example, how is your Cordapp written, what is the size of your node, Are your flows linear transactions, etc. Furthermore, we had boosted the performance since Corda(v4.x).
You can find more information on sizing and performance: https://docs.corda.r3.com/sizing-and-performance.html
It appears to me as if the current version of Corda (3.1) stores a (signed) transaction via a BLOB as a serialized byte-array of the Java class SignedTransaction. (The SignedTransaction is a WireTransaction, i.e. contains a byte array representing the serialized transaction).
For some projects this approach might pose a challenge as it seems comparably wasteful w.r.t to memory and hence throughput.
Is this the standard way Corda will serialize transactions? What options exist to change the serialization to reduce memory requirements?
Example
Trying the CordApp “IOU” Example having a simple IOUState and a simple transaction, a single transaction creates a single entry in table NODE_TRANSACTIONS where the size of TRANSACTION_VALUE reported by select length(TRANSACTION_VALUE) from NODE_TRANSACTIONS is 11 kilobytes. It appears as if these 11 kilobytes consist of 9 kilobyte for the serialized WireTransaction and 2 kilobytes for the signatures. The IOUState contains a single double (and info on the two counterparts).
Using BlobInspector to deserialize the binary format of TRANSACTION_VALUE reveals a JSON file of only 2 kBytes - much smaller than the 11 kBytes of the binary BLOB, but still having data which could be massively reduced if stored with a different model.
Considering 170 transaction per seconds (a figure quoted for Corda), the simple IOU example would require 50 terrabytes per Year (365 days, 24 hours) in each (participating) node.
Note also: That the size of the blob is much larger than the deserialized JSON (at least factor 5) is counterintuitive. Maybe I did something wrong here...
Although the blob appears to be very large, note that it also contains:
The schema/description of the thing being serialised, allowing the object to be reconstructed without the original class definitions (e.g. for use in GUIs or if data needs to be inspected many years into the future)
Transformation headers to allow various versions of state to be deserialised
However, optimisations are possible and we will look to implement them in future releases of Corda. For example, one option is to slice off the schema if you know that your counterparty already has it.
I am creating an online crowd driven game. I expect the read/write requests to fluctuate (like, 50,50,50,1500,50,50,50)every second and I need to process all 100% requests with strong consistency.
I am planning to go with AWS's DynamoDB from GAE datastore for its strong consistency. I have the below doubts which I could not get clear answers in other discussions.
1. If the item size for a write action is just 4B, Will that be rounded to a 1KB and consume a write unit?
2. Financially it is not wise to set the Provisioned Throughput Capacity around the expected peak value. Alarms can warn us. But in the case of sudden rise, the requests could be throttled at the time we receive alarm. Is DynamoDB really designed to handle highly fluctuating read/write?
3. I read about Dynamc DynamoDB to update the read/write throughput capacity for us, When we add some read/write units, How long it will take to allocate them? If it takes too long, Whats the use of increasing the bar after the tide hits?
Google app engine bills just for the number of requests happen in that month. If I can make AWS work like, "Whatever the request count could be, I will expand and contract myself and charge you only for the used read/write units", I will go for AWS.
Please advise. Dont hesitate if I am not being clear at parts.
Thanks,
Karthick.
Yes. Item sizes are rounded up and the throughput is used. From the Provisioned Throughput in Amazon DynamoDB documentation:
The total number of read operations necessary is the item size, rounded up to the next multiple of 4 KB, divided by 4 KB.
It can handle some bursting, but it is generally intended to be used for uniform workloads. Here is a section from the Guidelines for Working with Tables documentation and some other helpful links about the best practices:
A temporary non-uniformity in a workload can generally be absorbed by the bursting allowance, as described in Use Burst Capacity Sparingly. However, if your application must accommodate non-uniform workloads on a regular basis, you should design your table with DynamoDB's partitioning behavior in mind (see Understand Partition Behavior), and be mindful when increasing and decreasing provisioned throughput on that table.
Query and Scan guidelines for avoiding bursts of read activity
The Table Best Practices section
Use Burst Capacity Sparingly
This one is going to depend on how much data your table has, because DynamoDB will have to repartition the data if you are scaling up. See the Consider Workload Uniformity When Adjusting Provisioned Throughput documentation for more information about the partitioning..