I just need one point of clarification on your Matrix. Your FAQ says:
One Transaction is counted for every 20 Elements in a Large-Scale Matrix Routing API Request. The number of Elements equals the number of start points times the number of destination points. Large-Scale Matrix Routing Requests generating less than 100,000 individual elements are always counted as 5,000 Transactions.
So if we had 317 origins and 317 destinations (thereby around 100,000 individual elements), can you confirm that this will be treated as 5,000 'Transactions'?
The number of elements in your request given 317 origins and 317 destinations will be 100489 which is more than 100,000 and therefore will not be counted as 5,000 transactions.
The page you quoted states that only elements less than 100,000 will be counted as 5,000 transactions.
https://developer.here.com/faqs#payment-and-subscription
For specific pricing details and plans please contact our Sales:
https://developer.here.com/help#how-can-we-help-you
Related
In AWS documentation, it stated that
"For provisioned mode tables, you specify throughput capacity in terms of read capacity units (RCUs) and write capacity units (WCUs):
One read capacity unit represents **one strongly consistent read per second**, or two eventually consistent reads per second, for an item up to 4 KB in size."
But what count as one read? If I loop through different partitions to read from dynamodb, will each loop count as one read? Thank you.
Reference: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html
For a GetItem and BatchGetItem operation which read an individual item, the size of the entire item is used to calculate the amount of RCU (read capacity units) used, even if you only ask to read specific parts from this item. As you quoted, this size is than rounded up to a multiple of 4K: If the item is 3.9K you'll pay one RCU for a strongly-consistent read (ConsistentRead=true), and two RCUs for a 4.1K item. Again, as you quoted, if you asked for an eventual-consistent read (ConsistentRead=false) the number of RCUs would be halved.
For transactions (TransactGetItems) the number of RCUs is double what it would have been with consistent reads.
For scans - Scan or Query - the cost is calculated the same as reading a single item, except for one piece of good news: The rounding up happens for the entire size read, not for each individual item. This is very important for small items - for example consider that you have items of 100 bytes each. Reading each one individually costs you one RCU even though it's only 100 bytes, not 4K. But if you Query a partition that has 40 of these items, the total size of these 40 items is 4000 bytes so you pay just one RCU to read all 40 items - not 40 RCUs. If the length of the entire partion is 4 MB, you'll pay 1024 RCUs when ConsistentRead=true, or 512 RCUs when ConsistentRead=false, to read the entire partition - regardless of how many items this partition contains.
my application will query dynamodb 500queries/second, and for each query the estimated response data will be 300bytes. and my application will keep this frequency every second meaning it will continuous make 500queries/second. what's the right number I should pick for read capacity unit in my case? Thanks
From the docs...
Read capacity unit (RCU): Each API call to read data from your table is a read request. Read requests can be strongly consistent, eventually consistent, or transactional. For items up to 4 KB in size, one RCU can perform one strongly consistent read request per second. Items larger than 4 KB require additional RCUs. For items up to 4 KB in size, one RCU can perform two eventually consistent read requests per second.
So if eventually consistent is good enough, then 250 RCU is all that is needed.
If you need strongly consistent, then you'd need 500 RCU.
I have a question...
If I have 1000 item having same partition key in a table... And if I made a query for this partition key with limit 10 then I want to know does it take read capacity unit for 1000 items or for just 10 items
Please clear my doubt
I couldn't find the exact point in the DynamoDB documentation. From my experience it uses only the returned limit for consumed capacity which is 10 (Not 1000).
You can quickly evaluate this also using the following approach.
However, you can specify the ReturnConsumedCapacity parameter in
a Query request to obtain this information.
The limit option will limit the number of results returned. The capacity consumed depends on the size of the items, and how many of them are accessed (I say accessed because if you have filters in place, more capacity may be consumed than the number of items actually returned would consume if there are items that get filtered out) to produce the results returned.
The reason I mention this is because, for queries, each 4KB of returned capacity is equivalent to 1 read capacity unit.
Why is this important? Because if your items are small, then for each capacity unit consumed you could return multiple items.
For example, if each item is 200 bytes in size, you could be returning up to 20 items for each capacity unit.
According to the aws documentation:
The maximum number of items to evaluate (not necessarily the number of matching items). If DynamoDB processes the number of items up to the limit while processing the results, it stops the operation and returns the matching values up to that point, and a key in LastEvaluatedKey to apply in a subsequent operation, so that you can pick up where you left off.
It seems to me that it means that it will not consume the capacity units for all the items with the same partition key. According to your example the consumed capacity units will be for your 10 items.
However since I did not test it I cannot be sure, but that is how I understand the documentation.
How to calculate RCU and WCU with the data given as: reading throughput of 32 GB/s and writing throughput of 16 GB/s.
DynamoDB Provisioned Throughput is based upon a certain size of units, and the number of items being written:
In DynamoDB, you specify provisioned throughput requirements in terms of capacity units. Use the following guidelines to determine your provisioned throughput:
One read capacity unit represents one strongly consistent read per second, or two eventually consistent reads per second, for items up to 4 KB in size. If you need to read an item that is larger than 4 KB, DynamoDB will need to consume additional read capacity units. The total number of read capacity units required depends on the item size, and whether you want an eventually consistent or strongly consistent read.
One write capacity unit represents one write per second for items up to 1 KB in size. If you need to write an item that is larger than 1 KB, DynamoDB will need to consume additional write capacity units. The total number of write capacity units required depends on the item size.
Therefore, when determining your desired capacity, you need to know how many items you wish to read and write per second, and the size of those items.
Rather than seeking a particular GB/s, you should be seeking a given number of items that you wish to read/write per second. That is the functionality that your application would require to meet operational performance.
There are also some DynamoDB limits that would apply, but these can be changed upon request:
US East (N. Virginia) Region:
Per table – 40,000 read capacity units and 40,000 write capacity units
Per account – 80,000 read capacity units and 80,000 write capacity units
All Other Regions:
Per table – 10,000 read capacity units and 10,000 write capacity units
Per account – 20,000 read capacity units and 20,000 write capacity units
At 40,000 read capacity units x 4KB x 2 (eventually consistent) = 320MB/s
If my calculations are correct, your requirements are 100x this amount, so it would appear that DynamoDB is not an appropriate solution for such high throughputs.
Are your speeds correct?
Then comes the question of how you are generating so much data per second. A full-duplex 10GFC fiber runs at 2550MB/s, so you would need multiple fiber connections to transmit such data if it is going into/out of the AWS cloud.
Even 10Gb Ethernet only provides 10Gbit/s, so transferring 32GB would require 28 seconds -- and that's to transmit one second of data!
Bottom line: Your data requirements are super high. Are you sure they are realistic?
if you click on capacity tab of your dynamodb table there is a capacity calcuator link next to Estimated cost. you can use that to determine the read and write capacity units along with estimated cost.
read capacity units are dependent on the type of read that you need (strongly consistent/eventually consistent), item size and throughput that you desire.
write capacity units are determined by throughput and item size only.
for calculating item size you can refer this and below is a screenshot of the calculator
what constitutes an actual read in DynamoDB?
is it reading every line in a table or what data is returned?
is this why a scan is so expensive - you read the entire table and are charged for every table line that is read?
Can you put ElasticCache (Memcached) in front of DynamoDB to keep the cost down?
Finally are you charged for a query that yields no results?
See this link: http://aws.amazon.com/dynamodb/faqs/
1 Write = 1 Write per second for an item up to 1Kb in size.
1 Read = 2 Reads per second for an item up to 1Kb in size, or 1 per second if you required fully consistent results.
For example, if your items are 512 bytes and you need to read 100
items per second from your table, then you need to provision 100 units
of Read Capacity.
If your items are larger than 1KB in size, then you should calculate
the number of units of Read Capacity and Write Capacity that you need.
For example, if your items are 1.5KB and you want to do 100
reads/second, then you would need to provision 100 (read per second) x
2 (1.5KB rounded up to the nearest whole number) = 200 units of Read
Capacity.
Note that the required number of units of Read Capacity is determined
by the number of items being read per second, not the number of API
calls. For example, if you need to read 500 items per second from your
table, and if your items are 1KB or less, then you need 500 units of
Read Capacity. It doesn’t matter if you do 500 individual GetItem
calls or 50 BatchGetItem calls that each return 10 items.
The above applies to all the usual methods, GET, BATCH X & QUERY.
SCAN is a little different, they don't document exactly how they calculate the usage but they do offer the following:
The Scan API will iterate through your entire dataset and apply the
filter conditions to every row. Since only 1MB of data can be scanned
at a time, you may need to do multiple round trips (using a
continuation token) to complete the scan. Further, using this API may
consume much of your provisioned read throughput. Hence, this method
has limited scaling characteristics and we do not recommend that you
use it as a part of your application’s regular behavior.
So to answer your question directly: The calculation is made on what data is returned in all cases except for SCAN, where there isn't really any clear indication on how they charge. A query that yields no results will not cost you anything.
You can definitely set up a caching system infront of Dynamo, definitely recommend you look into that if you want to keep your reads down.
Hope that helps!