I want to write a terraform module that will create dynamoDb tables. The attributes are expected to be read from .tfvars or default variable instead of being already named in .tf as in the resource guide here
To explain further, say a list of attributes is being used to achieve this pseudo-code:
resource "aws_dynamodb_table" "basic-dynamodb-table" {
name = "GameScores"
... #Other required feilds
...
...
# attributes is a list of names
for(attribute_name:${length(var.attributes)}){
attribute {
name = "${var.attributes[i]}"
type = "N"
}
}
}
How can I iterate over the attribute list and create the attribute{ } during terraform plan/apply ? The number of attribute blocks cannot be static like shown in the terraform docs, and their names must be read from variables.
very late but must be util for another people that are looking at.
I use this doc and there I can use some like this
dynamic "attribute" {
for_each = var.table_attributes
content {
name = tag.value["name"]
type = tag.value["type"]
}
}
In my variables.tf I have the variable table_attributes with a map of objects to use.
Maybe it helps someone.
When you create a DynamoDB table, the only attributes you need to specify are the partition key and, optionally, the sort key. All other attributes are stored as part of each document (or item) you store in the table.
The same applies to Global Secondary Indexes as well. You only need to specify the partition key and sort key for each index.
If you don't have static attributes, then you can't create a table. The names of the partition and sort keys must be the same for the lifetime of a table/index.
Finally, it's not clear from the question, but please don't use Teraform to load data into your table. It's not the right tool for that!
Related
First of all, I have table structure like this,
Users:{
UserId
Name
Email
SubTable1:[{
Column-111
Column-112
},
{
Column-121
Column-122
}]
SubTable2:[{
Column-211
Column-212
},
{
Column-221
Column-222
}]
}
As I am new to DynamoDB, so I have couple of questions regarding this as follows:
1. Can I create structure like this?
2. Can we set primary key for subtables?
3. Luckily, I found DynamoDB helper class to do some operations into my DB.
https://www.gopiportal.in/2018/12/aws-dynamodb-helper-class-c-and-net-core.html
But, don't know how to fetch only perticular subtable
4. Can we fetch only specific columns from my main table? Also need suggestion for subtables
Note: I am using .net core c# language to communicate with DynamoDB.
Can I create structure like this?
Yes
Can we set primary key for subtables?
No, hash key can be set on top level scalar attributes only (String, Number etc.)
Luckily, I found DynamoDB helper class to do some operations into my DB.
https://www.gopiportal.in/2018/12/aws-dynamodb-helper-class-c-and-net-core.html
But, don't know how to fetch only perticular subtable
When you say subtables, I assume that you are referring to Array datatype in the above sample table. In order to fetch the data from DynamoDB table, you need hash key to use Query API. If you don't have hash key, you can use Scan API which scans the entire table. The Scan API is a costly operation.
GSI (Global Secondary Index) can be created to avoid scan operation. However, it can be created on scalar attributes only. GSI can't be created on Array attribute.
Other option is to redesign the table accordingly to match your Query Access Pattern.
Can we fetch only specific columns from my main table? Also need suggestion for subtables
Yes, you can fetch specific columns using ProjectionExpression. This way you get only the required attributes in the result set
I have seen that parquet format uses dictionaries to store some columns and that these dictionaries can be used to speed up the filters if useDictionaryFilter() is used on the ParquetReader.
Is there any way to access these dictionaries from java code ?
I'd like to use them to create a list of distinct members of my column and though that it would be faster to read only the dictionary values than scanning the whole column.
I have looked into org.apache.parquet.hadoop.ParquetReader API but did not found anything.
The methods in org.apache.parquet.column.Dictionary allow you to:
Query the range of dictionary indexes: Between 0 and getMaxId().
Look up the entry corresponding to any index, for example for an int field you can use decodeToInt().
Once you have a Dictionary, you can iterate over all indexes to get all entries, so the question boils down to getting a Dictionary. To do that, use ColumnReaderImpl as a guide:
getDictionary(ColumnDescriptor path, PageReader pageReader) {
DictionaryPage dictionaryPage = pageReader.readDictionaryPage();
if (dictionaryPage != null) {
Dictionary dictionary = dictionaryPage.getEncoding().initDictionary(path, dictionaryPage);
}
}
Please note that a column chunk may contain a mixture of data pages, some dictionary-encoded and some not, because if the dictionary "gets full" (reaches the maximum allowed size), then the writer outputs the dictionary page and the dictionary-encoded data pages and switches to not using dictionary-encoding for the rest of the data pages.
Suppose I have an object like
{
epochTime : 1527174282
action : create
state : fail
}
From the documentation of AWS, to query
You must specify the partition key name and value as an equality
condition.
You can optionally provide a second condition for the sort key (if
present).
In the case where I just want to query some item from A to B from the whole dataset ( Just want to use the sort key ) how should I chooe the hash key and query for this kind of data to work effectively?
You need to define secondary index for the attribute with the primary key (hash) as the field and perform scan on the secondary index
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SecondaryIndexes.html
Working with scans
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html
Also refer this java sdk examples for working with secondary index and scans https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSIJavaDocumentAPI.Example.html
https://aws.amazon.com/about-aws/whats-new/2015/02/10/secondary-index-scan-a-simpler-way-to-scan-dynamodb-table/
I'm trying to create an index on a nested field, using the Dashboard in AWS Developer Console. E.g. if I have the following schema:
{ 'id': 1,
'nested': {
'mode': 'mode1',
'text': 'nice text'
}
}
I was able to create the index on nested.mode, but whenever I then go to query by index, nothing ever comes back. It makes me think that DynamoDB created the index on a field name nested.mode instead of the mode field of nested. Any hints re. what I might be doing wrong?
You cannot (currently) create a secondary index off of a nested attribute. From the Improving Data Access with Secondary Indexes in DynamoDB documentation (emphasis mine):
For each secondary index, you must specify the following:
...
The key schema for the index. Every attribute in the index key schema must be a top-level attribute of type String, Number, or Binary. Nested attributes and multi-valued sets are not allowed. Other requirements for the key schema depend on the type of index:
You can, however, create an index on any top level JSON element.
What will happen if I create a PutItem request like this:
{
"Expected":
{
"testAttribute" :
{
"Exists": "false",
}
},
"Item":
{
"testAttribute" :
{
"S": "testValue"
}
},
"TableName": "TableName"
}
where "testAttribute" is not part of the primary key.
Will DynamoDB scan the table to see if there is an item with attribute "testAttribute" == "testValue" ?
If not, how will DynamoDB determine the presence of a "testAttribute" == "testValue" ?
I can't find anything in the docs describing how this works.
According to the documentation of the PutItem action, you are not allowed to issue that request. It says:
Item: A map of attribute name/value pairs, one for each attribute. Only the primary key attributes are required; you can optionally provide other attribute name-value pairs for the item.
(emphasis mine)
You must provide a value for each attribute of the primary key whenever you use PutItem.
This way, as you will surely agree, it is very simple and fast for DynamoDB to check the condition you defined on the Expected clause: no scan is needed, it just has to look at the single item that could match the request. Otherwise, as you noted, DynamoDB would need to perform a full table scan (and it would possibly be very slow, and they would certainly charge you for that) or it would need to maintain a consistent index of every single item in a table, and they would charge you for the SSD space used to store it!
Also, note that the meaning of the expected clause is a little bit different than what you described in the question. Supposing you fix your request and add all the primary key attributes, the request would mean:
"If the item identified by this primary key does not exist, create it; if it does exist and does not contain an attribute named testAttribute, replace the item with the one whose attributes are described in this request; if the item does exist and does contain an attribute named testAttribute, do nothing".
Your description says that DynamoDB would check if the value of testAttribute is testValue, but it is not what happens when you use the Expected/Exists clause. To achieve the effect you described, you need to use the Expected/Value clause, and then you specify the value you are expecting in that clause -- the value specified for the attribute in the Item property of the request is just used to define the new value of the attribute, if an update (or insert) is to occur.