GetItem from Secondary Index with DynamoDB - amazon-dynamodb

I'm just getting started using DynamoDB and have setup an 'accounts' table.
I've set-up a secondary index so I can query an api user and user key.
Neither of these values are the primary key, as they are both volatile and can be changed.
The Table is built with
TableName: "Accounts",
KeySchema: [
{ AttributeName: "id", KeyType: "HASH" },
{ AttributeName: "email", KeyType: "RANGE" }
],
AttributeDefinitions: [
{ AttributeName: "id", AttributeType: "S" },
{ AttributeName: "email", AttributeType: "S" }
]
And the Index is
TableName: 'Accounts',
AttributeDefinitions: [
{AttributeName: 'name', AttributeType: 'S'},
{AttributeName: 'apiKey', AttributeType: 'S'}
],
GlobalSecondaryIndexUpdates: [
{
Create: {
IndexName: "ApiAccounts",
ProvisionedThroughput: {
ReadCapacityUnits: 1, WriteCapacityUnits: 1
},
KeySchema: [
{AttributeName: 'name', KeyType: "HASH"},
{AttributeName: 'apiKey', KeyType: "STRING"}
],
Projection: {
ProjectionType: "KEYS_ONLY"
},
I'm now trying to get a uses account by querying the ApiAccounts index.
I'm trying
dynamoClient.get({
TableName: 'Accounts',
IndexName: 'ApiAccounts',
Key: {
name: nameKeyArray[0],
apiKey: nameKeyArray[1]
}, callback)
But I am getting an error One of the required keys was not given a value, which leads me to believe I can't do a 'get' on a Index? Or I'm not referring the index properly. Can somebody clarify for me?
Name and API Key are unique, so I think I want to avoid a query or scan if possible

I guess its not so clear from the official docs. You may perform Scan or Query operation on GSI index, but not the GetItem operation.
For every record / item in a Table, they must have unique HASH and RANGE keys.
i.e.
// assume dummy api putItem(id, email, name, apiKey)
account.putItem("1", "abc#email.com", "john", "key1") // OK
account.putItem("1", "abc#email.com", "john", "key1") // NOT OK, id and email are table HASH and RANGE keys, must be unique
But for Index'es, Hash and Range keys are not unique, they may contain duplicated records / items.
i.e.
// assume dummy api putItem(id, email, name, apiKey)
account.putItem("1", "abc#email.com", "john", "key1") // OK
account.putItem("1", "bcd#email.com", "john", "key1") // OK
i.e.
// assume dummy api putItem(id, email, name, apiKey)
account.putItem("1", "abc#email.com", "john", "key1") // OK
account.putItem("2", "abc#email.com", "john", "key1") // OK
Java
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/dynamodbv2/document/Index.html
Index implements QueryApi and ScanApi but not GetItemApi.
JavaScript
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#getItem-property
GetItem does not accept IndexName as a parameter.
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#query-property
Query accepts IndexName as a parameter.
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB.html#scan-property
Scan accepts IndexName as a parameter.

Related

What are the options for querying a DynamoDB GSI to find matching substrings?

I have a dynamodb table with a structure something like:
{
Type: 'AWS::DynamoDB::Table',
Properties: {
KeySchema: [
{
AttributeName: 'itemID',
KeyType: 'HASH'
}
],
AttributeDefinitions: [
{
AttributeName: 'itemID',
AttributeType: 'S'
},
{
AttributeName: 'label',
AttributeType: 'S'
}
],
GlobalSecondaryIndexes: [
{
IndexName: 'byLabel',
KeySchema: [
{
AttributeName: 'label',
KeyType: 'HASH'
}
],
Projection: {
ProjectionType: 'ALL'
}
}
]
}
}
For one of my access patterns I want to get all the data that contains a label with a specific sub string. I'm very new to dynamodb, so I am unsure of the best way to achieve this.
I have it working for equals like below. HoweverI would prefer more results so returning data that contains the input would be ideal.
Current request VTL template:
{
"version": "2018-05-29",
"operation": "Query",
"query": {
"expression": "label = :label",
"expressionValues": {
":label": $util.dynamodb.toDynamoDBJson($context.arguments.label)
}
},
"index": "byLabel",
}
I am looking for advice on how this is usually done with a dynamodb implementation.
When you do a Query in dynamodb, you provide one exact (complete) hash key.
If you want to a substring match on your hash key, you cannot use a Query. Instead you will need to do a Scan which include a FilterExpression.
Whilst this scan will return only the items you want, it will evaluate every single item in your table, and will therefore cost you more RCUs and may slow down as your table scales (although you can offset this by using a ParallelScan).

Query condition missed key schema element with a secondary index field

First time to dynamodb and serverless framework. I am trying to create a simple todo app. With todoId as primary key and userId as a secondary index. This is my definition of the table in serverless.yaml but when i try to get todo list of the user, i get the above error.
resources:
Resources:
GroupsDynamoDBTable:
Type: AWS::DynamoDB::Table
Properties:
AttributeDefinitions:
- AttributeName: todoId
AttributeType: S
- AttributeName: userId
AttributeType: S
KeySchema:
- AttributeName: todoId
KeyType: HASH
BillingMode: PAY_PER_REQUEST
TableName: ${self:provider.environment.TODOLIST_TABLE}
GlobalSecondaryIndexes:
- IndexName: ${self:provider.environment.USER_ID_INDEX}
KeySchema:
- AttributeName: userId
KeyType: HASH
Projection:
ProjectionType: ALL
query:
const result = await docClient
.query({
TableName: toDoListTable,
KeyConditionExpression: 'userId = :userId',
ExpressionAttributeValues: {
':userId': 5
},
ScanIndexForward: false
})
.promise()
Since you are making a query with a global secondary index you must specify the name of the index that you want to use and the attributes to be returned in the query results.
The result should be this:
const result = await docClient
.query({
TableName: toDoListTable,
IndexName: "UserIdIndex", // the name specified here: self:provider.environment.USER_ID_INDEX
KeyConditionExpression: "userId = :userId",
ExpressionAttributeValues: {
":userId": 5
},
ProjectionExpression: "todoId, userId",
ScanIndexForward: false
})
.promise()

DynamoDB,how to query with BEGINS_WITH

i'm using DocumentClient for query.
and using serverless framework with DynamoDb.
i'm trying to query with BEGINS_WITH without providing any primary key.
here is how my data looks like:
[
{
id: 1,
some_string: "77281829121"
},
{
id: 2,
some_string: "7712162hgvh"
},
{
id: 3,
some_string: "7212121"
}
]
here is my serverless.yml [i.e Table config i guess]:
Resources:
IPRecord:
Type: 'AWS::DynamoDB::Table'
Properties:
TableName: ${file(./serverless.js):Tables.IPRecord.name}
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: 'id'
AttributeType: 'S'
- AttributeName: 'some_string'
AttributeType: 'S'
KeySchema:
- AttributeName: 'id'
KeyType: 'HASH'
GlobalSecondaryIndexes:
- IndexName: ${file(./serverless.js):Tables.IPRecord.index.ID}
KeySchema:
# ...some more index goes here
- AttributeName: 'some_string'
KeyType: 'RANGE'
Projection:
ProjectionType: 'ALL'
Q:
Using DocumentClinet i want to query with the first few elements of some_string.
which will return all the docs, that is matching.
like in this case i want to query {some_string:"77"} and it will return
[{
id: 1,
some_string: "77281829121"
},
{
id: 2,
some_string: "7712162hgvh"
}]
currently my query looks like this [this gives error ][Running in Local DynamoDB JS shell]:
var params = {
TableName: '<TABLE_NAME>',
IndexName: '<INDEX_NAME>',
KeyConditionExpression: 'begins_with(some_string,:value)',
ExpressionAttributeValues: {
':value': '77'
}
};
docClient.query(params, function(err, data) {
if (err) ppJson(err);
else ppJson(data);
});
seems like this above query needs a primary key, and in my case that is id. if i pass that, then it will point to a single doc.
Here is what i have achived so far:
var params = {
TableName: '<TABLE_NAME>',
FilterExpression: 'begins_with(some_string,:value)',
ExpressionAttributeValues: {
':value': '77'
},
Select:'COUNT' //as i only required COUNT
};
docClient.scan(params, function(err, data) {
if (err) ppJson(err);
else ppJson(data);
});
this above query does what i want.but
any better approach or solution always welcome.
if number of characters in your beginswith query is always going to be random, i don't see an option solving it with dynamodb.
but let's say there are going to be at least 3 characters. then you can do the following.
Update your dynamodb schema to
IPRecord:
Type: 'AWS::DynamoDB::Table'
Properties:
TableName: ${file(./serverless.js):Tables.IPRecord.name}
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: 'id'
AttributeType: 'S'
- AttributeName: 'some_string'
AttributeType: 'S'
KeySchema:
- AttributeName: 'id'
KeyType: 'HASH'
- AttributeName: 'some_string'
KeyType: 'RANGE'
And instead of storing
[
{
id: 1,
some_string: "77281829121"
},
{
id: 2,
some_string: "7712162hgvh"
},
{
id: 3,
some_string: "7212121"
}
]
store as
[
{
id: 772,
uniqueid:1,
some_string: "77281829121"
},
{
id: 771,
uniqueid:2,
some_string: "7712162hgvh"
},
{
id: 721,
uniqueid:3,
some_string: "7212121"
}
]
Where id is always the first 3 character of original some_string.
Now let's say you have to query all items that start with abcx you can do
select * where id=abc and some_string startswith abcx
but you should always try to have more number of characters in id so that load is randomly distributed. for example if there are only 2 character only 36*36 ids are possible if there are 3 character 36*36*36 ids are possible.

How to query table by non primary key attribute in Amazons DynamoDB with DocumentClient.get

I seem to be having a lot of difficulty querying data with AWS.DynamoDB.DocumentClient().get().
I am using Serverless and set up my serverless.yml with this schema:
resources:
Resources:
ShortUrlsTable:
Type: "AWS::DynamoDB::Table"
Properties:
AttributeDefinitions:
- AttributeName: id
AttributeType: S
- AttributeName: longUrl
AttributeType: S
- AttributeName: shortPath
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
GlobalSecondaryIndexes:
- IndexName: longUrlIndex
KeySchema:
- AttributeName: longUrl
KeyType: HASH
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
Projection:
ProjectionType: ALL
- IndexName: shortPathIndex
KeySchema:
- AttributeName: shortPath
KeyType: HASH
Projection:
ProjectionType: ALL
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
TableName: ${self:custom.tableName}
What I want to be able to do is search the DB for a shortUrlItem using either longUrl or shortPath.
So far I have this set up:
dynamoDb = new AWS.DynamoDB.DocumentClient()
app.get("/:longUrl", (req, res) => {
const {longUrl} = req.params
const getParams = {
TableName: SHORT_URLS_TABLE,
Key: {longUrl},
}
dynamoDb.get(getParams, (error, result) => {
res.send({...error, ...result})
})
})
All I seem to be getting is this error message returned to me:
"message":"The provided key element does not match the schema","code":"ValidationException","time":"2018-08-17T20:39:27.765Z","requestId":"4RKNVG7ET1ORVF10H71M7AUABRVV4KQNSO5AEMVJF66Q9ASUAAJG","statusCode":400,"retryable":false,"retryDelay":21.513795782119505,"TableName":"short-urls-table-dev"
I cannot seem to figure out if I am querying correctly or setting up my schema correctly for the secondary index to be the searchable key in my table.
I can see two mistakes
1: your getParams are wrong. You make get request on PK but you provide GSI key in the params section. It should be like
const getParams = {
TableName: SHORT_URLS_TABLE,
Key: {
id: id, // Because id is the attribute of your HASH key.
}
}
This is the reason of the error. Your hash key is not on attribute longUrl.
2: Anyway, you can't make get request on GSI. Its not have GSI's are designed. GSI does not force uniqueness so there can be multiple items in the same GSI Hash key, therefore you can only query instead of get.
What you are trying to do is something like
const queryParams = {
TableName: SHORT_URLS_TABLE,
IndexName: 'longUrlIndex',
KeyConditionExpression: 'longUrl = :longUrlValue',
ExpressionAttributeValues: {
'longUrlValue': longUrl
}
};
dynamoDb.query(queryParams, (error, result) => {
res.send({...error, ...result})
})

Number of attributes in key schema must match the number of attributes defined in attribute definitions

I’m trying to create a simple table using DynamoDB JavaScript shell and I’m getting this exception:
{
"message": "The number of attributes in key schema must match the number of attributes defined in attribute definitions.",
"code": "ValidationException",
"time": "2015-06-16T10:24:23.319Z",
"statusCode": 400,
"retryable": false
}
Below is the table I’m trying to create:
var params = {
TableName: 'table_name',
KeySchema: [
{
AttributeName: 'hash_key_attribute_name',
KeyType: 'HASH'
}
],
AttributeDefinitions: [
{
AttributeName: 'hash_key_attribute_name',
AttributeType: 'S'
},
{
AttributeName: 'attribute_name_1',
AttributeType: 'S'
}
],
ProvisionedThroughput: {
ReadCapacityUnits: 1,
WriteCapacityUnits: 1
}
};
dynamodb.createTable(params, function(err, data) {
if (err) print(err);
else print(data);
});
However if I add the second attribute to the KeySchema, it works fine. Below a the working table:
var params = {
TableName: 'table_name',
KeySchema: [
{
AttributeName: 'hash_key_attribute_name',
KeyType: 'HASH'
},
{
AttributeName: 'attribute_name_1',
KeyType: 'RANGE'
}
],
AttributeDefinitions: [
{
AttributeName: 'hash_key_attribute_name',
AttributeType: 'S'
},
{
AttributeName: 'attribute_name_1',
AttributeType: 'S'
}
],
ProvisionedThroughput: {
ReadCapacityUnits: 1,
WriteCapacityUnits: 1
}
};
dynamodb.createTable(params, function(err, data) {
if (err) print(err);
else print(data);
});
I don’t want to add the range to key schema. Any idea how to fix it?
TL;DR Don't include any non-key attribute definitions in AttributeDefinitions.
DynamoDB is schemaless (except the key schema)
That is to say, you do need to specify the key schema (attribute name and type) when you create the table. Well, you don't need to specify any non-key attributes. You can put an item with any attribute later (must include the keys of course).
From the documentation page, the AttributeDefinitions is defined as:
An array of attributes that describe the key schema for the table and indexes.
When you create table, the AttributeDefinitions field is used for the hash and/or range keys only. In your first case, there is hash key only (number 1) while you provide 2 AttributeDefinitions. This is the root cause of the exception.
When you use non-key attribute in at "AttributeDefinitions", you must use it as index, otherwise it's against the way of DynamoDB to work. See the link.
So no need to put a non-key attribute in "AttributeDefinitions" if you're not gonna use it as index or primary key.
var params = {
TableName: 'table_name',
KeySchema: [ // The type of of schema. Must start with a HASH type, with an optional second RANGE.
{ // Required HASH type attribute
AttributeName: 'UserId',
KeyType: 'HASH',
},
{ // Optional RANGE key type for HASH + RANGE tables
AttributeName: 'RemindTime',
KeyType: 'RANGE',
}
],
AttributeDefinitions: [ // The names and types of all primary and index key attributes only
{
AttributeName: 'UserId',
AttributeType: 'S', // (S | N | B) for string, number, binary
},
{
AttributeName: 'RemindTime',
AttributeType: 'S', // (S | N | B) for string, number, binary
},
{
AttributeName: 'AlarmId',
AttributeType: 'S', // (S | N | B) for string, number, binary
},
// ... more attributes ...
],
ProvisionedThroughput: { // required provisioned throughput for the table
ReadCapacityUnits: 1,
WriteCapacityUnits: 1,
},
LocalSecondaryIndexes: [ // optional (list of LocalSecondaryIndex)
{
IndexName: 'index_UserId_AlarmId',
KeySchema: [
{ // Required HASH type attribute - must match the table's HASH key attribute name
AttributeName: 'UserId',
KeyType: 'HASH',
},
{ // alternate RANGE key attribute for the secondary index
AttributeName: 'AlarmId',
KeyType: 'RANGE',
}
],
Projection: { // required
ProjectionType: 'ALL', // (ALL | KEYS_ONLY | INCLUDE)
},
},
// ... more local secondary indexes ...
],
};
dynamodb.createTable(params, function(err, data) {
if (err) ppJson(err); // an error occurred
else ppJson(data); // successful response
});```
Declare attributes in AttrubuteDefinitions only if you are going to use the attribute in KeySchema
OR
when those attributes are going to be used in GlobalSecondaryIndexes or LocalSecondaryIndexes
For anybody using yaml files:
Example 1:
Lets say you have 3 attributes -> id, status, createdAt.
Here id is the KeySchema
AuctionsTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: AuctionsTable
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: id
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
Example2:
For the same attributes(ie. id, status and createdAt) if you have GlobalSecondaryIndexes or LocalSecondaryIndexes as well, then your yaml file looks like:
AuctionsTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: AuctionsTable-${self:provider.stage}
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: id
AttributeType: S
- AttributeName: status
AttributeType: S
- AttributeName: endingAt
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
GlobalSecondaryIndexes:
- IndexName: statusAndEndDate
KeySchema:
- AttributeName: status
KeyType: HASH
- AttributeName: endingAt
KeyType: RANGE
Projection:
ProjectionType: ALL
We have included status and createdId in AttributeDefinitions only because we have a GlobalSecondaryIndex which uses the aforementioned attributes.
Reason: DynamoDB only cares about the Primary Key, GlobalSecondaryIndex and LocalSecondaryIndex. You don't need to specify any other types of attributes which are not part of the above mentioned trio.
DynamoDB is only concerned with Primary Key, GlobalSecondaryIndex and LocalSecondaryIndex for partitioning. It doesn't care what other attributes you have for an item.
I also had this problem and I'll post here what went wrong for me in case it helps someone else.
In my CreateTableRequest, I had an empty array for the GlobalSecondaryIndexes.
CreateTableRequest createTableRequest = new CreateTableRequest
{
TableName = TableName,
ProvisionedThroughput = new ProvisionedThroughput { ReadCapacityUnits = 2, WriteCapacityUnits = 2 },
KeySchema = new List<KeySchemaElement>
{
new KeySchemaElement
{
AttributeName = "Field1",
KeyType = KeyType.HASH
},
new KeySchemaElement
{
AttributeName = "Field2",
KeyType = KeyType.RANGE
}
},
AttributeDefinitions = new List<AttributeDefinition>()
{
new AttributeDefinition
{
AttributeName = "Field1",
AttributeType = ScalarAttributeType.S
},
new AttributeDefinition
{
AttributeName = "Field2",
AttributeType = ScalarAttributeType.S
}
},
//GlobalSecondaryIndexes = new List<GlobalSecondaryIndex>
//{
//}
};
Commenting out these lines in the table creation solved my problem. So I guess the list has to be null, not empty.
Do not include all the Key values in the --attribute-definitions and --key-schema. Only include the HASH and RANGE keys in these while creating table.
When you are inserting an item into dynamo, it will accept other keys too that were no defined in the above attributes/schema.
for example:
Creating table:
aws dynamodb create-table \
--table-name Orders \
--attribute-definitions \
AttributeName=id,AttributeType=S \
AttributeName=sid,AttributeType=S \
--key-schema \
AttributeName=id,KeyType=HASH \
AttributeName=sid,KeyType=RANGE \
--provisioned-throughput \
ReadCapacityUnits=5,WriteCapacityUnits=5 \
--endpoint-url=http://localhost:4566
and now you can insert an item containing other keys too, just id and sid have to be present in the item

Resources