Dynamoose + DynamoDB query logic - amazon-dynamodb

I have a DynamoDB table with a primary key (_id) being a simple int. I want to get the highest value for the primary key.
How do I return the item in the table with the highest _id?
I can use either the Amazon javascript API or the Dynamoose library.

Partition keys are not stored in order. You would need to scan the entire table, stream over the items, map to the _id attribute and then return the maximum value.

You can easily create Global Secondary Index where _id must to be a sort key and based on it you can make a request like this:
var params = {
TableName: 'Devices',
KeyConditionExpression: 'status = :status',
ScanIndexForward: false, // true = ascending, false = descending
ExpressionAttributeValues: {
':s': status
}
};
docClient.query(params, function(err, data) {});

Related

Dynamo DB leader board within an Alexa Skill

I am attempting to create a leader board using dynamo db for an quiz style Alexa skill. I have set up the table and users are added to the table with their appropriate data e.g.:
Item: {
"PlatformId": 2,
"UserId": 12345,
"Score": 100,
"NickName": "scott",
"Sport": "football",
}
In my table the Primary key is their UserId, the sort key is the PlatformId (this is the same for all users). I have a secondary global index which sets the platformId as the primary key, and the score as the sort key.
In this leader board i want users to be ranked, the highest scorer being number 1, my first attempt at this was to scan the table using the secondary index, this nicely returned all the users sorted by score, however with the potential to have thousands of users on this leader board, i discovered that the time to scan a table with 10000+ users exceeds the 8 second response time that Alexa skills have. This causes the skill to error and close.
Before the response time exceeded, i was using the LastEvaluatedKey to perform an extra scan if the first one didn't cover the entire table, but on this second scan is when the response time limit was exceeded. Annoyingly it's just taking too long to scan the table.
dbHelper.prototype.scanGetUsers = (ad, newParams = null) => {
return new Promise((resolve, reject) => {
let params = {};
if (newParams != null) {
params = newParams
} else {
params = {
TableName: tableName,
IndexName: 'PlatformId-Score-index',
FilterExpression: "Score >= :s AND PlatformId = :p",
ProjectionExpression: `NickName, Sport, Score`,
// Limit: 10,
ExpressionAttributeValues: {
":p": User.PlatformId,
":s": User.Score,
},
}
}
docClient.scan(params, function (err, data) {
if (err || !data) {
console.error("Unable to read item. Error JSON:", JSON.stringify(err, null, 2));
return reject(JSON.stringify(err, null, 2))
} else {
console.log("scan users data succeeded:", JSON.stringify(data, null, 2));
if(data.LastEvaluatedKey) {
console.log("found a LastEvalutedKey, Continuing scan");
params.ExclusiveStartKey = data.LastEvaluatedKey;
data = data.concat(this.scanGetUsers(ad, params));
}
resolve(data);
}
});
});
}
Is there a way to work around these issues that i haven't explored yet? Or a way to create a leader board with dynamo db that can be structured in an easier way?
You can try Sort Key:
When you combine a partition key and sort key, they create a composite key, and that composite key is the primary key for individual items in a table. With a composite key, you gain the ability to use queries with a KeyConditionExpression against the sort key. In a query, you can use KeyConditionExpression to write conditional statements by using comparison operators that evaluate against a key and limit the items returned. In other words, you can use special operators to include, exclude, and match items by their sort key values.
The article contains all information how to setup and use it.

Using IN operator in DynamoDB from Boto3

I'm new to DynamoDB and trying to query a table based off the presence of a list of certain values for a field.
I have a field doc_id, which is also a secondary index, and I'd like to return all results where doc_id is contained in a list of values.
I'm trying something like this:
response = table.query(
IndexName='doc_id-index',
FilterExpression=In(['27242226'])
)
But clearly that is not correct.
Can anyone point me in the right direction?
Thanks!
with Query operation
A FilterExpression does not allow key attributes. You cannot define a filter expression based on a partition key or a sort key.
So, Your doc_id field is the partition key of the doc_id-index and cannot be used in FilterExpression.
Note
A FilterExpression is applied after the items have already been read; the process of filtering does not consume any additional read capacity units.
I'm assuming you have another field like userId, just to show how to implement IN operation.(Query)
var params = {
TableName: 'tbl',
IndexName: 'doc_id-index',
KeyConditionExpression: 'doc_id= :doc_id',
FilterExpression: 'userId IN (:userId1,:userId2)',//you can add more userId here
ExpressionAttributeValues: {
':doc_id':100,
':userId1':11,
':userId2':12
}
};
If you have more userId you should pass to FilterExpression dynamically.
but in your case, you can use Scan operation
var params = {
TableName : "tbl",
FilterExpression : "doc_id IN (:doc_id1, :doc_id2)",
ExpressionAttributeValues : {
":doc_id1" :100,
":doc_id2" :101
}
};
and even pass to FilterExpression dynamically like below
var documentsId = ["100", "101","200",...];
var documentsObj = {};
var index = 0;
documentsId.forEach((value)=> {
index++;
var documentKey = ":doc_id"+index;
documentsObj[documentKey.toString()] = value;
});
var params = {
TableName: 'job',
FilterExpression: 'doc_id IN ('+Object.keys(documentsObj).toString()+')',
ExpressionAttributeValues: documentsObj,
};
Note:be careful while using Scan operation, less efficient than Query.

convert a dynamodb scan to query

I've written a API gateway to scan a dynamodb table and get values based on the condition and my code is as below.
var params = {
TableName: 'CarsData',
FilterExpression: '#market_category = :market_category and #vehicle_size = :vehicle_size and #transmission_type = :transmission_type and #price_range = :price_range and #doors = :doors',
ExpressionAttributeNames: {
"#market_category": "market_category",
"#vehicle_size": "vehicle_size",
"#transmission_type": "transmission_type",
"#price_range": "price_range",
"#doors": "doors"
},
ExpressionAttributeValues: {
":market_category": body.market_category,
":vehicle_size": body.vehicle_size,
":transmission_type": body.transmission_type,
":price_range": body.price_range,
":doors": body.doors
}
}
dynamodb.scan(params).promise().then(function (data) {
var uw = data.Items;
console.log(data + "\n" + JSON.stringify(data) + "\n" + JSON.stringify(data.Items));
var res = {
"statusCode": 200,
"headers": {},
"body": JSON.stringify(uw)
};
ctx.succeed(res);
}).catch(function (err) {
console.log(err);
var res = {
"statusCode": 404,
"headers": {},
"body": JSON.stringify({ "status": "error" })
};
ctx.succeed(res);
});
when I run this code, I get the result as expected. But when I was going through some online forums, I came to know that scanning is expensive compared to querying. But I'm unable to know on how can I change my query from scan to query. Here my primary key is ID. please let me know on how can I do this.
Thanks
Scan operation is more expensive comparing to query operation, in terms of performance as well as costing. Dynamodb calculates cost based on the number of read capacity units consumed for processing not on number of records returned.
Query operation finds value based on primary key (Hash) or composite primary key (Hash key and Sort Key).
Your schema should be redesigned with composite primary key(Hash key and Sort Key).
Its not neccessary to have column Id as primary Key like old school RDBMS. If you are not using Id effectively remove that column from your schema and redefine it with some other attributes. For an example am using Market Category (market_category ) as Hash Key & Price Range (price_range) as Range Key.
var params = {
"TableName": 'CarsData',
"ConsistentRead": true,
//Composite Primary Key in Key Condition Expression
"KeyConditionExpression": "#market_category = :market_category AND #price_range = :price_range",
//Remaining column in filter expression
"FilterExpression": '#vehicle_size = :vehicle_size and #transmission_type = :transmission_type and #doors = :doors',
"ExpressionAttributeNames": {
"#market_category": "market_category",
"#vehicle_size": "vehicle_size",
"#transmission_type": "transmission_type",
"#price_range": "price_range",
"#doors": "doors"
},
"ExpressionAttributeValues": {
":market_category": body.market_category,
":vehicle_size": body.vehicle_size,
":transmission_type": body.transmission_type,
":price_range": body.price_range,
":doors": body.doors
}
}
dynamodb.query(params).promise()
.then(function (data) {
console.log(data);
}).catch(function (err) {
console.log(err);
});
Hope this example will give you insights about using composite primary key,
Based on your usage choose the widely used columns for Hash & Range key.

Query List of Maps in DynamoDB

I am trying to filter list of maps from a dynamodb table which is of the following format.
{
id: "Number",
users: {
{ userEmail: abc#gmail.com, age:"23" },
{ userEmail: de#gmail.com, age:"41" }
}
}
I need to get the data of the user with userEmail as "abc#gmail.com". Currently I am doing it using the following dynamodb query. Is there any another efficient way to solve this issue ?
var params = {
TableName: 'users',
Key:{
'id': id
}
};
var docClient = new AWS.DynamoDB.DocumentClient();
docClient.get(params, function (err, data) {
if (!err) {
const users = data.Item.users;
const user = users.filter(function (user) {
return user.email == userEmail;
});
// filtered has the required user in it
});
The only way you can get a single item in dynamo by id if you have a table with a partition key. So you need to have a table that looks like:
Email (string) - partition key
Id (some-type) - user id
...other relevant user data
Unfortunately, since a nested field cannot be a partition key you will have to maintain a separate table here and won't be able to use an index in DynamoDB (neither LSI, nor GSI).
It's a common pattern in NoSQL to duplicate data, so there is nothing unusual in it. If you were using Java, you could use transactions library, to ensure that both tables are in sync.
If you are not going to use Java you could read DynamoDB stream of the original database (where emails are nested fields) and update the new table (where emails are partition keys) when an original table is updated.

Update Multiple Items with same Hash Key in DynamoDb

I have a dynamodb table that stores users videos.
It's structured like this:
{
"userid": 324234234234234234, // Hash key
"videoid": 298374982364723648 // Range key
"user": {
"username": "mario"
}
}
I want to update username for all videos of a specific user. It's possible with a simple update or i have to scan the complete table and update one item a time?
var params = {
TableName: DDB_TABLE_SCENE,
Key: {
userid: userid,
},
UpdateExpression: "SET username = :username",
ExpressionAttributeValues: { ":username": username },
ReturnValues: "ALL_NEW",
ConditionExpression: 'attribute_exists (userid)'
};
docClient.update(params, function(err, data) {
if (err) fn(err, null);
else fn(err, data.Attributes.username);
});
I receive the following error, I suppose the range key is necessary.
ValidationException: The provided key element does not match the schema
Dynamo does not support write operations across multiple items (ie. for more than one item at a time). You will have to first scan/query the table, or otherwise generate a list of all items you'd like to update, and then update them one by one.
Dynamo does provide a batching API but that is still just a way to group updates together in batches of 25 at a time. It's not a proxy for a multi-item update like you're trying to achieve.

Resources