Using IN operator in DynamoDB from Boto3 - amazon-dynamodb

I'm new to DynamoDB and trying to query a table based off the presence of a list of certain values for a field.
I have a field doc_id, which is also a secondary index, and I'd like to return all results where doc_id is contained in a list of values.
I'm trying something like this:
response = table.query(
IndexName='doc_id-index',
FilterExpression=In(['27242226'])
)
But clearly that is not correct.
Can anyone point me in the right direction?
Thanks!

with Query operation
A FilterExpression does not allow key attributes. You cannot define a filter expression based on a partition key or a sort key.
So, Your doc_id field is the partition key of the doc_id-index and cannot be used in FilterExpression.
Note
A FilterExpression is applied after the items have already been read; the process of filtering does not consume any additional read capacity units.
I'm assuming you have another field like userId, just to show how to implement IN operation.(Query)
var params = {
TableName: 'tbl',
IndexName: 'doc_id-index',
KeyConditionExpression: 'doc_id= :doc_id',
FilterExpression: 'userId IN (:userId1,:userId2)',//you can add more userId here
ExpressionAttributeValues: {
':doc_id':100,
':userId1':11,
':userId2':12
}
};
If you have more userId you should pass to FilterExpression dynamically.
but in your case, you can use Scan operation
var params = {
TableName : "tbl",
FilterExpression : "doc_id IN (:doc_id1, :doc_id2)",
ExpressionAttributeValues : {
":doc_id1" :100,
":doc_id2" :101
}
};
and even pass to FilterExpression dynamically like below
var documentsId = ["100", "101","200",...];
var documentsObj = {};
var index = 0;
documentsId.forEach((value)=> {
index++;
var documentKey = ":doc_id"+index;
documentsObj[documentKey.toString()] = value;
});
var params = {
TableName: 'job',
FilterExpression: 'doc_id IN ('+Object.keys(documentsObj).toString()+')',
ExpressionAttributeValues: documentsObj,
};
Note:be careful while using Scan operation, less efficient than Query.

Related

Variable for Key in DynamoDB params?

I have the following code. I was able to use a variable for the table name, and for the search param. But I don't seem to be able to get tableKey as a variable to work. When I run the code I get "Error Lambda: ValidationException: The provided key element does not match the schema". tableKey is set in this case to "PhoneNumber" which is the name of the field in my DDB table.
Is this possible to use a variable in this location?
async function handleRequest(searchParam,dbTable,tableKey) {
let Details = {
TableName: dbTable,
Key: {
tableKey: searchParam,
}
};
return docClient.get(Details).promise();
}
You cannot pass in the keys as a variable, only the values.
Actually, I was able to figure it out.
async function handleRequest(searchParam, dbTable, tableKey) {
let Details = {
TableName: dbTable,
Key: {
}
};
Details.Key[tableKey] = searchParam;
return docClient.get(Details).promise();

400 error when upsert using Cosmos SP

I'm trying to execute the below SP
function createMyDocument() {
var collection = getContext().getCollection();
var doc = {
"someId": "123134444",
};
var options = {};
options['PartitionKey'] = ["someId"];
var isAccepted = collection.upsertDocument(collection.getSelfLink(), doc, options, function (error, resources, options) {
});
}
and cosmos keeps on complaining that there's something wrong with the partition key
{ code: 400,
body: '{"code":"BadRequest","message":"Message: {\\"Errors\\":
[\\"PartitionKey extracted from document doesn\'t match the one specified in the header\\"]}
}
Does anyone have any idea how to pass in the partion key in options so it gets pass this validation ?
Figured it out. The error was with how we call the stored proc.
How we were doing it
client.executeStoredProcedure('dbs/db1/colls/coll-1/sprocs/createMyDocument',
{},
{} //Here you have to pass in the partition key
);
How it has to be
client.executeStoredProcedure('dbs/db1/colls/coll-1/sprocs/createMyDocument',
{},
{"partitionKey": "43321"}
);
I think you misunderstand the meaning of partitionkey property in the options[].
For example , my container is created like this:
The partition key is "name" for my collection here. You could check your collection's partition key.
And my documents as below :
{
"id": "1",
"name": "jay"
}
{
"id": "2",
"name": "jay2"
}
My partitionkey is 'name', so here I have two paritions : 'jay' and 'jay1'.
So, here you should set the partitionkey property to '123134444' in your question, not 'someId'.
More details about cosmos db partition key.
Hope it helps you.

Query size limits in DynamoDB

I don't get the concept of limits for query/scan in DynamoDb.
According to the docs:
A single Query operation can retrieve a maximum of 1 MB of data.This
limit applies before any FilterExpression is applied to the results.
Let's say I have 10k items, 250kb per item, all of them fit query params.
If I run a simple query, I get only 4 items?
If I use ProjectionExpression to retrieve only single attribute (1kb
in size), will I get 1k items?
If I only need to count items (select: 'COUNT'), will it count all
items (10k)?
If I run a simple query, I get only 4 items?
Yes
If I use ProjectionExpression to retrieve only single attribute (1kb in size), will I get 1k items?
No, filterexpressions and projectexpressions are applied after the query has completed. So you still get 4 items.
If I only need to count items (select: 'COUNT'), will it count all items (10k)?
No, still just 4
The thing that you are probably missing here is that you can still get all 10k results, or the 10k count, you just need to get the results in pages. Some details here. Basically when you complete your query, check the LastEvaluatedKey attribute, and if its not empty, get the next set of results. Repeat this until the attribute is empty and you know you have all the results.
EDIT: I should say some of the SDKs abstract this away for you. For example the Java SDK has query and queryPage, where query will go back to the server multiple times to get the full result set for you (i.e. in your case, give you the full 10k results).
For any operation that returns items, you can request a subset of attributes to retrieve; however, doing so has no impact on the item size calculations. In addition, Query and Scan can return item counts instead of attribute values. Getting the count of items uses the same quantity of read capacity units and is subject to the same item size calculations. This is because DynamoDB has to read each item in order to increment the count.
Managing Throughput Settings on Provisioned Tables
Great explanation by #f-so-k.
This is how I am handling the query.
import AWS from 'aws-sdk';
async function loopQuery(params) {
let keepGoing = true;
let result = null;
while (keepGoing) {
let newParams = params;
if (result && result.LastEvaluatedKey) {
newParams = {
...params,
ExclusiveStartKey: result.LastEvaluatedKey,
};
}
result = await AWS.query(newParams).promise();
if (result.count > 0 || !result.LastEvaluatedKey) {
keepGoing = false;
}
}
return result;
}
const params = {
TableName: user,
IndexName: 'userOrder',
KeyConditionExpression: 'un=:n',
ExpressionAttributeValues: {
':n': {
S: name,
},
},
ConsistentRead: false,
ReturnConsumedCapacity: 'NONE',
ProjectionExpression: ALL,
};
const result = await loopQuery(params);
Edit:
import AWS from 'aws-sdk';
async function loopQuery(params) {
let keepGoing = true;
let result = null;
let list = [];
while (keepGoing) {
let newParams = params;
if (result && result.LastEvaluatedKey) {
newParams = {
...params,
ExclusiveStartKey: result.LastEvaluatedKey,
};
}
result = await AWS.query(newParams).promise();
if (result.count > 0 || !result.LastEvaluatedKey) {
keepGoing = false;
list = [...list, ...result]
}
}
return list;
}
const params = {
TableName: user,
IndexName: 'userOrder',
KeyConditionExpression: 'un=:n',
ExpressionAttributeValues: {
':n': {
S: name,
},
},
ConsistentRead: false,
ReturnConsumedCapacity: 'NONE',
ProjectionExpression: ALL,
};
const result = await loopQuery(params);

Query List of Maps in DynamoDB

I am trying to filter list of maps from a dynamodb table which is of the following format.
{
id: "Number",
users: {
{ userEmail: abc#gmail.com, age:"23" },
{ userEmail: de#gmail.com, age:"41" }
}
}
I need to get the data of the user with userEmail as "abc#gmail.com". Currently I am doing it using the following dynamodb query. Is there any another efficient way to solve this issue ?
var params = {
TableName: 'users',
Key:{
'id': id
}
};
var docClient = new AWS.DynamoDB.DocumentClient();
docClient.get(params, function (err, data) {
if (!err) {
const users = data.Item.users;
const user = users.filter(function (user) {
return user.email == userEmail;
});
// filtered has the required user in it
});
The only way you can get a single item in dynamo by id if you have a table with a partition key. So you need to have a table that looks like:
Email (string) - partition key
Id (some-type) - user id
...other relevant user data
Unfortunately, since a nested field cannot be a partition key you will have to maintain a separate table here and won't be able to use an index in DynamoDB (neither LSI, nor GSI).
It's a common pattern in NoSQL to duplicate data, so there is nothing unusual in it. If you were using Java, you could use transactions library, to ensure that both tables are in sync.
If you are not going to use Java you could read DynamoDB stream of the original database (where emails are nested fields) and update the new table (where emails are partition keys) when an original table is updated.

Dynamoose + DynamoDB query logic

I have a DynamoDB table with a primary key (_id) being a simple int. I want to get the highest value for the primary key.
How do I return the item in the table with the highest _id?
I can use either the Amazon javascript API or the Dynamoose library.
Partition keys are not stored in order. You would need to scan the entire table, stream over the items, map to the _id attribute and then return the maximum value.
You can easily create Global Secondary Index where _id must to be a sort key and based on it you can make a request like this:
var params = {
TableName: 'Devices',
KeyConditionExpression: 'status = :status',
ScanIndexForward: false, // true = ascending, false = descending
ExpressionAttributeValues: {
':s': status
}
};
docClient.query(params, function(err, data) {});

Resources