dynamo db FilterExpression, find in json object using value as key - amazon-dynamodb

It is possible to somehow filter results by key name that stored in the same object?
I have JSON object "keys", in property "default" stored key of the object that I need. Is it somehow possible to filter like that keys[keys.default].type = some_type?
var params = {
TableName: 'TABLE_NAME',
IndexName: 'TABLE_INDEX', // optional (if querying an index)
KeyConditionExpression: 'myId = :value',
FilterExpression: '#kmap[#kmap.#def].#tp = :keyval',
ExpressionAttributeNames: {names with special characters
'#kmap': 'keys',
'#tp': 'type',
'#def': 'default'
},
ExpressionAttributeValues: { // a map of substitutions for all attribute values
':value': '1',
':keyval': 'some_type'
},
Limit: 10, // optional (limit the number of items to evaluate)
ProjectionExpression: "displayName, #kmap",
ReturnConsumedCapacity: 'TOTLAL', // optional (NONE | TOTAL | INDEXES)
};
docClient.query(params, function(err, data) {
if (err) ppJson(err); // an error occurred
else ppJson(data); // successful response
});

I'm pretty sure the answer is no.
This keys[keys.default] is not even valid json, as far as I can tell.
Of course, you can do this in two steps:
First, query to get the default key
Then query to get the value
Don't forget, filters are obly applied to the result set - it still requires a libear traversal as specified by your Query or Scan operation.
So you can probably more easily run your query on the client.
And lastly, if this is a typical query ypu need to perform, as an optimization, you can lift the default key and value to be top level attributes on the item. Thrn you can actually create a GSI on that attribure and can actually do efficient lookups.

Related

How to force the DynamoDB query's ExclusiveStartKey to use exact match?

I'm using DynamoDB for my new Serverless Restful API with nodejs.
The Restful API supports query for resources with the limit and lastKey query parameters for key pagination.
Assume there's a table like below:
PK
SK
School
firstSchool
School
secondSchool
School
thirdSchool
PK is partition key, and SK is sort key.
I use SK for key pagination.
If I call the api with http://somewhere/api/school?limit=1&lastKey=secondSchool, ExclusiveStartKey in query will be {"PK" : "School", "SK" : "secondSchool"}, and the returned item will be {"PK" : "School", "SK" : "thirdSchool"}.
It works well in that case, but the problem is the same result is created with the url like http://somewhere/api/school?limit=1&lastKey=seco.
In this case, ExclusiveStartKey in query will be {"PK" : "School", "SK" : "seco"}
It seems DynamoDB doesn't use exact match for a sk value in ExclusiveStartKey.
Is there any way to force DynamoDB to use exact match for ExclusiveStartKey?
I attach my test code below:
const { DynamoDBClient } = require("#aws-sdk/client-dynamodb");
const { DynamoDBDocument } = require("#aws-sdk/lib-dynamodb");
const ddbClient = new DynamoDBClient({
region: AWS_REGION,
endpoint: AWS_DYNAMODB_END_POINT,
credentials: {
accessKeyId: AWS_ACCESSKEY_ID,
secretAccessKey: AWS_SECRET_ACCESS_KEY,
},
});
const ddbDocClient = DynamoDBDocument.from(ddbClient);
(async () => {
try {
const data = await ddbDocClient.query({
TableName: "Table Name",
KeyConditionExpression: "#pk = :pk",
ExpressionAttributeNames: {
"#pk": "PK",
},
ExpressionAttributeValues: {
":pk": "Test",
},
Limit: 1,
ExclusiveStartKey: { PK: "Test", SK: "Seco" },
});
console.log(data);
} catch (err) {
console.log("Error", err);
}
})();
The ExclusiveKeyStart is used mainly for paging large Scan or Query requests - i.e., retrieving the next page of results after the previous page ended with a LastEvaluatedKey, and you are supposed to give exactly that key (not some subset of it...) as the ExclusiveKeyStart of the next request.
You are trying to do something different, and to achieve you can't use ExclusiveKeyStart, but you can use something else:
The Query request has a KeyConditionExpression. You can specify sk > :value as a key condition expression (don't pass ExclusiveKeyStart), and you'll get this all the sort keys higher than that :value like your string "seco". Please note, however, that because your sort key is truncated, this result may actually include one or more extra results before the first key you want (e.g., the keys "seco" and "secoaaaa" come before "secondSchool") so you may need to drop them yourself from the results.
The KeyConditionExpression is implemented efficiently - DynamoDB knows how to skip directly to that sort key in the partition, and doesn't charge you for reading the entire partition, so in this respect it is just as good as ExclusiveKeyStart.

Dynamoose model update with hash key

I'm trying to execute an update against a dynamoose model. Here's the docs on calling model.update
Model.update(key[, updateObj[, settings]],[ callback])
key can be a string representing the hashKey or an object containing the hashKey & rangeKey.
My schema has both a hash key (partition key) and range key (sort key) like this:
// create model
let model = dynamoose.model(
"SampleStatus",
{
id: {
type: String,
hashKey: true,
},
date: {
type: Date,
rangeKey: true,
},
status: String,
});
I've created an object like this (with a fixed timestamp for demoing)
let timestamp = 1606781220842; // Date.Now()
model.create({
id: "1",
date: new Date(timestamp),
status: "pending",
});
I'd like to be able to update the status property by referencing just the id property like this:
model.update({id: "1"}, {status: "completed"})
// err: The provided key element does not match the schema
model.update("1", {status: "completed"})
// err: Argument of type 'string' is not assignable to parameter of type 'ObjectType'
But both result in the shown errors:
I can pass in the full composite key if I know the timestamp, so the following will work:
let timestamp = 1606781220842; // Date.Now()
model.update({ id: "1", date: timestamp }, { status: "completed" });
However, that requires me holding onto the timestamp and persisting alongside the id.
The ID field, in my case, should, by itself, be unique, so I don't need both to create a key, but wanted to add the date as a range key so it was sortable. Should I just update my schema so there's only a single hash key? I was thinking the docs that said a "`key can be a string representing the hashkey" would let me just pass in the ID, but that throws an error on compile (in typescript).
Any suggestions?
The solution here is to remove the rangeKey from the date property.
This is because in DynamoDB every document/item must have a unique “key”. This can either be the hashKey or hashKey + rangeKey.
Since you mention that your id property is unique, you probably want to use just the hashKey as the key, which should fix the issue.
In your example there could have been many documents with that id, so DynamoDB wouldn’t know which to update.
Don’t forget that this causes changes to your table so you might have to delete and recreate the table. But that should fix the problem you are running into.
Logically there is nothing stopping you than inserting more than 1 entry into the same partition (in your case the unique id). You could insert more than one item with the same id, if it had a different date.
Therefore if you want to get an item by only its partition key, which is really a unique ID, you need to use a query to retrieve the item (as opposed to a GET), but the return signature will be a collection of items. As you know you only have one item in the partition, you can take the first item, and specify a limit of 1 to save RCU.
// create model
let model = dynamoose.model(
"SampleStatus",
{
id: {
type: String,
hashKey: true,
"index": {
"name": "index_name",
"rangeKey": "date",
}
},
date: {
type: Date
},
status: String,
});
You have to tell the schema that hashKey and range are one partition key.
Ref: https://dynamoosejs.com/guide/Schema#index-boolean--object--array

DynamoDB - Get Item by Global Secondary Index

I have an existing table which has 10 fields. Fields are like this:
AuthID, UserID, Age, Job, .etc
The table stores data of my users. "AuthID" is primary key and "UserID" is a Global Secondary Index.
When I get item by AuthID, everything is fine. But I can't get item by UserID. I tried GetItem, Query and Scan methods but I failed in all three method.
I need to be able to get data with these 3 methods I wrote below :
1 - Get user data by AuthID (It's already works fine)
2 - Get user data by UserID
3 - Get user data by AuthID and UserID both
AuthID and UserID is unique. Can someone point me right way as to what to do?
I have searched a lot in the documentation and found that if you need to get a single item even then you cant use the get or getItem method when using a global secondary index. One can use the query method. a sample of query method with global secondary index is
let params = {
TableName: "Users",
IndexName: "your-index",
ExpressionAttributeValues: {
":v1": "myid"
},
KeyConditionExpression: "my_partition_key_in_gsi = :v1",
};
dynamodb.query(params, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
});

DynamoDB update - "ValidationException: An operand in the update expression has an incorrect data type"

I am trying to append to a string set (array of strings) column, which may or may not already exist, in a DynamoDB table. I referred to SO questions like this and this when writing my UpdateExpression.
My code looks like this.
const AWS = require('aws-sdk')
const dynamo = new AWS.DynamoDB.DocumentClient()
const updateParams = {
// The table definitely exists.
TableName: process.env.DYNAMO_TABLE_NAME,
Key: {
email: user.email
},
// The column may or may not exist, which is why I am combining list_append with if_not_exists.
UpdateExpression: 'SET #column = list_append(if_not_exists(#column, :empty_list), :vals)',
ExpressionAttributeNames: {
'#column': 'items'
},
ExpressionAttributeValues: {
':vals': ['test', 'test2'],
':empty_list': []
},
ReturnValues: 'UPDATED_NEW'
}
dynamo.update(updateParams).promise().catch((error) => {
console.log(`Error: ${error}`)
})
However, I am getting this error: ValidationException: An operand in the update expression has an incorrect data type. What am I doing incorrectly here?
[Update]
Thanks to Nadav Har'El's answer, I was able to make it work by amending the params to use the ADD operation instead of SET.
const updateParams = {
TableName: process.env.DYNAMO_TABLE_NAME,
Key: {
email: user.email
},
UpdateExpression: 'ADD items :vals',
ExpressionAttributeValues: {
':vals': dynamo.createSet(['test', 'test2'])
}
}
A list and a string set are not the same type - a string set can only hold strings while a list may hold any types (including nested lists and objects), element types don't need to be the same, and a list can hold also duplicate items. So if your original item is indeed as you said a string set, not a list, this explains why this operation cannot work.
To add items to a string set, use the ADD operation, not the SET operation. The parameter you will give to add should be a set (not a list, I don't know the magic js syntax to specify this, check your docs) with a bunch of elements. If the attribute already exists these elements will be added to it (dropping duplicates), and if the attribute doesn't already exit, it will be set to the set of these elements. See the documentation here: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_UpdateItem.html#DDB-UpdateItem-request-UpdateExpression

How to set auto increment field in DynamoDB? [duplicate]

I am new to dynamodb. I want to auto increment id value when I use putitem with dynamodb.
Is possible to do that?
This is anti-pattern in DynamoDB which is build to scale across many partitions/shards/servers. DynamoDB does not support auto-increment primary keys due to scaling limitations and cannot be guaranteed across multiple servers.
Better option is to assemble primary key from multiple indices. Primary key can be up to 2048 bytes. There are few options:
Use UUID as your key - possibly time based UUID which makes it unique, evenly distributed and carries time value
Use randomly generated number or timestamp + random (possibly bit-shifting) like: ts << 12 + random_number
Use another service or DynamoDB itself to generate incremental unique id (requires extra call)
Following code will auto-increment counter in DynamoDB and then you can use it as primary key.
var documentClient = new AWS.DynamoDB.DocumentClient();
var params = {
TableName: 'sampletable',
Key: { HashKey : 'counters' },
UpdateExpression: 'ADD #a :x',
ExpressionAttributeNames: {'#a' : "counter_field"},
ExpressionAttributeValues: {':x' : 1},
ReturnValues: "UPDATED_NEW" // ensures you get value back
};
documentClient.update(params, function(err, data) {});
// once you get new value, use it as your primary key
My personal favorite is using timestamp + random inspired by Instagram's Sharding ID generation at http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
Following function will generate id for a specific shard (provided as parameter). This way you can have unique key, which is assembled from timestamp, shard no. and some randomness (0-512).
var CUSTOMEPOCH = 1300000000000; // artificial epoch
function generateRowId(shardId /* range 0-64 for shard/slot */) {
var ts = new Date().getTime() - CUSTOMEPOCH; // limit to recent
var randid = Math.floor(Math.random() * 512);
ts = (ts * 64); // bit-shift << 6
ts = ts + shardId;
return (ts * 512) + randid;
}
var newPrimaryHashKey = "obj_name:" + generateRowId(4);
// output is: "obj_name:8055517407349240"
DynamoDB doesn't provide this out of the box. You can generate something in your application such as UUIDs that "should" be unique enough for most systems.
I noticed you were using Node.js (I removed your tag). Here is a library that provides UUID functionality: node-uuid
Example from README
var uuid = require('node-uuid');
var uuid1 = uuid.v1();
var uuid2 = uuid.v1({node:[0x01,0x23,0x45,0x67,0x89,0xab]});
var uuid3 = uuid.v1({node:[0, 0, 0, 0, 0, 0]})
var uuid4 = uuid.v4();
var uuid5 = uuid.v4();
You probably can use AtomicCounters.
With AtomicCounters, you can use the UpdateItem operation to implement
an atomic counter—a numeric attribute that is incremented,
unconditionally, without interfering with other write requests. (All
write requests are applied in the order in which they were received.)
With an atomic counter, the updates are not idempotent. In other
words, the numeric value increments each time you call UpdateItem.
You might use an atomic counter to track the number of visitors to a
website. In this case, your application would increment a numeric
value, regardless of its current value. If an UpdateItem operation
fails, the application could simply retry the operation. This would
risk updating the counter twice, but you could probably tolerate a
slight overcounting or undercounting of website visitors.
Came across a similar issue, where I required auto-incrementing primary key in my table. We could use some randomization techniques to generate a random key and store it using that. But it won't be in a incremental fashion.
If you require something in incremental fashion, you can use Unix Time as your primary key. Not assuring, that you can get a accurate incrementation(one-by-one), but yes every record you put, it would be in incremental fashion, with respect to the difference in how much time each record in inserted in.
Not a complete solution, if you don't want to read the entire table and get it's last id and then increment it.
Following is the code for inserting a record in DynamoDB using NodeJS:
.
.
const params = {
TableName: RANDOM_TABLE,
Item: {
ip: this.ip,
id: new Date().getTime()
}
}
dynamoDb.put(params, (error, result) => {
console.log(error, result);
});
.
.
If you are using NoSQL Dynamo DB then using Dynamoose, you can easily set default unique id, here is the simple user create example
// User.modal.js
const dynamoose = require("dynamoose");
const { v4: uuidv4 } = require("uuid");
const userSchema = new dynamoose.Schema(
{
id: {
type: String,
hashKey: true,
},
displayName: String,
firstName: String,
lastName: String,
},
{ timestamps: true },
);
const User = dynamoose.model("User", userSchema);
module.exports = User;
// User.controller.js
exports.create = async (req, res) => {
const user = new User({ id: uuidv4(), ...req.body }); // set unique id
const [err, response] = await to(user.save());
if (err) {
return badRes(res, err);
}
return goodRes(res, reponse);
};
Update for 2022 :
I was looking for the same issue and came across following research.
DynamoDB still doesn't support auto-increment of primary keys.
https://aws.amazon.com/blogs/database/simulating-amazon-dynamodb-unique-constraints-using-transactions/
Also the package node-uuid is now deprecated. They recommend we use uuid package instead that creates RFC4122 compliant UUID's.
npm install uuid
import { v4 as uuidv4 } from 'uuid';
uuidv4(); // ⇨ '9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d'
For Java developers, there is the DynamoDBMapper, which is a simple ORM. This supports the DynamoDBAutoGeneratedKey annotation. It doesn't increment a numeric value like a typical "Long id", but rather generates a UUID like other answers here suggest. If you're mapping classes as you would with Hibernate, GORM, etc., this is more natural with less code.
I see no caveats in the docs about scaling issues. And it eliminates the issues with under or over-counting as you have with the auto-incremented numeric values (which the docs do call out).

Resources