How to set auto increment field in DynamoDB? [duplicate] - amazon-dynamodb

I am new to dynamodb. I want to auto increment id value when I use putitem with dynamodb.
Is possible to do that?

This is anti-pattern in DynamoDB which is build to scale across many partitions/shards/servers. DynamoDB does not support auto-increment primary keys due to scaling limitations and cannot be guaranteed across multiple servers.
Better option is to assemble primary key from multiple indices. Primary key can be up to 2048 bytes. There are few options:
Use UUID as your key - possibly time based UUID which makes it unique, evenly distributed and carries time value
Use randomly generated number or timestamp + random (possibly bit-shifting) like: ts << 12 + random_number
Use another service or DynamoDB itself to generate incremental unique id (requires extra call)
Following code will auto-increment counter in DynamoDB and then you can use it as primary key.
var documentClient = new AWS.DynamoDB.DocumentClient();
var params = {
TableName: 'sampletable',
Key: { HashKey : 'counters' },
UpdateExpression: 'ADD #a :x',
ExpressionAttributeNames: {'#a' : "counter_field"},
ExpressionAttributeValues: {':x' : 1},
ReturnValues: "UPDATED_NEW" // ensures you get value back
};
documentClient.update(params, function(err, data) {});
// once you get new value, use it as your primary key
My personal favorite is using timestamp + random inspired by Instagram's Sharding ID generation at http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
Following function will generate id for a specific shard (provided as parameter). This way you can have unique key, which is assembled from timestamp, shard no. and some randomness (0-512).
var CUSTOMEPOCH = 1300000000000; // artificial epoch
function generateRowId(shardId /* range 0-64 for shard/slot */) {
var ts = new Date().getTime() - CUSTOMEPOCH; // limit to recent
var randid = Math.floor(Math.random() * 512);
ts = (ts * 64); // bit-shift << 6
ts = ts + shardId;
return (ts * 512) + randid;
}
var newPrimaryHashKey = "obj_name:" + generateRowId(4);
// output is: "obj_name:8055517407349240"

DynamoDB doesn't provide this out of the box. You can generate something in your application such as UUIDs that "should" be unique enough for most systems.
I noticed you were using Node.js (I removed your tag). Here is a library that provides UUID functionality: node-uuid
Example from README
var uuid = require('node-uuid');
var uuid1 = uuid.v1();
var uuid2 = uuid.v1({node:[0x01,0x23,0x45,0x67,0x89,0xab]});
var uuid3 = uuid.v1({node:[0, 0, 0, 0, 0, 0]})
var uuid4 = uuid.v4();
var uuid5 = uuid.v4();

You probably can use AtomicCounters.
With AtomicCounters, you can use the UpdateItem operation to implement
an atomic counter—a numeric attribute that is incremented,
unconditionally, without interfering with other write requests. (All
write requests are applied in the order in which they were received.)
With an atomic counter, the updates are not idempotent. In other
words, the numeric value increments each time you call UpdateItem.
You might use an atomic counter to track the number of visitors to a
website. In this case, your application would increment a numeric
value, regardless of its current value. If an UpdateItem operation
fails, the application could simply retry the operation. This would
risk updating the counter twice, but you could probably tolerate a
slight overcounting or undercounting of website visitors.

Came across a similar issue, where I required auto-incrementing primary key in my table. We could use some randomization techniques to generate a random key and store it using that. But it won't be in a incremental fashion.
If you require something in incremental fashion, you can use Unix Time as your primary key. Not assuring, that you can get a accurate incrementation(one-by-one), but yes every record you put, it would be in incremental fashion, with respect to the difference in how much time each record in inserted in.
Not a complete solution, if you don't want to read the entire table and get it's last id and then increment it.
Following is the code for inserting a record in DynamoDB using NodeJS:
.
.
const params = {
TableName: RANDOM_TABLE,
Item: {
ip: this.ip,
id: new Date().getTime()
}
}
dynamoDb.put(params, (error, result) => {
console.log(error, result);
});
.
.

If you are using NoSQL Dynamo DB then using Dynamoose, you can easily set default unique id, here is the simple user create example
// User.modal.js
const dynamoose = require("dynamoose");
const { v4: uuidv4 } = require("uuid");
const userSchema = new dynamoose.Schema(
{
id: {
type: String,
hashKey: true,
},
displayName: String,
firstName: String,
lastName: String,
},
{ timestamps: true },
);
const User = dynamoose.model("User", userSchema);
module.exports = User;
// User.controller.js
exports.create = async (req, res) => {
const user = new User({ id: uuidv4(), ...req.body }); // set unique id
const [err, response] = await to(user.save());
if (err) {
return badRes(res, err);
}
return goodRes(res, reponse);
};

Update for 2022 :
I was looking for the same issue and came across following research.
DynamoDB still doesn't support auto-increment of primary keys.
https://aws.amazon.com/blogs/database/simulating-amazon-dynamodb-unique-constraints-using-transactions/
Also the package node-uuid is now deprecated. They recommend we use uuid package instead that creates RFC4122 compliant UUID's.
npm install uuid
import { v4 as uuidv4 } from 'uuid';
uuidv4(); // ⇨ '9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d'

For Java developers, there is the DynamoDBMapper, which is a simple ORM. This supports the DynamoDBAutoGeneratedKey annotation. It doesn't increment a numeric value like a typical "Long id", but rather generates a UUID like other answers here suggest. If you're mapping classes as you would with Hibernate, GORM, etc., this is more natural with less code.
I see no caveats in the docs about scaling issues. And it eliminates the issues with under or over-counting as you have with the auto-incremented numeric values (which the docs do call out).

Related

How to force the DynamoDB query's ExclusiveStartKey to use exact match?

I'm using DynamoDB for my new Serverless Restful API with nodejs.
The Restful API supports query for resources with the limit and lastKey query parameters for key pagination.
Assume there's a table like below:
PK
SK
School
firstSchool
School
secondSchool
School
thirdSchool
PK is partition key, and SK is sort key.
I use SK for key pagination.
If I call the api with http://somewhere/api/school?limit=1&lastKey=secondSchool, ExclusiveStartKey in query will be {"PK" : "School", "SK" : "secondSchool"}, and the returned item will be {"PK" : "School", "SK" : "thirdSchool"}.
It works well in that case, but the problem is the same result is created with the url like http://somewhere/api/school?limit=1&lastKey=seco.
In this case, ExclusiveStartKey in query will be {"PK" : "School", "SK" : "seco"}
It seems DynamoDB doesn't use exact match for a sk value in ExclusiveStartKey.
Is there any way to force DynamoDB to use exact match for ExclusiveStartKey?
I attach my test code below:
const { DynamoDBClient } = require("#aws-sdk/client-dynamodb");
const { DynamoDBDocument } = require("#aws-sdk/lib-dynamodb");
const ddbClient = new DynamoDBClient({
region: AWS_REGION,
endpoint: AWS_DYNAMODB_END_POINT,
credentials: {
accessKeyId: AWS_ACCESSKEY_ID,
secretAccessKey: AWS_SECRET_ACCESS_KEY,
},
});
const ddbDocClient = DynamoDBDocument.from(ddbClient);
(async () => {
try {
const data = await ddbDocClient.query({
TableName: "Table Name",
KeyConditionExpression: "#pk = :pk",
ExpressionAttributeNames: {
"#pk": "PK",
},
ExpressionAttributeValues: {
":pk": "Test",
},
Limit: 1,
ExclusiveStartKey: { PK: "Test", SK: "Seco" },
});
console.log(data);
} catch (err) {
console.log("Error", err);
}
})();
The ExclusiveKeyStart is used mainly for paging large Scan or Query requests - i.e., retrieving the next page of results after the previous page ended with a LastEvaluatedKey, and you are supposed to give exactly that key (not some subset of it...) as the ExclusiveKeyStart of the next request.
You are trying to do something different, and to achieve you can't use ExclusiveKeyStart, but you can use something else:
The Query request has a KeyConditionExpression. You can specify sk > :value as a key condition expression (don't pass ExclusiveKeyStart), and you'll get this all the sort keys higher than that :value like your string "seco". Please note, however, that because your sort key is truncated, this result may actually include one or more extra results before the first key you want (e.g., the keys "seco" and "secoaaaa" come before "secondSchool") so you may need to drop them yourself from the results.
The KeyConditionExpression is implemented efficiently - DynamoDB knows how to skip directly to that sort key in the partition, and doesn't charge you for reading the entire partition, so in this respect it is just as good as ExclusiveKeyStart.

findAll() returns empty with WHERE option

First question on StackOverflow, long time reader first time poster or whatever people say.
I'm developing a Discord bot in my free time using Discord.js, and I'm using Sequelize to interface with a local SQLite database. I can insert data into it just fine-- however, I can't seem to delete any of the records I add. Relevant piece of code is below, which I believe to be self-contradictory:
const query3 = await Towers.findAll({
attributes: ['channelID']
});
console.log(JSON.stringify(query3)); //returns the one Tower
console.log(query3[0].channelID === channel); //returns true(!)
const query2 = await Towers.findAll({
attributes: ['channelID'],
where: {channelID: channel}
});
console.log(JSON.stringify(query2)); //returns empty
//DELETE FROM Towers WHERE channelID = channel;
const query = await Towers.destroy({
where: {channelID: channel}
});
console.log(query); //returns 0, expected behavior given query2 returns empty
I'm attempting to delete a record from a table named Towers by passing a channel ID to it, which is expected to be unique. However, when I make any query on the database with a WHERE clause, the query returns an empty set-- even when, in this example, I sanity-checked and verified that the value I'm attempting to remove is present in the table. This occurs for both findAll() and findOne() as long as a WHERE clause is present.
(For posterity, I've double and triple checked that channelID was spelled correctly and with the correct capitalization in all instances.)
I'm happy to provide any more information if needed!
EDIT: As requested, the model definition...
const Towers = sequelize.define('Towers', {
serverID: {
type: Sequelize.INTEGER,
allowNull: false,
},
channelID: {
type: Sequelize.INTEGER,
unique: true,
allowNull: false,
},
pattern: Sequelize.STRING,
height: Sequelize.INTEGER,
delay: Sequelize.BOOLEAN,
});
channel in the snippet in the original post is defined as parseInt(interaction.options.getChannel('channel').id).
To anyone who happens to have the same issue I did, the answer is a doozy.
I wanted to store Discord server and channel ID's as integers, even though they're returned to you as strings when calling the API. As it turns out, Discord snowflakes are higher than float64 precision, which JS uses. When parsing the strings into integers to insert them into my table, the value changed from the intended number, and I was creating erroneous records.
In my case (with the actual numbers obfuscated) interaction.options.getChannel('channel').id returned "837512533934092340", while parseInt(interaction.options.getChannel('channel').id returned 837512533934092300. The number I was adding to the table was somehow 40 less!
I'm not sure if this could be fixed by using BigInt, but since it's going into a different structure anyway, I just shrugged and changed the serverId and channelId types to Sequelize.STRING in the model definition and removed the parseInt calls. Works like a charm now.
Good opportunity to shake my fist at JS though.

dynamo db FilterExpression, find in json object using value as key

It is possible to somehow filter results by key name that stored in the same object?
I have JSON object "keys", in property "default" stored key of the object that I need. Is it somehow possible to filter like that keys[keys.default].type = some_type?
var params = {
TableName: 'TABLE_NAME',
IndexName: 'TABLE_INDEX', // optional (if querying an index)
KeyConditionExpression: 'myId = :value',
FilterExpression: '#kmap[#kmap.#def].#tp = :keyval',
ExpressionAttributeNames: {names with special characters
'#kmap': 'keys',
'#tp': 'type',
'#def': 'default'
},
ExpressionAttributeValues: { // a map of substitutions for all attribute values
':value': '1',
':keyval': 'some_type'
},
Limit: 10, // optional (limit the number of items to evaluate)
ProjectionExpression: "displayName, #kmap",
ReturnConsumedCapacity: 'TOTLAL', // optional (NONE | TOTAL | INDEXES)
};
docClient.query(params, function(err, data) {
if (err) ppJson(err); // an error occurred
else ppJson(data); // successful response
});
I'm pretty sure the answer is no.
This keys[keys.default] is not even valid json, as far as I can tell.
Of course, you can do this in two steps:
First, query to get the default key
Then query to get the value
Don't forget, filters are obly applied to the result set - it still requires a libear traversal as specified by your Query or Scan operation.
So you can probably more easily run your query on the client.
And lastly, if this is a typical query ypu need to perform, as an optimization, you can lift the default key and value to be top level attributes on the item. Thrn you can actually create a GSI on that attribure and can actually do efficient lookups.

How to publish a view/transform of a collection in Meteor?

I have made a collection
var Words = new Meteor.Collection("words");
and published it:
Meteor.publish("words", function() {
return Words.find();
});
so that I can access it on the client. Problem is, this collection is going to get very large and I just want to publish a transform of it. For example, let's say I want to publish a summary called "num words by length", which is an array of ints, where the index is the length of a word and the item is the number of words of that length. So
wordsByLength[5] = 12;
means that there are 12 words of length 5. In SQL terms, it's a simple GROUP BY/COUNT over the original data set. I'm trying to make a template on the client that will say something like
You have N words of length X
for each length. My question boils down to "I have my data in form A, and I want to publish a transformed version, B".
UPDATE You can transform a collection on the server like this:
Words = new Mongo.Collection("collection_name");
Meteor.publish("yourRecordSet", function() {
//Transform function
var transform = function(doc) {
doc.date = new Date();
return doc;
}
var self = this;
var observer = Words.find().observe({
added: function (document) {
self.added('collection_name', document._id, transform(document));
},
changed: function (newDocument, oldDocument) {
self.changed('collection_name', oldDocument._id, transform(newDocument));
},
removed: function (oldDocument) {
self.removed('collection_name', oldDocument._id);
}
});
self.onStop(function () {
observer.stop();
});
self.ready();
});
To wrap transformations mentioned in other answers, you could use the package I developed, meteor-middleware. It provides a nice pluggable API for this. So instead of just providing a transform, you can stack them one on another. This allows for code reuse, permissions checks (like removing or aggregating fields based on permissions), etc. So you could create a class which allows you to aggregate documents in the way you want.
But for your particular case you might want to look into MongoDB aggregation pipeline. If there is really a lot of words you probably do not want to transfer all of them from the MongoDB server to the Meteor server side. On the other hand, aggregation pipeline lacks the reactivity you might want to have. So that published documents change counts as words come in and go.
To address that you could use another package I developed, PeerDB. It allows you to specify triggers which would be reactively called as data changes, and stored in the database. Then you could simply use normal publishing to send counts to the client. The downside is that all users should be interested in the same collection. It works globally, not per user. But if you are interested in counts of words per whole collection, you could do something like (in CoffeesScript):
class WordCounts extends Document
#Meta
name: 'WordCounts'
class Words extends Document
#Meta
name: 'Words'
triggers: =>
countWords: #Trigger ['word'], (newDocument, oldDocument) ->
# Document has been removed.
if not newDocument._id
WordCounts.update
length: oldDocument.word.length
,
$inc:
count: -1
# Document has been added.
else if not oldDocument._id
WordCounts.update
length: newDocument.word.length
,
$inc:
count: 1
# Word length has changed.
else if newDocument.word.length isnt oldDocument.word.length
WordCounts.update
length: oldDocument.word.length
,
$inc:
count: -1
WordCounts.update
length: newDocument.word.length
,
$inc:
count: 1
And then you could simply publish WordCounts documents:
Meteor.publish 'counts', ->
WordCounts.documents.find()
You could assemble the counts by going through each document in Words, (cursor for each)
var countingCursor = Words.find({});
var wordCounts = {};
countingCursor.forEach(function (word) {
wordCounts[word.length].count += 1;
wordCounts[word.length].words = wordCounts[word.length].words || []
wordCounts[word.length].words.push(word);
});
create a local collection,
var counts = new Meteor.Collection('local-counts-collection', {connection: null});
and insert your answers
var key, value;
for (key in wordCounts) {
value = object[key];
counts.insert({
length: key,
count: value.count,
members: value.words
});
}
Counts is now a collection, just not stored in Mongo.
Not tested!

Meteor duplicate insert conflict resolution

Is there a design pattern in meteor application to handle multiple clients inserting the same logical record 'simultaneously'?
Specifically I have a scoring type application, and multiple clients could create the initial, basically blank, Score record for an Entrant when the entrant is ready to start. The appearance of the record is then used to make it available on the page for editing by the officials, incrementing penalty counts and such.
Stages = new Meteor.Collection("contests");
Entrants = new Meteor.Collection("entrants");
Scores = new Meteor.Collection("scores");
// official picks the next entrant
Scores.insert( stage_id:xxxx, entrant_id:yyyy)
I am happy with the implications of the conflict resolutions of edits to the Score record once it is in the Collection. I am not sure how to deal with multiple clients trying to insert the Score for the stage_id/entrant_id pair.
In a synchronous app I would tend to use some form of interlocking, or a relational DB key constraint.
Well, according to this answer Meteor $upsert flag is still in enhancement list and seems to be added in stable branch after 1.0 release.
So the first way is how it was said to add an unique index:
All the implementation ways are listed here. I would recommend you to use native mongo indexes, not a code implementation.
The optimistic concurrency way is much more complicated according to no transactions in MongoDB.
Here comes my implementation of it(be careful, might be buggy))):
var result_callback = function(_id) {
// callback to call on successfull insert made
}
var $set = {stage_id: xxxx, entrant_id: xxxx};
var created_at = Date.now().toFixed();
var $insert = _.extend({}, $set, {created_at: created_at});
Scores.insert($insert, function(error, _id) {
if (error) {
//handle it
return;
}
var entries = Scores.find($set, {sort: {created_at: -1}}).fetch()
if (entries.length > 1) {
var duplicates = entries.splice(0, entries.length - 1);
var duplicate_ids = _.map(duplicates, function(entry) {
return entry._id;
});
Scores.remove({_id: {$in: duplicate_ids}})
Scores.update(entries[0]._id, $set, function(error) {
if (error) {
// handle it
} else {
result_callback(entries[0]._id)
}
})
} else {
result_callback(_id);
}
});
Hope this will give you some good ideas)
Sorry, previous version of my answer was completely incorrect.

Resources