Dynamodb for users table - amazon-dynamodb

For what I'm understanding, dynamodb has two ways for defining primary keys: Single primary keys that only use the partition key and composite keys that needs to specify both a partition key and a sort key.
The thing I'm stuck right now is that I'm not sure if for this case a users table should have a single primary key or a composite.
Here for example says that a good example for a single primary key is the users table, but I'm not sure how true it is.

As a general rule, you could promote as the primary key the user's email and make it as a partition key.
When I faced the same problem I made it using 2 tables:
On the table 1 I had all user data, like:
LICENSE_NUMBER, NAME, EMAIL, PASSWORD, ETC... where "LICENSE_NUMBER" is the primary key which is mainly used to identify the user when needed.
...And I also had a table 2 having the following fields:
EMAIL, LICENSE_NUMBER, ETC... where "EMAIL" is used to retrieve the "LICENSE_NUMBER" when the user logs in.
Final consideration:
Since DynamoDB does not allow JOIN operations I had to split in two tables and emulate a join by creating 2 tables with single primary keys each, but tied to the same Email / Licence_number fields.
Hope this helps you when choosing the right pattern to build your user table.

Related

Fetch last item of the aws dynamodb table

So I wanted to fetch the last item/row of my dynamodb table but i am not finding resources. My primary key is id having series of incremented numbers such as 1,2,3... for each row respectively.
This is my function.
async function readMessage(){
const params = {
TableName: table,
};
return dynamo.getItem(params).promise();
}
I am not sure as to what i should be adding in my params.
DynamoDB has two types of primary keys:
Partition key – A simple primary key, composed of one attribute known as the partition key.
Partition key and sort key – Referred to as a composite primary key, this type of key is composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key.
When fetching an item by partition key, you need to specify the exact partition key. You cannot fetch the max/min partition key.
Instead, you may want to create a sort key with a timestamp (or the ID if it's a sequential number) and use the sort key to fetch the last item.
Check out the AWS docs on Choosing the Right Partition Key for more info.
The proper way to design a table in DynamoDB is based on its expected access patterns; if this is something you need perhaps you should consider using this id as Sort Key instead of Primary Key and then query the table in descending order while also limiting the amount of items to 1.
If instead you don't want to change the schema of your items and you don't care about making at least two operations to do this you have two, not optimal options:
If none of your items ever gets deleted, just make a count first and use that information to know what's the latest item that was written.
Alternatively, if you could consider keeping a "special" record in your DynamoDB table that is basically a count that gets always incremented/written when one of your "other" items gets written. Upon retrieval you first retrieve the value of this special record and use this info to retrieve the actual one.
The combination of the partition key and sort key, makes the primary key of your item in the dynamoDB, so their combination must be unique, otherwise the item will be overwritten.
In almost all my use-cases, I select the primary key as an object attribute, like the brand, an email or a class and then, for the sort key I select the TimeStamp. So in this way, you always know the partition key, we need it to retrieve the values and then you can query your dynamoDB by making some filters by the sort key. For more extensive examples using Python, check the AWS page: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.Python.04.html, where it shows, how you can query your DynamoDB items.
There is also other ways to define the keys in your Dynamo and for that I advise you to check https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-sort-keys.html

Choosing Primary key for DynamoDB

A bit of context: I am trying to build an inventory to list my AWS resources in various accounts and I am planning to use DynamoDB to store the data. These will be the columns for my table: ResourceARN, ResourceName, ResourceType, StandardTag, IsDeleted, LastUpdateTime and ResourceCreationDate ( this field is available only for a few resource types like Ec2).
Question: I want to query my DDB table using account ID, resource type and tag name. I am stumped on choosing the primary key for the table. Since primary key should be unique and has to have 1:many relationship. Hence, I cannot use a combination of resourceType and account Id. Nor can I use resourceArn as my primary key since it is 1:1 relationship. Also, using the resourceARN as the sort key does not make sense to me. I understand that I can use a simple scan operation, but that is very costly and will take time if I add more data in my DDB.
I would appreciate any suggestions or guidance over the same.
Short answer
Partition key: Account ID
Sort key: <resource type>/<resource ID>
Rationale
It's a common pattern for a sort key to be a string concatenating multiple attributes. Since sort keys can be queried by prefix, you can leverage this in your queries:
Get all account resources: query all sort keys on the Account ID partition key
Get all EC2 instances of an account: query with partition key = <your account ID> and sort key begins_with('ec2-instance').
You may notice that ARNs follow such a hierarchy as well (what's probably not a coincidence). This would be effectively using a subset of the ARN as the sort key.
Some notes:
DynamoDB is about attributes as much as about columns. You don't need to include ResourceCreationDate in the records which don't have it, and doing so will save you space (see next point).
Attribute names count as storage for every record, which impacts cost and also throughput. It's common to use shorthand for names for this reason (rct instead of ResourceCreationTime for example).
You can use LSIs (Local Secondary Indexes) to order by creation and update times if you need this.

Dynamodb key made up of 3 fields

Say I have an RDBMS table with a composite primary key e,g field1,field2,field3 which uniquely identify a record in the table. How can I model this on Dynamodb as it appears the primary on Dynamodb can only be made up of two fields (e.g a partition key and sort key)
You may need to somehow combine them into one value (such as concatenation with a field delimiter). For e.g. field1_field2_field3 as the partition key. If you happen to need sorting, you may also use sort key. You would also be able to search on bases for these fields for e.g. field1_ or field2 or _field3
Refrence: https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/

getting the right data model with dynamodb

I am about to create my first dynamodb table and can't find a proper solution to model my requirements. It sounds very basic, but probably my brain is still too much into relational database world.
I want to do store something similar like that:
A user can buy a product (once or several times). What I want to store is username, product_id
The only things I need to query later are:
which products have been purchased by user X
how many times were they purchased
First I considered having having an item with two attributes: username and product_id. But then I cannot use username as primary key (a user can buy more than once) neither can I user username + product_id (user can buy a product several times)
Now I would go for having username, product_id, counter and taking username + product_id as primary key. However, I will always need to check first if a product was already purchased and update it, otherwise create a new entry. For getting all products of a user I would create a global secondary index on username.
However, I am not very sure if this is the right way. Any feedback would be great!
There are probably a number of ways to do this and I don't know all of your requirements so I can't guarantee this is the right answer for you but based on your description, this is what I would do.
First, I'm assuming that each order has some sort of unique order number associated with it. I would use this order number as the primary key of the table. I wouldn't use a range key. This would ensure that the constraint that all primary keys be unique is met. In addition, when I write the data to DynamoDB I would also write the username and the product_id as additional attributes.
Next, I would create a Global Secondary Index that uses the username as the primary key and the product_id as the range key. Unlike the primary key of the table, GSI keys do not have to be unique so if a user purchased a particular product more than once, this would be fine. This GSI would allow me to perform queries such as "find all orders by username" or "find all orders where username purchased product_id".
If you also needed to do queries like "find all usernames who purchased product_id" you would need another GSI that used product_id as the primary key and username as the range key.

Sqlite, is Primary Key important if I don't need auto-increment?

I only use primary key integer ID for it's "auto-increment function".
What if I don't need an "auto-increment"? Do I still need primary key if I don't care the uniqueness of record?
Example: Lets compare this table:
create table if not exists `table1`
(
name text primary key,
tel text,
address text
);
with this:
create table if not exists `table2`
(
name text,
tel text,
address text
);
table1 applies primary key and table2 don't. Is there any bad thing happen to table2?
I don't need the record to be unique.
SQLite is a relational database system. So it's all about relations. You build relations between tables on keys.
You can have tables without a primary key; it is not necessary for a table to have a primary key. But you will almost always want a primary key to show what makes a record unique in that table and to build relations.
In your example, what would it mean to have two identical records? They would mean the same person, no? Then how would you count how many persons named Anna are in the database? If you count five, how many of them are unique, how many are mere duplicates? Such queries can be done properly, but get overly complicated because of the lacking primary key. And how would you build relations, say the cars a person drives? You would have a car table and then how to link it to the persons table in your example?
There are cases when you want a table without a primary key. These are usually log tables and the like. They are rare. Whenever you are creating a table without a primary key, ask yourself why this is the case. Maybe you are about to build something messy ;-)
You get auto-incrementing primary keys only when a column is declared as INTEGER PRIMARY KEY; other data types result in plain primary keys.
You are not required to declare a PRIMARY KEY.
But even if you do not do this, there will be some column(s) used to identify and look up records.
The PRIMARY KEY declaration helps to document this, enforces uniqueness, and optimizes lookups through the implicit index.

Resources