getting the right data model with dynamodb

getting the right data model with dynamodb - amazon-dynamodb

I am about to create my first dynamodb table and can't find a proper solution to model my requirements. It sounds very basic, but probably my brain is still too much into relational database world.
I want to do store something similar like that:
A user can buy a product (once or several times). What I want to store is username, product_id
The only things I need to query later are:
which products have been purchased by user X
how many times were they purchased
First I considered having having an item with two attributes: username and product_id. But then I cannot use username as primary key (a user can buy more than once) neither can I user username + product_id (user can buy a product several times)
Now I would go for having username, product_id, counter and taking username + product_id as primary key. However, I will always need to check first if a product was already purchased and update it, otherwise create a new entry. For getting all products of a user I would create a global secondary index on username.
However, I am not very sure if this is the right way. Any feedback would be great!

There are probably a number of ways to do this and I don't know all of your requirements so I can't guarantee this is the right answer for you but based on your description, this is what I would do.
First, I'm assuming that each order has some sort of unique order number associated with it. I would use this order number as the primary key of the table. I wouldn't use a range key. This would ensure that the constraint that all primary keys be unique is met. In addition, when I write the data to DynamoDB I would also write the username and the product_id as additional attributes.
Next, I would create a Global Secondary Index that uses the username as the primary key and the product_id as the range key. Unlike the primary key of the table, GSI keys do not have to be unique so if a user purchased a particular product more than once, this would be fine. This GSI would allow me to perform queries such as "find all orders by username" or "find all orders where username purchased product_id".
If you also needed to do queries like "find all usernames who purchased product_id" you would need another GSI that used product_id as the primary key and username as the range key.

Related

Using a GUID as entity Id vs the entity's "actual" Id

In every cosmos db repository example I've seen, the id/row key has been generated like this: {partitionKey}:{Guid.newGuid()}. I'm working on a web api where the user won't necessarily have any way of knowing what this random GUID is. But they will know the EmployeeId, ProjectId etc. of the respective object, so I'm wondering if there are any issues with using i.e. EmployeeId as both the partition key and Id?

There's nothing technically wrong with the approach of setting id and partition key the same however you will have just one document per partition and that's bad design IMHO as all your read queries will be cross-partition queries (e.g. listing all employees).
One approach could be to set the partition key as the type of the entity (Employee, Project etc.) and then set the id as the unique identifier of the entity (employee id, project id etc.).

To be honest, if you know the partition key AND the item id, you can do a Point read which is the fastest.
We used to also take the approach of using random guids for all item IDs, but this means you will always need to know this id and partition key. Sometimes a more functional key as the item ID makes more sense so have a good thought about it!
And remember, an item ID is not unique, the uniqueness is only within the partition key.
So you could have two items with the same item ID and different partition key.

Dynamodb for users table

For what I'm understanding, dynamodb has two ways for defining primary keys: Single primary keys that only use the partition key and composite keys that needs to specify both a partition key and a sort key.
The thing I'm stuck right now is that I'm not sure if for this case a users table should have a single primary key or a composite.
Here for example says that a good example for a single primary key is the users table, but I'm not sure how true it is.

As a general rule, you could promote as the primary key the user's email and make it as a partition key.
When I faced the same problem I made it using 2 tables:
On the table 1 I had all user data, like:
LICENSE_NUMBER, NAME, EMAIL, PASSWORD, ETC... where "LICENSE_NUMBER" is the primary key which is mainly used to identify the user when needed.
...And I also had a table 2 having the following fields:
EMAIL, LICENSE_NUMBER, ETC... where "EMAIL" is used to retrieve the "LICENSE_NUMBER" when the user logs in.
Final consideration:
Since DynamoDB does not allow JOIN operations I had to split in two tables and emulate a join by creating 2 tables with single primary keys each, but tied to the same Email / Licence_number fields.
Hope this helps you when choosing the right pattern to build your user table.

How to insert row into 2 tables with a one-to-one relationship in SQL Server

Suppose I have 2 tables with a one-to-one relation:
tblOrder (orderId, orderName, totalPrice, billId)
tblBill (billId, billAmount, cardNumber)
The orderId in tblOrder and billId in tblBill are the primary key and they are both identity keys. Also billId in the tblOrder is the unique foreign key.
In the front end using Asp.net, I want the customers to enter the tblOrder information first into the database, then I want them to enter tblBill information. But I want to automate the process of setting the billId foreign key on the tblOrder. Problem is when multiple users will use the system at same time how can I know which bill will belong to which order?
One solution I thought of was to insert an empty row in tblBill and set that id column value to the tblOrder's billId foreign key. And update the bill information when customer enters the bill information in front end. But it doesn't seem like an optimal solution since one empty row insertion will happen for every purchase.

Query on non-key attribute

It appears that dynamodb's query method must include the partition key as part of the filter. How can a query be performed if you do not know the partition key?
For example, you have a User table with the attribute userid set as the partition key. Now we want to look up a user by their phone number. Is it possible to perform the query without the partition key? Using the scan method, this goal can be achieved, but at the expense of pulling every item from the table before the filter is applied, as far as I know.

You'll need to set up a global secondary index (GSI), using your phoneNumber column as the index hash key.
You can create a GSI by calling UpdateTable.
Once you create the index, you'll be able to call Query with your IndexName, to pull user records based on the phone number.

How to design DynamoDB table to facilitate searching by time ranges, and deleting by unique ID

I'm new to DynamoDB - I already have an application where the data gets inserted, but I'm getting stuck on extracting the data.
Requirement:
There must be a unique table per customer
Insert documents into the table (each doc has a unique ID and a timestamp)
Get X number of documents based on timestamp (ordered ascending)
Delete individual documents based on unique ID
So far I have created a table with composite key (S:id, N:timestamp). However when I come to query it, I realise that since my id is unique, because I can't do a wildcard search on ID I won't be able to extract a range of items...
So, how should I design my table to satisfy this scenario?
Edit: Here's what I'm thinking:
Primary index will be composite: (s:customer_id, n:timestamp) where customer ID will be the same within a table. This will enable me to extact data based on time range.
Secondary index will be hash (s: unique_doc_id) whereby I will be able to delete items using this index.
Does this sound like the correct solution? Thank you in advance.

You can satisfy the requirements like this:
Your primary key will be h:customer_id and r:unique_id. This makes sure all the elements in the table have different keys.
You will also have an attribute for timestamp and will have a Local Secondary Index on it.
You will use the LSI to do requirement 3 and batchWrite API call to do batch delete for requirement 4.
This solution doesn't require (1) - all the customers can stay in the same table (Heads up - There is a limit-before-contact-us of 256 tables per account)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

getting the right data model with dynamodb - amazon-dynamodb

Related

Using a GUID as entity Id vs the entity's "actual" Id

Dynamodb for users table

How to insert row into 2 tables with a one-to-one relationship in SQL Server

Query on non-key attribute

How to design DynamoDB table to facilitate searching by time ranges, and deleting by unique ID

Categories

Resources