DynamoDB Composite Key vs Simple Key Performance - amazon-dynamodb

Table: user_group_data
Option 1:
Primary Composite Key:
Partition Key: user_id
Sort Key: group_id
Option 2:
Primary Simple Key:
Partition Key: (user_id + group_id)
Question: Will the lookup speed for a record given both (user_id and group) will be the same for Option 1 and Option 2? (edited)

Yes.
Looking up something by primary key is an O(1) operation in every DB I've ever heard about.
Given that user_id would seem to be unique, it would seem to make more sense to have just
hk: user_id.
Or maybe if you really always have both and you sometimes need all the users in a group, then
hk: group_id
sk: user_id

Related

DynamoDB NOT EQUALS on GSI sort key

As the title suggest, I'm in a situation where I need to fetch all records from a dynamo table GSI, given that I know the hash key and I know the sort key that I want to avoid.
The table looks like this:
Id - Primary Key,
AId - GSI hash key,
BId - GSI sort key
I need an efficient query to get records by a query like this
AId = 1 and BId != 2.
DynamoDB doesn't support <> operator when querying on hash and sort keys, it's only present on filter expressions, but those are not allowed on any of the primary key fields either.
So what would be the solution here? Scanning is probably not a good idea, unless it would be possible to scan on a partition, but that doesn't seem to be supported either.
So the only solution that is obvious to me at this point is querying by the partition key and then filtering it out client side.
Assuming that your sort key is actually numeric as shown in your example...
Then your best option would be to issue two separate queries..
AId = 1 and BId < 2
AId = 1 and Bid > 2
Actually, as I write this...I think it would work regardless of the type of sort key...

One to Many Schema in DynamoDB

I am currently designing dynamo DB schemas for the following use case:
A company can have many channels, and then for each channel, I have multiple channel cards.
I am thinking of having following tables:
Channel Table:
Partition Key: CompanyId
Sort Key: Combination of Time Stamp and deleted or not
Now after getting the channels for a company, I need to fetch its channel cards, for this, I am thinking to have following Table Schema for ChannelCard.
ChannelCard Table:
Partition Key: channelId
Sort Key: Combination of Time Stamp and deleted or not
Now to get the channel cards for a company, I need to do the following:
1. First query the channels for the company using partition key (1 query)
2. Getting channel cards for each channel (number of channels query)
So in this case, we will be making many queries, can we have any less number of queries in our case?
Any suggestions for modifying the database tables or about how to query the database are welcome.
You could also have
Channel Table
Partition Key: CompanyId
Sort Key: Deleted+timestamp
Channel Card Table
Partition Key: CompanyId
Sort Key: Deleted+ChannelCardTimeStamp
GSI #1:
Partition Key: ChannelId
Sort Key: Deleted+ChannelCardTimeStamp
This way you can have one query for the most recent channelcards for any given company and you can also query for the most recent channelcards for any channel.

Querying with multiple local Secondary Index Dynamodb

I have 2 LSI in my table with a primary partition Key with primary sort key
Org-ID - primary partition Key
ClientID- primary sort Key
Gender - LSI
Section - LSI
I have no issue with querying a table with one LSI, but how to mention 2 LSI in a table schema.
var params = {
TableName:"MyTable",
IndexNames: ['ClientID-Gender-index','ClientID-Section-index'],
KeyConditionExpression : '#Key1 = :Value1 and #Key2=:Value2 and #Key3=:Value3',
ExpressionAttributeNames:{
"#Key1":"Org-ID",
"#Key2":"Gender",
"#Key3":"Section"
},
ExpressionAttributeValues : {
':Value1' :"Microsoft",
':Value2':"Male",
':Value3':"Cloud Computing"
}};
Can anyone fix the issue in IndexName(line 3) or KeyConditionExpression(line 4), I'm not sure about it.
Issue
Condition can be of length 1 or 2 only
You can only query a single DynamoDB index at a time. You cannot use multiple indexes in the same query.
A simple alternative is to use a single index and apply a query filter, but this will potentially require a lot of records to be scanned and the filter only reduces the amount of data transferred over the network.
A more advanced alternative is to make a compound key. You would most likely want to use a GSI, rather than an LSI for this use case. By making a single new column that is the string concatenation of Key1, Key2, and Key3 you can use this GSI to search all three keys at the same time. This will make each individual record bigger by repeating data but it allows for a more complex query pattern.

DynamoDB : Is it possible to query by hash and range without creating two tables?

If I want to create a DynamoDB table with ItemId and BatchId and I want to be able to query by ItemId and BatchId do I have to create two tables:
Table1: Hash-ItemId Range-BatchId
Table2: Hash-BatchId Range-ItemId
Or is there a way to use secondary indexes to avoid duplication?
how about a global secondary index on Table1 with BatchId as hash key?
Reopen because I think this answer is useful. Please correct me instead of closing the answer. #rfornal #Devin.
Hey #Nickolay I saw your comments below. Range key of base table CAN be used as hash key of GSI.
To prove that I created a table like this:
Base table: HashKey: hash + RangeKey:range
GSI table: HashKey: range + Rangekey: hash
Inserted some keys:
query base table:
query gsi:

Sqlite, is Primary Key important if I don't need auto-increment?

I only use primary key integer ID for it's "auto-increment function".
What if I don't need an "auto-increment"? Do I still need primary key if I don't care the uniqueness of record?
Example: Lets compare this table:
create table if not exists `table1`
(
name text primary key,
tel text,
address text
);
with this:
create table if not exists `table2`
(
name text,
tel text,
address text
);
table1 applies primary key and table2 don't. Is there any bad thing happen to table2?
I don't need the record to be unique.
SQLite is a relational database system. So it's all about relations. You build relations between tables on keys.
You can have tables without a primary key; it is not necessary for a table to have a primary key. But you will almost always want a primary key to show what makes a record unique in that table and to build relations.
In your example, what would it mean to have two identical records? They would mean the same person, no? Then how would you count how many persons named Anna are in the database? If you count five, how many of them are unique, how many are mere duplicates? Such queries can be done properly, but get overly complicated because of the lacking primary key. And how would you build relations, say the cars a person drives? You would have a car table and then how to link it to the persons table in your example?
There are cases when you want a table without a primary key. These are usually log tables and the like. They are rare. Whenever you are creating a table without a primary key, ask yourself why this is the case. Maybe you are about to build something messy ;-)
You get auto-incrementing primary keys only when a column is declared as INTEGER PRIMARY KEY; other data types result in plain primary keys.
You are not required to declare a PRIMARY KEY.
But even if you do not do this, there will be some column(s) used to identify and look up records.
The PRIMARY KEY declaration helps to document this, enforces uniqueness, and optimizes lookups through the implicit index.

Resources