DynamoDB OrderBy operation - amazon-dynamodb

Table : Customer
Item: CustomerId,PurchaseType,Name,mobilenumber,price, createdDate
DATA1: cus001,"online","BBBBB","yourmobilenumber",6000,"01/07/2017 01:12:05"
DATA2: cus002,"online","myname","mymobilenumber",500,"10/07/2017 01:12:01"
DATA3: cus003,"online","AAAAA","yourmobilenumber",6000,"10/07/2017 01:12:06"
DATA4: cus004,"online","yourname","yourmobilenumber",1000,"10/07/2017 02:12:06"
DATA5: cus005,"retail","yourname","yourmobilenumber",1000,"10/07/2017 03:12:06"
GSI: price-index[PurchaseType,price]
Query with index "price-index"
condition: purchasetype="online" and price >500
ScanIndex: true
How to get the result based on the following conditions:
purchasetype="online"
price>500
order by Name

You need to create a different GSI:
PurchaseType - partition key of the GSI
Name - sort key of the GSI
Then you can use a query to find all items with the necessary purchase type, order by Name and provide a filter expression to filter all items with high prices.

Related

How to create composite GSI PK in DynamoDB?

I have tried creating GSI with a PK that uses a composite value of business_id, type_id, partner_id fields. I did it in two different ways in the AWS console:
First: business_id#type_id#partner_id
Second: [business_id]#[type_id]#[partner_id]
and sort key: updated
Here is the query:
SELECT *
FROM "items"."composite_key-index"
WHERE business_id = 435634652 AND type_id = 2 AND partner_id = 69992528
ORDER BY updated ASC
In both cases it throws this error:
ValidationException: Must have at least one non-optional hash key
condition in WHERE clause when using ORDER BY clause.
And if I run it without the order by:
SELECT *
FROM "items"."composite_key-index"
WHERE business_id = 435634652 AND type_id = 2 AND partner_id = 69992528
it doesnt return any items, even though there are data matching those values.
What am I doing wrong here?
To use a composite value as a key, you have to build the values yourself.
Your application would have to store the value in a single attribute, ex GSI_PK, as 435634652#2#6992528
Then your query would look like
SELECT *
FROM "items"."composite_key-index"
WHERE GSI_PK = "435634652#2#6992528"

boto3 resource for querying dynamodb : Query condition missed key schema element

I have a table as : AdvgCountries which has two columns
a. CountryId (String) (Parition Key)
b. CountryName(String) Sort Key
While creating the table , I created with only Partition Key and then later added a Global Secondary Index with Index name as:
CountryName-index
Type : GSI
Partition key : CountryId
Sort Key : CountryName
I am able to retrieve CountryName based upon CountryId but unable to retrieve CountryId based upon CountryName. Based upon my reading I found that there are options to do this by providing indexname but I get the following error:
botocore.exceptions.ClientError: An error occurred
(ValidationException) when calling the Query operation: Query
condition missed key schema element: CountryId
import boto3
import json
import os
from boto3.dynamodb.conditions import Key, Attr
def query_bycountryname(pCountryname, dynamodb=None):
if not dynamodb:
dynamodb = boto3.resource('dynamodb', endpoint_url="https://dynamodb.us-east-1.amazonaws.com")
table = dynamodb.Table('AdvgCountires')
print(f"table")
attributes = table.query(
IndexName="CountryName-index",
KeyConditionExpression=Key('CountryName').eq(pCountryname),
)
if 'Items' in attributes and len(attributes['Items']) == 1:
attributes = attributes['Items'][0]
print(f"before return")
return attributes
if __name__ == '__main__':
CountryName = "India"
print(f"Data for {CountryName}")
countries = query_bycountryname(CountryName)
for country in countries:
print(country['CountryId'], ":", country['CountryName'])
Any help is appreciated.
You can't be able to fetch primary key value based on sort key. DynamoDB does not work like this.
In Dynamodb, each item’s location is determined by the hash value of
its partition key.
The Query operation in Amazon DynamoDB finds items based on primary
key values.
KeyConditionExpression are used to write conditional statements by
using comparison operators that evaluate against a key and limit the
items returned. In other words, you can use special operators to
include, exclude, and match items by their sort key values.

What's the equivalent DynamoDB solution for this MySQL Query?

I'm familiar with MySQL and am starting to use Amazon DynamoDB for a new project.
Assume I have a MySQL table like this:
CREATE TABLE foo (
id CHAR(64) NOT NULL,
scheduledDelivery DATETIME NOT NULL,
-- ...other columns...
PRIMARY KEY(id),
INDEX schedIndex (scheduledDelivery)
);
Note the secondary Index schedIndex which is supposed to speed-up the following query (which is executed periodically):
SELECT *
FROM foo
WHERE scheduledDelivery <= NOW()
ORDER BY scheduledDelivery ASC
LIMIT 100;
That is: Take the 100 oldest items that are due to be delivered.
With DynamoDB I can use the id column as primary partition key.
However, I don't understand how I can avoid full-table scans in DynamoDB. When adding a secondary index I must always specify a "partition key". However, (in MySQL words) I see these problems:
the scheduledDelivery column is not unique, so it can't be used as a partition key itself AFAIK
adding id as unique partition key and using scheduledDelivery as "sort key" sounds like a (id, scheduledDelivery) secondary index to me, which makes that index pratically useless
I understand that MySQL and DynamoDB require different approaches, so what would be a appropriate solution in this case?
It's not possible to avoid a full table scan with this kind of query.
However, you may be able to disguise it as a Query operation, which would allow you to sort the results (not possible with a Scan).
You must first create a GSI. Let's name it scheduled_delivery-index.
We will specify our index's partition key to be an attribute named fixed_val, and our sort key to be scheduled_delivery.
fixed_val will contain any value you want, but it must always be that value, and you must know it from the client side. For the sake of this example, let's say that fixed_val will always be 1.
GSI keys do not have to be unique, so don't worry if there are two duplicated scheduled_delivery values.
You would query the table like this:
var now = Date.now();
//...
{
TableName: "foo",
IndexName: "scheduled_delivery-index",
ExpressionAttributeNames: {
"#f": "fixed_value",
"#d": "scheduled_delivery"
},
ExpressionAttributeValues: {
":f": 1,
":d": now
},
KeyConditionExpression: "#f = :f and #d <= :d",
ScanIndexForward: true
}

Make own like system for various content

There are three types of content in my database. They are Songs, Albums and Playlists. Albums and Playlists are just collections of songs. And I want to let the user put like for each of them. I made table with columns
LikeId UserId SongId PlaylistId AlbumId
for storing likes. For example if user puts like to song, I put song's id into SongId column and user's id into UserId column. Other columns will be null. It's working good,but I don't like this solution because it's not normalized.
So I want to ask if there are better solutions for this.
You should just create 3 tables - one for User paired with each of Playlist, Song, and Album. They'd look something like:
CREATE TABLE PlaylistLikes
(
UserID INT NOT NULL,
PlaylistID INT NOT NULL,
PRIMARY KEY (UserID, PlaylistID),
FOREIGN KEY (UserID) REFERENCES Users (UserID),
FOREIGN KEY (PlaylistID) REFERENCES Playlists (PlaylistID)
);
CREATE TABLE SongLikes
(
UserID INT NOT NULL,
SongID INT NOT NULL,
PRIMARY KEY (UserID, SongID),
FOREIGN KEY (UserID) REFERENCES Users (UserID),
FOREIGN KEY (SongID) REFERENCES Songs (SongID)
);
CREATE TABLE AlbumLikes
(
UserID INT NOT NULL,
AlbumID INT NOT NULL,
PRIMARY KEY (UserID, AlbumID),
FOREIGN KEY (UserID) REFERENCES Users (UserID),
FOREIGN KEY (AlbumID) REFERENCES Albums (AlbumID)
);
Here, having both columns in the primary key prevents the user from liking the song/playlist/album more than once (unless you want that to be available - then remove it or maybe keep track of that in a 'number of likes' column).
You should avoid putting all 3 different types of likes in the same table - different tables should be used to represent different things. You want to avoid "One True Lookup Table" - here's one answer detailing why: OTLT
If you want to query against all 3 tables, you can create a view which is the result of a UNION between the 3 tables.
How about
LikeId UserId LikeType TargetId
Where LikeType can be "Song", "Playlist" or "Album" ?
Your solution is fine. It has the nice feature that you can set up explicit foreign key relationships to the other tables. In addition, you can verify that exactly one of the values is set by adding a check constraint:
check ((case when SongId is null then 0 else 1 end) +
(case when AlbumId is null then 0 else 1 end) +
(case when PlayListId is null then 0 else 1 end)
) = 1
There is an overhead incurred, of storing NULL values for all three. This is fairly minimal for three values.
You can even add a computed column to get which value is stored:
WhichId = (case when SongId is not null then 'Song'
when AlbumId is not null then 'Album'
when PlayListId is not null then 'PlayList
end);
As a glutton for punishment, I would use three tables: UserLikesSongs, UserLikesPlaylists and UserLikesAlbums. Each contains a UserId and an appropriate reference to one of the other tables: Songs, Albums or Playlists.
This also allows adding additional type-specific information. Perhaps Albums will support a favorite track in the future.
You can always use UNION to combine data from the various entity types.

composite index: does the order of columns matter in sql server/linq to sql?

I'm in visual studio, looking to create a composite index on 2 columns for several tables. There are 2 columns: UserID is in all tables and acts as the foreign key; then, each table has its own key to refer to the parts of the object, such as phone, address... Like this:
TablePhones:
PhoneID | UserID | PhonePrefix | PhoneNumber | PhoneExtention
TableAddresses:
AddressID | UserID | AddressStreet1 | AddressStreet2 | AddressCity...
Note that users can have more than 1 address and more than 1 phone number.
I'm using linq to sql and the where clauses queries to get the objects look like this:
read queries:
where x.UserID == TheUserID
update/delete queries:
where x.UserID == TheUserID && x.PhoneID = ThePhoneID
At the moment, the primary keys are on PhoneID and AddressID and I'm looking to create composite indexes on PhoneID/UserID and AddressID/UserID. Is the order of the columns in the database fine as it is or should I move UserID in first position for all tables.
Thanks for suggestions.
Order of columns in table doesn't matter; at least for SQLServer. The important thing is in which order fields are listed in an index. Queries with conditions on leading column[s] will very benefit from the index.
If your primary key is clustered, you can create index on only userID, no need for composite key. Anyway, it will have a reference to clustered key.

Resources