Using "OR" and selection by another table model in DynamoDB - amazon-dynamodb

I will explain my question with a concrete example. I have single DynamoDB table (for this example). Table is consisting by the two models:
- user: {
firstname
lastname
placeId
typeId
}
// List of favourites for each users
- userFavourites {
userId
favouriteId
favouriteType
}
I would like to effectively find users, by the following rule:
placeId = 'XXX' OR typeId = 'YYY' or user have any favourite with favouriteId: 'ZZZ' and favouriteType: "Dog" OR user have any favourite with favouriteType: "Cat"
I'm using onetable for communication with dynamo: https://doc.onetable.io/start/quick-tour/
Is it possible to do this kind of selection in DynamoDB (with multiple OR and selection by items from another model in same table) and everything together in one rule?

To be efficient with your reads you must do a GetItem or Query which means you have to provide the partition key for the item, that means you cannot do an OR with the native APIs.
You can however do an OR using PartiQL ExecuteStatement where you can say:
SELECT * FROM MYTABLE WHERE PARTITIONKEY IN [1,2,3]
Again this is only useful when it's the partition key.
If you are looking for OR on multiple different values then I suggest perhaps using a more suitable database with more flexible query capability, as to do so with DynamoDB would resul in a full table Scan each time you need a single row/item.

Related

How to future proof these possible requirement changes (swaping primary key columns) with a dynamodb table design?

I have the following data structure
item_id String
version String
_id String
data String
_id is simply a UUID to identify the item. There is no need to search for a row by this field yet.
As of now, item_id, an id generated by an external system, is the a primary key. i.e. Given the item_id, I want to be able retrieve version, _id and data from the dynamodb table.
item_id -> (version, _id, data)
Therefore I am setting item_id as the partition key.
I have two questions for future-proofing (evolution of) the above "schema":
In the future, if I want to incorporate version (version number of the item) into the primary key, can I just modify the table and add it to be the partition key?
If I also want to make the data searchable by _id, is it feasible modify the table to assign _id to be the partition key (It is a unique value because it is a UUID) and reassign item_id to be a search key?
I want to avoid creation of new dynamodb table and data migration to create new key structures, because it may lead to down time.
You cannot update primary keys in DynamoDB. From the docs:
You cannot use UpdateItem to update any primary key attributes. Instead, you will need to delete the item, and then use PutItem to create a new item with new attributes.
If you wanted to make data searchable by _id, you could introduce a secondary index with the _id field as the partition key of the index.
For example, let's say your data looked like this:
If you defined a secondary index on _id, the index would look like this (same data as the previous example, just a different logical view):
DynamoDB doesn't currently have any native versioning functionality, so you'll have to incorporate that into your data model. Fortunately, there's lots of discussion about this use case on the web. AWS has a document of DynamoDB "Best Practices", including an example of versioning.

How to use the Partition Key in CosmosBD via SDK or via Select QUERY

Consider Below is my sample json.
{
"servletname": "cofaxEmail",
"servlet-class": "org.cofax.cds.EmailServlet",
"init-param": {
"mailHost": "mail1",
"mailHostOverride": "mail2"
}
i have chosen "servletname" as my primary key as i will receive it in every request plus few 1000 server names are there it could be the best PK.
My Question is, to make the partition key work for me.
Do i have to specify the partition key option seperately like below
ItemResponse<ServerDto> ServerDtoResponse = await this.container.ReadItemAsync<ServerDto>(bocServerDto.mailHost, new PartitionKey(bocServerDto.servletname));
or
Including the partition key in the select query itself , without adding seperate new PartitionKey(), like
select * from r where r.servletname='cofaxEmail' and r.mailHost='mail1';
Crux of the question is: By passing partitionKey object in where condition of select query is it enough to utilize the partition key feature?
Thanks
For any crud operation you would pass in the value for the partition key. For example, on a point read.
ItemResponse<ServerDto> ServerDtoResponse = await this.container.ReadItemAsync<ServerDto>(bocServerDto.mailHost, new PartitionKey("cofaxEmail"));
For a query, you can either pass it in the queryRequest options or use it in the query as the first filter predicate. Here is an example of using the queryRequest options.
thanks.

Modeling an invite schema with embedded collections with dynamodb or docuemntdb

I'm investigating whether to use AWS DynamoDb or Azure DocumentDb or google cloud for price and simplicity for my app and am wondering what the best approach is for a typical invite schema.
An invite has
userId : key (who created the invite)
gameId : key
invitationList : collection of userIds
The queries I would be running are
Get invites where userId == me
Get invites where my userId is in the invitationList
In Mongo, I would just set an index on the embedded invitationList, and in SQL I would set up a join table of gameId and invited UserIds.
Using dynamodb or documentdb, could I do this in one "table" or would I have to set up a second denormalized table one that has an invited UserId per row with a set of invitedGameIds?
e.g.
A secondary table with
InvitedUserId : key
GameIds : Collection
Similar to hslriksen's answer, if certain criteria are met, I recommend that you denormalize all of this into a single document. Those criteria are:
The invitationList for games cannot grow unbounded.
Even if it's bounded, will a maximum length array fit in the document and transaction limits.
However, different from hslriksen, I recommend that an example document look like this:
{
gameId: <some game key>,
userId: <some user id>,
invitationList: [<user id 1>, <user id 2>, ...]
}
You might also decide to use the built-in id field for games in which case the name above is wrong.
The key difference between what I propose and hslriksen is that the invitationsList is a pure array of foreign keys. This will allow indexes to be used for an ARRAY_CONTAINS clause in your query.
Note, in DocumentDB, you would tend to store all entity types in the same big bucket and just distinguish them with a string type field or slightly better, an is_my_type boolean field.
For DocumentDB you could probably just keep this in one document per inviting user
where the document Id could equal the key of the inviting user. If you have many games, you could use gameId as partitionKey.
{
"id" : "gameKey+invitingUserKey",
"gameKey" : "someGameKey",
"invitingUserId": "key",
"invites": ["inviteKey1", "inviteKey2"]
}
This is based on a limited number of invites for a user/gameKey. It is however hard to determine the structure without knowing your query patterns. I find that the query patterns often dictates the document structure.

DynamoDB data model secondary index search

Folks,
Given we have to store the following shopping cart data:
userID1 ['itemID1','itemID2','itemID3']
userID2 ['itemID3','itemID2','itemID7']
userID3 ['itemID3','itemID2','itemID1']
We need to run the following queries:
Give me all items (which is a list) for a specific user (easy).
Give me all users which have itemID3 (precisely my question).
How would you model this in DynamoDB?
Option 1, only have the Hash key? ie
HashKey(users) cartItems
userID1 ['itemID1','itemID2','itemID3']
userID2 ['itemID3','itemID2','itemID7']
userID3 ['itemID3','itemID2','itemID1']
Option 2, Hash and Range keys?
HashKey(users) RangeKey(cartItems)
userID1 ['itemID1','itemID2','itemID3']
userID2 ['itemID3','itemID2','itemID7']
userID3 ['itemID3','itemID2','itemID1']
But it seems that range keys can only be strings, numbers, or binary...
Should this be solved by having 2 tables? How would you model them?
Thanks!
Rule 1: The range keys in DynamoDB table must be scalar, and that's why the type must be strings, numbers, boolean or binaries. You can't take a list, set, or a map type.
Rule 2: You cannot (currently) create a secondary index off of a nested attribute. From the Improving Data Access with Secondary Indexes in DynamoDB documentation. That means, you can not index the cartItems since it's not a top level JSON attribute. You may need another table for this.
So, the simple answer to your question is another question: how do you use your data?
If you query the users with input item (say itemID3 in your case) infrequently, perhaps a Scan operation with filter expression will work just fine. To model your data, you may use the user id as the HASH key and cartItems as the string set (SS type) attribute. For queries, you need to provide a filter expression for the Scan operation like this:
contains(cartItems, :expectedItem)
and, provide the value itemID3 for the placeholder :expectedItem in parameter valueMap.
If you run many queries like this frequently, perhaps you can create another table taking the item id as the HASH key, and set of users having that item as the string set attribute. In this case, the 2nd query in your question turns out to be the 1st query in the other table.
Be aware of that, you need to maintain the data at two tables for each CRUD action, which may be trivial with DynamoDB Streams.

Entity Framework - Use join table to return collections of Followers / Following

Using Entity Framework, I am writing a social networking app and trying to setup some relationships so that I can have Users with Followers/Following properties.
I have a Users table:
Users
_______
Id
FirstName
LastName
Email
Then I have a Follows table:
Follows
__________
FollowerId
FolloweeId
On a strongly typed User object, I want to see .Followers and .Following properties that return collections of User objects. So far I've tried making the columns in the Follows table as foreign keys, and both as composite primary keys. The entity model comes out the way I want it (I don't have an Entity in-between, and I just have to rename the navigation properties), but when I populate with data, the Followers and Following collections are empty, so something is not right with the relationships.
I suppose I could separate and have two tables, Followers and Following, but I would have duplicate data and would have to add to two tables when someone follows someone else.
You need to include the data for the tables you User tables has relationships with in your query.
A query like this:
using (EntityObject context = new EntityObject())
{
var user = from x in context.users.Include("Followers").Include("Followees") //use your tables
where x.Id == TheUserId
select x;
}
Will let you have access to the objects for those included tables.

Resources