DynamoDB sub item filter using .Net Core API - amazon-dynamodb

First of all, I have table structure like this,
Users:{
UserId
Name
Email
SubTable1:[{
Column-111
Column-112
},
{
Column-121
Column-122
}]
SubTable2:[{
Column-211
Column-212
},
{
Column-221
Column-222
}]
}
As I am new to DynamoDB, so I have couple of questions regarding this as follows:
1. Can I create structure like this?
2. Can we set primary key for subtables?
3. Luckily, I found DynamoDB helper class to do some operations into my DB.
https://www.gopiportal.in/2018/12/aws-dynamodb-helper-class-c-and-net-core.html
But, don't know how to fetch only perticular subtable
4. Can we fetch only specific columns from my main table? Also need suggestion for subtables
Note: I am using .net core c# language to communicate with DynamoDB.

Can I create structure like this?
Yes
Can we set primary key for subtables?
No, hash key can be set on top level scalar attributes only (String, Number etc.)
Luckily, I found DynamoDB helper class to do some operations into my DB.
https://www.gopiportal.in/2018/12/aws-dynamodb-helper-class-c-and-net-core.html
But, don't know how to fetch only perticular subtable
When you say subtables, I assume that you are referring to Array datatype in the above sample table. In order to fetch the data from DynamoDB table, you need hash key to use Query API. If you don't have hash key, you can use Scan API which scans the entire table. The Scan API is a costly operation.
GSI (Global Secondary Index) can be created to avoid scan operation. However, it can be created on scalar attributes only. GSI can't be created on Array attribute.
Other option is to redesign the table accordingly to match your Query Access Pattern.
Can we fetch only specific columns from my main table? Also need suggestion for subtables
Yes, you can fetch specific columns using ProjectionExpression. This way you get only the required attributes in the result set

Related

How to filter DynamoDb by object property value

I have a DynamoDB table:
How shoul I filter entried in DB table where all keys are: access.role = "ADMIN"?
You would be best served by setting up an Global Index (GSI). You set the Partition Key equal to that attribute, and the Sort Key equal to some other attribute that you can guarantee will be unique. Then you use your SDK of choice or the Query option in the console, select the index, and query for partion_key = ADMIN
However. Be aware. Index's are a complete replication of the table. Dynamo is very good at this and relatively fast at doing so, but there is still the possibility that your index will be out of sync with the actual data. If you are not making the call against the index very often you are pretty much fine. If you are calling it very often, then you should restructure your table.
Dynamo is not an SQL. When setting up a dynamo schema you have to consider how you will access your data. your Access Patterns. You should design your data with your Partition Key as the data you will have when looking up (Ie: i always will have a user ID number) and your sort keys as the individual documents related to that PK (ie: a user has a document that is his profile data, a document that is his profile picture url, a document that is a list of his friends user numbers, a document that is ... ect)
Then you use Indexs for things like your question that you wont be doing very often.

How to fetch multiple rows from DynamoDB using a non primary key

select * from tableName where columnName="value";
How can I fetch a similar result in DynamoDB using java, without using primary key as my attribute (Need to group data based on a value for a particular column).
I have gone through articles regarding getbatchitems, QuerySpec but all these require me to pass the primary key.
Can someone give a lead here?
Short answer is you can't. Whenever you use the Query or GetItem operations in DynamoDB you must always supply the table or index primary key.
You have two options:
Perform a Scan operation on the table and filter by columnName="value". However this requires DynamoDB to look at every item in the table so it is likely to be slow and expensive.
Add a Global Secondary Index to your table. This will require you to define a primary key for the index that contains the columnName you want to query

DynamoDB - Global Secondary Index on set items

I have a dynamo table with the following attributes :
id (Number - primary key )
title (String)
created_at (Number - long)
tags (StringSet - contains a set of tags say android, ios, etc.,)
I want to be able to query by tags - get me all the items tagged android. How can I do that in DynamoDB? It appears that global secondary index can be built only on ScalarDataTypes (which is Number and String) and not on items inside a set.
If the approach I am taking is wrong, an alternative way for doing it either by creating different tables or changing the attributes is also fine.
DynamoDB is not designed to optimize indexing on set values. Below is a copy of the amazon's relevant documentation (from Improving Data Access with Secondary Indexes in DynamoDB).
The key schema for the index. Every attribute in the index key schema
must be a top-level attribute of type String, Number, or Binary.
Nested attributes and multi-valued sets are not allowed. Other
requirements for the key schema depend on the type of index: For a
global secondary index, the hash attribute can be any scalar table
attribute. A range attribute is optional, and it too can be any scalar
table attribute. For a local secondary index, the hash attribute must
be the same as the table's hash attribute, and the range attribute
must be a non-key table attribute.
Amazon recommends creating a separate one-to-many table for these kind of problems. More info here : Use one to many tables
This is a really old post, sorry to revive it, but I'd take a look at "Single Table Design"
Basically, stop thinking about your data as structured data - embrace denormalization
id (Number - primary key )
title (String)
created_at (Number - long)
tags (StringSet - contains a set of tags say android, ios, etc.,)
Instead of a nosql table with a "header" of this:
id|title|created_at|tags
think of it like this:
pk|sk |data....
id|id |{title, created_at}
id|id+tag|{id, tag} <- create one record per tag
You can still return everything by querying for pk=id & sk begins with id and join the tags to the id records in your app logic
and you can use a GSI to project id|id+tag into tag|id which will still require you to write two queries against your data to get items of a given tag (get the ids then get the items), but you won't have to duplicate your data, you wont have to scan and you'll still be able to get your items in one query when your access pattern doesn't rely on tags.
FWIW I'd start by thinking about all of your access patterns, and from there think about how you can structure composite keys and/or GSIs
cheers
You will need to create a separate table for this query.
If you are interested in fetching all items based on a tag then I suggest keeping a table with a primary key:
hash: tag
range: id
This way you can use a very simple Query to fetch all items by tag.

Query DynamoDB table using only two Global secondary indices in AWS sdk for .net

I've a Dynamodb table with two global secondary indexes. and I need to query this table based only on both of those indexes at once without using the hash key of the table. Is there any particular way to do this in AWS sdk for .net? It seems this is impossible in high level API.
eg:-(SQL equivalent query would be:-) SELECT * FROM TABLE WHERE FIRST_GLOBAL_SECONDARY_INDEX='x' AND SECOND_GLOBAL_SECONDARY_INDEX='y';
You can only query a single GSI, not multiple. You would have to project the 2nd attribute (I'm assuming by other GSI you mean another attribute) onto the 1st index. Choosing which depends on your usage. Then, you have both attributes on a single index which you can query with the hashKey of the first attribute and use a FilterExpression for the second attribute.

DynamoDB Change Range Key Column

Is it possible to modify the Rangekey column after table creation. Such as adding new column/attribute and assigning as RangeKey for the table. Tried searching but cant ble to find any articles about changing the Range or Hash key
No, unfortunately it's not possible to change the hash key, range key, or indexes after a table is created in DynamoDB. The DynamoDB UpdateItem API Documentation is clear about the fact that indexes cannot be modified. I can't find a reference to anywhere in the docs that explicitly states that the table keys cannot be modified, but at present they cannot be changed.
Note that DynamoDB is schema-less other than the hash and range key, and you can add other attributes to new items with no problems. Unfortunately, if you need to modify either your hash key or range key, you'll have to make a new table and migrate the data.
Edit (January 2014): DynamoDB now has support for on the fly global secondary indexes
To change or create an additional sort key, you will need to create a new table and migrate over to it, as both actions cannot be done on existing tables.
DynamoDB streams enable us to migrate tables without any downtime. I've done this to great effective, and the steps I've followed are:
Create a new table (let us call this NewTable), with the desired key structure, LSIs, GSIs.
Enable DynamoDB Streams on the original table
Associate a Lambda to the Stream, which pushes the record into NewTable. (This Lambda should trim off the migration flag in Step 5)
[Optional] Create a GSI on the original table to speed up scanning items. Ensure this GSI only has attributes: Primary Key, and Migrated (See Step 5).
Scan the GSI created in the previous step (or entire table) and use the following Filter:
FilterExpression = "attribute_not_exists(Migrated)"
Update each item in the table with a migrate flag (ie: “Migrated”: { “S”: “0” }, which sends it to the DynamoDB Streams (using UpdateItem API, to ensure no data loss occurs).
NOTE: You may want to increase write capacity units on the table during the updates.
The Lambda will pick up all items, trim off the Migrated flag and push it into NewTable.
Once all items have been migrated, repoint the code to the new table
Remove original table, and Lambda function once happy all is good.
Following these steps should ensure you have no data loss and no downtime.
I've documented this on my blog, with code to assist:
https://www.abhayachauhan.com/2018/01/dynamodb-changing-table-schema/

Resources