Dynamodb 'orter by' based on a hash and other column criteria - amazon-dynamodb

I'm using dynamo db, I was reading this document Class QueryRequest and says literately:
You can narrow the scope of the query by using comparison operators on
the range key value, or on the index key. You can use the
ScanIndexForward parameter to get results in forward or reverse order,
by range key or by index key.
But, I need know if is possible to sort my data according to another parameter (different to hash or range).
Thanks in advance.

You can only Query a Table of type Hash and Range, or a Local Secondary Index or a Global Secondary Index. You have to create the Table with the indexes, so if you haven't then you cannot query on them.

Related

How to get attribute from a list of partition keys in DynamoDB - is scan my only option?

I've got a list of partition keys from one table.
userId["123","456","235"]
I need to get an attribute that they all share. like "username".
What would be the best practice to get them all at once?
Is scan my only option knowing that I know all my partition keys?
Do I know the sort key? yes but only the beginning of it. Therefore I
don't think I could use batchGetItem.
Scan is only appropriate if you don't know the partition keys. Because you know the partition keys you want to search, you can achieve the desired behavior with multiple Query operations.
A Query searches all documents with the specified partition key; you can only query one partition key per request, so you'll need multiple queries, but this will still be significantly more efficient than a single Scan operation.
If you're only looking for documents with a sort key that begins with something, you can include it in your KeyConditionExpression along with the partition key.
For example, if you wanted to only return documents whose sort key begins with a certain string, you could pass something like userId = :user_id AND begins_with(#SortKey, :str) as the key condition expression.
You can efficiently achieve the result by using PartQL SELECT statement. It allows to query array of partition keys with IN operator and apply additional conditions on other attributes without causing a full table scan.
To ensure that a SELECT statement does not result in a full table
scan, the WHERE clause condition must specify a partition key. Use the
equality or IN operator.

How to fetch multiple rows from DynamoDB using a non primary key

select * from tableName where columnName="value";
How can I fetch a similar result in DynamoDB using java, without using primary key as my attribute (Need to group data based on a value for a particular column).
I have gone through articles regarding getbatchitems, QuerySpec but all these require me to pass the primary key.
Can someone give a lead here?
Short answer is you can't. Whenever you use the Query or GetItem operations in DynamoDB you must always supply the table or index primary key.
You have two options:
Perform a Scan operation on the table and filter by columnName="value". However this requires DynamoDB to look at every item in the table so it is likely to be slow and expensive.
Add a Global Secondary Index to your table. This will require you to define a primary key for the index that contains the columnName you want to query

DynamoDB GSI BatchGetItem

Is it possible to retrieve rows from the dynamodb Global secondary index using batchgetitem api? If my aim is to retrieve data from the main table based on some non-key attribute also , but data should be retrieved in the batch of 100 items - is the GSI index won't fit here?
Also is BatchItemGet API available for Query? Say a table has the primary key and sort key and same primary key can have multiple sort keys can I retrieve multiple primary keys using batchItemGet with just primary key only or it won't fir here?
There is no way to specify the index name in the BatchGetItem API operation according to the docs. That means using BatchGetItem (and GetItem for that matter) on a secondary index isn't possible. Both of these operate on the primary index.
If you want to retrieve data from a secondary index, you need to use Query or Scan. Both support the IndexName attribute according to the documentation. When using Query you have to specify the partition key and can optionally filter based on the sort key. If you don't filter on the sort key, you will get all items with the partition key, which should take care of your second requirement.
To retrieve data from a secondary index based on different partition keys, you'd need to issue multiple Query operations for the separate values of these keys, there is no batching here.
You can use PartiQL with WHERE IN clause for that:
SELECT * FROM Orders WHERE OrderID IN [100, 300, 234]
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ql-reference.select.html

limit offset, sorting and aggregation challenges in DynamoDB

I am using DynamoDB to store my device events (in JSON format) into table for further analysis and using scan APIs to display the result set on UI, which requires
To define limit offset of records,say 10 records per page, means
result set should be paginated(e.g. page-1 has 0-10 records, page-2
has 11-20 records and so on), i got an API like scanRequest.withLimit(10) but it has different meaning of limit offset, does DynamoDB API comes with support of limit offset?
I also need to sort result set on basis of user input fields like sorting on Date, Serial Number etc, but still didn't get any sorting/order by APIs.
I may look for aggregation e.g. on Device Name, Date etc. which also doesn't seems to be available in DynamoDB.
The above situation led me to think about some others noSQL database solutions, Please assist me on above mentioned issues.
The right way to think about DynamoDB is as a key-value store with support for indexes.
"Amazon DynamoDB supports key-value data structures. Each item (row) is a key-value pair where the primary key is the only required attribute for items in a table and uniquely identifies each item. DynamoDB is schema-less. Each item can have any number of attributes (columns). In addition to querying the primary key, you can query non-primary key attributes using Global Secondary Indexes and Local Secondary Indexes."
https://aws.amazon.com/dynamodb/details/
A table can have 2 types of keys:
Hash Type Primary Key—The primary key is made of one attribute, a
hash attribute. DynamoDB builds an unordered hash index on this
primary key attribute. Each item in the table is uniquely identified
by its hash key value.
Hash and Range Type Primary Key—The primary
key is made of two attributes. The first attribute is the hash
attribute and the second one is the range attribute. DynamoDB builds
an unordered hash index on the hash primary key attribute, and a
sorted range index on the range primary key attribute. Each item in
the table is uniquely identified by the combination of its hash and
range key values. It is possible for two items to have the same hash
key value, but those two items must have different range key values.
What kind of primary key have you set up for your Device Events table? I would suggest that you denormalize your data (i.e. pull specific attributes out of the json) and build additional indexes on those attributes that you want to sort and aggregate on: Date, Serial Number, etc. If I know what kind of primary key you have set up on your table, I can point you in the right direction to build these indices so that you can get what you need via the query method. The scan method will be inefficient for you because it reads every row in the table.
Lastly, with regard to your "limit offset" question, I think that you're looking for the ExclusiveStartKey, which will be returned by DynamoDB in the response to your query.
The ExclusiveStartKey is what will help you do pagination. It's not necessary to depend on the LastEvaluatedKey from the response. You'll get LastEvaluatedKey only if you are getting more than a MB worth data. If LIMIT page size is such that total returned data size is less than 1 MB, you'll not get back LastEvaluatedKey. But that does not stop you from using ExclusiveStartKey as an offset.

Order of DynamoDB multiple conditions query

I am studying DynamoDB and confuse on the order.
a. Could I use multiple conditions in the KeyConditions field of query command to do the 'AND' query? i.e. Set condition to the following keys:
hash part of primary key,
range part of primary key,
local secondary index 1,
b. If it's workable, how would DynamoDB sort the result?
DynamoDB can only use one index at a time so you can't really query using both a range primary key AND a secondary index.
The sort will be based on the index actually used.
The conditions are filtering out results and are not limited to indices.

Resources