How to get attribute from a list of partition keys in DynamoDB - is scan my only option? - amazon-dynamodb

I've got a list of partition keys from one table.
userId["123","456","235"]
I need to get an attribute that they all share. like "username".
What would be the best practice to get them all at once?
Is scan my only option knowing that I know all my partition keys?
Do I know the sort key? yes but only the beginning of it. Therefore I
don't think I could use batchGetItem.

Scan is only appropriate if you don't know the partition keys. Because you know the partition keys you want to search, you can achieve the desired behavior with multiple Query operations.
A Query searches all documents with the specified partition key; you can only query one partition key per request, so you'll need multiple queries, but this will still be significantly more efficient than a single Scan operation.
If you're only looking for documents with a sort key that begins with something, you can include it in your KeyConditionExpression along with the partition key.
For example, if you wanted to only return documents whose sort key begins with a certain string, you could pass something like userId = :user_id AND begins_with(#SortKey, :str) as the key condition expression.

You can efficiently achieve the result by using PartQL SELECT statement. It allows to query array of partition keys with IN operator and apply additional conditions on other attributes without causing a full table scan.
To ensure that a SELECT statement does not result in a full table
scan, the WHERE clause condition must specify a partition key. Use the
equality or IN operator.

Related

Mapping a dynamodb query result

I have a table with a composite key; there is both a partition and a sort key. I know that the java sdk allows me to query by just the partition key. However, if I do this then the docs say I will get this iterator back ItemCollection<QueryOutcome>. This means for me to work with this data, I will have to iterate over the entire collection in order to fulfill my needs.
It would be easier if I was able to get back a Map<T, V> type where the key here would be the sort key. That way, I can quickly find rows for a particular sort key. Is this possible? I would rather not iterate over the collection just to find certain items with a certain sort key value.
If you just want an item with a certain sort key, that’s a get item. Don’t do a Query.
You may be confused by DynamoDB’s use of the word Query. That’s not the only way to query the database. It’s one way to query which happens to have the name Query.

Fetch last item of the aws dynamodb table

So I wanted to fetch the last item/row of my dynamodb table but i am not finding resources. My primary key is id having series of incremented numbers such as 1,2,3... for each row respectively.
This is my function.
async function readMessage(){
const params = {
TableName: table,
};
return dynamo.getItem(params).promise();
}
I am not sure as to what i should be adding in my params.
DynamoDB has two types of primary keys:
Partition key – A simple primary key, composed of one attribute known as the partition key.
Partition key and sort key – Referred to as a composite primary key, this type of key is composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key.
When fetching an item by partition key, you need to specify the exact partition key. You cannot fetch the max/min partition key.
Instead, you may want to create a sort key with a timestamp (or the ID if it's a sequential number) and use the sort key to fetch the last item.
Check out the AWS docs on Choosing the Right Partition Key for more info.
The proper way to design a table in DynamoDB is based on its expected access patterns; if this is something you need perhaps you should consider using this id as Sort Key instead of Primary Key and then query the table in descending order while also limiting the amount of items to 1.
If instead you don't want to change the schema of your items and you don't care about making at least two operations to do this you have two, not optimal options:
If none of your items ever gets deleted, just make a count first and use that information to know what's the latest item that was written.
Alternatively, if you could consider keeping a "special" record in your DynamoDB table that is basically a count that gets always incremented/written when one of your "other" items gets written. Upon retrieval you first retrieve the value of this special record and use this info to retrieve the actual one.
The combination of the partition key and sort key, makes the primary key of your item in the dynamoDB, so their combination must be unique, otherwise the item will be overwritten.
In almost all my use-cases, I select the primary key as an object attribute, like the brand, an email or a class and then, for the sort key I select the TimeStamp. So in this way, you always know the partition key, we need it to retrieve the values and then you can query your dynamoDB by making some filters by the sort key. For more extensive examples using Python, check the AWS page: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.Python.04.html, where it shows, how you can query your DynamoDB items.
There is also other ways to define the keys in your Dynamo and for that I advise you to check https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-sort-keys.html

DynamoDB how to search for a list of values

I have a DynamoDB instance with a partition key and sort key. Let's say that they are organisation (hash key) and employee id (sort key).
I want to retrieve all employees who's ids are in a list. They all work for the same organisation but they are not all of the employees of that organisation.
In SQL I'd do something like:
select * from table where organisation_id = 'org' and employee_id in [list of ids]
There does not seem to be an equivalent in DynamoDB.
My choices seem to be:
1) Iterate over all employee IDs using a Query OR
2) Use BatchGetItems and provide organisation_id:employee_id for all items
The first seems like it will be slower as it involves multiple requests while the second is a single request but may consume more RCUs.
Which of these is preferred solution to this problem? Or am I missing a better third way?
I would iterate your list using GetItem, adding each employee found to a collection. This approach isn't slow - DynamoDB is designed specifically for getting lots of items fast using their keys.
There is no need to use Query as you have both the partition key and range key. You would only use a Query if say you wanted all employees of one organisation.
If your list is particularly large you could use BatchGetItem, which will create multiple parallel threads and therefore reduce latency. You won't find much a difference though unless you have a lot of items to get.
By the way, DynamoDB does have an 'IN' operator but your can't use it on KeyConditions.

How to fetch multiple rows from DynamoDB using a non primary key

select * from tableName where columnName="value";
How can I fetch a similar result in DynamoDB using java, without using primary key as my attribute (Need to group data based on a value for a particular column).
I have gone through articles regarding getbatchitems, QuerySpec but all these require me to pass the primary key.
Can someone give a lead here?
Short answer is you can't. Whenever you use the Query or GetItem operations in DynamoDB you must always supply the table or index primary key.
You have two options:
Perform a Scan operation on the table and filter by columnName="value". However this requires DynamoDB to look at every item in the table so it is likely to be slow and expensive.
Add a Global Secondary Index to your table. This will require you to define a primary key for the index that contains the columnName you want to query

Does dynamodb support something like an "in" clause in its queries?

Say I have table of photos and users.
Given I have a list of users I'm following [user1,user2,...] and I want to get a list of photos of people I'm following.
How can I query the table of photos where photo.createdBy in [user1,user2,user3...]
I saw that dynamodb has a batch operation, but that takes a primary key, and in this case we would be querying against a secondary index (createdBy).
Is there a way to do a query like this in dynamodb?
If you are querying purely on photo.createdBy, then you should create a global secondary index:
To speed up queries on non-key attributes, you can create a global secondary index. A global secondary index contains a selection of attributes from the table, but they are organized by a primary key that is different from that of the table. The index key does not need to have any of the key attributes from the table; it doesn't even need to have the same key schema as a table.
This will, of course, only retrieve one item. To limit results when returning more items, use a FilterExpression:
With a Query or a Scan operation, you can provide an optional filter expression to refine the results returned to you. A filter expression lets you apply conditions to the data after it is queried or scanned, but before it is returned to you. Only the items that meet your conditions are returned.
This can be applied to a Filter or Scan, but be careful of using too many Read Capacity Units when scanning for matching entries.

Resources