DynamoDB Last Evaluated Key Expiration? - amazon-dynamodb

My application ingests data from a 3rd party REST API which is backed by DynamoDB. The results are paginated and thus I page forward by passing the last evaluated key to each subsequent request.
My question is does the last evaluated key have a shelf life? Does it ever expire?
Let's say I query the REST API and then decide to stop. If I save the last evaluated key, can pick up exactly where I left off 30 days later? Would that last evaluated key still work and return the correct next page based on where I left off previously?

You shouldn't think of the last evaluated key like a "placeholder" or a "bookmark" in a result set from which to resume paused iteration.
You should think of it more like a "start from" place marker. An example might help. Let's say you have a table with a hash key userId and a range key timestamp. The range key timestamp will provide an ordering for your result set. Say your table looked like this:
user ID | Timestamp
1 | 123
1 | 124
1 | 125
1 | 126
In this order, when you query the table for all of the records for userId 1, you'll get the records back in the order they're listed above, or ascending order by timestamp. If you wanted them back in descending order, you'd use Dyanmo DB's scanIndexForward flag to indicate to order them "newest to oldest" or in descending order by timestamp.
Now, suppose there were a lot more than 4 items in the table and it would take multiple queries to return all of the records with a userId of one. Well, you wouldn't want to have to keep getting pages and pages back, so you can tell Dynamo DB where to start by giving it the last evaluated key. Say the last result for the previous query was the record with userId = 1 and timestamp = 124. You tell Dynamo in your query that that was the last record you got, and it will start your next result set with the record that has userId = 1 and timestamp = 125.
So the last evaluated key isn't something that "expires," it's a way for you to communicate to Dynamo which records you want it to return based on records that you've already processed, displayed to the user, etc.

Related

Dynamodb re-order row priority field values and also make them sequential

I have a table in dynamodb, records has priority field. 1-N.
records shown to user with a form and user can update priority field, it means I need to change the priority of the field.
one solution is like when priority of a record changed I reorder all the records that their priory is more than it.
for example if I change a priority of record in N= 5 to 10, I need to order all records that their priority field is more than 5.
what do you recommend?
DynamoDB store all items(records) in order by a tables sort-attribute. However, you are unable to update a key value, you would need to delete and add a new item every time you update.
One way to overcome this is to create a GSI. Depending on the throughput required for you table you may need to artificially shard the partition key. If you expect to consume less than 1000 WCU per second, you won't need to.
gsipk
gsisk
data
1
001
data
1
002
data
1
007
data
1
009
data
Now to get all the data in order of priority you simply Query your index where gsipk = 1.
You can also Update the order attribute gsisk without having to delete and put an item.

To query Last 7 days data in DynamoDB

I have my dynamo db table as follows:
HashKey(Date) ,RangeKey(timestamp)
DB stores the data of each day(hash key) and time stamp(range key).
Now I want to query data of last 7 days.
Can i do this in one query? or do i need to call dbb 7 times for each day? order of the data does not matter So, can some one suggest an efficient query to do that.
I think you have a few options here.
BatchGetItem - The BatchGetItem operation returns the attributes of one or more items from one or more tables. You identify requested items by primary key. You could specify all 7 primary keys and fire off a single request.
7 calls to DynamoDB. Not ideal, but it'd get the job done.
Introduce a global secondary index that projects your data into the shape your application needs. For example, you could introduce an attribute that represents an entire week by using a truncated timestamp:
2021-02-08 (represents the week of 02/08/21T00:00:00 - 02/14/21T12:59:59)
2021-02-16 (represents the week of 02/15/21T00:00:00 - 02/22/21T12:59:59)
I call this a "truncated timestamp" because I am effectively ignoring the HH:MM:SS portion of the timestamp. When you create a new item in DDB, you could introduce a truncated timestamp that represents the week it was inserted. Therefore, all items inserted in the same week will show up in the same item collection in your GSI.
Depending on the volume of data you're dealing with, you might also consider separate tables to segregate ranges of data. AWS has an article describing this pattern.

Query DynamoDB with Partition key and Sort Key using OR Conditon

I have a requirement to query the dynamoDB and get all the records which matches a certain criteria. The requirement is, I have table say parent_child_table, which has parent_id and child_id as two columns, now i need to query the table with a particular input id and fetch all the records. for Example
now if I query the db with id 67899, the I should get both two records i.e 12345 and 67899.
I was trying to use below methods :
GetItemRequest itemRequest=new GetItemRequest().withTableName("PARENT_CHILD_TABLE").withKey(partitionKey.entrySet().iterator().next(), sortKey.entrySet().iterator().next());
but am not getting OR operator.
DynamoDB doesn't work like that...
GetItemRequest() can only return a single record.
Query() can return multiple records, but only if you are using a composite primary key (partition key + sort key) and you can only query within a single partition...so all the records to be returned must have the same partition key.
Scan() can return multiple records from any partition, but it does so by always scanning the entire table. Regular use of scan is a bad idea.
Without knowing more it's hard to provide guidance, but consider a schema like so:
partition key sort key
12345 12345
12345 12345#67899
12345 12345#67899#97765
Possibly adding some sort of level indicator in the sort key or just as an attribute.

Sqlite order of query

I'm running this query on a sqlite db and it looks that its working fine.
SELECT batterij ,timestamp FROM temphobbykamer WHERE nodeid= 113 AND timestamp >= 1527889336634 AND timestamp <= 1530481336634 AND ROWID % 20 =0
But can i be sure that the query is handled in the correct order?
It must find all records from node113 between time A and B. From this selection found I only want to have every 20th record.
I can imagine if the query order difference, that if you query every 20th record between time A and B and select from this selection all the node113 records that the response will be different.
When no ORDER BY is specified, the order is undefined. However, typically sqlite will return in ROWID order since you haven't specified anything else. To make sure you get consistent results, you should specify ORDER BY ROWID

Data Modeling with NoSQL (DynamoDB)

Coming from a SQL background, I understand the high-level concepts on NoSQL but still having troubles trying to translating some basic usage scenario. I am hoping someone can help.
My application simply record a location, a timestamp, and tempature for every second of the day. So we end up having 3 basic columns:
1) location
2) timestamp
3) and temperature
(All field are numbers and I'm storing the timestamp as an epoch for easy range querying)
I setup dynamodb with the location as the primary key, and the timestamp as the sortkey and temp as an attribute. This results in a composite key on location and timestamp which allows each location to have its own unique timestamp but not allow any individual location to have more than one identical timestamp.
Now comes the real-world queries:
Query each site for a time range (Works fine)
Query for any particular time-range return all temps for all locations (won't work)
So how would you account for the 2nd scenario? This is were I get hung up... Is this were we get into secondary indexes and things like that? For those of you smarter than me, how would you deal with this?
Thanks in advance for you help!
-D
you cant query for range of values in dynamodb. you can query for a range of values (range keys) that belongs to a certain value (hash key)
its not matter if this is table key, local secondary index key, or global secondary index (secondary index are giving you another query options..)
lets back to your scenario:
if timestamp is in seconds and you want to get all records between 2 timestamps then you can add another field 'min_timestamp'.
this field can be your global secondary hash key, and timestamp will be your global secondary range key.
now you can get all records that logged in a certain minute.
if you want a range of minutes, then you need to perform X queries (if X its the range of minutes)
you can also add another field 'hour_timestamp' (that hash key contains all records in a certain hour) and goes on... - but this approach is very dangerous - you going to update many records with the same hash key in the same point of time, and you can get many throughput errors...

Resources