How to update a dynamodb tables column with lowercase value - amazon-dynamodb

I go through some posts and came to know that in dynamodb case insensitive search is not possible, hence trying to update existing dynamodb table's column values to lowercase.
I searched for syntax but havent get any satisfactory result. In mysql we achieve same thing by "
set name = LOWERCASE(name)
Please help me to write same thing in dynamodb.
I wrote this query
aws dynamodb update-item --profile test --table-name test-event-tickets --key '{"university_id": {"S": "112"}}' --update-expression 'SET #nameAttribute = :inputScope' --expression-attribute-names '{"#scopeAttribute":"name"}' --expression-attribute-values '{":inputname":{"S":"george philips"}}'
but here i have hardcoded inputname to "george philips". instead of this I want to read column value and convert it to lowercase

Unforetunately, there is no such syntax in DynamoDB. Although DynamoDB is capable of doing some transformations to data in-place, such as incrementing a counter, the syntax to do this is very limited, and lowercasing a value is NOT one of the things you can do.
So you'll have to scan the entire table, reading the old value of the attribute, calculating the lowercase version in your application, and writing the value back. If your application is doing regular writes in parallel to this transformation, you'll need to be very careful not to overwrite data that is being overwritten in parallel. You can do this with a conditional expression, but I think it will be easier if the new lowercase attribute will have a different name from the old not-always-lowercase attribute, so your transformation process will be able to write to the new attribute only (using ConditionalExpression) if the new attribute is not yet set.

Related

AWS CLI, DynamoDB add attribute to entire table

I'm trying to add an attribute to a whole table, without specifying an index.
In this examples it's always being used an index:
aws dynamodb update-item \
--region MY_REGION \
--table-name MY_TABLE_NAME \
--key='{"AccountId": {"S": accountId}}' \
--update-expression 'SET conf=:newconf' \
--expression-attribute-values '{":newconf":{"S":"new conf value"}}'
Plus, that's an update for an attribute that is already in the table.
How can add a new attribute to each record of a table?
There is no API that will automatically add an attribute to all items in a table. DynamoDB just doesn't work that way.
The only way to add an attribute to all items in a table is to scan the table and for each item, make an UpdateItem request to add the attribute you want. This can be done for attributes that are missing (ie. adding new), or attributes that already exist and just being updated.
Some caveats:
If the table is small, and not being updated too often, this may work as intended in a single pass
If the table is larger and being updated relatively fast (ie. every second) then you will need to make sure the code updating the table is also adding the attribute to new items, or items being updated and that the updates don't clobber
Lastly, if the table is large, this can consume a LOT of capacity because of the scan and update for each item so plan on it taking a long time (also mind the consumed capacity vs. provisioned capacity) -- better have some rate-limiting on the update script

Not able to have commands in User-Defined functions in Kusto

I am trying to create a function that will accept name of tag and a datetime value and drop a extent within a specific table which has that tag and then ingest a new record into that table with the same tag and the input datetime value -- sort of 'update' simulation. I am not bothered about performance, it's just going to hold metadata -- maybe 20-30 rows at max.
So this is how the create table looks:-
.create table MyTable(sometext:string,somevalue:datetime)
And shown below is my function creation step, which is failing:-
.create-or-alter function MyFunction(arg_sometext:string,arg_somedate:datetime)
{
.drop extents <| .show table MyTable extents where tags has arg_sometext;
.ingest inline into table MyTable with (tags="[arg_sometext]") <| arg_somedate
}
So you can see I am trying to do something simple -- I am suspecting that Kusto won't allow commands in a function. Is there any workaround for achieving this?
Generally:
Kusto mandates that control commands start with a dot (.), and that this must be the first character in the text of the command. As queries, functions, etc. don't start with a dot, this precludes them from invoking control commands.
This is an intentional limitation that prevents a wide range of code injection attacks. By imposing this rule, Kusto makes it easy to guarantee that any query that does not begin with a dot will only have read access to the data and metadata, and never be able to alter them.
Specifically: with regards to your specific scenario:
I'm assuming it's triggered automatically (even if you did have the option to create a function), which suggests you should be able to achieve your goal using Kusto's API / Client libraries and a simple script/app.
An alternative, and perhaps even better approach, would be to re-consider if you actually need to delete or update specific records, or you can use summarize arg_max() in order to query for only the latest "versions" of the records (you could also create a function which encapsulates that logic and overrides the table, by naming the function with the table's name).

Dynamodb query expression

Team,
I have a dynamodb with a given hashkey (userid) and sort key (ages). Lets say if we want to retrieve the elements as "per each hashkey(userid), smallest age" output, what would be the query and filter expression for the dynamo query.
Thanks!
I don't think you can do it in a query. You would need to do full table scan. If you have a list of hash keys somewhere, then you can do N queries (in parallel) instead.
[Update] Here is another possible approach:
Maintain a second table, where you have just a hash key (userID). This table will contain record with the smallest age for given user. To achieve that, make sure that every time you update main table you also update second one if new age is less than current age in the second table. You can use conditional update for that. Update can either be done by application itself, or you can have AWS lambda listening to dynamoDB stream. Now if you need smallest age for each use, you still do full table scan of the second table, but this scan will only read relevant records, to it will be optimal.
There are two ways to achieve that:
If you don't need to get this data in realtime you can export your data into a other AWS systems, like EMR or Redshift and perform complex analytics queries there. With this you can write SQL expressions using joins and group by operators.
You can even perform EMR Hive queries on DynamoDB data, but they perform scans, so it's not very cost efficient.
Another option is use DynamoDB streams. You can maintain a separate table that stores:
Table: MinAges
UserId - primary key
MinAge - regular numeric attribute
On every update/delete/insert of an original query you can query minimum age for an updated user and store into the MinAges table
Another option is to write something like this:
storeNewAge(userId, newAge)
def smallestAge = getSmallestAgeFor(userId)
storeSmallestAge(userId, smallestAge)
But since DynamoDB does not has native transactions support it's dangerous to run code like that, since you may end up with inconsistent data. You can use DynamoDB transactions library, but these transactions are expensive. While if you are using streams you will have consistent data, at a very low price.
You can do it using ScanIndexForward
YourEntity requestEntity = new YourEntity();
requestEntity.setHashKey(hashkey);
DynamoDBQueryExpression<YourEntity> queryExpression = new DynamoDBQueryExpression<YourEntity>()
.withHashKeyValues(requestEntity)
.withConsistentRead(false);
equeryExpression.setIndexName(IndexName); // if you are using any index
queryExpression.setScanIndexForward(false);
queryExpression.setLimit(1);

What is a valid dynamodb key-condition-expression for the cli

Could somebody please tell me what a valid key condition expression would be. I am trying to run a query on a simple table called MyKeyTable. It has two "columns," namely Id and AnotherNumberThatICareAbout which is of type Long.
I would like to see all the values I put in. So I tried:
aws dynamodb query --select ALL_ATTRIBUTES --table-name MyKeyTable
--endpoint http://localhost:8000
--key-condition-expression "WHAT DO I PUT IN HERE?"
What hash do I need to put in? The docs are a bit lame on this imho. Any help appreciated, even if it's just a link to a good doc.
Here's a command-line-only approach you can use with no intermediate files.
First, use value placeholders to construct your key condition expression, e.g.,
--key-condition-expression "Id = :idValue"
(Don't forget the colon prefix for placeholders!)
Next, construct an expression-attribute-values argument. Note that it expects a JSON format. The tricky bit I always try to forget with this is that you can't just plug in 42 for a number or "foo" for a string. You have to tell DynamoDb the type and value. Ref AWS docs for the complete breakdown of how you can format the value specification, which can be quite complex if you need it to be.
For Windows you can escape quotation marks in it by doubling them, e.g.,
--expression-attribute-values "{"":idValue"":{""N"":""42""}}"
For MacOS/Linux, single quote is required around the JSON:
--expression-attribute-values '{":idValue":{"N":"42"}}'
create a file containing your keys: test.json
{
"yourHashKeyName": {"S": "abc"},
"YourRangeKey": {"S": "xyz"} //optional
}
Run
aws dynamodb query --table-name "your table name" --key-conditions file://test.json
refer: http://docs.aws.amazon.com/cli/latest/reference/dynamodb/query.html
For scanning the table
aws dynamodb scan --table-name "you table name"
No need to pass any keys as we scan the whole table (Note: It will get max 1MB of data)
refer:http://docs.aws.amazon.com/cli/latest/reference/dynamodb/scan.html

DynamoDB Change Range Key Column

Is it possible to modify the Rangekey column after table creation. Such as adding new column/attribute and assigning as RangeKey for the table. Tried searching but cant ble to find any articles about changing the Range or Hash key
No, unfortunately it's not possible to change the hash key, range key, or indexes after a table is created in DynamoDB. The DynamoDB UpdateItem API Documentation is clear about the fact that indexes cannot be modified. I can't find a reference to anywhere in the docs that explicitly states that the table keys cannot be modified, but at present they cannot be changed.
Note that DynamoDB is schema-less other than the hash and range key, and you can add other attributes to new items with no problems. Unfortunately, if you need to modify either your hash key or range key, you'll have to make a new table and migrate the data.
Edit (January 2014): DynamoDB now has support for on the fly global secondary indexes
To change or create an additional sort key, you will need to create a new table and migrate over to it, as both actions cannot be done on existing tables.
DynamoDB streams enable us to migrate tables without any downtime. I've done this to great effective, and the steps I've followed are:
Create a new table (let us call this NewTable), with the desired key structure, LSIs, GSIs.
Enable DynamoDB Streams on the original table
Associate a Lambda to the Stream, which pushes the record into NewTable. (This Lambda should trim off the migration flag in Step 5)
[Optional] Create a GSI on the original table to speed up scanning items. Ensure this GSI only has attributes: Primary Key, and Migrated (See Step 5).
Scan the GSI created in the previous step (or entire table) and use the following Filter:
FilterExpression = "attribute_not_exists(Migrated)"
Update each item in the table with a migrate flag (ie: “Migrated”: { “S”: “0” }, which sends it to the DynamoDB Streams (using UpdateItem API, to ensure no data loss occurs).
NOTE: You may want to increase write capacity units on the table during the updates.
The Lambda will pick up all items, trim off the Migrated flag and push it into NewTable.
Once all items have been migrated, repoint the code to the new table
Remove original table, and Lambda function once happy all is good.
Following these steps should ensure you have no data loss and no downtime.
I've documented this on my blog, with code to assist:
https://www.abhayachauhan.com/2018/01/dynamodb-changing-table-schema/

Resources