Conditional insert in Dynamodb - amazon-dynamodb

I am creating a leave tracker app where I want to store the user ID along with the from date and to date. I am using Amazon's DynamoDB as the database, and the user enters a leave through a custom command.
Eg: apply-leave from-date to-date
I want to avoid duplicate entries in the database. For example, if a user has already applied for a leave between 06-10-2019 to 10-10-2019 and applies for a leave between the same dates again, they should get a message saying that this already exists and a new record should not be created for the same.
However, a user can apply for multiple leaves and two users can take a leave between the same dates.
I tried using a conditional statement as follows:
table.put_item(
Item={
'leave_id': leave_id,
'user_id': user_id,
'from_date': from_date,
'to_date': to_date,
},
ConditionExpression='attribute_not_exists(user_id) AND attribute_not_exists(from_date) AND attribute_not_exists(to_date)'
)
where leave_id is the partition key. However, this does not work and a new row is added every time, even if it is the same dates. I have looked through similar other questions, but haven't been able to understand how to get this configured correctly.
Any ideas on how I should go about this, or if there is a different design that I should follow?

If you are calling your code with the leave_id that doesn't yet exist in the table, the item will always be inserted. If you call your code with leave_id that does already exist in your table you should be getting An error occurred (ConditionalCheckFailedException) when calling the PutItem operation: The conditional request failed error message.
I have two suggestions:
If you don't want to change your table, you can create a secondary index with user_id as the partition key and then query the index for all the items where the given user has some from_date and to_date attributes.
Like this:
table.query(
IndexName='user_id-index',
KeyConditionExpression=Key('user_id').eq(user_id),
FilterExpression=Attr('from_date').exists() & Attr('from_date').exists()
)
Then you will need to check for overlapping leave requests, etc. (eg. leave request that starts before the one that is already in place finishes). After deciding that the leave request is a valid one you will call put_item.
Another suggestion and probably a better one would be to create a composite primary key on your table with user_id as a partition key and leave_id as a sort key. That way you could execute a query for all leave requests from a particular user without the need to create a secondary index.

Related

How to introduce a new column in dynamo DB running in production?

I have a use case where DynamoDB is running in production and I need to add a new column IDUpdatedAt which will also be serving as a sort key for one of the GSIs.
I tried a thing in test where my application adds the new rows with IDUpdatedAt, it's working fine but what about the existing rows? How to add the values for those?
Also the new rows will not be added without IDUpdatedAt, but how will the search be impacted for older rows?
PS: IDUpdatedAt is being used as a filter in the application, i.e., user can search for specific ID and can get results sorted by date. That's why IDUpdatedAt is also a part of GSI (sort key).
Please help.
You've got the right idea by adding the field to new items. After all, DynamoDB does not enforce a particular schema outside of the primary key.
This also happens to be a very useful feature, especially when defining a GSI on that attribute; if the atttibute exists on the item, it ends up in the index! For example, imagine modeling an email inbox in DDB where each item represents an email. You could include an attribute 'is_read' and define a GSI using that atttibute.
If the 'is_read' attribute exists on the item, it's in the index. Otherwise, it's not. A cool way to use GSIs to implement filtering.
Pretty neat stuff!
However, there is no way to retroactively update all items with a new attribute other than manually updating each item (or in batches). The equivalent in SQL databases is defining a new column. Unfortunately, an analogous operation in DDB does not exist.

Removing one-to-many relations using splice

I'm having some trouble getting relation deletion to work exactly how I would expect it to.
For example I have two simple tables, users and permissions with a one-to-many relation between users and permissions (or it could be many-to-many in this example as well).
I first tried deleting one of the related permissions using userDatasource.deleteItem() or userDatasource.item.permissions[index]._delete() but when you use either of those functions it marks the record as deleted client side so you run into trouble when you need to insert again.
I then found a related question that said to use item.relation.splice(startIndex, 1) to just break the relation and that worked as expected but now I have a bunch of extra rows in my database with the user foreign key null. I would much rather have the same behavior as .splice but also have it delete those records from the database. Is there any way to do that or is App Maker supposed to detect the broken relation and automatically delete the row from the table?
Just do a check after the splice like this:
if (item.relation.length === 0) {
item._delete();
}

Geoserver WFS-T Insert error (ORA-22816) when using layers based on database views i.c. with sequence numbering for ID's

We are working on a Frontend application, which adds new data to layers of GeoServer. The Frontend is using WFS-T Insert calls to add this data. We use views for these layers in GeoServer, to do some additional handling. The views we use are database views thus created on our Oracle database itself (e.g. we do not use the SQL Views of GeoServer). These layers based on views work all fine (solution applied with a disabled “primary key” for the view as described elsewhere on the internet).
The views we use contain an unique ID which is the unique ID of the “principal” table which is used in the view. For the unique creation of the ID’s for the “principal” table we chose to have the creation of this ID to be done by a sequence defined in Oracle. For using this sequence within GeoServer you can provide this “metadata” as indicated by the documentation: http://docs.geoserver.org/stable/en/user/data/database/primarykey.html
This solutions works fine, except when you use an (insert) trigger on the database view.
CREATE OR REPLACE TRIGGER OUR_VIEW_TRG
instead of insert or update or delete on vw_our_view
for each row…
If we perform a WFS-T insert call this results in the following exception:
org.geoserver.wfs.WFSTransactionException: Error performing insert:
Error inserting features Error performing insert: Error inserting
features Error inserting features ORA-22816: unsupported feature with
RETURNING clause
Without indicating GeoServer to use the sequence we get returned a feature ID, which will not correspond with our sequence numbering on the ID of the table (GeoServer simply returns the number of rows of the table, plus one). This results in an undesired situation where we have an incorrect ID at the Frontend after an WFS-T insert, only after a refresh of the browser the correct ID is being fetched.
Does anybody know if there is a way to make this work, e.g. for our customer changing the GeoServer code ourselves, and thus create our “own” version of GeoServer is no option. Stop using the views would imply extra WFS-T insert calls from our FrontEnd Application.

Using timestamp as an Attribute in DynamoDB

I'm quite new to DynamoDB, but have some experience in Cassandra. I'm trying to adapt a pattern I followed in Cassandra, where each column represented a timestamped event, and wondering if it will carry over gracefully into DynamoDB or if I need to change my approach.
My goal is to query a set of documents within a date range by using the milliseconds-since-epoch timestamp as an Attribute name. I'm successfully storing the following as each report is generated with each new report being added under its own column:
{ PartitionKey:customerId,
SortKey:reportName_yyyymm,
'#millis_1#':{'report':doc_1},
'#millis_2#':{'report':doc_2},
. . .
'#millis_n#':{'report':doc_n}
}
My question is, given a millisecond-based date range, and the accompanying Partition and Sort keys, is it possible to query the set of Attributes that fall within that range or must I retrieve all columns for the matching keys and filter them at the client?
Welcome to the most powerful NoSQL database ;)
To kick off with the positive news, there is no way to query out specific attributes. You can project certain attributes in a query. But you would have to write your own logic to determine which attributes or columns should be included in the projected query. To get close to your solution you could use a map attribute inside an item with the milliseconds as a key. But there is another thing you have to be aware of when starting on this path.
There is a maximum total item size of 400KB for each item in DynamoDB, including key and attribute names.(Limits in DynamoDB Items) This means you can only store so many attributes in an item. This is especially true if you intend to put the actual report inside of the attribute. Which I would advise against, also because you will be burning up read capacity units every time you get one attribute out of the whole item. You would be better of putting this data in a separate table with the keys in the map. But truthfully in DynamoDB I would split this whole thing up, just add the milliseconds to the sort key and make every document its own item. That way you can directly query to these items and you can use the "between" where clause to select specific date-time ranges. Please let me you meant something else.

Voting on items - how to design database/aws-lambda to minimize AWS costs

I'm working on a website that mostly displays items created by registered users. So I'd say 95% of API calls are to read a single item and 5% are to store a single item. System is designed with AWS API Gateway that calls AWS Lambda function which manipulates data in DynamoDB.
My next step is to implement voting system (upvote/downvote) with basic fetaures:
Each registered user can vote only once per item, and later is only allowed to change that vote.
number of votes needs to be displayed to all users next to every item.
items have only single-item views, and are (almost) never displayed in a list view.
only list view I need is "top 100 items by votes" but it is ok to calculate this once per day and serve cached version
My goal is to design a database/lambda to minimize costs of AWS. It's easy to make the logic work but I'm not sure if my solution is the optimal one:
My items table currently has hashkey slug and sortkey version
I created items-votes table with hashkey slug and sortkey user and also voted field (containing -1 or 1)
I added field votes to items table
API call to upvote/downvote inserts to item-votes table but before checks constraints that user has not already voted that way. Then in second query updates items table with updated votes count. (so 1 API call and 2 db queries)
old API call to show an item stays the same but grabs new votes count too (1 API call and 1 db query)
I was wondering if this can be done even better with avoiding new items-votes table and storing user votes inside items table? It looks like it is possible to save one query that way, and half the lambda execution time but I'm worried it might make that table too big/complex. Each user field is a 10 chars user ID so if item gets thousands of votes I'm not sure how Lambda/DynamoDB will behave compared to original solution.
I don't expect thousands of votes any time soon, but it is not impossible to happen to a few items and I'd like to avoid situation where I need to migrate to different solution in the near future.
I would suggest to have a SET DynamoDB (i.e. SS) attribute to maintain the list of users who voted against the item. Something like below:-
upvotes : ['user1', 'user2']
downvotes : ['user1', 'user2']
When you update the votes using UpdateExpression, you can use ADD operator which adds users to SET only if it doesn't exists.
ADD - Adds the specified value to the item, if the attribute does not
already exist. If the attribute does exist, then the behavior of ADD
depends on the data type of the attribute:
If the existing data type is a set and if Value is also a set, then
Value is added to the existing set. For example, if the attribute
value is the set [1,2], and the ADD action specified [3], then the
final attribute value is [1,2,3]. An error occurs if an ADD action is
specified for a set attribute and the attribute type specified does
not match the existing set type. Both sets must have the same
primitive data type. For example, if the existing data type is a set
of strings, the Value must also be a set of strings.
This way you don't need to check whether the user already upvote or downvote for the item or not.
Only thing you may need to ensure is that the same user shouldn't be present on upvote and downvote set. Probably, you can use REMOVE or ConditionExpression to achieve this.

Resources