DynamoDB - Return which ConditionExpression was false - amazon-dynamodb

I am calling PutItem with a ConditionExpression that looks like:
attribute_exists(id) AND object_version = :x
In other words, I only want to update an item if the following conditions are true:
The object needs to exists
My update must be on latest version of the object
Right now, if the check fails, I don't know which condition was false. Is there a way to get information on which conditions were false? Probably not but who knows...

Conditional expressions in DynamoDB allow for atomic write operations for DynamoDB objects that is strongly consistent for a single object, even in a distributed system thanks to paxos.
One standard approach is to simply read the object first and perform your above check in your client application code. If one of the conditions doesn't match you know which one was invalid directly without a failed write operation. The reason for having DynamoDB also perform this check is because another application or thread may have modified this object between the check and write. If the write failed then you would read the object again and perform the check again.
Another approach would be to skip the read before the write and just read the object after the failed write to do the check in your code to determine what condition actually failed.
The one or two additional reads against the table are required because you want to know which specific condition failed. This feature is not provided by DynamoDB so you'll have to do the check yourself.

Related

Dynamodb: Index on List attribute and query NOT_CONTAINS

I am trying to figure out (at this point I think the answer is No) if it is possible to build a index on a List Attribute and query NOT_CONTAINS on that attribute.
Example table:
Tasks
Task_id: string
solved_by: List<String> # stores list of user_ids who previously solved this task.
My query would be:
Get me all the tasks not yet solved by current_user
select * from tasks where tasks.solved_by NOT_CONTAINS current_user_id
Is it possible to do this without full scans. I tried creating an attribute of type L but aws cli errors out saying Member must satisfy enum value set: [B, N, S]
If this is not possible with dynamodb, please suggest what datastore I can use.
Any help is highly appreciated. Thanks!
As you found out, and as the error you got suggests, this is NOT possible.
However, I'd argue if your design couldn't be improved. Storing a potentially unbound list of entries (users in your case) inside a single item, which is limited to 400kb seems dangerous.
If instead, you'd store for each task the information that a particular user resolved it as a separate item (partition key - task_id, sort key - user_id) than you could easily look up if a user solved a task or not. You could also store additional information about the particular solution or attempts.
If you haven't heard of DynamoDB single table design yet, or how to overload indexes, I can recommend looking at
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-modeling-nosql-B.html
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-overloading.html
https://www.dynamodbbook.com/
Update
I just realised, you care about a negation (NOT_CONTAINS) - for those, you can't use an index anyway. For the sort key you can only use positive comparison (=, <, >, <=, >=, between, begins_with): https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html#Query.KeyConditionExpressions
So you might have to rethink the whole approach, to better pre-process the data stored in DDB, so it's easier to fetch, or pick a different database.
In your original question, you defined your access pattern as
Get me all the tasks not yet solved by current_user
In a later comment, you clarified that the access pattern is
A solver should be shown a task that is not yet solved by them.
which is a slightly different access pattern.
Here's one way you could fetch a task not yet solved by a user.
In this data model, I chose to model Users and Tasks as separate items. Tasks have numerically increasing ID's. Each User item should start with the lastSolved attribute set to 1. Each time you fetch a new Task for a user, you fetch TASK#{last_solved+1} and increment the lastSolved attribute by 1.
You could probably take a similar approach by using timestamps instead of numbers... anything sortable, really.

What is the model for checking if a GSI key exists or not in DDB?

I have a pretty straight forward question
I want to know if some GSI hash key exists or not.
The best I can find right now is
DynamoDBQueryExpression<T> queryExpression;
// Logic for constructing query
queryExpression.withIndexName(SomeIndexName);
QueryResultPage<T> queryResponse mapper.queryPage(T.class, queryExpression, someMapperConfig));
Here query result page contains a list of results, I can check if that list has anything and conclude whether it exists or not.
The obvious problem is the efficiency drop when there are things that are present. Is there a way to not move the contents of the item across network IO for the purpose of verification (i.e. a server side total validation of the predicate of checking if some GSI key exists or not)?
When writing an item with, for example, Put-Item you can add a condition specifying the key must not exist. This way DynamoDB checks whether the provided key is already taken and will give an error when you try to put something in. Just catch the error and then you know the key was already taken.

SQL: Best practice for inserting a record if it doesn't exist

I have an asp.net Gridview that handles insert operations into a SQL database. Records are only permitted to be inserted if they meet a uniqueness criteria, and this constraint is being enforced using unique indexes in SQL server. If the user attempts to insert a record that already exists, an error message is displayed.
I'm wondering what the best practice is for implementing this.
Check if the record exists SQL side, using IF EXISTS, and locking hints (updlock, holdlock, etc). Return an error code to ASP.net depending on whether the record was inserted
Perform the INSERT operation inside a SQL server try/catch block, relying on the unique index to prevent the insert from occurring if the record exists. Return an error code depending on whether an exception was thrown.
Perform the INSERT operation SQL side, but without SQL try/catch. Handle the PK violation exception inside ASP.net instead.
Normally I'd consider using exceptions to handle valid operations to be bad practice - i.e. software should not throw exceptions unless something is broken. However if the unique index on the table in SQL is going to implement the desired constraint, why bother performing a manual check for existence of the record?
I would make a separate call to check if the record already exists. If yes, show message to user, if no make insert. The reason I would do it this way is because I prefer keeping all the business logic in the application.
If you insist in making just one stored proc call:
I would check before I insert. I would also add an output parameter to the stored proc that returns a message if the insert was unsuccessful. In my application if I see a message in the output parameter, I will display that to the user.

MS Access CREATE PROCEDURE Or use Access Macro in .NET

I need to be able to run a query such as
SELECT * FROM atable WHERE MyFunc(afield) = "some text"
I've written MyFunc in a VB module but the query results in "Undefined function 'MyFunc' in expression." when executed from .NET
From what I've read so far, functions in Access VB modules aren't available in .NET due to security concerns. There isn't much information on the subject but this avenue seems like a daed end.
The other possibility is through the CREATE PROCEDURE statement which also has precious little documentation: http://msdn.microsoft.com/en-us/library/bb177892%28v=office.12%29.aspx
The following code does work and creates a query in Access:
CREATE PROCEDURE test AS SELECT * FROM atable
However I need more than just a simple select statement - I need several lines of VB code.
While experimenting with the CREATE PROCEDURE statement, I executed the following code:
CREATE PROCEDURE test AS
Which produced the error "Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE', 'SELECT', or 'UPDATE'."
This seems to indicate that there's a SQL 'PROCEDURE' statement, so then I tried
CREATE PROCEDURE TEST AS PROCEDURE
Which resulted in "Syntax error in PROCEDURE clause."
I can't find any information on the SQL 'PROCEDURE' statement - maybe I'm just reading the error message incorrectly and there's no such beast. I've spent some time experimenting with the statement but I can't get any further.
In response to the suggestions to add a field to store the value, I'll expand on my requirements:
I have two scenarios where I need this functionality.
In the first scenario, I needed to enable the user to search on the soundex of a field and since there's no soundex SQL function in Access I added a field to store the soundex value for every field in every table where the user wants to be able to search for a record that "soundes like" an entered value. I update the soundex value whenever the parent field value changes. It's a fair bit of overhead but I considered it necessary in this instance.
For the second scenario, I want to normalize the spacing of a space-concatenation of field values and optionally strip out user-defined characters. I can come very close to acheiving the desired value with a combination of TRIM and REPLACE functions. The value would only differ if three or more spaces appeared between words in the value of one of the fields (an unlikely scenario). It's hard to justify the overhead of an extra field on every field in every table where this functionality is needed. Unless I get specific feedback from users about the issue of extra spaces, I'll stick with the TRIM & REPLACE value.
My application is database agnostic (or just not very religious... I support 7). I wrote a UDF for each of the other 6 databases that does the space normalization and character stripping much more efficiently than the built-in database functions. It really annoys me that I can write the UDF in Access as a VB macro and use that macro within Access but I can't use it from .NET.
I do need to be able to index on the value, so pulling the entire column(s) into .NET and then performing my calculation won't work.
I think you are running into the ceiling of what Access can do (and trying to go beyond). Access really doesn't have the power to do really complex TSQL statements like you are attempting. However, there are a couple ways to accomplish what you are looking for.
First, if the results of MyFunc don't change often, you could create a function in a module that loops through each record in atable and runs your MyFunc against it. You could either store that data in the table itself (in a new column) or you could build an in-memory dataset that you use for whatever purposes you want.
The second way of doing this is to do the manipulation in .NET since it seems you have the ability to do so. Do the SELECT statement and pull out the data you want from Access (without trying to run MyFunc against it). Then run whatever logic you want against the data and either use it from there or put it back into the Access database.
Why don't you want to create an additional field in your atable, which is atable.afieldX = MyFunc(atable.afield)? All what you need - to run UPDATE command once.
You should try to write a SQL Server function MyFunc. This way you will be able to run the same query in SQLserver and in Access.
A few usefull links for you so you can get started:
MSDN article about user defined functions: http://msdn.microsoft.com/en-us/magazine/cc164062.aspx
SQLServer user defined functions: http://www.sqlteam.com/article/intro-to-user-defined-functions-updated
SQLServer string functions: http://msdn.microsoft.com/en-us/library/ms181984.aspx
What version of JET (now called Ace) are you using?
I mean, it should come as no surprise that if you going to use some Access VBA code, then you need the VBA library and a copy of MS Access loaded and running.
However, in Access 2010, we now have table triggers and store procedures. These store procedures do NOT require VBA and in fact run at the engine level. I have a table trigger and soundex routine here that shows how this works:
http://www.kallal.ca/searchw/WebSoundex.htm
The above means if Access, or VB.net, or even FoxPro via odbc modifies a row, the table trigger code will fire and run and save the soundex value in a column for you. And this feature also works if you use the new web publishing feature in access 2010. So, while the above article is written from the point of view of using Access Web services (available in office 365 and SharePoint), the above soundex table trigger will also work in a stand a alone Access and JET (ACE) only application.

How to restrict PL/SQL code from executing twice with the same values to the input parameters?

I want to restrict the execution of my PL/SQL code from repetition. That is, I have written a PL/SQL code with three input parameters viz, Month, Year and a Flag. I have executed the procedure with the following values for the parameters:
Month: March
Year : 2011
Flag: Y
Now, If I am trying to execute the procedure with the same values to the parameters as above, I want to write some code in the PL/SQL to restrict the unwanted second execution. Can anyone help. I hope the question is no ambiguous.
You can use the function result cache: http://www.oracle-developer.net/display.php?id=504 . So Oracle can do this for you.
I would create another table that would store the 3 parameters of each request. When your procedure is called it would first check the "parameter request" table to see if the calling parameters have beem used before. If found, then exit the procedure. If not found, then save the parameters and execute the rest of the procedure.
Your going to need to keep "State" of the last call somewhere. I would recommend creating a table with a datetime column.
When your procedure is called update this table. So, next time when your procedure is called.. check this table to see when was the last time your procedure was called and then proceed accordingly.
Why not set up a table to track what arguments you've already executed it with?
In your procedure, first check that table to see if similar parameters have already been processed. If so, exit (with or without an error).
If not, insert them and do the processing necessary.
Depending on how tight the requirements are, you'll need to get a exclusive lock on that table to prevent concurrent execution.
A nice plus would be an extra column with "in progress"/"done"/"error" status so that you can check if things are going on properly. (Maybe a timestamp too if that's important/interesting.)
This setup allows you to easily clear some of the executions (by deleting some rows) if you find things need to be re-done for whatever reason.
Make an insert in the beginning of the procedure, and do a select for update tolock the table so no one else can process any data and if everything goes ok with the procedure, commit and release the table 😀

Resources