DynamoDb - .NET Object Persistence Model - LoadAsync does not apply ScanCondition - amazon-dynamodb

I am fairly new in this realm and any help is appreciated
I have a table in Dynamodb database named Tenant as below:
"TenantId" is the hash primary key and I have no other keys. And I have a field named "IsDeleted" which is boolean
Table Structure
I am trying to run a query to get the record with specified "TenantId" while it is not deleted ("IsDeleted == 0")
I can get a correct result by running the following code: (returns 0 item)
var filter = new QueryFilter("TenantId", QueryOperator.Equal, "2235ed82-41ec-42b2-bd1c-d94fba2cf9cc");
filter.AddCondition("IsDeleted", QueryOperator.Equal, 0);
var dbTenant = await
_genericRepository.FromQueryAsync(new QueryOperationConfig
{
Filter = filter
}).GetRemainingAsync();
But no luck when I try to get it with following code snippet (It returns the item which is also deleted) (returns 1 item)
var queryFilter = new List<ScanCondition>();
var scanCondition = new ScanCondition("IsDeleted", ScanOperator.Equal, new object[]{0});
queryFilter.Add(scanCondition);
var dbTenant2 = await
_genericRepository.LoadAsync("2235ed82-41ec-42b2-bd1c-d94fba2cf9cc", new DynamoDBOperationConfig
{
QueryFilter = queryFilter,
ConditionalOperator = ConditionalOperatorValues.And
});
Any Idea why ScanCondition has no effect?
Later I also tried this: (throw exception)
var dbTenant2 = await
_genericRepository.QueryAsync("2235ed82-41ec-42b2-bd1c-d94fba2cf9cc", new DynamoDBOperationConfig()
{
QueryFilter = new List<ScanCondition>()
{
new ScanCondition("IsDeleted", ScanOperator.Equal, 0)
}
}).GetRemainingAsync();
It throws with: "Message": "Must have one range key or a GSI index defined for the table Tenants"
Why does it complain about Range key or Index? I'm calling
public AsyncSearch<T> QueryAsync<T>(object hashKeyValue, DynamoDBOperationConfig operationConfig = null);

You simply cant query a table only giving a single primary key (only hash key). Because there is one and only one item for that primary key. The result of the Query would be that still that single item, which is actually Load operation not Query. You can only query if you have composite primary key in this case (Hash (TenantID) and Range Key) or GSI (which doesn't impose key uniqueness therefore accepts duplicate keys on index).
The second code attempts to filter the Load. DynamoDBOperationConfig's QueryFilter has a description ...
// Summary:
// Query filter for the Query operation operation. Evaluates the query results and
// returns only the matching values. If you specify more than one condition, then
// by default all of the conditions must evaluate to true. To match only some conditions,
// set ConditionalOperator to Or. Note: Conditions must be against non-key properties.
So works only with Query operations
Edit: So after reading your comments on this...
I dont think there conditional expressions are for read operations. AWS documents indicates they are for put or update operations. However, not being entirely sure on this since I never needed to do a conditional Load. There is no such thing like CheckIfExists functionality as well in general. You have to read the item and see if it exists. Conditional load will still consume read throughput so your only advantage would be only NOT retrieving it in other words saving the bandwith (which is very negligible for single item).
My suggestion is read it and filter it in your application layer. Dont query for it. However what you can also do is if you very need it you can use TenantId as hashkey and isDeleted for range key. If you do so, you always have to query when you wanna get a tenant. With the query you can set rangeKey(isDeleted) to 0 or 1. This isnt how I would do it. As I said, would just read it and filter it at my application.
Another suggestion thing could be setting a GSI on isDeleted field and writing null when it is 0. This way you can only see that attribute in your table when its only 1. GSI on such attribute is called sparse index. Later if you need to get all the tenants that are deleted (isDeleted=1) you can simply scan that entire index without conditions. When you are writing null when its 0 dynamoDB wont put it in the index at the first place.

Related

Global Secondary Index DynamoDB empty value error

I have a string attribute that can be an empty value. And I want to set it as a Global Secondary Index. But it showed an error when I tried to perform UpdateItemRequest or SaveTable Context:
Amazon.DynamoDBv2.AmazonDynamoDBException: One or more parameter values are not valid. A value specified for a secondary index key is not supported. The AttributeValue for a key attribute cannot contain an empty string value. IndexName: .... IndexKey: ...
What is wrong with my mindset or my settings? I'm new to DynamoDB and had a MongoDB base. If I don't use GSI for this attribute, how to perform a query on that attribute?
I tried
[DynamoDBIgnore] string property;
var operationConfig = new DynamoDBOperationConfig() { };
operationConfig.IsEmptyStringValueEnabled = true;
operationConfig.Conversion = DynamoDBEntryConversion.V2;
but it does not work.
This states that you are setting the value for your GSI PK to an empty string "" which is not supported.
In order to overcome this, you simply remove any value from that attribute, i.e don't set it at all. This allows your index to become sparse in that it will only store data that has a key associated with it.

retrieve a result when the partition is not known (but row key is)

In my case (I happen to have only two types for each entry, so 2 partitions, and the row key is unique) I can write an iterative set of queries going over all possible partitions like this:
TableOperation retrieveOperation = TableOperation.Retrieve<JobStatus>(Mode.GreyScale.Description(), id);
TableResult query = await table.ExecuteAsync(retrieveOperation);
if (query.Result != null)
{
return new OkObjectResult((JobStatus)query.Result);
}
else
{
retrieveOperation = TableOperation.Retrieve<JobStatus>(Mode.Sepia.Description(), id);
query = await table.ExecuteAsync(retrieveOperation);
if (query.Result != null)
{
return new OkObjectResult((JobStatus)query.Result);
}
}
return new NotFoundResult();
The thing is, that is clearly inefficient (imagine if there were hundreds of types!). Does azure storage tables provide an efficient means to query when you know only the row key?
Does azure storage tables provide an efficient means to query when you
know only the row key?
Simple answer to your question is no, there's no efficient way to query table when you only know the RowKey. Table Service will do full table scan going from one partition to another and find entities with matching RowKey.
In your case, you would probably want to use TableQuery to create your query and then either call ExecuteQuery or ExecuteQuerySegmented to get query results.
TableQuery query = new TableQuery().Where("RowKey eq 'Your Row Key'");
var result = table.ExecuteQuery(query);

DynamoDB Mapper Query Doesn't Respect QueryExpression Limit

Imagine the following function which is querying a GlobalSecondaryIndex and associated Range Key in order to find a limited number of results:
#Override
public List<Statement> getAllStatementsOlderThan(String userId, String startingDate, int limit) {
if(StringUtils.isNullOrEmpty(startingDate)) {
startingDate = UTC.now().toString();
}
LOG.info("Attempting to find all Statements older than ({})", startingDate);
Map<String, AttributeValue> eav = Maps.newHashMap();
eav.put(":userId", new AttributeValue().withS(userId));
eav.put(":receivedDate", new AttributeValue().withS(startingDate));
DynamoDBQueryExpression<Statement> queryExpression = new DynamoDBQueryExpression<Statement>()
.withKeyConditionExpression("userId = :userId and receivedDate < :receivedDate").withExpressionAttributeValues(eav)
.withIndexName("userId-index")
.withConsistentRead(false);
if(limit > 0) {
queryExpression.setLimit(limit);
}
List<Statement> statementResults = mapper.query(Statement.class, queryExpression);
LOG.info("Successfully retrieved ({}) values", statementResults.size());
return statementResults;
}
List<Statement> results = statementRepository.getAllStatementsOlderThan(userId, UTC.now().toString(), 5);
assertThat(results.size()).isEqualTo(5); // NEVER passes
The limit isn't respected whenever I query against the database. I always get back all results that match my search criteria so if I set the startingDate to now then I get every item in the database since they're all older than now.
You should use queryPage function instead of query.
From DynamoDBQueryExpression.setLimit documentation:
Sets the maximum number of items to retrieve in each service request
to DynamoDB.
Note that when calling DynamoDBMapper.query, multiple
requests are made to DynamoDB if needed to retrieve the entire result
set. Setting this will limit the number of items retrieved by each
request, NOT the total number of results that will be retrieved. Use
DynamoDBMapper.queryPage to retrieve a single page of items from
DynamoDB.
As they've rightly answered the setLimit or withLimit functions limit the number of records fetched only in each particular request and internally multiple requests take place to fetch the results.
If you want to limit the number of records fetched in all the requests then you might want to use "Scan".
Example for the same can be found here

NOT_NULL query condition on globalSecondaryIndex in dynamodb query

Is it possible to add constraint to a dynamodb query expression that states that a GSI should be not null?
Can somebody provide examples.
Is possible to construct a query like the one below?
new DynamoDBQueryExpression<XXX>()
.withHashKeyValues(YYY).withKeyConditionExpression(GSI != NULL);
Note:
Please let me know if this is possible in during query and not during filter time?
if you're like me and you landed on this page while finding the answer to the above question, here's the thread you need to see
How do you query for a non-existent (null) attribute in DynamoDB
The DynamoDB String attribute can't have NULL or empty string.
When you try to insert NULL, the API should throw the below exception:-
java.lang.IllegalArgumentException: Input value must not be null
When you try to insert empty string, the API should throw the below exception:-
com.amazonaws.AmazonServiceException: One or more parameter values were invalid: An AttributeValue may not contain an empty string
If you want to add additional filters on some attributes (i.e. attributes other than hash or range key), you can use the below syntax (i.e. withFilterExpression).
Not equals operator is "<>"
Map<String, AttributeValue> eav = new HashMap<String, AttributeValue>();
eav.put(":val1", new AttributeValue().withS("Some value"));
DynamoDBQueryExpression<XXX> queryExpression = new DynamoDBQueryExpression<XXX>();
queryExpression.withHashKeyValues(hashKeyValues);
queryExpression.withFilterExpression("docType <> :val1").withExpressionAttributeValues(eav);

How to check if SimpleList or SimpleRecord is empty

I got the following issue. I am trying to use the With or WithMany instruction
to retrieve a linked list of roles of an business relation via an
outer join. The referential integrity is in place on the database but
the primary key on the roles table is a composite key. That's the
reason i use an OuterJoin clause because I get an exception
otherwise .
When the query gets executed the results are exactly as I expected and
nicely filled with data. Nevertheless there are certain cases where
there are not yet roles available in the database. So I would expect
that in those cases the returned SimpleList (Roles below) would be
null, cause there is not data available. Instead Simple.Data returns a
SimpleLIst and if i expand the Dynamic View in debug then it says
'Empty: No further information on this object could be discovered".
Even if i traverse further down and i retrieve the first object in the
SimpleList, it even returns a SimpleRecord with the same information
as above in debug. Only after I request a property of the SimpleRecord
I get some information that the record was empty because then it
returns null.
To come to the bottom line... is there anybody who can tell me how to
check if a SimpleList or SimpleRecord is empty or null without
traversing down the hierarchy?
I am using Simple.Data 0.16.1.0 (due to policy i can't use the
beta yet)
Thanks in advance for reading the whole story...
Below is the code sample:
dynamic businessRelationRoles;
var query = db.Zakenrelaties.As("BusinessRelations")
.All()
.OuterJoin(db.Zakenrelaties_Rollen.As("Roles"), out businessRelationRoles)
.On(zr_ID: db.Zakenrelaties.zr_ID)
.With(businessRelationRoles);
foreach (var relation in query)
{
//Get the SimpleList as IEnumerable
IEnumerable<dynamic> roles = relation.Roles;
//Get the first available SimpleRecord
var role = roles.First();
//Check if any record was returned..This passes always?? Even if the SimpleList was empty
if (role != null)
{
//Get the id of the role. returns null if SimpleRecord was empty
var roleId = role.zrro_id;
}
}
Is there anybody who can help me out?
Belatedly, and for information purposes only, this was a bug and got fixed in the 0.17 (aka 1.0-RC0) release.

Resources