How to ADD values in a map field in DynamoDB using UpdateItem operation? - amazon-dynamodb

I need to constantly increment values for a map field value in DynamoDB. The map will contain keys with counters, and for each update I want to atomically increment the keys. The corner case is that for this operation it needs to do an upsert, meaning, if the record/property doesn't exist it should be created, otherwise updated (a.k.a. incremented).
The given Java code below shows what I'm trying to achieve:
DynamoDbClient client = DynamoDbClient.builder()
.credentialsProvider(
StaticCredentialsProvider.create(
AwsBasicCredentials.create("test-key", "test-secret")))
.region(Region.EU_CENTRAL_1)
.endpointOverride(URI.create("http://localhost:4566"))
.build();
client.transactWriteItems(TransactWriteItemsRequest.builder()
.transactItems(TransactWriteItem.builder()
.update(Update.builder()
.tableName("my-table")
.key(Map.of("id", AttributeValue.builder().s("123").build()))
.updateExpression("""
SET itemId = if_not_exists(itemId, :itemId)
ADD #val.#country :value
""")
.expressionAttributeNames(Map.of(
"#val", "value",
"#country", "Portugal" // in other invocations this might be a different country
))
.expressionAttributeValues(Map.of(
":itemId", AttributeValue.builder().s("1234").build(),
":value", AttributeValue.builder().n("1").build()))
.build())
.build())
.build());
Every time I run this operation I get the following error:
Exception in thread "main" software.amazon.awssdk.services.dynamodb.model.DynamoDbException: The document path provided in the update expression is invalid for update (Service: DynamoDb, Status Code: 400, Request ID: XXX)
Does somebody know how I could achieve this functionality?

Related

Edge with id already exists - Gremlin

I am using the following to upsert an edge using Gremlin. I am following the recipe as mentioned here. This is the code I am using. The code runs in a lambda which talks to a cluster hosted in Amazon Neptune
public void createEdge(final String id, final String label, final String fromId, final String toId, final Map<String, String> properties) {
this.graphTraversalSource
.V(fromId) // get vertex of id given for the source
.as("fromVertex") // label as fromVertex to be accessed later
.V(toId) // get vertex of id given for destination
.coalesce( // evaluates the provided traversals in order and returns the first traversal that emits at least one element
inE(label) // check incoming edge of label given
.has(T.id, id) // also check for incoming edge of given ID
.where( // conditional check to check if edge exists
outV() // get destination vertex of the edge to check
.as("fromVertex")), // against staged vertex
addE(label) // add edge if not present
.property(T.id, id) // with given id
.from("fromVertex")) // from source vertexx
.next(); // end traversal to commit to graph
log.info("Created edge {} if it doesn't exists between {} and {}", id, fromId, toId);
}
The only difference from the example recipe mentioned in the tinkerpop website is this step that I have added. In my case, I want to create multiple edges between the same vertices with same label but different ID
.has(T.id, id) // also check for incoming edge of given ID
But I am getting the following exception
Caused by: java.util.concurrent.CompletionException: org.apache.tinkerpop.gremlin.driver.exception.ResponseException:
{
"detailedMessage": "Edge with id already exists: 123",
"requestId": "dce46db8-1d0a-4717-a412-ee831973b177",
"code": "ConstraintViolationException"
}
When I run the same query in gremlin console however, the experiment succeeds. The following query, how many times I execute, is returning the same edge. But the same is not happening with remote gremlin server
g.V().has('product','id','C1').as('v1').V().has('product','id','C2').coalesce(__.inE('rel').has('id', 'E1').where(outV().as('v1')), addE('rel').property('id', 'E1').from('v1')).next()
Can someone help me with this error
Thanks
The query that you have above (Lambda function code) and the query that you reference at the end of your question have one primary difference. In the query at the end of your question:
g.V().has('product','id','C1').as('v1').
V().has('product','id','C2').
coalesce(
__.inE('rel').has('id', 'E1').where(outV().as('v1')),
addE('rel').property('id', 'E1').from('v1')).
next()
This query is using has('id','E1') to check the id of the edge, not has(T.id,...). has('id' is looking for a custom property called 'id', whereas has(T.id is looking at the actual edge ID.
Same goes for property('id'. If using (property('id', Neptune (specifically) is going to create a UUID for the T.id value for the edge when that edge is created. So you'll end up with many edges with different T.ids but the same id property.
NOTE: You only need to use the T.id notation when coding this in Python. If using the Gremlin console you would just use either has(id,"myId") or property(id,"myId"). No quotes around id.

change.after.val() returns full JSON object rather than value of the object

I'm working on a Firebase Cloud Function. When I log the value of change.after.val() I get a printout of a key-value pair
{ DCBPUTBPT5haNaMvMZRNEpOAWXf3: 'https://m.youtube.com/watch?v=t-7mQhSZRgM' }
rather than simply the value (the URL). Here's my code. What am I not understanding about .val() ? Shouldn't "updated" simply contain the URL?
exports.fanOutLink = functions.database.ref('/userLink').onWrite((change, context) => {
const updated = change.after.val();
console.log(updated);
return null
});
If you want only the URL value, you should include a wildcard in your trigger path for the URL key:
exports.fanOutLink = functions.database.ref('/userLink/{keyId}').onWrite((change, context) => {
console.log('keyId=', context.params.keyId);
const updated = change.after.val();
console.log(updated);
return null
});
In the Realtime Database, data is modeled as a JSON tree. The path specified in an event trigger identifies a node in the tree. The value of the node, being JSON, includes all child nodes. The change parameter for the trigger event refers to the value of the entire node.
I indicated above that you can change the trigger path to refer one level down. An alternative is to access the children of the node using the child() method of DataSnapshot.
Without knowing your use-case, it's hard to be more specific about the trigger event path you should use. Keep in mind that the event fires when any element of the node value changes, whether it be a simple value at the root level, or a value of a child node. It is often the case that you want the trigger to be as specific as possible, to better identify what changed. That's where wildcards in the path are useful. As I showed in the code I posted, the string value of a wildcard is available from the context parameter.

DynamoDb - .NET Object Persistence Model - LoadAsync does not apply ScanCondition

I am fairly new in this realm and any help is appreciated
I have a table in Dynamodb database named Tenant as below:
"TenantId" is the hash primary key and I have no other keys. And I have a field named "IsDeleted" which is boolean
Table Structure
I am trying to run a query to get the record with specified "TenantId" while it is not deleted ("IsDeleted == 0")
I can get a correct result by running the following code: (returns 0 item)
var filter = new QueryFilter("TenantId", QueryOperator.Equal, "2235ed82-41ec-42b2-bd1c-d94fba2cf9cc");
filter.AddCondition("IsDeleted", QueryOperator.Equal, 0);
var dbTenant = await
_genericRepository.FromQueryAsync(new QueryOperationConfig
{
Filter = filter
}).GetRemainingAsync();
But no luck when I try to get it with following code snippet (It returns the item which is also deleted) (returns 1 item)
var queryFilter = new List<ScanCondition>();
var scanCondition = new ScanCondition("IsDeleted", ScanOperator.Equal, new object[]{0});
queryFilter.Add(scanCondition);
var dbTenant2 = await
_genericRepository.LoadAsync("2235ed82-41ec-42b2-bd1c-d94fba2cf9cc", new DynamoDBOperationConfig
{
QueryFilter = queryFilter,
ConditionalOperator = ConditionalOperatorValues.And
});
Any Idea why ScanCondition has no effect?
Later I also tried this: (throw exception)
var dbTenant2 = await
_genericRepository.QueryAsync("2235ed82-41ec-42b2-bd1c-d94fba2cf9cc", new DynamoDBOperationConfig()
{
QueryFilter = new List<ScanCondition>()
{
new ScanCondition("IsDeleted", ScanOperator.Equal, 0)
}
}).GetRemainingAsync();
It throws with: "Message": "Must have one range key or a GSI index defined for the table Tenants"
Why does it complain about Range key or Index? I'm calling
public AsyncSearch<T> QueryAsync<T>(object hashKeyValue, DynamoDBOperationConfig operationConfig = null);
You simply cant query a table only giving a single primary key (only hash key). Because there is one and only one item for that primary key. The result of the Query would be that still that single item, which is actually Load operation not Query. You can only query if you have composite primary key in this case (Hash (TenantID) and Range Key) or GSI (which doesn't impose key uniqueness therefore accepts duplicate keys on index).
The second code attempts to filter the Load. DynamoDBOperationConfig's QueryFilter has a description ...
// Summary:
// Query filter for the Query operation operation. Evaluates the query results and
// returns only the matching values. If you specify more than one condition, then
// by default all of the conditions must evaluate to true. To match only some conditions,
// set ConditionalOperator to Or. Note: Conditions must be against non-key properties.
So works only with Query operations
Edit: So after reading your comments on this...
I dont think there conditional expressions are for read operations. AWS documents indicates they are for put or update operations. However, not being entirely sure on this since I never needed to do a conditional Load. There is no such thing like CheckIfExists functionality as well in general. You have to read the item and see if it exists. Conditional load will still consume read throughput so your only advantage would be only NOT retrieving it in other words saving the bandwith (which is very negligible for single item).
My suggestion is read it and filter it in your application layer. Dont query for it. However what you can also do is if you very need it you can use TenantId as hashkey and isDeleted for range key. If you do so, you always have to query when you wanna get a tenant. With the query you can set rangeKey(isDeleted) to 0 or 1. This isnt how I would do it. As I said, would just read it and filter it at my application.
Another suggestion thing could be setting a GSI on isDeleted field and writing null when it is 0. This way you can only see that attribute in your table when its only 1. GSI on such attribute is called sparse index. Later if you need to get all the tenants that are deleted (isDeleted=1) you can simply scan that entire index without conditions. When you are writing null when its 0 dynamoDB wont put it in the index at the first place.

How to get the table name in AWS dynamodb trigger function?

I am new with AWS and working on creating a lambda function on Python. The function will get the dynamodb table stream and write to a file in s3. Here the name of the file should be the name of the table.
Can someone please tell me how to get the table name if the trigger that is invoking the lambda function?
Thanks for help.
Since you mentioned you are new to AWS, I am going to answer descriptively.
I am assuming that you have set 'Stream enabled' setting for your DynamoDB table to 'Yes', and have set up this as an event source to your lambda function.
This is how I got the table name from the stream that invoked my lambda function -
def lambda_handler(event, context):
print(json.dumps(event, indent=2)) # Shows what's in the event object
for record in event['Records']:
ddbARN = record['eventSourceARN']
ddbTable = ddbARN.split(':')[5].split('/')[1]
print("DynamoDB table name: " + ddbTable)
return 'Successfully processed records.'
Basically, the event object that contains all the information about a particular DynamoDB stream that was responsible for that particular lambda function invoke, contains a parameter eventSourceARN. This eventSourceARN is the ARN (Amazon Resource Number) that uniquely identifies your DynamoDB table from which the event occurred.
This is a sample value for eventSourceARN -
arn:aws:dynamodb:us-east-1:111111111111:table/test/stream/2020-10-10T08:18:22.385
Notice the bold text above - test; this is the table name you are looking for.
In the line ddbTable = ddbARN.split(':')[5].split('/')[1] above, I have tried to split the entire ARN by ':' first, and then by '/' in order to get the value test. Once you have this value, you can call S3 APIs to write to a file in S3 with the same name.
Hope this helps.
Please note that eventSourceArn is not always provided. From my testing today, I didn't see eventSourceArn presented in record. You can also refer to the links:
Issue: https://github.com/aws/aws-sdk-js/issues/2226
API: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_streams_Record.html
One way to do it will be via pattern matching in Scala using regex:
val ddbArnRegex: Regex = """arn:aws:dynamodb:(.+):(.+):table/(.+)/stream/(.+)""".r
def parseTableName(ddbARN: String): Option[String] = {
if (null == ddbARN) None
ddbARN match {
case ddbArnRegex(_, _, table, _) => Some(table)
case _ => None
}
}

Angularfire v2: is there a way to get a collection beginning with a specified index number?

For instance:
$scope.items = $firebase(new Firebase("https://****.firebaseio.com").startAt(100).limit(100));
Starting at the 100th item in the Firebase and ending at 200? I know I can use a skip filter but that still seems to load the first 100 items, correct me if I'm wrong.
You are on the right track. You attach the startAt().limit() code to the Firebase ref and then pass that into $firebase as you have above.
However, the startAt method does not take a numeric offset, but instead a priority and optionally a record id (key).
So where you have put startAt(100), attempting to start after record 100, you would instead need to use the record ID, or prioritize the records in groups of 100.
For background, here's a simple paginator you can check out and steal ideas from. The heart of the example is in nextPage, where it calls startAt using the previous record id like so:
var lastKey = null; // page 0
this.ref.startAt(null, lastKey)
.limit(this.limit + (lastKey? 1 : 0))
.once('value', /* callback to process data goes here */));
UPDATE
Another useful note here is that the records returned by angularFire contain their unique id as $id, which can be useful for determining the id of the last iterated item.

Resources