Is there any API in DynamoDB to update a batch of items? There is an API to write new items in batches (BatchWriteItem) and update single item using UpdateItem, but is it possible to update multiple items in one call?
There is no batch update item API available in DynamoDB at the moment.
DynamoDB API operations list
I know this is an old question by now, but DynamoDB recently added a Transaction api which supports update:
Update — Initiates an UpdateItem operation to edit an existing item's attributes or add a new item to the table if it does not already exist. Use this action to add, delete, or update attributes on an existing item conditionally or without a condition.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/transaction-apis.html
I reached this thread on similar query, hope this might help.
DynamoDB supports Batch Statement Execution which is described in documentation. This works with client object rather than resource object. Then I used the PartiQL update statement supported by DynamoDB and described here.
Python code reference looks something like this:
client = boto3.client('dynamodb')
batch = ["UPDATE users SET active='N' WHERE email='<user_email>' RETURNING [ALL|MODIFIED] [NEW|OLD] *;", "UPDATE users ..."] # Limit to 25 per batch
request_items = [{'Statement': _stat} for _stat in batch]
batch_response = client.batch_execute_statement(Statements=request_items)
This is minimal code. You can use multi-threading to execute multiple batches at once.
With PartiQL you can execute batch insert and update just like SQL.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ql-reference.multiplestatements.batching.html
BatchWriteItem cannot update items. To update items, use the UpdateItem action.
BatchWriteItem operation puts or deletes multiple items in one or more tables
Reference: http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html
I use DynamoDBMapper.batchSave(Iterable<? extends Object> objectsToSave) for this purpose.
No there is no batch update currently , you can use a single update Item call and have a workflow over it like AWS SWF or AWS step functions
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/DynamoDB/DocumentClient.html
I use a dynamoDB update trigger, then I made a template that said to me what items I should modify, I put them on a queue and them read queue messages in other to update one by one
Not exactly a batch delete but I did this in a python lambda function just now:
import json
import boto3
client = boto3.client('dynamodb')
def lambda_handler(event, context):
idList = [
"id1",
"id2
...
"id100",
]
for itemID in idList:
test = client.update_item(
TableName='Your-Table-Name',
Key={
'id': {
'S': itemID
}
},
UpdateExpression="set exressionToChange=:r",
ExpressionAttributeValues={
':r': {'S':'New_Value'}},
ReturnValues="UPDATED_NEW")
return
To get the idList, I downloaded the values in a CSV, copied them into VSCode and then did a find and replace with regex (CMD-F then click .\*) and set
find to ".*" and replace to "$0",
which basically replaces every line with itself in quotes and a comma
So basically before:
id1
id2
id3
...
And after
"id1",
"id2",
"id3",
...
Just replace "idList = [...]" with your ids, "Your-Table-Name", "expressionToChange" and lastly, "New_Value".
Also you will have give your lambda function permission to "Update Item" in DynamoDB or you will get an error
Related
I tried a sample code ( see below) , to create new items in dynamo db . based on the docs for dynamodb and boto3, the sample code adds the item in dynamodb in batch, but just from the code , it looks like put item is being called in each iteration of the for loop below. any thoughts, also , i understand for updating item, there is no batch operation , we have to call update item one at a time?
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('my-table')
with table.batch_writer() as writer:
for item in somelist:
writer.put_item(Item=item)
Note that you called the put_item() method on the writer object. This writer object is a batch writer - it is a wrapper of the original table object. This wrapper doesn't perform every put_item() request individually! Instead,
As its name suggests, the batch writer collects batches of up to 25 writes in memory, and only on the 25th call, it sends all 25 writes as one DynamoDB BatchWriteItem request.
Then, at the end of the loop, the writer object is destroyed when the the with block ends, and this sends the final partial batch as one last BatchWriteItem request.
As you can see, Python made efficient writing using batches very transparent and easy.
The boto3 batch writer buffers internally and sends each batch automatically. It’s like magic.
I have a list of files that should be inserted or updated in dynamodb, so I'm doing in this way:
var batch = _dynamoDbContext.CreateBatchWrite<MyEntity>();
batch.AddPutItems(myEntityList);
batch.ExecuteAsync();
This works fine if DynamoDB table is empty, but sometimes I should update instead insert, but I got the following error:
An item with the same key has already been added. Key: Amazon.DynamoDBv2.DocumentModel.Key
How can I solve it ? I need to use batch, because of performance.
You can use transactions to do insert or updates but they are double the cost, otherwise you will need to update one by one
Here's some more info on a previous post
DynamoDB Batch Update
First of all, I have table structure like this,
Users:{
UserId
Name
Email
SubTable1:[{
Column-111
Column-112
},
{
Column-121
Column-122
}]
SubTable2:[{
Column-211
Column-212
},
{
Column-221
Column-222
}]
}
As I am new to DynamoDB, so I have couple of questions regarding this as follows:
1. Can I create structure like this?
2. Can we set primary key for subtables?
3. Luckily, I found DynamoDB helper class to do some operations into my DB.
https://www.gopiportal.in/2018/12/aws-dynamodb-helper-class-c-and-net-core.html
But, don't know how to fetch only perticular subtable
4. Can we fetch only specific columns from my main table? Also need suggestion for subtables
Note: I am using .net core c# language to communicate with DynamoDB.
Can I create structure like this?
Yes
Can we set primary key for subtables?
No, hash key can be set on top level scalar attributes only (String, Number etc.)
Luckily, I found DynamoDB helper class to do some operations into my DB.
https://www.gopiportal.in/2018/12/aws-dynamodb-helper-class-c-and-net-core.html
But, don't know how to fetch only perticular subtable
When you say subtables, I assume that you are referring to Array datatype in the above sample table. In order to fetch the data from DynamoDB table, you need hash key to use Query API. If you don't have hash key, you can use Scan API which scans the entire table. The Scan API is a costly operation.
GSI (Global Secondary Index) can be created to avoid scan operation. However, it can be created on scalar attributes only. GSI can't be created on Array attribute.
Other option is to redesign the table accordingly to match your Query Access Pattern.
Can we fetch only specific columns from my main table? Also need suggestion for subtables
Yes, you can fetch specific columns using ProjectionExpression. This way you get only the required attributes in the result set
I am using Python client SDK for Datastore (google-cloud-datastore) version 1.4.0. I am trying to run a key-only query fetch:
query = client.query(kind = 'SomeEntity')
query.keys_only()
Query filter has EQUAL condition on field1 and GREATER_THAN_OR_EQUAL condition on field2. Ordering is done based on field2
For fetch, I am specifying a limit:
query_iter = query.fetch(start_cursor=cursor, limit=100)
page = next(query_iter.pages)
keyList = [entity.key for entity in page]
nextCursor = query_iter.next_page_token
Though there are around 50 entities satisfying this query, each fetch returns around 10-15 results and a cursor. I can use the cursor to get all the results; but this results in additional call overhead
Is this behavior expected?
keys_only query is limited to 1000 entries in a single call. This operation counts as a single entity read.
For another limitations of Datastore, please refer detailed table in the documentation.
However, in the code, you did specify cursor as a starting point for a subsequent retrieval operation. Query can be limited, without cursor:
query = client.query()
query.keys_only()
tasks = list(query.fetch(limit=100))
For detailed instruction how to use limits and cursors, please refer documentation of the Google Gloud Datastore
I know we can do update by two operations, first get the primary key by querying the db, and then update it by put operation. But does DynamoDB support update by one operation as the relational db (such as mysql)? Since two operations will cost more time in network transferring.
My situation is as:
I have a table A with fields ID, Name, Location, Value.
And name+location can uniquely define a row.
So now I want to update the field "Value" when Name and Location satisfied some condition, but I don't know the ID. So if I use mysql, then I can update it by "Update A set value = XXX where name = "abc" and location="123"".
But when I use dynamoDB, I have to first get the primary key ID.
Then use the Key to update the item. So my question is that does DynamoDB also support similar update operation as mysql does.
Thanks!
Chen hit it on the nose. Joey, the situation you described (Get followed by a Put) is equivalent to 2 mysql functions
SELECT *
FROM TABLE
WHERE key = x
UPDATE TABLE
SET var = param
WHERE key = x
Do you see how the Select/PutItem aren't part of the update process? As long as you have the keys, you don't need to perform a query. I'm assuming you're performing the GetItem before the PutItem request because the PutItem replaces the entire item/row (i.e. deletes all attributes not specified in the Put request).
So if the original item looked like: < key-id=1, first-name=John, last-name=Doe, age=22>
and you perform a PutItem of: < key-id=1,location=NY>
The final item looks like: < key-id=1,location=NY>
If you perform an UpdateItem in place of PutItem then you would instead get:
< key-id=1, first-name=John, last-name=Doe, age=22, location=NY>
Here's a link for using the UpdateItem with Java. There also examples using .net and php
UpdateItem for Java
Correct me if I am wrong but Update Item will consume 1 operation only it will get hash key value and update it if exists else will create new Item (up-to 1 kb item)
here is the link for reference : http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithTables.html#CapacityUnitCalculations
Hope that helps
You don't need to get the primary key first. If you know the primary key, you don't need to get anything and you can simply use the UpdateItem API call to update your item.
If that still isn't clear, please edit your question and add some code samples of what you are trying to do.