This is more of a concept clarification. I can find the actual counts using Boto3 via repeated queries using the LastEvaluatedKey of previous response.
I want to count items matching certain conditions in dynamoDb. I am using the "select = count", which according to the docs [1] should just return count of matched items, and my assumption that the response will not be paginated.
COUNT - Returns the number of matching items, rather than the matching
items themselves.
When i try it via aws-cli, my assumptions seems correct, (like the rest api samples in the doc [1])
aws dynamodb query \
--table-name 'my-table' \
--index-name 'classification-date-index' \
--key-condition-expression 'classification = :col AND #dt BETWEEN :start AND :end' \
--expression-attribute-values '{":col" : {"S":"INTERNAL"}, ":start" : {"S": "2020-04-10"}, ":end" : {"S": "2020-04-25"}}' \
--expression-attribute-names '{"#dt" : "date"}' \
--select 'COUNT'
{
"Count": 18817,
"ScannedCount": 18817,
"ConsumedCapacity": null
}
But when I try using Python3 and Boto3, the response is paginated, and I have to repeat the query till LastEvaluatedKey is empty.
In [22]: table.query(IndexName='classification-date-index', Select='COUNT', KeyConditionExpression= Key('classification').eq('INTERNAL') & Key('date').between('2020-04-10', '2020-04-25'))
Out[22]:
{'Count': 5667,
'ScannedCount': 5667,
'LastEvaluatedKey': {'classification': 'INTERNAL',
'date': '2020-04-14',
's3Path': '<redacted>'},
'ResponseMetadata': {'RequestId': 'TH3ILO0P47QB7GAU9M3M98BKJVVV4KQNSO5AEMVJF66Q9ASUAAJG',
'HTTPStatusCode': 200,
'HTTPHeaders': {'server': 'Server',
'date': 'Sat, 25 Apr 2020 13:32:36 GMT',
'content-type': 'application/x-amz-json-1.0',
'content-length': '230',
'connection': 'keep-alive',
'x-amzn-requestid': 'TH3ILO0P47QB7GAU9M3M98BKJVVV4KQNSO5AEMVJF66Q9ASUAAJG',
'x-amz-crc32': '133035383'},
'RetryAttempts': 0}}
I expected the same behaviour from the Boto3 sdk like the aws cli, as the response seems lesser than the 1mb.
The docs are slightly conflicting ...
"Paginating Table Query Results" [2] page says :
DynamoDB paginates the results from Query operations. With pagination,
the Query results are divided into "pages" of data that are 1 MB in
size (or less). An application can process the first page of results,
then the second page, and so on. A single Query only returns a result
set that fits within the 1 MB size limit.
While the "Query" [1] page says:
A single Query operation will read up to the maximum number of items
set (if using the Limit parameter) or a maximum of 1 MB of data and
then apply any filtering to the results using FilterExpression.
[1] https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html
[2] https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.Pagination.html
Just ran down this issue myself. The AWS CLI does automatic summation of the pages from the DynamoDB query. To stop it from doing this, add --no-paginate onto your command as listed on this page
Related
I need to delete one item from a list in a dynamodb table, but I am getting below error. Could someone help me to resolve it
aws dynamodb delete-item \
--table-name dev-table \
--key '{"Environment":{"S":"all"}}' \
--expression-attribute-names '{"#v": "Values"}' \
--expression-attribute-values '{":vals": {"L": [ { "S": "test"}]}}'
Error:
An error occurred (ValidationException) when calling the DeleteItem operation: ExpressionAttributeNames can only be specified when using expressions
Could not yet figure out how to do it with cli, but here is a lambda that would do this:
import boto3
REGION = 'us-east-1'
dynamodb = boto3.resource('dynamodb', REGION)
env_table = dynamodb.Table('dev-table')
def get_index(totallist):
for item in totallist:
if item['Environment'] == 'all':
index = item['Values'].index('test')
return index
def lambda_handler(event, context):
totallist = env_table.scan()['Items']
index = get_index(totallist)
response = env_table.update_item(
Key={'Environment': 'all'},
UpdateExpression=f"REMOVE #v[{index}]",
ExpressionAttributeNames={'#v': 'Values'},
)
return
Make sure you add an IAM role to the lambda function to allow access to dynamodb.
It sounds like what you need is updateItem. In DynamoDB an item is a record in the table. If you're trying to modify a list within your record, you need to use the UpdateItem request.
Your request should look like below.
aws dynamodb update-item \
--table-name dev-table \
--key '{"Environment":{"S":"all"}}' \
--update-expression "SET Values = :vals" \
--expression-attribute-values '{":vals": {"L": [ { "S": "test"}]}}'
--return-values ALL_NEW
Here's an example request.
For e.g., there is a table with a field _autoScope.
When I am querying data with no filters or conditions (scan), I receive data, and, I can confirm that _autoScope is part of the data. Similarly, there is no issue "putting" an item in dynamodb either.
However, I tried this and it bombed:
$ aws dynamodb scan --table-name ModelDefinition --endpoint-url $ENDPOINT_URL --filter-expression '_autoScope = :val' --expression-attribute-values file://values.json
An error occurred (ValidationException) when calling the Scan operation: Invalid FilterExpression: Syntax error; token: "_", near: "_autoScope"
And the docs don't say much about naming rules for field names either: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.NamingRulesDataTypes.html
Is there a workaround for this issue?
You should also use ExpressionAttributeNames:
aws dynamodb scan \
--table-name ModelDefinition \
--endpoint-url $ENDPOINT_URL \
--filter-expression '#nam = :val' \
--expression-attribute-values file://values.json \
--expression-attribute-names '{"#nam":"_autoScope"}'
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.ExpressionAttributeNames.html
Here, is my configuration:
Table Name: MY_TABLE
Primary partition key method (String)
Primary sort key path (String)
and I would like to query agains two fields:
1. method (Primary partition key): GET
2. path (Primary sort key): /greet/v1/hello
I have used '#pathKey' because 'path' is a reserved keyword. (Similar for #methodKey)
aws dynamodb query --table-name MY_TABLE \
--key-condition-expression '#pathKey=:path1 AND #methodKey=:method1' \
--expression-attribute-names '{"#pathKey":"path"}' \
--expression-attribute-names '{"#methodKey":"method"}' \
--expression-attribute-values '{":method1":{"S":"GET"}}' \
--expression-attribute-values '{":path1":{"S":"/greet/v1/hello"}}'
But while doing so, I am getting the below error:
An error occurred (ValidationException) when calling the Query operation: Invalid KeyConditionExpression: An expression attribute name used in the document path is not defined; attribute name: #pathKey
Please note that, I don't want to use an external JSON file to pass parameters and needs to run on command line.
You should provide all expression attribute names under the same CLI argument (also true for the values).
What happened is that --expression-attribute-names '{"#methodKey":"method"}' override the one before. Hence, the error is regarding missing #pathKey.
It should work for you this way:
aws dynamodb query --table-name MY_TABLE \
--key-condition-expression '#pathKey=:path1 AND #methodKey=:method1' \
--expression-attribute-names '{"#pathKey":"path", "#methodKey":"method"}' \
--expression-attribute-values '{":path1":{"S":"/greet/v1/hello"}, ":method1":{"S":"GET"}}'
I'm trying to execute one query against the DynamoDB. The command line is as below:
aws dynamodb query --table-name History
--key-condition-expression "#k = :v1" --expression-attribute-names '{"#k":"Key"}' --expression-attribute-values file://query.json
Json file:
{ ":v1": { "S":"cef50df4-b063-cebb-e0c0-08d651599ab7"} }
For my talbe "History", it has the hashkey of column "Key". When I execute this command line, it always tells me that:
Error parsing parameter '--expression-attribute-names': Expected: '=',
received: ''' for input: '{#k:Key}'
Can someone tell me how to correct it? Thanks a lot.
Problem in your JSON fromate '{"#k":"Key"}'`
Please change --expression-attribute-names '{"#k":"Key"}' to
--expression-attribute-names '{\"#k\":\"Key\"}' and try
reference Link: https://github.com/aws/aws-cli/issues/2298
I have a dynamodb on amazon server which I would like to add one column in a table for about half of the records
The table has a composite key and I want to provide the range as a parameter and I want to update all the records that only satisfies that parameter (regardless of the hash)
For the moment I have the following json files:
key.json
{
"ExchangeName": {"S": "TSX"}
}
expression-attribute-names.json
{
"#S":"Suffix"
}
expression-attribute-values.json
{
":s":{"S": "TO"}
}
the command line I use for this is the following:
aws dynamodb update-item \
--table-name stockInfoListCompanyAndTickers \
--key file://key.json --update-expression "SET #S = :s" \
--expression-attribute-names file://expression-attribute-names.json \
--expression-attribute-values file://expression-attribute-values.json --return-values ALL_NEW
It then returns me the following: "An error occurred (ValidationException) when calling the UpdateItem operation: The provided key element does not match the schema"
I understand that I need to add the hash in the keys, however, how I am not sure how I am suppose to do it so that it is not used to search for a specific value.
Thanks