I am having trouble using AWS Boto3 to query DynamoDB with a hash key and a range key at the same time using the recommend KeyConditionExpression. I have attached an example query:
import boto3
from boto3 import dynamodb
from boto3.session import Session
dynamodb_session = Session(aws_access_key_id=AWS_KEY,
aws_secret_access_key=AWS_PASS,
region_name=DYNAMODB_REGION)
dynamodb = dynamodb_session.resource('dynamodb')
table=dynamodb.Table(TABLE_NAME)
request = {
'ExpressionAttributeNames': {
'#n0': 'hash_key',
'#n1': 'range_key'
},
'ExpressionAttributeValues': {
':v0': {'S': MY_HASH_KEY},
':v1': {'N': GT_RANGE_KEY}
},
'KeyConditionExpression': '(#n0 = :v0) AND (#n1 > :v1)',
'TableName': TABLE_NAME
}
response = table.query(**request)
When I run this against a table with the following scheme:
Table Name: TABLE_NAME
Primary Hash Key: hash_key (String)
Primary Range Key: range_key (Number)
I get the following error and I cannot understand why:
ClientError: An error occurred (ValidationException) when calling the Query operation: Invalid KeyConditionExpression: Incorrect operand type for operator or function; operator or function: >, operand type: M
From my understanding the type M would be a map or dictionary type and I am using a type N which is a number type and matches my table scheme for the range key. If someone could explain why this error is happening or I am also open to a different way of accomplishing the same query even if you cannot explain why this error exists.
The Boto 3 SDK constructs a Condition Expression for you when you use the Key and Attr functions imported from boto3.dynamodb.conditions:
response = table.query(
KeyConditionExpression=Key('hash_key').eq(hash_value) & Key('range_key').eq(range_key_value)
)
Reference: Step 4: Query and Scan the Data
Hope it helps
Adding this solution as the accepted answer did not address why the query used did not work.
TLDR: Using query on a Table resource in boto3 has subtle differences as opposed to using client.query(...) and requires a different syntax.
The syntax is valid for a query on a client, but not on a Table. The ExpressionAttributeValues on a table do not require you to specify the data type. Also if you are executing a query on a Table resource you do not have to specify the TableName again.
Working solution:
from boto3.session import Session
dynamodb_session = Session(aws_access_key_id=AWS_KEY,aws_secret_access_key=AWS_PASS,region_name=DYNAMODB_REGION)
dynamodb = dynamodb_session.resource('dynamodb')
table = dynamodb.Table(TABLE_NAME)
request = {
'ExpressionAttributeNames': {
'#n0': 'hash_key',
'#n1': 'range_key'
},
'ExpressionAttributeValues': {
':v0': MY_HASH_KEY,
':v1': GT_RANGE_KEY
},
'KeyConditionExpression': '(#n0 = :v0) AND (#n1 > :v1)',
}
response = table.query(**request)
I am the author of a package called botoful which might be useful to avoid dealing with these complexities. The code using botoful will be as follows:
import boto3
from botoful import Query
client = boto3.Session(
aws_access_key_id=AWS_KEY,
aws_secret_access_key=AWS_PASS,
region_name=DYNAMODB_REGION
).client('dynamodb')
results = (
Query(TABLE_NAME)
.key(hash_key=MY_HASH_KEY, range_key__gt=GT_RANGE_KEY)
.execute(client)
)
print(results.items)
Related
I'm attempting to send a batch of PartiQL statements in the NodeJS AWS SDK v3. The statement works fine for a single ExecuteStatementCommand, but the Batch command doesn't.
The statement looks like
const statement = `
SELECT *
FROM "my-table"
WHERE "partitionKey" = '1234'
AND "filterKey" = '5678'
`
This code snippet works as expected:
const result = await dynamodbClient.send(new ExecuteStatementCommand(
{ Statement: statement}
))
The batch snippet does not:
const result = await dynamodbClient.send(new BatchExecuteStatementCommand({
Statements: [
{
Statement: statement
}
]
}))
The batch call produces the following error:
"Code": "ValidationError",
"Message": "Select statements within BatchExecuteStatement must specify the primary key in the where clause."
Any insight is greatly appreciated. Thanks for taking the time to reading my question!
Seems like what I needed was a rubber duck.
DynamoDB primary keys consists of partition key + sort key. My particular table has a sort key, which is missing from the statement. Batch jobs cannot handle filtering of responses, and each statement must match a single item in the database.
I have a table as : AdvgCountries which has two columns
a. CountryId (String) (Parition Key)
b. CountryName(String) Sort Key
While creating the table , I created with only Partition Key and then later added a Global Secondary Index with Index name as:
CountryName-index
Type : GSI
Partition key : CountryId
Sort Key : CountryName
I am able to retrieve CountryName based upon CountryId but unable to retrieve CountryId based upon CountryName. Based upon my reading I found that there are options to do this by providing indexname but I get the following error:
botocore.exceptions.ClientError: An error occurred
(ValidationException) when calling the Query operation: Query
condition missed key schema element: CountryId
import boto3
import json
import os
from boto3.dynamodb.conditions import Key, Attr
def query_bycountryname(pCountryname, dynamodb=None):
if not dynamodb:
dynamodb = boto3.resource('dynamodb', endpoint_url="https://dynamodb.us-east-1.amazonaws.com")
table = dynamodb.Table('AdvgCountires')
print(f"table")
attributes = table.query(
IndexName="CountryName-index",
KeyConditionExpression=Key('CountryName').eq(pCountryname),
)
if 'Items' in attributes and len(attributes['Items']) == 1:
attributes = attributes['Items'][0]
print(f"before return")
return attributes
if __name__ == '__main__':
CountryName = "India"
print(f"Data for {CountryName}")
countries = query_bycountryname(CountryName)
for country in countries:
print(country['CountryId'], ":", country['CountryName'])
Any help is appreciated.
You can't be able to fetch primary key value based on sort key. DynamoDB does not work like this.
In Dynamodb, each item’s location is determined by the hash value of
its partition key.
The Query operation in Amazon DynamoDB finds items based on primary
key values.
KeyConditionExpression are used to write conditional statements by
using comparison operators that evaluate against a key and limit the
items returned. In other words, you can use special operators to
include, exclude, and match items by their sort key values.
I'm having some strange feeling abour sqlite3 parameters that I would like to expose to you.
This is my query and the fail message :
#query
'SELECT id FROM ? WHERE key = ? AND (userid = '0' OR userid = ?) ORDER BY userid DESC LIMIT 1;'
#error message, fails when calling sqlite3_prepare()
error: 'near "?": syntax error'
In my code it looks like:
// Query is a helper class, at creation it does an sqlite3_preprare()
Query q("SELECT id FROM ? WHERE key = ? AND (userid = 0 OR userid = ?) ORDER BY userid DESC LIMIT 1;");
// bind arguments
q.bindString(1, _db_name.c_str() ); // class member, the table name
q.bindString(2, key.c_str()); // function argument (std::string)
q.bindInt (3, currentID); // function argument (int)
q.execute();
I have the feeling that I can't use sqlite parameters for the table name, but I can't find the confirmation in the Sqlite3 C API.
Do you know what's wrong with my query?
Do I have to pre-process my SQL statement to include the table name before preparing the query?
Ooookay, should have looked more thoroughly on SO.
Answers:
- SQLite Parameters - Not allowing tablename as parameter
- Variable table name in sqlite
They are meant for Python, but I guess the same applies for C++.
tl;dr:
You can't pass the table name as a parameter.
If anyone have a link in the SQLite documentation where I have the confirmation of this, I'll gladly accept the answer.
I know this is super old already but since your query is just a string you can always append the table name like this in C++:
std::string queryString = "SELECT id FROM " + std::string(_db_name);
or in objective-C:
[#"SELECT id FROM " stringByAppendingString:_db_name];
This is how my table looks like:
toHash is my primary partition key and timestamp is the sort key.
So, when I am executing this code [For getting the reverse sorted list of timestamp]:
import boto3
from boto3.dynamodb.conditions import Key, Attr
client = boto3.client('dynamodb')
response = client.query(
TableName='logs',
Limit=1,
ScanIndexForward=False,
KeyConditionExpression="toHash = :X",
)
I get the following error:
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the Query operation: Invalid KeyConditionExpression: An expression attribute value used in expression is not defined; attribute value: :X
Am I doing something wrong here? Why isn't X considered a valid attribute value?
This is why I don't use boto directly anymore. The API for what should be simple stuff like this is nuts. #notionquest is correct, you need to add the ExpressionAttributeValues. I don't mess with these things anymore. I use dynamof. Heres an example with your query:
from boto3 import client
from dynamof import db as make_db
from dynamof import query
client = client('dynamodb', endpoint_url='http://localstack:4569')
db = make_db(client)
db(query(
table_name='logs',
conditions=attr('toHash').equals('somevalue')
))
dynamof build the KeyConditionExpression and ExpressionAttributeValues for you.
disclaimer: I wrote dynamof
Add ExpressionAttributeValues as mentioned below:-
response = client.query(
TableName='logs',
Limit=1,
ScanIndexForward=False,
KeyConditionExpression="toHash = :X",
ExpressionAttributeValues={":X" : {"S" : "somevalue"}}
)
I am trying to move form DynamoDB to DynamoDB2 to use tables with global secondary indices. I need to create a table and then batch-write items into it. Here's a block of tets code:
from boto.dynamodb2.fields import HashKey, RangeKey, GlobalAllIndex
from boto.dynamodb2.layer1 import DynamoDBConnection
from boto.dynamodb2.table import Table
from boto.dynamodb2.items import Item
import boto
conn = DynamoDBConnection(aws_access_key_id=<MYID>,aws_secret_access_key=<MYKEY>)
tables = conn.list_tables()
table_name = 'myTable001'
if table_name not in tables['TableNames']:
Table.create(table_name, schema=[HashKey('firstKey')], throughput={'read': 5, 'write': 2}, global_indexes=[
GlobalAllIndex('secondKeyIndex', parts=[HashKey('secondKey')], throughput={'read': 5, 'write': 3})], connection=conn)
table = Table(table_name, connection=conn)
with table.batch_write() as batch:
batch.put_item(data={'firstKey': 'fk01', 'secondKey':'sk001', 'message': '{"firstKey":"fk01", "secondKey":"sk001", "comments": "fk01-sk001"}'})
# ...
batch.put_item(data={'firstKey': 'fk74', 'secondKey':'sk112', 'message': '{"firstKey":"fk74", "secondKey":"sk012", "comments": "fk74-sk012"}'})
When I run this code for the 1st time with a new value of table_name, I get the following error at the last line of the block:
boto.exception.JSONResponseError: JSONResponseError: 400 Bad Request
{u'message': u'Requested resource not found', u'__type': u'com.amazonaws.dynamodb.v20120810#ResourceNotFoundException'}
When I run it one more time, it get executed fine. I suspect that the reason is simply that the table is still being created when I run it for the first time. How do I check for table's status in DDB2? In DDB I used table.status but this does not seem to be available in DDB2. What should I use instead?
UPDATE: Based on final response here, the right way to extract table status is:
tdescr = conn.describe_table(tName)
print "%s" % ((tdescr['Table'])['TableStatus'])
Here are the other elements of the description dictionary:
for key in tdescr['Table'].keys():
print key
GlobalSecondaryIndexes
AttributeDefinitions
ProvisionedThroughput
TableSizeBytes
TableName
TableStatus
KeySchema
ItemCount
CreationDateTime
You can use conn.describe_table('table') to fetch the details about table & then check for the TableStatus field in the returned output.