Truncate DynamoDb or rewrite data via Data Pipeline - amazon-dynamodb

There is possibility to dump DynamoDb via Data Pipeline and also import data in DynamoDb. Import is going well, but all the time data appends to already exists data in DynamoDb.
For now I found work examples that scan DynamoDb and delete items one by one or via Batch. But at any rate for big amount of data it is not good variant.
Also it is possible to delete table at all and create it. But with that variant indexes will be lost.
So, best way would be to override DynamoDb data via import by Data Pipeline or truncate somehow. Is it possible to do? And how is it possible if yes?

Truncate Table functionality is not available in DynamoDB, So kindly consider deleting the table & creating again,
Reason : DynamoDB Charges you based on ReadCapacityUnits & WriteCapacityUnits which you have used. If you delete all items using BatchWriteItem function, it will use WriteCapacityUnits. So, to save these WriteCapacityUnits for deleting items, It will be better if you truncate the table & recreate it agian.
Steps to Delete & Create DynamoDB Tables as follows :
Delete Table via AWS CLI :
aws dynamodb delete-table --table-name *tableName*
Delete Table via AmazonDynamoDB API :
Sample Request
POST / HTTP/1.1
Host: dynamodb.<region>.<domain>;
Accept-Encoding: identity
Content-Length: <PayloadSizeBytes>
User-Agent: <UserAgentString>
Content-Type: application/x-amz-json-1.0
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>, SignedHeaders=<Headers>, Signature=<Signature>
X-Amz-Date: <Date>
X-Amz-Target: DynamoDB_20120810.DeleteTable
{
"TableName": "Reply"
}
Creating DynamoDB Table via AmazonDynamoDB API :
POST / HTTP/1.1
Host: dynamodb.<region>.<domain>;
Accept-Encoding: identity
Content-Length: <PayloadSizeBytes>
User-Agent: <UserAgentString>
Content-Type: application/x-amz-json-1.0
Authorization: AWS4-HMAC-SHA256 Credential=<Credential>, SignedHeaders=<Headers>, Signature=<Signature>
X-Amz-Date: <Date>
X-Amz-Target: DynamoDB_20120810.CreateTable
{
"AttributeDefinitions": [
{
"AttributeName": "ForumName",
"AttributeType": "S"
},
{
"AttributeName": "Subject",
"AttributeType": "S"
},
{
"AttributeName": "LastPostDateTime",
"AttributeType": "S"
}
],
"TableName": "Thread",
"KeySchema": [
{
"AttributeName": "ForumName",
"KeyType": "HASH"
},
{
"AttributeName": "Subject",
"KeyType": "RANGE"
}
],
"LocalSecondaryIndexes": [
{
"IndexName": "LastPostIndex",
"KeySchema": [
{
"AttributeName": "ForumName",
"KeyType": "HASH"
},
{
"AttributeName": "LastPostDateTime",
"KeyType": "RANGE"
}
],
"Projection": {
"ProjectionType": "KEYS_ONLY"
}
}
],
"ProvisionedThroughput": {
"ReadCapacityUnits": 5,
"WriteCapacityUnits": 5
}
}
Summary : Delete the table & Create it again would be the best solution.

Related

Insert an item in DynamoDB only if the partition key exists

I am wanting to only insert an item if the partition/hash key exists. I am attempting to use a conditional expression along with attribute_exists to achieve this but I am getting unexpected results.
The example table
{
"TableName": "example",
"KeySchema": [
{ "AttributeName": "PK", "KeyType": "HASH" },
{ "AttributeName": "SK", "KeyType": "RANGE" }
],
"AttributeDefinitions": [
{ "AttributeName": "PK", "AttributeType": "S" },
{ "AttributeName": "SK", "AttributeType": "S" }
],
}
Insert an initial item with PK USER#123
$ aws dynamodb put-item --table-name "example" \
--endpoint-url http://localhost:8000 \
--item '{"PK": {"S":"USER#123"}, "SK":{"S":"PROFILE"}}'
$ aws dynamodb scan --table-name "example" --endpoint-url http://localhost:8000
{
"Items": [
{
"PK": {
"S": "USER#123"
},
"SK": {
"S": "PROFILE"
}
}
],
"Count": 1,
"ScannedCount": 1,
"ConsumedCapacity": null
}
Attempt to insert another item with the same PK. This results in ConditionalCheckFailedException. Based on the docs and various attribute_not_exists examples I have seen, I would expect this to succeed because the PK exists.
$ aws dynamodb put-item --table-name "example" \
--endpoint-url http://localhost:8000 \
--item '{"PK": {"S":"USER#123"}, "SK":{"S":"COMMENT#123"}}' \
--condition-expression "attribute_exists(PK)"
I would expect this to fail because the PK does not exist:
$ aws dynamodb put-item --table-name "example" \
--endpoint-url http://localhost:8000 \
--item '{"PK": {"S":"USER#321"}, "SK":{"S":"COMMENT#123"}}' \
--condition-expression "attribute_exists(PK)"
Instead, both of these operations fail.
If it helps, I am looking for the exact OPPOSITE of this stackoverflow post
There is no such concept as “the PK already exists” because there is no PK entity, only items, some of which may have that PK.
If you really want to enforce this type of behavior you’ll need to put an actual item in the database to indicate to your application that this PK ”exists”. Pick whatever SK you want for the marker item. Then do a transactional write for your new item with a ConditionCheck as part of it that the marker item already exists.

How to insert empty string in DynamoDB using the output of a Lambda in Step Functions?

I'm trying to save the output of a Lambda which calls Lex to DynamoDB using Step Functions.
The intentName in a Lex response is sometimes null (unknown). The problem is that in the state (task) that saves the response to DynamoDB, because of this empty string I get an error from DynamoDB.
Is there any workaround, maybe using JsonPath or the state machine diagram of the Step Function, in order to insert null or maybe no insert that specific property in DynamoDB?
Here is the JSON for the state machine:
{
"StartAt": "ProcessLex",
"States": {
"ProcessLex": {
"Type": "Task",
"Resource": "arn:aws:lambda:<Region>:<Account Id>:function:getIntent",
"ResultPath": "$.lexResult",
"Next": "ChoiceIfIntent"
},
"SaveToDynamo": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:putItem",
"Parameters": {
"TableName": "MyTable",
"Item": {
"dateTime": {
"S.$": "$.dateTime"
},
"intentName": {
"S.$": "$.lexResult.intentName"
},
"analysis": {
"M.$": "$.lexResult.sentimentResponse"
}
}
},
"End": true
},
"Comprehend": {
"Comment": "To be implemented later",
"Type": "Pass",
"End": true
},
"ChoiceIfIntent": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.lexResult.intentName",
"StringGreaterThanEquals": "",
"Next": "SaveToDynamo"
}
],
"Default": "Comprehend"
}
}
}
The problem is not the null value, the problem is that in DynamoDB with the PutItem Api you cannot insert empty strings.
I know this is frustrating but the quickest solution is to replace "" with NULL.
The solution that I prefer is to set the convertEmptyValue to true in your DynamoDb client settings.
const dynamodb = new AWS.DynamoDB.DocumentClient({ convertEmptyValues: true })
UPDATE
Since yesterday, DynamoDB supports empty values for string!
Take a look here.

Firebase Firestore REST Request - Query and Filter

I have a firestore database on a firebase project. I want to make rest request for filtering or querying data with postman. I'm using "https://firestore.googleapis.com/v1/projects//databases/(default)/documents/" to get the data in a known path in my database. Here is a sample of my database structure:
users > xxxxx > messages > yyyyy> "sent":"true"
where "users" and "messages" are collections, "xxxxx" and "yyyyy"are autogenerated document ids (xxxxx is autogenerated user id)
What I want to do is to find the "xxxxx"s (users) which have >"sent":"true"< data.
I get success if I know the "xxxxx" and "yyyyy" but I don't know them because they are autogenerated and different from each other in my database and don't know how to do it.
You need to run a Query, as explained here in the documentation of the REST API.
Since you want to query all the messages sub-collections of different user documents, you need to "simulate" a Collection Group Query in your StructuredQuery. The way to do that is to set the allDescendants element to true in the CollectionSelector.
So, issuing a POST HTTP Request on the following URL will do the trick:
var URL = "https://firestore.googleapis.com/v1/projects/<your-project-id>/databases/(default)/documents:runQuery";
The body of the POST Request shall contain:
"structuredQuery": {
"from": [{
"collectionId": "messages",
"allDescendants": true
}],
"where": {
"fieldFilter": {
"field": {
"fieldPath": "sent"
},
"op": "EQUAL",
"value": {
"stringValue": "true",
}
}
}
}
Note that you need to add a single field index to your Firestore DB, as follows:
Note also that, if your field sent is of type Boolean (and not String as shown in your question), you need to use a booleanValue element in your Value JSON element.
I am unable to get this to work for some reason.
I have a Collection called dzs which has some documents with auto generated id's.
I want to query and find a document with a specific email address.
When I try this in Postman, it returns (Error 400 Bad request)
"structuredQuery": {
"from": [{
"collectionId": "dzs",
"allDescendants": true
}],
"where": {
"fieldFilter": {
"field": {
"fieldPath": "email"
},
"op": "EQUAL",
"value": {
"stringValue": "123#123.com",
}
}
}
}
Add the parent collection/document path to the URL:
var URL = "https://firestore.googleapis.com/v1/projects/<your-project-id>/databases/(default)/documents/users/xxxxx:runQuery";
Then make the collectionId "messages" and allDescendents false:
"structuredQuery": {
"from": [{
"collectionId": "messages",
"allDescendants": false
}],
"where": {
"fieldFilter": {
"field": {
"fieldPath": "sent"
},
"op": "EQUAL",
"value": {
"stringValue": "true",
}
}
}
}
Source

JAGQL - Why do I need an id for a post call?

I'm using JAGQL to build a JSON API compatible express server. My database behind it is MongoDB (jsonapi-store-mongodb). I posted my question here as well: https://github.com/holidayextras/jsonapi-store-mongodb/issues/59
According to the JAGQL documentation, https://jagql.github.io/pages/project_setup/resources.html#generateid,
I am told that
generateId
By default, the server autogenerates a UUID for resources which are created without specifying an ID. To disable this behavior (for example, if the database generates an ID by auto-incrementing), set generateId to false. If the resource's ID is not a UUID, it is also necessary to specify an id attribute with the correct type. See /examples/resorces/autoincrement.js for an example of such a resource.
But when I send a POST request to one of my resources, I get this:
"jsonapi": {
"version": "1.0"
},
"meta": {},
"links": {
"self": "/myresource"
},
"errors": [
{
"status": "403",
"code": "EFORBIDDEN",
"title": "Param validation failed",
"detail": [
{
"message": "\"id\" is required",
"path": [
"id"
],
"type": "any.required",
"context": {
"key": "id",
"label": "id"
}
}
]
}
]
What am I missing?
See here for more details: https://github.com/jagql/framework/issues/106
In your resource definition, you want to add primaryKey: 'uuid':
{
resource: 'things',
handlers,
primaryKey: 'uuid',
attributes: {
...
}
}

Add Global Secondary Index to an existing table in DynamoDB using aws cli

I can notseem to find an example on how to add Global Secondary Index to an existing table in DynamoDB using the aws cli.
This is what i know so far from the docs
Any pointers would be appreciated
Here is the update-table document.
Example:
aws dynamodb update-table --table-name <tableName> --global-secondary-index-updates file://gsi-command.json
Create a JSON file based with either update, create or delete action:-
Keep one of the action (update, create or delete) from below sample JSON and update the attribute definitions accordingly
[
{
"Update": {
"IndexName": "string",
"ProvisionedThroughput": {
"ReadCapacityUnits": long,
"WriteCapacityUnits": long
}
},
"Create": {
"IndexName": "string",
"KeySchema": [
{
"AttributeName": "string",
"KeyType": "HASH"|"RANGE"
}
...
],
"Projection": {
"ProjectionType": "ALL"|"KEYS_ONLY"|"INCLUDE",
"NonKeyAttributes": ["string", ...]
},
"ProvisionedThroughput": {
"ReadCapacityUnits": long,
"WriteCapacityUnits": long
}
},
"Delete": {
"IndexName": "string"
}
}
...
]
There is a small section in the Options section of the Update Table documentation that mentions the required options specific to creating a new global secondary index which requires that the attribute-definitions include the key elements of the new index. Just adding that option to the end of the example provided by #notionquest should do the trick.
aws dynamodb update-table --table-name <tableName> --global-secondary-index-updates file://gsi-command.json --attribute-definitions AttributeName=<attributeName>, AttributeType=<attributeType>
Creating global secondary indexes in existing tables.
Use this CLI command and JSON file for update.
aws dynamodb update-table --table-name sample--cli-input-json file://gsi-update.json --endpoint-url http://localhost:8000
Save the arguments in JSON format.
{
"AttributeDefinitions":[
{
"AttributeName":"String",
"AttributeType":"S"
},
{
"AttributeName":"String",
"AttributeType":"S"
}
],
"GlobalSecondaryIndexUpdates":[
{
"Create":{
"IndexName":"index-name",
"KeySchema":[
{
"AttributeName":"String",
"KeyType":"HASH"
},
{
"AttributeName":"String",
"KeyType":"RANGE"
}
],
"Projection":{
"ProjectionType":"ALL"
},
"ProvisionedThroughput":{
"ReadCapacityUnits":5,
"WriteCapacityUnits":5
}
}
}
]
}

Resources