DynamoDB provisioned Read/Write Capacity Units exceeded unexpectedly - amazon-dynamodb

I run a program that sends data to dynamodb using api gateway and lambdas.
All the data sent to the db is small, and only sent from about 200 machines.
I'm still using free tier and sometimes unexpectedly in the middle of the month I'm start getting an higher provisioned read / write capacity and then from this day I pay a constant amount each day until the end of the month.
Can someone understand from the image below what happened in the 03/13 that caused this pike in the charts and caused these provisioned to rise from 50 to 65?

I can't tell what happened based on those charts alone, but some things to consider:
You may not be aware of the new "PAY_PER_REQUEST" billing mode option for DynamoDB tables which allows you to mostly forget about manually provisioning your throughput capacity: https://aws.amazon.com/blogs/aws/amazon-dynamodb-on-demand-no-capacity-planning-and-pay-per-request-pricing/
Also, might not make sense for your use case, but for free tier projects I've found it useful to proxy all writes to DynamoDB through an SQS queue (use the queue as an event source for a Lambda with a reserved concurrency that is compatible with your provisioned throughput). This is easy if your project is reasonably event-driven, i.e. build your DynamoDB request object/params, write to SQS, then have the next step be a Lambda that is triggered from the DynamoDB stream (so you aren't expecting a synchronous response from the write operation in the first Lambda). Like this:
Example serverless config for SQS-triggered Lambda:
dynamodb_proxy:
description: SQS event function to write to DynamoDB table '${self:custom.dynamodb_table_name}'
handler: handlers/dynamodb_proxy.handler
memorySize: 128
reservedConcurrency: 95 # see custom.dynamodb_active_write_capacity_units
environment:
DYNAMODB_TABLE_NAME: ${self:custom.dynamodb_table_name}
iamRoleStatements:
- Effect: Allow
Action:
- dynamodb:PutItem
Resource:
- Fn::GetAtt: [ DynamoDbTable, Arn ]
- Effect: Allow
Action:
- sqs:ReceiveMessage
- sqs:DeleteMessage
- sqs:GetQueueAttributes
Resource:
- Fn::GetAtt: [ DynamoDbProxySqsQueue, Arn ]
events:
- sqs:
batchSize: 1
arn:
Fn::GetAtt: [ DynamoDbProxySqsQueue, Arn ]
Example write to SQS:
await sqs.sendMessage({
MessageBody: JSON.stringify({
method: 'putItem',
params: {
TableName: DYNAMODB_TABLE_NAME,
Item: {
...attributes,
created_at: {
S: createdAt.toString(),
},
created_ts: {
N: createdAtTs.toString(),
},
},
...conditionExpression,
},
}),
QueueUrl: SQS_QUEUE_URL_DYNAMODB_PROXY,
}).promise();
SQS-triggered Lambda:
import retry from 'async-retry';
import { getEnv } from '../lib/common';
import { dynamodb } from '../lib/aws-clients';
const {
DYNAMODB_TABLE_NAME
} = process.env;
export const handler = async (event) => {
const message = JSON.parse(event.Records[0].body);
if (message.params.TableName !== env.DYNAMODB_TABLE_NAME) {
console.log(`DynamoDB proxy event table '${message.params.TableName}' does not match current table name '${env.DYNAMODB_TABLE_NAME}', skipping.`);
} else if (message.method === 'putItem') {
let attemptsTaken;
await retry(async (bail, attempt) => {
attemptsTaken = attempt;
try {
await dynamodb.putItem(message.params).promise();
} catch (err) {
if (err.code && err.code === 'ConditionalCheckFailedException') {
// expected exception
// if (message.params.ConditionExpression) {
// const conditionExpression = message.params.ConditionExpression;
// console.log(`ConditionalCheckFailed: ${conditionExpression}. Skipping.`);
// }
} else if (err.code && err.code === 'ProvisionedThroughputExceededException') {
// retry
throw err;
} else {
bail(err);
}
}
}, {
retries: 5,
randomize: true,
});
if (attemptsTaken > 1) {
console.log(`DynamoDB proxy event succeeded after ${attemptsTaken} attempts`);
}
} else {
console.log(`Unsupported method ${message.method}, skipping.`);
}
};

Related

Am I doing Firestore Transactions correct?

I've followed the Firestore documentation with relation to transactions, and I think I have it all sorted correctly, but in testing I am noticing issues with my documents not getting updated properly sometimes. It is possible that multiple versions of the document could be submitted to the function in a very short interval, but I am only interested in only ever keeping the most recent version.
My general logic is this:
New/Updated document is sent to cloud function
Check if document already exists in Firestore, and if not, add it.
If it does exist, check that it is "newer" than the instance in firestore, if it is, update it.
Otherwise, don't do anything.
Here is the code from my function that attempts to accomplish this...I would love some feedback if this is correct/best way to do this:
const ocsFlight = req.body;
const procFlight = processOcsFlightEvent(ocsFlight);
try {
const ocsFlightRef = db.collection(collection).doc(procFlight.fltId);
const originalFlight = await ocsFlightRef.get();
if (!originalFlight.exists) {
const response = await ocsFlightRef.set(procFlight);
console.log("Record Added: ", JSON.stringify(procFlight));
res.status(201).json(response); // 201 - Created
return;
}
await db.runTransaction(async (t) => {
const doc = await t.get(ocsFlightRef);
const flightDoc = doc.data();
if (flightDoc.recordModified <= procFlight.recordModified) {
t.update(ocsFlightRef, procFlight);
console.log("Record Updated: ", JSON.stringify(procFlight));
res.status(200).json("Record Updated");
return;
}
console.log("Record isn't newer, nothing changed.");
console.log("Record:", JSON.stringify("Same Flight:", JSON.stringify(procFlight)));
res.status(200).json("Record isn't newer, nothing done.");
return;
});
} catch (error) {
console.log("Error:", JSON.stringify(error));
res.status(500).json(error.message);
}
The Bugs
First, you are trusting the value of req.body to be of the correct shape. If you don't already have type assertions that mirror your security rules for /collection/someFlightId in processOcsFlightEvent, you should add them. This is important because any database operations from the Admin SDKs will bypass your security rules.
The next bug is sending a response to your function inside the transaction. Once you send a response back the client, your function is marked inactive - resources are severely throttled and any network requests may not complete or crash. As a transaction may be retried a handful of times if a database collision is detected, you should make sure to only respond to the client once the transaction has properly completed.
You use set to write the new flight to Firestore, this can lead to trouble when working with transactions as a set operation will cancel all pending transactions at that location. If two function instances are fighting over the same flight ID, this will lead to the problem where the wrong data can be written to the database.
In your current code, you return the result of the ocsFlightRef.set() operation to the client as the body of the HTTP 201 Created response. As the result of the DocumentReference#set() is a WriteResult object, you'll need to properly serialize it if you want to return it to the client and even then, I don't think it will be useful as you don't seem to use it for the other response types. Instead, a HTTP 201 Created response normally includes where the resource was written to as the Location header with no body, but here we'll pass the path in the body. If you start using multiple database instances, including the relevant database may also be useful.
Fixing
The correct way to achieve the desired result would be to do the entire read->check->write process inside of a transaction and only once the transaction has completed, then respond to the client.
So we can send the appropriate response to the client, we can use the return value of the transaction to pass data out of it. We'll pass the type of the change we made ("created" | "updated" | "aborted") and the recordModified value of what was stored in the database. We'll return these along with the resource's path and an appropriate message.
In the case of an error, we'll return a message to show the user as message and the error's Firebase error code (if available) or general message as the error property.
// if not using express to wrangle requests, assert the correct method
if (req.method !== "POST") {
console.log(`Denied ${req.method} request`);
res.status(405) // 405 - Method Not Allowed
.set("Allow", "POST")
.end();
return;
}
const ocsFlight = req.body;
try {
// process AND type check `ocsFlight`
const procFlight = processOcsFlightEvent(ocsFlight);
const ocsFlightRef = db.collection(collection).doc(procFlight.fltId);
const { changeType, recordModified } = await db.runTransaction(async (t) => {
const flightDoc = await t.get(ocsFlightRef);
if (!flightDoc.exists) {
t.set(ocsFlightRef, procFlight);
return {
changeType: "created",
recordModified: procFlight.recordModified
};
}
// only parse the field we need rather than everything
const storedRecordModified = flightDoc.get('recordModified');
if (storedRecordModified <= procFlight.recordModified) {
t.update(ocsFlightRef, procFlight);
return {
changeType: "updated",
recordModified: procFlight.recordModified
};
}
return {
changeType: "aborted",
recordModified: storedRecordModified
};
});
switch (changeType) {
case "updated":
console.log("Record updated: ", JSON.stringify(procFlight));
res.status(200).json({ // 200 - OK
path: ocsFlightRef.path,
message: "Updated",
recordModified,
changeType
});
return;
case "created":
console.log("Record added: ", JSON.stringify(procFlight));
res.status(201).json({ // 201 - Created
path: ocsFlightRef.path,
message: "Created",
recordModified,
changeType
});
return;
case "aborted":
console.log("Outdated record discarded: ", JSON.stringify(procFlight));
res.status(200).json({ // 200 - OK
path: ocsFlightRef.path,
message: "Record isn't newer, nothing done.",
recordModified,
changeType
});
return;
default:
throw new Error("Unexpected value for 'changeType': " + changeType);
}
} catch (error) {
console.log("Error:", JSON.stringify(error));
res.status(500) // 500 - Internal Server Error
.json({
message: "Something went wrong",
// if available, prefer a Firebase error code
error: error.code || error.message
});
}
References
Cloud Firestore Transactions
Cloud Firestore Node SDK Reference
HTTP Event Cloud Functions

SQlite3 and knexjs results in Timeout acquiring a connection

I am trying to run the following code and getting an error
{ TimeoutError: Knex: Timeout acquiring a connection. The pool is
probably full. Are you missing a .transacting(trx) call?
Is there anyway to make sqlite wait until the pool is empty? if not, what would you suggest?
const path = require('path');
const knex = require('knex')({
client: 'sqlite3',
useNullAsDefault: true,
connection: {
filename: path.join(__dirname, '/db/sqlite.db')
}
});
knex('lorem')
.insert({ rowid: 'Slaughterhouse Five' })
var z = 0;
while (z < 20000) {
knex('lorem')
.select('rowid')
.then(result => {
console.log('res', result);
})
.catch(error => console.log('Error in select', error));
z++;
}
I would suggest to not trying to run 20000 parallel queries. At which point would you like to wait the pool being empty? You could run all the queries one by one or you could use for example Bluebird's .map() which allows to pass concurrency paramater to limit how many queries are resolved at the same time.

How can I prevent postgres deadlocks when running jest tests on CircleCI?

When I run my tests on CircleCI, it logs the following message a many times and eventually the tests fail because none of the database methods can retrieve the data due to the deadlocks:
{
"message": "Error running raw sql query in pool.",
"stack": "error: deadlock detected\n at Connection.Object.<anonymous>.Connection.parseE (/home/circleci/backend/node_modules/pg/lib/connection.js:567:11)\n at Connection.Object.<anonymous>.Connection.parseMessage (/home/circleci/-backend/node_modules/pg/lib/connection.js:391:17)\n at Socket.<anonymous> (/home/circleci/backend/node_modules/pg/lib/connection.js:129:22)\n at emitOne (events.js:116:13)\n at Socket.emit (events.js:211:7)\n at addChunk (_stream_readable.js:263:12)\n at readableAddChunk (_stream_readable.js:250:11)\n at Socket.Readable.push (_stream_readable.js:208:10)\n at TCP.onread (net.js:597:20)",
"name": "error",
"length": 316,
"severity": "ERROR",
"code": "40P01",
"detail": "Process 1000 waits for AccessExclusiveLock on relation 17925 of database 16384; blocked by process 986.\nProcess 986 waits for RowShareLock on relation 17870 of database 16384; blocked by process 1000.",
"hint": "See server log for query details.",
"file": "deadlock.c",
"line": "1140",
"routine": "DeadLockReport",
"level": "error",
"timestamp": "2018-10-15T20:54:29.221Z"
}
This is the test command I run: jest --logHeapUsage --forceExit --runInBand
I also tried this: jest --logHeapUsage --forceExit --maxWorkers=2
Pretty much all of the tests run some sort of database function. This issue only started to occur when we added more tests. Has anyone else had this same issue?
Based on the error message we got Deadlock because of RowShareLock;
This means that two transactions (lets call them transactionOne and transactionTwo) have locked resurce which the other transaction requires
Example:
transactionOne locks record in UserTable with userId = 1
transactionTwo locks record in UserTable with userId = 2
transactionOne attempts to update in UserTable for userId = 2, but since it is locked by another transaction - it waits for the lock to be released
transactionTwo attempts to update in UserTable for userId = 1, but since it is locked by another transaction - it waits for the lock to be released
Now the SQL engine detects that there is a deadlock and randomly picks one of the transactions and terminates it.
Lets say the SQL engine picks transactionOne and terminates it. This will result in the exception that is posted in the question.
transactionTwo is now allowed to perform an update in UserTable for user with userId = 1.
transactionTwo completes with success
SQL engines are pretty fast in detecting deadlocks, and the exception will be instant.
This is the reason for the deadlocks.
Deadlocks can have different root causes.
I see you use the pg plugin. Make sure you use it right with the transactions: pg node-postgres transactions
I would suspect a few different root causes and their solutions:
Cause 1: Multiple tests are running against the same database instance
It may be different ci pipelines executing the same test against the same Postgres instance
Solution:
This is the least probable situation, but the CI pipeline should provision its own separate Postgres instance on each run.
Cause 2: Transactions are not handled with appropriate catch("ROLLBACK")
This means that some transactions may stay alive and block others.
Solution: All transactions should have appropriate error handling.
const client = await pool.connect()
try {
await client.query('BEGIN')
//do what you have to do
await client.query('COMMIT')
} catch (e) {
await client.query('ROLLBACK')
throw e
} finally {
client.release()
}
Cause 3: Concurrency. For example: Tests are running in parallel, and they cause deadlocks.
We are writing scalable apps. This means that the deadlocks are inevitable. We have to be prepared for them and handle those appropriately.
Solution: Use the strategy "Let's try again". When we detect in our code that there is a deadlock exception, we just retry finite times. This approach has been proven with all my production apps for more than a decade.
Solution with helper func :
//Sample deadlock wrapper
const handleDeadLocks = async (action, currentAttepmt = 1 , maxAttepmts = 3) {
try {
return await action();
} catch (e) {
//detect it is a deadlock. Not 100% sure whether this is deterministic enough
const isDeadlock = e.stack?.includes("deadlock detected");
const nextAttempt = currentAttepmt + 1;
if (isDeadlock && nextAttempt <= maxAttepmts) {
//try again
return await handleDeadLocks(action, nextAttempt, maxAttepmts);
} else {
throw e;
}
}
}
//our db access functions
const updateUserProfile = async (input) => {
return handleDeadLocks(async () => {
//do our db calls
});
};
If the code becomes to complex/ nested. We can try to do it with another solution using High order function
const handleDeadLocksHOF = (funcRef, maxAttepmts = 3) {
return async (...args) {
const currentAttepmt = 1;
while (currentAttepmt <= maxAttepmts) {
try {
await funcRef(...args);
} catch (e) {
const isDeadlock = e.stack?.includes("deadlock detected");
if (isDeadlock && currentAttepmt + 1 < maxAttepmts) {
//try again
currentAttepmt += 1;
} else {
throw e;
}
}
}
}
}
// instead of exporting the updateUserProfile we should export the decorated func, we can control how many retries we want or keep the default
// old code:
export const updateUserProfile = (input) => {
//out legacy already implemented data access code
}
// new code
const updateUserProfileLegacy = (input) => {
//out legacy already implemented data access code
}
export const updateUserProfile = handleDeadLocksHOF(updateUserProfile)

How To guarantee that the input of the smart contract is not manipulated?

Let's say that my DApp got the following (smart) contract:
module.exports = {
winner: async function(value) {
if (value===10) {
}
}
}
Now the Dapp user can do someting which invoke the contract with some value which can be 10 or not. The Dapp determines if value equals 10 or not. So far so good.
But now it seems that anyone with a valid secret (and some XAS send to the Dapps's side chain) can invoke the contract with a simple PUT request to api/<dappId>//transactions/unsigned with value set to whatever they want.
How to ensure that the value of value is set by the Dapp and can not be manipulated?
As far as i do understand Asch DApps run on a express server, with the cors middleware enabled, which means that anyone can do a request (GET, POST, PUT, etc ) from anywhere.
So one can invoke your contract easily with a script like that shown below:
const axios = require('axios');
var fee = '10000000'
var data = {
secret: "<your secret>",
fee: fee,
type: 1001, //the number for contractfile.function
args: 1000 // a very high score
}
axios.put('http://<domain>:4096/api/dapps/<dappid>/transactions/unsigned',data)
.then(function (response) {
console.log(response);
})
.catch(function (error) {
console.log(error);
})
.then(function () {
// always executed
});
Due to the above it is not possible to guarantee that in input is not manipulated (send from outside the DApp). Also see: https://github.com/AschPlatform/asch/issues/228

Log 'jsonPayload' in Firebase Cloud Functions

TL;DR;
Does anyone know if it's possible to use console.log in a Firebase/Google Cloud Function to log entries to Stack Driver using the jsonPayload property so my logs are searchable (currently anything I pass to console.log gets stringified into textPayload).
I have a multi-module project with some code running on Firebase Cloud Functions, and some running in other environments like Google Compute Engine. Simplifying things a little, I essentially have a 'core' module, and then I deploy the 'cloud-functions' module to Cloud Functions, 'backend-service' to GCE, which all depend on 'core' etc.
I'm using bunyan for logging throughout my 'core' module, and when deployed to GCE the logger is configured using '#google-cloud/logging-bunyan' so my logs go to Stack Driver.
Aside: Using this configuration in Google Cloud Functions is causing issues with Error: Endpoint read failed which I think is due to functions not going cold and trying to reuse dead connections, but I'm not 100% sure what the real cause is.
So now I'm trying to log using console.log(arg) where arg is an object, not a string. I want this object to appear in Stack Driver under the jsonPayload but it's being stringified and put into the textPayload field.
It took me awhile, but I finally came across this example in firebase functions samples repository. In the end I settled on something a bit like this:
const Logging = require('#google-cloud/logging');
const logging = new Logging();
const log = logging.log('my-func-logger');
const logMetadata = {
resource: {
type: 'cloud_function',
labels: {
function_name: process.env.FUNCTION_NAME ,
project: process.env.GCLOUD_PROJECT,
region: process.env.FUNCTION_REGION
},
},
};
const logData = { id: 1, score: 100 };
const entry = log.entry(logMetaData, logData);
log.write(entry)
You can add a string severity property value to logMetaData (e.g. "INFO" or "ERROR"). Here is the list of possible values.
Update for available node 10 env vars. These seem to do the trick:
labels: {
function_name: process.env.FUNCTION_TARGET,
project: process.env.GCP_PROJECT,
region: JSON.parse(process.env.FIREBASE_CONFIG).locationId
}
UPDATE: Looks like for Node 10 runtimes they want you to set env values explicitly during deploy. I guess there has been a grace period in place because my deployed functions are still working.
I ran into the same problem, and as stated by comments on #wtk's answer, I would like to add replicating all of the default cloud function logging behavior I could find in the snippet below, including execution_id.
At least for using Cloud Functions with the HTTP Trigger option the following produced correct logs for me. I have not tested for Firebase Cloud Functions
// global
const { Logging } = require("#google-cloud/logging");
const logging = new Logging();
const Log = logging.log("cloudfunctions.googleapis.com%2Fcloud-functions");
const LogMetadata = {
severity: "INFO",
type: "cloud_function",
labels: {
function_name: process.env.FUNCTION_NAME,
project: process.env.GCLOUD_PROJECT,
region: process.env.FUNCTION_REGION
}
};
// per request
const data = { foo: "bar" };
const traceId = req.get("x-cloud-trace-context").split("/")[0];
const metadata = {
...LogMetadata,
severity: 'INFO',
trace: `projects/${process.env.GCLOUD_PROJECT}/traces/${traceId}`,
labels: {
execution_id: req.get("function-execution-id")
}
};
Log.write(Log.entry(metadata, data));
The github link in #wtk's answer should be updated to:
https://github.com/firebase/functions-samples/blob/2f678fb933e416fed9be93e290ae79f5ea463a2b/stripe/functions/index.js#L103
As it refers to the repository as of when the question was answered, and has the following function in it:
// To keep on top of errors, we should raise a verbose error report with Stackdriver rather
// than simply relying on console.error. This will calculate users affected + send you email
// alerts, if you've opted into receiving them.
// [START reporterror]
function reportError(err, context = {}) {
// This is the name of the StackDriver log stream that will receive the log
// entry. This name can be any valid log stream name, but must contain "err"
// in order for the error to be picked up by StackDriver Error Reporting.
const logName = 'errors';
const log = logging.log(logName);
// https://cloud.google.com/logging/docs/api/ref_v2beta1/rest/v2beta1/MonitoredResource
const metadata = {
resource: {
type: 'cloud_function',
labels: {function_name: process.env.FUNCTION_NAME},
},
};
// https://cloud.google.com/error-reporting/reference/rest/v1beta1/ErrorEvent
const errorEvent = {
message: err.stack,
serviceContext: {
service: process.env.FUNCTION_NAME,
resourceType: 'cloud_function',
},
context: context,
};
// Write the error log entry
return new Promise((resolve, reject) => {
log.write(log.entry(metadata, errorEvent), (error) => {
if (error) {
return reject(error);
}
resolve();
});
});
}
// [END reporterror]

Resources