How to implement outbox pattern in Cosmos DB - azure-cosmosdb

I'm looking to implement support for the outbox pattern in Cosmos DB.
However, Cosmos DB doesn't seem to support transactions across collections.
Then how do I do it?
I've been considering a few approaches to implement this:
Use Service bus transactions
Within a Service bus transaction scope, send the message (not committed just yet), do the Cosmos DB update and, if it works, then we commit the service bus transaction to have the message made available to subscribers.
Use triggers to insert rows in the outbox collection
As inserts/updates happen, we use Cosmos DB triggers to insert the respective messages into the outbox table and from then on, it's business as usual.
Use triggers to execute azure functions
Create Azure functions as Cosmos DB triggers. I almost like this but it would be so much better to get a message straight to service bus.
Use a data pump
Add two fields UpdateTimestamp and OutboxMessageTimestamp. When a recorded is updated so does the UpdateTimestamp.
Some process looks for records in which these two don't match and for each of those creates a notification message and relays it to the respective queues or topics.
Of course, then it updates the second timestamp so they match.
Other ideas on how to do this?

in general, you store things in your cosmos db collection. then you have change feed sending these changes to some observer (lets say azure function). then your azure function can do whatever: put it in queue for other consumers, save into another collection projected differently, etc... within your azure function you should implement your dead letter queue for failures that are not related to function runtime (for example, writing to another collection failed due to id conflict)
[UPDATE]
Let me add a bit more as a response to your comment.
From my experience, doing things atomically in distributed systems boils down to:
Always do things in same order
Make second step itempotent (ensuring you can repeat it any number of times getting same result)
Once first step succeeded - repeat second step until successful
So, in case you want to send email upon something saved into cosmos db, you could:
Save record in cosmos db
Have azure function listen to change feed
Once you receive inserted document > send email (more robust solution would actually put it in queue from which some dedicated consumer sends emails)
Alternative would be to have initial command (to save record) put in queue and then have 2 consumers (one for saving and one for sending emails) but then you have a problem of ordering (if thats important for you).

Related

DDD: persisting domain objects into two databases. How many repositories should I use?

I need to persist my domain objects into two different databases. This use case is purely write-only. I don't need to read back from the databases.
Following Domain Driven Design, I typically create a repository for each aggregate root.
I see two alternatives. I can create one single repository for my AG, and implement it so that it persists the domain object into the two databases.
The second alternative is to create two repositories, one each for each database.
From a domain driven design perspective, which alternative is correct?
My requirement is that it must persist the AR in both databases - all or nothing. So if the first one goes through and the second fails, I would need to remove the AG from the first one.
If you had a transaction manager that were to span across those two databases, you would use that manager to automatically roll back all of the transactions if one of them fails. A transaction manager like that would necessarily add overhead to your writes, as it would have to ensure that all transactions succeeded, and while doing so, maintain a lock on the tables being written to.
If you consider what the transaction manager is doing, it is effectively writing to one database and ensuring that write is successful, writing to the next, and then committing those transactions. You could implement the same type of process using a two-phase commit process. Unfortunately, this can be complicated because the process of keeping two databases in sync is inherently complex.
You would use a process manager or saga to manage the process of ensuring that the databases are consistent:
Write to the first database and leave the record in a PENDING status (not visible to user reads).
Make a request to second database to write the record in a PENDING status.
Make a request to the first database to leave the record in a VALID status (visible to user reads).
Make a request to the second database to leave the record in a VALID status.
The issue with this approach is that the process can fail at any point. In this case, you would need to account for those failures. For example,
You can have a process that comes through and finds records in PENDING status that are older than X minutes and continues pushing them through the workflow.
You can can have a process that cleans up any PENDING records after X minutes and purges them from the database.
Ideally, you are using something like a queue based workflow that allows you to fire and forget these commands and a saga or process manager to identify and react to failures.
The second alternative is to create two repositories, one each for each database.
Based on the above, hopefully you can understand why this is the correct option.
If you don't need to write why don't build some sort of commands log?
The log acts as a queue, you write the operation in it, and two processes pulls new command from it and each one update a database, if you can accept that in worst case scenario the two dbs can have different version of the data, with the guarantees that eventually they will be consistent it seems to me much easier than does transactions spanning two different dbs.
I'm not sure how much DDD is your use case, as if you don't need to read back you don't have any state to manage, so no need for entities/aggregates

Cosmos DB - thread safe pattern to allocate an 'available' document to reach request

For example if I was building an airline booking system and all of my seats were individual documents in a cosmos container with PartitionKey of the FlightNumber_DepartureDateTime e.g. UAT123_20220605T1100Z and id of SeatNumber eg. 12A.
A request comes in to allocate a single seat (any seat without a preference).
I want to be able to query the cosmos container for seats where allocated: false and allocate the first one to the request by setting allocated: true allocatedTo:ticketReference. But I need to do this in a thread safe way so that no two requests get the same seat.
Does Cosmos DB (SQL API) have a standard pattern to solve this problem?
The solution I thought of was to query a document and then update it by checking its Etag and if another thread got in first then the update would fail. If it fails then query another document and keep trying until I can successfully update it to claim the seat for this thread.
Is there a better way?
You could achieve this by using transactions. Cosmos DB allows you to write stored procedures that are executed in an atomic transaction, basically serializing concurrent seat reservation operations for you within a logical partition.
Quote from "Benefits of using server-side programming" in the link above:
Atomic transactions: Azure Cosmos DB database operations that are
performed within a single stored procedure or a trigger are atomic.
This atomic functionality lets an application combine related
operations into a single batch, so that either all of the operations
succeed or none of them succeed.
Bear in mind though that transactions come with a cost. They limit scalability of those operations. However in your scenario when you partition data per flight and given that those operations are very fast, this might be the preferable and most reliable option.
I have done something similar with Service Bus queues, essentially allowing you to queue bookings to be saved, therefore you can do the availability logic before you save the booking guaranteeing no overbookings.
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-queues-topics-subscriptions

Streams with initial state

I would like to expose something like a subscription or a "sticky query": the goal is to query DynamoDB and return the results via the WebSockets API in API Gateway. Well, whenever DynamoDB changes in a way the query would be affected (I guess I could use Streams for that) I would like to notify the client(s). How can I make sure the client gets the initial list and all updates? I would like to make sure the client doesn't miss any updates right after the subscription is created and before the initial list of results is returned to it...
To inform your clients about changes in your DynamoDB, DynamoDB Streams could be used. However, the information is only available for 24 hours.
Even if you write your updates to a Kinesis Stream, your information will be available for a maximum of 7 days (according to the FAQ)
I suggest splitting your use-case in two:
Create a service where you return the initial state of your "sticky query"
Create a stream which will notify your clients about updates to the "sticky query"

What is the best way to integrate DynamoDB stream with CloudSearch? [duplicate]

I'm using Dynamo DB pretty heavily for a service I'm building. A new client request has come in that requires cloud search. I see that a cloud search domain can be created from a dynamo table via the AWS console.
My question is this:
Is there a way to automatically offload data from a dynamo table into a cloud search domain via the API or otherwise at a specified
time interval?
I'd prefer this to manually offloading dynamo documents to cloudsearch. All help greatly appreciated!
Here are two ideas.
The official AWS way of searching DynamoDB data with CloudSearch
This approach is described pretty thoroughly in the "Synchronizing a Search Domain with a DynamoDB Table" section of http://docs.aws.amazon.com/cloudsearch/latest/developerguide/searching-dynamodb-data.html.
The downside is that it sounds like a huge pain: you have to either re-create new search domains or maintain an update table in order to sync, and you'd need a cron job or something to execute the script.
The AWS Lambdas way
Use the newish Lambdas event processing service. It is pretty simple to set up an event stream based on Dynamo (see http://docs.aws.amazon.com/lambda/latest/dg/wt-ddb.html).
Your Lambda would then submit a search document to CloudSearch based on the Dynamo event. For an example of submitting a document from a Lambda, see https://gist.github.com/fzakaria/4f93a8dbf483695fb7d5
This approach is a lot nicer in my opinion as it would continuously update your search index without any involvement from you.
I'm not so clear on how Lambda would always keep the data in sync with the data in dynamoDB. Consider the following flow:
Application updates a DynamoDB table's Record A (say to A1)
Very closely after that Application updates same table's same record A (to A2)
Trigger for 1 causes Lambda of 1 to start execute
Trigger for 2 causes Lambda of 2 to start execute
Step 4 completes first, so CloudSearch sees A2
Now Step 3 completes, so CloudSearch sees A1
Lambda triggers are not guaranteed to start ONLY after previous invocation is complete (Correct if wrong, and provide me link)
As we can see, the thing goes out of sync.
The closest I can think which will work is to use AWS Kinesis Streams, but those too with a single Shard (1MB ps limit ingestion). If that restriction works, then your consumer application can be written such that the record is first processed sequentially, i.e., only after previous record is put into CS, then the next record should be put.

Using SQS or DynamoDB to control order status

I am building a system that processes orders. Each order will follow a workflow. So this order can be, e.g., booked,accepted,payment approved,cancelled and so on.
Every time a status of a order changes I will post this change to SNS. To know if a status order has changed I will need to make a request to a external API, and compare to the last known status.
The question is: What is the best place to store the last known order status?
1. A SQS queue. So every time I read a message from queue, check status using the external API, delete the message and insert another one with the new status.
2. Use a database (like Dynamo DB) to control the order status.
You should not use the word "store" to describe something happening with stateful facts and a queue. Stateful, factual information should be stored -- persisted -- to a database.
The queue messages should be treated as "hints" on what work needs to be done -- a request to consider the reasonableness of a proposed action, and if reasonable, perform the action.
What I mean by this, is that when a queue consumer sees a message to create an order, it should check the database and create the order if not already present. Update an order? Check the database to see whether the order is in a correct status for the update to occur. (Canceling an order that has already shipped would be an example of a mismatched state).
Queues, by design, can't be as precise and atomic in their operation as a database should. The Two Generals Problem is one of several scenarios that becomes an issue in dealing with queues (and indeed with designing a queue system) -- messages can be lost or delivered more than once.
What happens in a "queue is authoritative" scenario when a message is delivered (received from the queue) more than once? What happens if a message is lost? There's nothing wrong with using a queue, but I respectfully suggest that in this scenario the queue should not be treated as authoritative.
I will go with the database option instead of SQS:
1) option SQS:
You will have one application which will change the status
Add the status value into SQS
Now another application will check your messages and send notification, delete the message
2) Option DynamoDB:
Insert you updated status in DynamoDB
Configure a Lambda function on update of that field
Lambda function will send notifcation
The database option looks clear additionally, you don't have to worry about maintaining any queue plus you can read one message from the queue at a time unless you implement parallel reader to read from the queue. In a database, you can update multiple rows and it will trigger the lambda and you don't have to worry about it.
Hope that helps

Resources