I am just getting started with Firebase and had a question regarding the Firebase Event Guarantees listed at the following URL:
Event Guarantees.
One of the guarantees states that writes from a single client will always be written to the server and broadcast out to other users in-order.
Does this guarantee also imply that clients will receive events broadcast by a single client in the order that they were broadcast, or is it possible to receive events out of the order they were broadcast?
For example, if one client adds a node, then adds a child to that node, am I guaranteed that other clients will see those events in the same order?
The only guarantee is that the values will be eventually consistent. Thinking this through, it's the only reasonable answer. Any operation over the internet could be delayed indefinitely by any moving part in the process, thus producing out-of-order events received by the client, regardless of the order they reach the server.
Thus, you are guaranteed that all the clients will see both of the added child nodes eventually, and that they will be consistent across all the clients (eventually).
If you want to guarantee the order of events, you are using a messaging queue--which is one adaptation of how you can use Firebase, but not the only one. This is easily achieved using the push() method, which creates chronologically ordered, unique ids.
You can also throw in a timestamp and utilize the orderByChild method to sort records.
Related
I am building a project from scratch using event-sourcing with Java and Cassandra.
My apps we be based on microservices and in some use cases information will be processed asynchronously. I was wondering what part a Message Queue (such as Rabbit, Active MQ Artemis, Kafka, etc) would play to improve the technology stack in this environment and if I understand the scenarios if I won't use it.
I would start with separating messaging infrastructure like RabbitMQ from event streaming/storing/processing like Kafka. These are two different things made for two (or more) different purposes.
Concerning the event sourcing, you have to have a place where you must store events. This storage must be append-only and support fast reads of unstructured data based on an identity. One example of such persistence is the EventStore.
Event sourcing goes together with CQRS, which means you have to project your changes (event) to another store, which you can query. This is done by projecting events to that store, this is where events get processed to change the domain object state. It is important to understand that using message infrastructure for projections is generally a bad idea. This is due to the nature of messaging and two-phase commit issue.
If you look at how events get persisted, you can see that they get saved to the store as one transaction. If you then need to publish events, this will be another transaction. Since you are dealing with two different pieces of infrastructure, things can get broken.
The messaging issue as such is that messages are usually guaranteed to be delivered "at least once" and the order of messages is usually not guaranteed. Also, when your message consumer fails and NACKs the message, it will be redelivered but usually a bit later, again breaking the sequence.
The ordering and duplication concerns, whoever, do not apply to event streaming servers like Kafka. Also, the EventStore will guarantee once only event delivery in order if you use catch-up subscription.
In my experience, messages are used to send commands and to implement event-driven architecture to connect independent services in a reactive way. Event stores, at the other hand, are used to persist events and only events that get there are then projected to the query store and also get published to the message bus.
Make sure you are clear on the distinction between send(command) and publish(event). Udi Dahan touches on that topic in his essay on busses and brokers.
In most cases where you are event sourcing, you do not want to be reconstructing state from published events. If you need state, then query the technical authority/book of record for the history, and reconstruct the state from the history.
On the other hand, event driven activity off of a message queue should be fine. When a single event (plus the subscriber's state) has everything you need, then running off of the bus is fine.
In some cases, you might do both. For example, if you were updating cached views, you'd subscribe to various BobChanged events to know when your cached data was stale; to rebuild a stale view, you would reload a representation of the history and transform it into an updated view.
In the world of event-sourcing applications, message queues usually allow you to implement publish-subscribe pattern style of communication between producers and consumers. Also, they usually help you with delivery guarantees: which messages were delivered to which subscribers and which ones were not.
But they don't store all messages indefinitely. You need to have an event store to do any kind of event sourcing.
The question is not 'to queue or not to queue', but it is more like:
can this thing store huge volume of events indefinitely?
does it have publish-subscribe capabilities?
does it provide at-least-once delivery guarantees?
So, you should use something like Kafka or EventStore to have all that out-of-the-box. Alternatively, you can combine event store with message queue manually, but this is going to be more involved.
Steps to reproduce
The logic of the application assumes that there are number of data sources on the server which are handled by groups.
If client wants to subscribe to the specific data source, it calls:
myhub.Subscribe(dataSourceId);
On the server side, we just add the client to the specific group:
await Groups.Add(Context.ConnectionId, dataSourceId.ToString());
Then all the messages are sent with huge cursor payload. And the most important part, the size of it grows with every subscription.
Am I doing something wrong?
Update
Similar: SignalR and large number of groups
Unfortunately this is how cursors work. Cursor contains references to all topics the connection is subscribed to and each group is a separate topic. Besides the cursor getting bigger there is one more limitation to using many groups. The more groups the client is a member of the bigger the groups token. The groups token is sent back to the server when the client is reconnecting and if it gets too big it may exceed the URL size limit causing reconnect failures.
If a subscription is rerun with the "same arguments" in a flush cycle it reuses the observer or the server and the data in minimongo:
If the subscription is run with the same arguments then the “new” subscription discovers the old “marked for destruction” subscription that’s sitting around, with the same data already ready, and simply reuses that. - Meteor Guide
Additionally, if two subscriptions both request the same document Merge Box will ensure the data is not sent multiple times across DDP.
Furthermore, if a subscription is marked for destruction and rerun with different arguments the observer cannot be reused, however my question is: if there are shared documents being published by the old and new subscription, in the same flush cycle, will the overlapping documents need be intelligently recycled on the client or will they need be sent over the wire a second time.
[Assume there are no other subscriptions that share this data.]
I believe the data will be reused I need to double check though.
I am building a system that processes orders. Each order will follow a workflow. So this order can be, e.g., booked,accepted,payment approved,cancelled and so on.
Every time a status of a order changes I will post this change to SNS. To know if a status order has changed I will need to make a request to a external API, and compare to the last known status.
The question is: What is the best place to store the last known order status?
1. A SQS queue. So every time I read a message from queue, check status using the external API, delete the message and insert another one with the new status.
2. Use a database (like Dynamo DB) to control the order status.
You should not use the word "store" to describe something happening with stateful facts and a queue. Stateful, factual information should be stored -- persisted -- to a database.
The queue messages should be treated as "hints" on what work needs to be done -- a request to consider the reasonableness of a proposed action, and if reasonable, perform the action.
What I mean by this, is that when a queue consumer sees a message to create an order, it should check the database and create the order if not already present. Update an order? Check the database to see whether the order is in a correct status for the update to occur. (Canceling an order that has already shipped would be an example of a mismatched state).
Queues, by design, can't be as precise and atomic in their operation as a database should. The Two Generals Problem is one of several scenarios that becomes an issue in dealing with queues (and indeed with designing a queue system) -- messages can be lost or delivered more than once.
What happens in a "queue is authoritative" scenario when a message is delivered (received from the queue) more than once? What happens if a message is lost? There's nothing wrong with using a queue, but I respectfully suggest that in this scenario the queue should not be treated as authoritative.
I will go with the database option instead of SQS:
1) option SQS:
You will have one application which will change the status
Add the status value into SQS
Now another application will check your messages and send notification, delete the message
2) Option DynamoDB:
Insert you updated status in DynamoDB
Configure a Lambda function on update of that field
Lambda function will send notifcation
The database option looks clear additionally, you don't have to worry about maintaining any queue plus you can read one message from the queue at a time unless you implement parallel reader to read from the queue. In a database, you can update multiple rows and it will trigger the lambda and you don't have to worry about it.
Hope that helps
I'm doing the sync to mirror a sqlite DB to a server one.
I have a Master-Detail table, where the details must be send to the server ASAP. However, is possible that detail 3 arrive before detail 2. I need to mimic the steps made to the document and respect the order of the operations.
When a record is saved locally, I send a notification and then post the data. How I can guarantee a strict sequential order using AFNetworking?
By default, operations run concurrently, with no guarantee of order. The only way to ensure that actions play is to prevent more than one request operation from running at a given time, by setting the operationQueue.maximumConcurrentOperations property to 1 (or, if you're not using a manager, make sure to enqueue operations into an operation queue with the property set thusly).