What is the architecture of http://telepat.io?
How exactly does client/server db sync work in http://telepat.io?
What is the purpose of redis DB, do you use it for data sync?
I want to have picture similar to this hood.ie architecture:
I've made you this diagram (not very good at explaining through diagrams), I hope you'll understand the general view of it.
This is how a client would normally operate in the Telepat system: subscribes to object in the app (it returns the objects from that channel), sends requests to API for creating/updating/deleting app objects. API sends messages to the workers (aggregators just put these changes (deltas) in the Redis volatile DB; writers get notified by aggregators which in turn process the deltas and writes changes to DB, writers know which channels each are affected by the change and gets the subscribed devices for each of those channels sending messages to client transport workers; these workers send the changes back to the clients, GCM, APN or webSockets).
Redis is used for volatile stuff like devices, subscriptions, deltas and object caches.
Related
I am using SignalR in my web api to provide real-time functionality to my client apps (mobile and web). Everything works ok but there is something that worries me a bit:
The clients get updated when different things happen in the backend. For example, when one of the clients does a CRUD operation on a resource that will be notified by SignalR. But, what happens when something happens on the client, let's say the mobile app, and the device data connection is dropped?.
It could happen that another client has done any action over a resource and when SignalR broadcasts the message it doesn't arrive to that client. So, that client will have an old view sate.
As I have read, it seems that there's no way to know if a meesage has been sent and received ok by all the clients. So, beside checking the network state and doing a full reload of the resource list when this happens is there any way to be sure message synchronization has been accomplished correctly on all the clients?
As you've suggested, ASP NET Core SignalR places the responsibility on the application for managing message buffering if that's required.
If an eventually consistent view is an issue (because order of operations is important, for example) and the full reload proves to be an expensive operation, you could manage some persistent queue of message events as far back as it makes sense to do so (until a full reload would be preferable) and take a page from message buses and event sourcing, with an onus on the client in a "dumb broker/smart consumer"-style approach.
It's not an exact match of your case, but credit where credit is due, there's a well thought out example of handling queuing up SignalR events here: https://stackoverflow.com/a/56984518/13374279 You'd have to adapt that some and give a numerical order to the queued events.
The initial state load and any subsequent events could have an aggregate version attached to them; at any time that the client receives an event from SignalR, it can compare its currently known state against what was received and determine whether it has missed events, be it from a disconnection or a delay in the hub connection starting up after the initial fetch; if the client's version is out of date and within the depth of your queue, you can issue a request to the server to replay the events out to that connection to bring the client back up to sync.
Some reading into immediate consistency vs eventual consistency may be helpful to come up with a plan. Hope this helps!
Structure
This is a question how to best use Apollo Client 2 in our current infrastructure. We have one Graphql server (Apollo server 2) which connects to multiple other endpoints. Mutations are send via RabbitMQ and our Graphql also listens to RabbitMQ which then are pushed to the client via subscriptions.
Implementation
We have a Apollo server which we send mutations, these always give a null because they are async. Results will be send back via a subscription.
I have currently implemented it like this.
Send mutation.
Create a optimisticResponse.
Update the state optimistically via writeQuery in the update function.
When the real response comes (which is null) use the optimisticResponse again in the update method.
Wait for the subscription to come back with the response.
Then refresh the the state/component with the actual data.
As you can see.. not the most ideal way, and a lot of complexity in the client.
Would love to keep the client as dumb as possible and reduce complexity.
Seems Apollo is mostly designed for sync mutations (which is logical).
What are your thoughts on a better way to implement this?
Mutations can be asynchronous
Mutations are generally not synchronous in Apollo Client. The client can wait for a mutation result as long as it needs to. You might not want your GraphQL service to keep the HTTP connection open for that duration and this seems to be the problem you are dealing with here. Your approach responding to mutations goes against how GraphQL is designed to work and that starts to create complexity that - in my opinion - you don't want to have.
Solve the problem on the implementation level - not API level
My idea to make this work better is the following: Follow the GraphQL spec and dogmatic approach. Let mutations return mutation results. This will create an API that any developer is familiar working with. Instead treat the delivery of these results as the actual problem you want to solve. GraphQL does not specify the transport protocol of the client server communication. If you have websockets running between server and client already throw away HTTP and completely operate on the socket level.
Leverage Apollo Client's flexible link system
This is where Apollo Client 2 comes in. Apollo Client 2 lets you write your own network links that handle client server communication. If you solve the communication on the link level, developers can use the client as they are used to without any knowledge of the network communication details.
I hope this helps and you still have the chance to go in this direction. I know that this might require changes on the server side but could be totally worth it when your application is frontend heavy (as most applications are these days)
I am building a project from scratch using event-sourcing with Java and Cassandra.
My apps we be based on microservices and in some use cases information will be processed asynchronously. I was wondering what part a Message Queue (such as Rabbit, Active MQ Artemis, Kafka, etc) would play to improve the technology stack in this environment and if I understand the scenarios if I won't use it.
I would start with separating messaging infrastructure like RabbitMQ from event streaming/storing/processing like Kafka. These are two different things made for two (or more) different purposes.
Concerning the event sourcing, you have to have a place where you must store events. This storage must be append-only and support fast reads of unstructured data based on an identity. One example of such persistence is the EventStore.
Event sourcing goes together with CQRS, which means you have to project your changes (event) to another store, which you can query. This is done by projecting events to that store, this is where events get processed to change the domain object state. It is important to understand that using message infrastructure for projections is generally a bad idea. This is due to the nature of messaging and two-phase commit issue.
If you look at how events get persisted, you can see that they get saved to the store as one transaction. If you then need to publish events, this will be another transaction. Since you are dealing with two different pieces of infrastructure, things can get broken.
The messaging issue as such is that messages are usually guaranteed to be delivered "at least once" and the order of messages is usually not guaranteed. Also, when your message consumer fails and NACKs the message, it will be redelivered but usually a bit later, again breaking the sequence.
The ordering and duplication concerns, whoever, do not apply to event streaming servers like Kafka. Also, the EventStore will guarantee once only event delivery in order if you use catch-up subscription.
In my experience, messages are used to send commands and to implement event-driven architecture to connect independent services in a reactive way. Event stores, at the other hand, are used to persist events and only events that get there are then projected to the query store and also get published to the message bus.
Make sure you are clear on the distinction between send(command) and publish(event). Udi Dahan touches on that topic in his essay on busses and brokers.
In most cases where you are event sourcing, you do not want to be reconstructing state from published events. If you need state, then query the technical authority/book of record for the history, and reconstruct the state from the history.
On the other hand, event driven activity off of a message queue should be fine. When a single event (plus the subscriber's state) has everything you need, then running off of the bus is fine.
In some cases, you might do both. For example, if you were updating cached views, you'd subscribe to various BobChanged events to know when your cached data was stale; to rebuild a stale view, you would reload a representation of the history and transform it into an updated view.
In the world of event-sourcing applications, message queues usually allow you to implement publish-subscribe pattern style of communication between producers and consumers. Also, they usually help you with delivery guarantees: which messages were delivered to which subscribers and which ones were not.
But they don't store all messages indefinitely. You need to have an event store to do any kind of event sourcing.
The question is not 'to queue or not to queue', but it is more like:
can this thing store huge volume of events indefinitely?
does it have publish-subscribe capabilities?
does it provide at-least-once delivery guarantees?
So, you should use something like Kafka or EventStore to have all that out-of-the-box. Alternatively, you can combine event store with message queue manually, but this is going to be more involved.
I am interested in creating an application using the the Meteor framework that will be disconnected from the network for long periods of time (multiple hours). I believe meteor stores local data in RAM in a mini-mongodb js structure. If the user closes the browser, or refreshes the page, all local changes are lost. It would be nice if local changes were persisted to disk (localStorage? indexedDB?). Any chance that's coming soon for Meteor?
Related question... how does Meteor deal with document conflicts? In other words, if 2 users edit the same MongoDB JSON doc, how is that conflict resolved? Optimistic locking?
Conflict resolution is "last writer wins".
More specifically, each MongoDB insert/update/remove operation on a client maps to an RPC. RPCs from a given client always play back in order. RPCs from different clients are interleaved on the server without any particular ordering guarantee.
If a client tries to issue RPCs while disconnected, those RPCs queue up until the client reconnects, and then play back to the server in order. When multiple clients are executing offline RPCs, the order they finally run on the server is highly dependent on exactly when each client reconnects.
For some offline mutations like MongoDB's $inc and $addToSet, this model works pretty well as is. But many common modifiers like $set won't behave very well across long disconnects, because the mutation will likely conflict with intervening changes from other clients.
So building "offline" apps is more than persisting the local database. You also need to define RPCs that implement some type of conflict resolution. Eventually we hope to have turnkey packages that implement various resolution schemes.
I have developed a chat web application which uses a SqlServer database for exchanging messages.
All clients poll every x seconds to check for new messages.
It is obvious that this approach consumes many resources, and I was wondering if there is a "cheaper" way of doing that.
I use the same approach for "presence": checking who is on.
Without using a browser plugin/extension like flash or java applet, browser is essentially a one way communication tool. The request has to be initiated by the browser to fetch data. You cannot 'push' data to the browser.
Many web app using Ajax polling method to simulate a server 'push'. The trick is to balance the frequency/data size with the bandwidth and server resources.
I just did a simple observation for gmail. It does a HttpPost polling every 5 seconds. If there's no 'state' change, the response data size is only a few bytes (not including the http headers). Of course google have huge server resources and bandwidth, that's why I mention: finding a good balance.
That is "Improving user experience vs Server resource". You might need to come out with a creative way of polling strategy, instead of a straightforward polling every x seconds.
E.g. If no activity from party A, poll every 3 seconds. While party A is typing, poll every 5 seconds. This is just a illustraton, you can play around with the numbers, or come out with a more efficient one.
Lastly, the data exchange. The challenge is to find a way to pass minimum data sizes to convey the same info.
my 2 cents :)
For something like a real-time chat app, I'd recommend a distributed cache with a SQL backing. I happen to like memcached with the Enyim .NET provider, so I'd do something like the following:
User posts message
System writes message to database
System writes message to cache
All users poll cache periodically for new messages
The database backing allows you to preload the cache in the event the cache is cleared or the application restarts, but the functional bits rely on in-memory cache, rather than polling the database.
If you are using SQL Server 2005 you can look at Notification Services. Granted this would lock you into SQL 2005 as Notification Services was removed in SQL 2008 it was designed to allow the SQL Server to notify client applications of changes to the database.
If you want something a little more scalable, you can put a couple of bit flags on the Users record. When a message for the user comes in change the bit for new messages to true. When you read the messages change it to 0. Same for when people sign on and off. That way you are reading a very small field that has a damn good chance of already being in cache.
Do the workflow would be ready the bit. If it's 1 then go get the messages from the message table. If it's 0 do nothing.
In ASP.NET 4.0 you can use the Observer Pattern with JavaScript Objects and Arrays ie: AJAX JSON calls with jQuery and or PageMethods.
You are going to always have to hit the database to do analysis on whether there is any data to return or not. The trick will be on making those calls small and only return data when needed.
There are two related solutions built-in to SQL Server 2005 and still available in SQL Server 2008:
1) Service Broker, which allows subscribers to post reads on queues (the RECEIVE command with WAIT..). In your case you would want to send your message through the database by using Service Broker Services fronting these Queues, which could then be picked up by the waiting clients. There's no polling, the waiting clients just get activated when a message is received.
2) Query Notifications, which allow a subscriber to define a Query, and the receive notifications when the dataset that would result from executing that query would change. Built on Service Broker, Query Notifications are somewhat easier to use, but may also be somewhat less efficient. (Not that Query Notifications and their siblings, Event Notifications are frequently mistaken for Notification Services (NS), which causes concern because NS is decommitted in 2008, however, Query & Event Notifications are still fully available and even enhanced in SQL Server 2008).