What's the recommended way to handle microservice processing bugs new insights? - asynchronous

Before I get to my question, let me sketch out a sample set of microservices to illustrate my dilemma.
Scenario outline
Suppose I have 4 microservices:
An activation service where features supplied to our customers are (de)activated. A registration service where members can be added and changed. A secured key service that is able to generate secure keys (in a multi step process) for members to be used when communicating with them with the outside world. And a communication service that is used to communicate about our members with external vendors.
The secured key service may however only request secured keys if this is a feature that is activated. Additionally, the communication service may only communicate about members that have a secured key AND if the communication feature itself is activated.
Because they are microservices, each of the services has it's own datastore and is completely self sufficient. That is, any data that is required from the other microservices is duplicated locally and kept in sync by means of asynchronous messages from the other microservices.
The dilemma
I'm actually facing two main dilemma's. The first is (pretty obviously) data synchronization. When there are multiple data stores that need to be kept in sync you have to account for messages getting lost or processed out of order. But there are plenty of out of the box solutions for this and when all fails you could even fall back to some kind of ETL process to keep things in sync.
The main issue I'm facing however is the actions that need to be performed. In the above example the secured key service must perform an action when it either
Receives a message from the registration service for a new member when it already knows that the secured keys feature is active in the activation service
Receives a message from the activation service that the secured keys feature is now active when it already knows about members from the registration service
In both cases this means that a message from the external system must lead to both an update in the local copy of the data as well as some logic that needs to be processed.
The question
Now to the actual question :)
What is the recommended way to cope with either bugs or new insights when it comes to handling those messages? Suppose there is a bug in the message handler from the activation service. The handler does update the internal data structure, but it fails to detect that there are already registered members and thus never starts the secure key generation process. Alternatively it could be that there's no bug, but we decide that there is something else we want the handler to do.
The system will have no reason to resubmit or reprocess messages (as the message didn't fail), but there's no real way for us to re-trigger the behavior that's behind the message.
I hope it's clear what I'm asking (and I do apologize if it should be posted on any of the other 170 Stack... sites, I only really know of StackOverflow)

I don't know what is the recommended way, I know how this is done in DDD and maybe this can help you as DDD and microservices are friends.
What you have is a long-running/multi-step process that involves information from multiple microservices. In DDD this can be implemented using a Saga/Process manager. The Saga maintains a local state by subscribing to events from both the registration service and the activation service. As the events come, the Saga check to see if it has all the information it needs to generate secure keys by submitting a CreateSecureKey command. The events may come in any order and even can be duplicated but this is not a problem as the Saga can compensate for this.
In case of bugs or new features, you could create special scripts or other processes that search for a particular situation and handle it by submitting specific compensating commands, without reprocessing all the past events.
In case of new features you may even have to process old events that now are interesting for your business process. You do this in the same way, by querying the events source for the newly interesting old events and send them to the newly updated Saga. After that import process, you subscribe the Saga to these newly interesting events and the Saga continues to function as usual.

Related

Microservices without synchronous communication possible?

I know this question was already asked in a lot of ways and flavors, I wanted to add another way and a concrete example.
Basically I know the we should avoid synchronous communication, I was just wondering if there are some patterns to really avoid all of it. Let me give you a short example for a situation in which I wouldn't know how to make it asynchronous:
I have a service that is managing e.g. users, basically a DB that hast users saved and their configuration etc.
Now another service that is the API Gates provides the Endpoint to register the user. And this is the point where the communication becomes a problem: if the register endpoint is called we somehow have to call synchronously the user service because we e.g. need the userId of the newly create user. So this is a very abstract example and needing the userId might not be needed in a lot of cases, but in generell I am curious about this patter:
A services needs to call another service in order to create a new resource but needs some kind of data of the newly created resource either to return it to it's caller or create locally some kind of connection between it's entities and the other services entities.
Is there some pattern for this or is this just a place where synchronous communication needs to happen?
What you are describing is the Orchestration vs Choreography patterns:
In the Orchestration pattern a microservice invokes its dependencies directly, just like in your example, a microservice invokes another to register the user and then uses the userId from the response.
On the other hand, we can have the Choreography pattern where we use need a message queue system, e.g., Kafka, RabbitMq, to decouple the microservices. The same example would work as following:
Your User-Manager microservice will publish an event (command) of type RegisterUser to the message queue, containing the user information.
The API Gates subscribes to the events of type RegisterUser and whenever it gets an event of that type it will create the user normally.
Now, the API Gates must let everyone know that the user was created, so it will publish another event of type UserCreated containing the user information, e.g., the userId.
Finally, the User Manager must also subscribe the UserCreated events, so it can proceed with the flow.
With this approach the two microservices do not know each other, they are decoupled, and you can have any number of dependencies subscribing the events, i.e., you can add new dependencies without needing to change the code.

Is there a specification for commands similar to events in OpenAPI?

I am building a microservice application, and I try to follow the best practices. I use event sourcing and event driven state transfer in many places, but I realized that sometimes I just need to call another service in an asynchronous way to kindly ask it to do something e.g. send out a registration email (as the email service is a technical component and not a domain). I noticed that many times, services just call the other's service API endpoint, but that wouldn't be asynchronous. As I don't expect any returned value when calling from another service, the command would only produce events, RPC is not necessary.
In the end, my plan is to implement commands/actions that can be triggered by clients from REST API (then the commands may also produce responses) or by events or other services from RabbitMQ or similar. This leads me to how should I define the data structure of the command/action, is there any specification for that? Or existing solutions for Python? Or should I do something differently?

Sharing particular data between realms

I am about to start working on the back-end for a mobile app (initially iOS/Android, later also website) and I am thinking whether Realm could fulfill all my needs.
The basic idea is that there are two types of users - customers and service-providers. The customers send requests to the server once in a while and are subscribed (real-time) for any event that might occur in relation to this request in the future. Each service-provider is listening for specific requests from all customers and is the one who is going to trigger various events (send data) for each of those requests.
From the Realm docs, it is obvious that the real-time data sync is not going to be a problem. The thing I am concerned about is how to model the scenario (customer/service-provider) in the Realm 'world'. Based on what I read, it is preferred to have one realm per user. Therefore, I suppose the user will register and will be given a realm. Then whenever he makes a request, it is going to be stored in his realm. Now the question is how to model the service-provider. There are going to be various service-providers each responding (triggering various kinds of events up to one hour after request) to different kinds of requests. (Each user can send any request and therefore be served by any service-provider.)
I read a bit about that Realm supports data sharing among different realms which could be a partial solution for this problem, however I was not able to find if this 'sharing' could share only particular requests. (Meaning each service-provider will get only requests intended for him.)
My question is whether this scenario is doable using Realm?
This sounds like a perfect fit for Realm's server-side event-handling. Put simply, Realm offers the ability through our Node SDK to listen for changes across Realms on the server.
So in your example, where each mobile user would have their own Realm, the URL for this would be /~/myRealm in which the tilde represents the Realm user ID. The Node SDK event handling API allows you to register a JS function that will run in response to changes represented by a Regex pattern for Realm URLs. In this case you could use: ^/([0-9a-f]+)/myRealm so that any time any user's myRealm updated, the server could perform some logic.
In this manner, the server via the Node SDK is really a "super-user" or service-provider as you describe. When an event fires, the JS function that runs is provided the Realm that was updated and a list of indexes pertaining to the objects in the Realm that were inserted, deleted, or modified. You can then perform any logic in JS, such as using the changed data to call out to another API or opening the Realm in question or any other and writing changes which will get pushed back out to the respective clients.
The full server-side event handling is part of Realm Professional Edition, but we recently released another way to interact with this called Realm Functions. This provides the ability through the server's dashboard to create the same JS functions that will run in response to changes across Realms. The developer edition support 3 functions so you can try it out immediately!

How can I create a queue with multiple workers?

I want to create a queue where clients can put in requests, then server worker threads can pull them out as they have resources available.
I'm exploring how I could do this with a Firebase repository, rather than an external queue service that would then have to inject data back into Firebase.
With security and validation tools in mind, here is a simple example of what I have in mind:
user pushes a request into a "queue" bucket
servers pull out the request and deletes it (how do I ensure only one server gets the request?)
server validates data and retrieves from a private bucket (or injects new data)
server pushes data and/or errors back to the user's bucket
A simplified example of where this might be useful would be authentication:
user puts authentication request into the public queue
his login/password goes into his private bucket (a place only he can read/write into)
a server picks up the authentication request, retrieves login/password, and validates against the private bucket only the server can access
the server pushes a token into user's private bucket
(certainly there are still some security loopholes in a public queue; I'm just exploring at this point)
Some other examples for usage:
read only status queue (user status is communicated via private bucket, server write's it to a public bucket which is read-only for the public)
message queue (messages are sent via user, server decides which discussion buckets they get dropped into)
So the questions are:
Is this a good design that will integrate well into the upcoming security plans? What are some alternative approaches being explored?
How do I get all the servers to listen to the queue, but only one to pick up each request?
Wow, great question. This is a usage pattern that we've discussed internally so we'd love to hear about your experience implementing it (support#firebase.com). Here are some thoughts on your questions:
Authentication
If your primary goal is actually authentication, just wait for our security features. :-) In particular, we're intending to have the ability to do auth backed by your own backend server, backed by a firebase user store, or backed by 3rd-party providers (Facebook, twitter, etc.).
Load-balanced Work Queue
Regardless of auth, there's still an interesting use case for using Firebase as the backbone for some sort of workload balancing system like you describe. For that, there are a couple approaches you could take:
As you describe, have a single work queue that all of your servers watch and remove items from. You can accomplish this using transaction() to remove the items. transaction() deals with conflicts so that only one server's transaction will succeed. If one server beats a second server to a work item, the second server can abort its transaction and try again on the next item in the queue. This approach is nice because it scales automatically as you add and remove servers, but there's an overhead for each transaction attempt since it has to make a round-trip to the firebase servers to make sure nobody else has grabbed the item from the queue already. But if the time it takes to process a work item is much greater than the time to do a round-trip to the Firebase servers, this overhead probably isn't a big deal. If you have lots of servers (i.e. more contention) and/or lots of small work items, the overhead may be a killer.
Push the load-balancing to the client by having them choose randomly among a number of work queues. (e.g. have /queue/0, /queue/1, /queue/2, /queue/3, and have the client randomly choose one). Then each server can monitor one work queue and own all of the processing. In general, this will have the least overhead, but it doesn't scale as seamlessly when you add/remove servers (you'll probably need to keep a separate list of work queues that servers update when they come online, and then have clients monitor the list so they know how many queues there are to choose from, etc.).
Personally, I'd lean toward option #2 if you want optimal performance. But #1 might be easier for prototyping and be fine at least initially.
In general, your design is definitely on the right track. If you experiment with implementation and run into problems or have suggestions for our API, let us know (support#firebase.com :-)!
This question is pretty old but in case someone makes it here anyway...
Since mid 2015 Firebase offers something called the Firebase Queue, a fault-tolerant multi-worker job pipeline built on Firebase.
Q: Is this a good design that will integrate well into the upcoming security plans?
A: Your design suggestion fits perfectly with Firebase Queue.
Q: How do I get all the servers to listen to the queue, but only one to pick up each request?
A: Well, that is pretty much what Firebase Queue does for you!
References:
Introducing Firebase Queue (blog entry)
Firebase Queue (official GitHub-repo)

List currently logged in users in BlazeDS

I really need some expert help here!. Does anybody knows of a way of getting the list of the users currently connected to a BlazeDS server? Is there any built-in mechanism of knowing this? or do I have to implement some kind of server side logic every time a user access my Flex application and store all logged in users details somewhere and retrieve them later?
The nearest thing I can think of, using BlazeDS, is to obtain a list of clients currently subscribed to a destination, but this won't solve the problem IMO.
First of all, you need to define a destination and make sure that all clients will actually subscribe to it (see BlazeDS documentation for this). Then, on the server, you can then get a reference to the message service
MessageService messageService;
messageService = (MessageService) messageBroker.getService("message-service");
and ask for all subscribers with the getSubscribersIds method on the MessageService instance, specifying the name of your destination. This will only returns a number of identifiers, internally generated by BlazeDS (they are also available on the client side of the connection).
To resolve the same problem, I used this approach in combination to a custom server-side logic to store logged-in users (explicitly invoked login/logout methods). Regularly looking at the subscribers can help to clean this store, because in my experience there's no way to be sure that a "logout" method will always successfully called, expecially from Flex client running in the browser, while BlazeDS will automatically take care of cleanup of the subscribers.
I don't like very much this approach, probably someone came up with a better solution..

Resources