Dynamic Kafka Consumer using functional paradigm - spring-kafka

I have a particulier requirement in which I want to collect messages from a topic until a specified duration (for example for 40 seconds), but only when asked to (so start a consumer for 40 sec when asked for and then stop).
I came across examples over creating consumers dynamically using
DefaultKafkaConsumerFactory
ConcurrentKafkaListenerContainerFactory
Firstly, I would like to know if its possible to do the same using KStream + functional paradigm (i.e. using (bi)function, consumer interfaces)?
Secondly, What should I do if another call comes in to start collecting messages while the first duration hasn't finished ? Create a new container (unique group-id ofcoarse) ??
Lastly, When the collect duration has expired, I have no use of this container anymore. I could stop the container but what then, can it be reused, i.e. when a new request comes in ?

It looks like you are using the Kafka Streams binder. If so, you can control the way the processors are started programmatically.
See these sections from the reference docs for more details.
https://docs.spring.io/spring-cloud-stream/docs/current/reference/html/spring-cloud-stream-binder-kafka.html#_manually_starting_kafka_streams_processors
https://docs.spring.io/spring-cloud-stream/docs/current/reference/html/spring-cloud-stream-binder-kafka.html#_manually_starting_kafka_streams_processors_selectively
You can also stop/start the bindings using a REST endpoint: https://docs.spring.io/spring-cloud-stream/docs/current/reference/html/spring-cloud-stream-binder-kafka.html#_binding_visualization_and_control_in_kafka_streams_binder

Related

Managing a connection to an external component in an Aggregate

I just discovered Axon framework and am very eager to use it as it answers a lot of my questions and concerns about DDD.
The first project I'd like to use it on contains small cameras which are controlled via the GigEVision protocol (over TCP and UDP for the control and stream channels). I think my problem would be the same for any case where we maintain a connection to an external component or more generally we want to link an external component lifecycle to Axon's lifecycles.
I'd like to have an Aggregate named Camera to which I can send Commands to grab 1 image or start grabbing N images at a certain FPS.
What I'm not sure about is how to manage the connection to an external component in my Aggregate.
- Should I inject the client to my camera in my Camera Aggregate and consider connecting to it as part of my protocol / business commands? In this case how would I link the camera lifecycle (a camera get disconnected all of a sudden) to the aggregate lifecyle (create a corresponding CameraDisconnectedEvent)?
- Should the connection be handled in a side car Saga which get the camera client injected, the saga starting on ConnectionRequestedEvent and stopping as soon as we get a connection error from the camera. I would get the same issue of linking the connection lifecycle to the lifecycle of the Saga I think.
- Am I leaking implementation details in the business layer and should manage the issue an other way?
- Am I just using the wrong tool for this job and should not try to force it into Axon?
Thank you very much in advance, hope my message and issues make sense.
Best regards,
First and foremost what you should do, is ensure the language spoken by the GigEVision protocol by no means transitions over into your other domain.
These two should be separate and remain so, as they cover different concerns.
This brings to light the necessity to have a translation layer of some sort.
More formuly called a context mapping. From a DDD perspective, you would take this even further by talking about an Anti-Corruption Layer.
The name already says it, you add a layer to ensure you are not corrupting your domain logic with that from another domain.
Another useful topic to read up on here would be the notion of a Bounded Context.
I digress though, let's go back to the problem at hand: where to position this anti-corruption layer.
What is currently unclear to me, is what domain requirements are in place why the connection is required to be maintained all the time when requesting for images.
Is the command you want to send requesting for a live feed? Or just "some" images from a given time frame?
In both scenarios I am not immediately convinced that any of these operations requires the validation through a single Camera aggregate to be honest.
You could still model this in a command and event format, as the messaging paradigm is very helpful to allow clean segregation of concerns.
But given the current description, I am uncertain whether you need DDD's idea of an Aggregate to model a single Aggregate in.
I might be wrong on this note, but I just don't know enough about your domain at this stage.
That's my two cents to the situation, hoping this helps!

Can I create a flow to migrate state to new version instead of using contract upgrade?

When I upgrade contract there are 2 step Authorise and Initiate. And since both step need to be done one by one per state (as my understanding) it take very long time when I have a large amount of data.
I ended up with looping call API to query some amount of data and then looping call ContractUpgradeFlow one by one.
The result is it took more than 11 hours and not finish upgrading.
So the question is if I create a flow A to query list of StateV1 as an input and create an out output to be list of StateV2.
Would it reduce the process for contract upgrade?
Should it be faster?
Is this considering same result like upgrade contract?
Would it be any effect to the next contract upgrade for StateV2 if I want to use Corda contract upgrade instead of flow A?
Yes correct with an explicit upgrade if there is a lot of data, it is going to take time as lot of things are happening behind the scenes.
Each and every unconsumed state is taken, new transaction is created, old state with old contract and new states with new contract are
added to this transaction, the transaction is sent to each signer for signing, setting of appropriate constraints is done, and finally the entire signed transaction is sent to notary.
“So the question is if I create a flow A to query list of StateV1 as an input and create an out output to be list of StateV2”
Yes you can very well create a flow to query list of StateV1 as an input and create an out output to be list of StateV2, but keep in mind you will also have to take care of all the steps which I have mentioned above which are as of now handled by the ContractUpgradeFlow.
“Would it reduce the process for contract upgrade?”
No I don’t think so as you will have to handle all the steps as mentioned above which are as of now handled by the ContractUpgradeFlow.
“Should it be faster?”
No it will take same time as taken by ContractUpgradeFlow

fetch 1 message at given offset when debugging app using spring-kafka

I understand, that random access in inefficient. But app failed, record content is (let's assume it) big for logging or otherwise inappropriate (that's true without assumption), so I only have info, that record at this offset failed and why. Good, now I want to see the data, let's say to be able to reproduce it. How to do that?
OK, I can use ConsumerSeekAware consumer, but that will rewind the position and process all records from that position on. I don't want that, I want just 1 specific message. I can use specific consumer in specific consumer group for this use case not to influence others and set ConsumerConfig.MAX_POLL_RECORDS_CONFIG to 1 so that each pull returns just 1 record, but this will not stop all records from reaching the listener. Since there is no way how to call poll manually, programmatically. Right? Or is there such a way? Or other how to achieve this? Even if I try to reach spring-kafka internals, the org.apache.kafka.clients.consumer.The consumer seems to be made inaccessible on purpose, or at least I do not see the way.
Yes, you can just create your own consumer manually and poll it.
Get a reference to the consumer factory and call `createConsumer("tempGroup", "tempClient").
You would need to create a second consumer factory with max.poll.records=1.
You can copy the other properties from the main factory by calling getConfigurationProperties() - and creating a new map from it and create a new DefaultKafkaConsumerFactory.
Close the consumer when you are done.

Best approach to wait untill all service calls returned values in Flex PureMVC

I am writing an Adobe AIR application using PureMVC.
Imagine that I have an page-based application view ( using ViewStack ), and user is navigating through this pages in some way ( like clicking the button or whatever ).
Now for example I have an Account Infromation page which when instantiated or showed again needs to load the data from WebService ( for example email, account balance and username ), and when the data is returned I want to show it on my Account Information page in the proper labels.
The problem is when I will execute this three Web Calls, each of them will return different resultEvent at different time. I am wondering what is the best way to get the information that ALL of the service calls returned results, so I know that I can finally show all the results at once ( and maybe before this happens play some loading screen ).
I really don't know much about PureMVC, but the as3commons-async library is great for managing async calls and should work just fine in any framework-setup
http://as3commons.org/as3-commons-async/
In your case, you could create 3 classes implementing IOperation or IAsyncCommand (depending on if you plan to execute the operations immediately or deferred) encapsulating your RPCs.
After that is done you simply create a new CompositeCommand and add the operations to its queue.
When all is done, CompositeCommand will fire an OperationEvent.COMPLETE
BTW, the library even includes some pre-implemented common Flex Operations, such as HTTPRequest, when you download the as3commons-asyc-flex package as well.
I would do it in this way:
Create a proxy for each of three information entities (EMailProxy, BalanceProxy, UsernameProxy);
Create a delegate class which handles the interaction with your WebService (something like "public class WSConnector implements IResponder{...}"), which is used by the proxies to call the end ws-methods;
Create a proxy which coordinates all the three results (CoordProxy);
Choose a mediator which will coordinate all the three calls (for example it could be done by your ApplicationMediator);
Create notification constants for all proxy results (GET_EMAIL_RESULT, GET_BALANCE_RESULT, GET_USERNAME_RESULT, COORD_RESULT);
Let the ApplicationMediator get all 4 notifications;
it is important that you should not only wait for all three results but also be ready for some errors and their interpretation. That is why a simple counter could be too weak.
The overall workflow could look like this:
The user initiates the process;
Some mediator gets an event from your GUI-component and sends a notification like DO_TRIPLECALL;
The ApplicationMediator catches this notification, drops the state of the CoordProxy and calls all 3 methods from your proxies (getEMail, getBalance, getUsername).
The responses are coming asynchronously. Each proxy gets its response from the delegate, changes its own data object and sends an appropriate notification.
The ApplicationMediator catches those notifications and changes the state of the CoordProxy. When all three responses are there (may be not all are successful) the CoordProxy sends a notification with the overall result.
I know it is not the best approach to do such an interaction through mediators. The initial idea was to use commands for all "business logic" decisions. But it can be too boring to create the bureaucracy.
I hope it can help you. I would be glad to know your solution and discuss it here.

What's the best way to create/use an ID throughout the processing of a message in Biztalk?

Our program so far: We have a process that involves multiple schemata, orchestrations and messages sent/received.
Our desire: To have an ID that links the whole process together when we log our progress into a SQL server table.
So far, we have a table that logs our progress but when there are multiple messages it is very difficult to read since Biztalk will, sometimes, process certain messages out of order.
E.g., we could have:
1 Beginning process for client1
2 Second item for client1
3 Third item for client1
4 Final item for client1
Easily followed if there's only one client being updated at a time. On the other hand, this will be much more likely:
1 Beginning process for client1
2 Beginning process for client2
3 Second item for client2
4 Third item for client2
5 Second item for client1
6 Third item for client1
7 Final item for client1
8 Final item for client2
It would be nice to have an ID throughout the whole thing so that the last listing could ordered by this ID field.
What is the best and/or quickest way to do this? We had thought to add an ID, we would create, from the initial moment of the first orchestration's triggering and keep passing that value to all the schemata and later orchestrations. This seems like a lot of work and would require we modify all the schemata - which just seems wrong.
Should we even be wanting to have such an ID? Any other solutions that come to mind?
This may not exactly be the easiest way, but have you looked at this:
http://blogs.msdn.com/b/appfabriccat/archive/2010/08/30/biztalk-application-tracing-made-easy-with-biztalk-cat-instrumentation-framework-controller.aspx
Basically it's an instrumentation framework which allows you to event out from pipelines, maps, orchs, etc.
When you write out to the event-trace you can use a "business key" which will tie mutltiple events together in a chain, similar to what you are saying.
Available here
http://btscatifcontroller.codeplex.com/
I'm not sure I fully understand all the details of your specific setup, but here goes:
If you can correlate the messages from the same client into a "long running" orchestration (which waits for subsequent messages from the same client), then the orchestration will have an automatically assigned ServiceId Guid, which will be kept throughout the orchestration.
As you say, for correlation purposes, you would usually try and use natural keys within the existing incoming message schemas to correlate subsequent messages back to the running orchestration - this way you don't need to change the schemas. In your example, ClientId might be a good correlation, provided that the same client cannot send multiple message 'sets' simultaneously. (and worst case, if you do add a new correlation key to the schemas, all systems involved in the orchestration will need to be changed to 'remember' this key and return it to you.) Again, assuming ClientId as a correlation key, in your example, 2 orchestrations would be running simultaneously - one for Client 1 and one for Client 2
However, for scalability and version control reasons, (very) long running orchestrations are generally to be avoided unless they are absolutely necessary (e.g. unless you can only trigger a process once all 4 client messages are received). If you decide to keep each message as a separate orchestration or just mapped and filtered on a port, another way to 'track' the sets of is by using BAM - you can use a continuation to tie all the client messages back together, e.g. for the purpose of a report or such.
Take a look at BAM. It's designed to do exactly what you describe: Using Business Activity Monitoring
This book has got a very good chapter about BAM and this tool, by one of the authors of the book, can help you developing your BAM solution. And finally, a nice BAM Poster.
Don't be put off by the initial complexity. When you get your head around it, BAM it's one of the coolest features of BizTalk.
Hope this helps. Good luck.
Biztalk assigns various values in the message context that usually persist for the life of the processing of that message. Such as the initial MessageId. Will that work for you?
In our application we have to use an externally provided ID (from the customer). We have a multi-part message with this id in part of it. You might consider that as well
You could create a UniqueId and StepId and pass them around in the message context. When a new process for a client starts set UniqueId to a Guid and StepId to 1. As it gets passed to the next process increment the StepId.
This would allow you to query events, grouped by client id and in the order (stepId) the event happened.

Resources