Spring Boot Actuator Health Groups with nested Indicators - spring-boot-actuator

I'm trying to configure Actuator's health probes to include checks for an external service that is nested beyond the first level. For example, when calling /actuator/health, these are the available health indicators:
{
"status":"DOWN",
"components":{
"jms":{
"status":"DOWN",
"components":{
"broker1":{
"status":"DOWN",
"details":{
"error":"javax.jms.JMSException: Failed to create session factory"
}
},
"broker2":{
"status":"UP",
"details":{
"provider":"ActiveMQ"
}
}
}
},
"livenessState":{
"status":"UP"
},
"readinessState":{
"status":"UP"
}
},
"groups":[
"liveness",
"readiness"
]
}
Under the jms component, there are two brokers - broker1 and broker2. I can configure Actuator to include jms in the readiness group like:
endpoint:
health:
probes:
enabled: true
enabled: true
show-details: always
group:
readiness:
include: readinessState, jms
But, this will include all brokers in the readiness probe.
When calling /actuator/health/readiness, I get:
{
"status":"DOWN",
"components":{
"jms":{
"status":"DOWN",
"components":{
"broker1":{
"status":"DOWN",
"details":{
"error":"javax.jms.JMSException: Failed to create session factory"
}
},
"broker2":{
"status":"UP",
"details":{
"provider":"ActiveMQ"
}
}
}
},
"readinessState":{
"status":"UP"
}
}
}
Since the readiness probe in Kubernetes will only prevent routing web requests to my pod, I have cases where I can process the request if broker1 is down as long as broker2 is up. Is there a way to configure Actuator to include nested health indicators in a Health Group instead of just the root ones? I tried combinations like broker1, jms.broker1, jms/broker1, jms\broker1 to no avail.
If it is not directly supported through configuration, is there a custom component I could create that would give me the desired behavior. For example, I thought of the possibility of writing a custom CompositeHealthContributor, but I am not sure if I can aggregate existing health indicators. I do not want to replicate the health checks that are already being done.
One other related use case is to consider the service as healthy as long as one of a group of external resources is available. For example, I have an identical broker in two data centers. As long as one of the brokers is available, then my service could be considered healthy. What would be a good approach for this use case?

I believe that as long as you expose nested indicators as Health objects, you will get the desired result:
#Component
public class CustomHealthIndicator implements ReactiveHealthIndicator {
private final List<CustomContributor> customContributors;
public CustomHealthIndicator(List<CustomContributor> customContributors) {
Assert.notNull(customContributors, "At least one contributor must be available");
this.customContributors = customContributors;
}
#Override
public Mono<Health> health() {
return checkServices()
.onErrorResume(ex -> Mono.just(new Health.Builder().down(ex).build()));
}
private Mono<Health> checkServices() {
return Mono.fromSupplier(() -> {
final Builder builder = new Builder();
AtomicBoolean partial = new AtomicBoolean();
this.customContributors.parallelStream()
.forEach(customContributor -> {
final Object status = customContributor.getStatus();
final boolean isAnErrorState = status instanceof Throwable;
partial.set(partial.get() || isAnErrorState);
if(isAnErrorState){
builder.withDetail(customContributor.getName(), new Builder().down().withException((Throwable) status).build());
} else {
builder.withDetail(customContributor.getName(), new Builder().up().withDetail("serviceDetail", status).build());
}
});
builder.status(partial.get() ? Status.DOWN : Status.UP);
return
builder.build();
});
}
}

Related

Vertx: How to combine third party events with Standard Verticle and Worker Verticle style event send and reply approach

I am trying to solve an issue whereby I need to combine third party events with the eventBus send and reply approach that Vertx provides for Standard and Worker Verticle setups. I am not sure if what I have laid out below is necessarily the correct approach.
Problem Statement:
I want to have standard verticle that sends a message to a worker verticle.
The worker verticle does some preprocessing and then uses a client method provided by a third party state management lib to publish an even (in an async manner). The result of which is only whether or not the event was successfully received or not (but does not contain any further info around processing etc).
Further processing takes place when the third party state management lib receives the event(this all happens on a separate thread) and a success or failure can occur at which point another event will be published to the cluster management tools output channel.
From the output channel listener I then want to be able to use the event to somehow use the message.reply() on the worker verticle to send back a response to the standard verticle that made the original request, thereby closing the loop of the entire request lifecycle but also using the async approach that vertx is built to use.
Now I conceptually know how to do 90% of what is described here but the missing piece for me is how to coordinate the event on the output channel listener and connect this to the worker verticle so that I can trigger the message.reply.
I have looked at possibly using SharedData and Clustering that Vertx has but was wondering if there is possibly another approach.
I have put a possible example implementation but would really appreciate if anyone has any insights/thoughts into how this can be accomplished and if I am on the right track.
class Order(val id: String)
class OrderCommand(val order: Order) : Serializable {
companion object {
const val name = "CreateOrderCommand"
}
}
class SuccessOrderEvent(val id: String) : Serializable {
companion object {
const val name = "OrderSuccessfulEvent"
}
}
interface StateManagementLib {
fun <T> send(
value: T,
success: Handler<AsyncResult<Int>>,
failure: Handler<AsyncResult<Exception>>
) {
val output = publish(value)
if (output == 1) {
success.handle(Future.succeededFuture())
} else {
failure.handle(Future.failedFuture("Failed"))
}
}
// non-blocking
fun <T> publish(value: T): Int // returns success/failure only
}
class WorkVerticle constructor(private val lib: StateManagementLib) : AbstractVerticle() {
override fun start(startPromise: Promise<Void>) {
workerHandler()
startPromise.complete()
}
private fun workerHandler() {
val consumer = vertx.eventBus().consumer<OrderCommand>(OrderCommand.name)
consumer.handler { message: Message<OrderCommand> ->
try {
vertx.sharedData().getClusterWideMap<String, Message<OrderCommand>>("OrderRequest") { mapIt ->
if (mapIt.succeeded()) {
lib.send(message.body(), {
// The StateManagementLib successfully propagated the event so, we try and store in this map (id -> Message)
mapIt.result().put(message.body().order.id, message)
}, {
message.fail(400, it.result().message)
})
}
}
} catch (e: Exception) {
message.fail(
HttpResponseStatus.INTERNAL_SERVER_ERROR.code(), "Failed to encode data."
)
}
}
// A consumer that will pick up an event that is received from the clusterTool output channel
vertx.eventBus().addInboundInterceptor { context: DeliveryContext<SuccessOrderEvent> ->
// Retrieve cluster map to get the previously stored message and try to respond with Message.reply
// This should go back to the Standard Verticle that sent the message
vertx.sharedData().getClusterWideMap<String, Message<OrderCommand>>("OrderRequest") {
if (it.succeeded()) {
val id = context.message().body().id
val mapResult = it.result().get(id)
it.result().remove(id)
// Try and reply so the original eventloop thread can pickup and respond to calling client
mapResult.result().reply(id)
}
}
}
}
}

Spring cloud stream - Content based routing

I am trying to use the Content-based routing in the latest version of the Spring Cloud stream. As per this document -> Content-based routing, it mentions how to use it in the legacy system/StreamListener.
This is my code with StreamListener
#StreamListener(target = EventChannels.FILE_REQUEST_IN
, condition = "headers['saga_request']=='FILE_SUBMIT'")
public void handleSubmitFile(#Payload FileSubmitRequest request) {
}
#StreamListener(target = EventChannels.FILE_REQUEST_IN
, condition = "headers['saga_request']=='FILE_CANCEL'")
public void handleCancelFile(#Payload FileCancelRequest request) {
}
By using the condition, it was possible to route the message to two different functions.
I am trying to consume the message with a Functional interface approach as below.
#Bean
public Consumer<String> consumeMessage(){
return event -> {
try {
LOGGER.info("Consumer is working: {}", event);
} catch (Exception ex) {
LOGGER.error("Exception while processing");
}
};
}
How can I achieve similar content-based routing in the functions? TIA.
Other details->
Spring boot version - 2.3.12.RELEASE
Spring cloud version - Hoxton.SR11
Have you seen this - https://docs.spring.io/spring-cloud-stream/docs/3.1.4/reference/html/spring-cloud-stream.html#_event_routing?
We provide two different routing models TO and FROM. The included link contains samples so please look through it and feel free to post any followups

What is the correct way of handling collect in Kotlin?

I'm getting user data from Firestore using MVVM. In the repository class I use:
fun getUserData() = flow {
auth.currentUser?.apply {
val user = ref.document(uid).get().await().toObject(User::class.java)
user?.let {
emit(Success(user))
}
}
}. catch { error ->
error.message?.let { message ->
emit(Failure(message))
}
}
This method is called from the ViewModel class:
fun getUser() = repo.getUserData()
And in the activity class I use:
private fun getUser() {
lifecycleScope.launch {
viewModel.getUser().collect { data ->
when(data) {
is Success -> textView.text = data.user.name
is Failure -> print(data.message)
}
}
}
}
To display the name in the TextView. The code works fine. But is this the correct way if doing things? Or is it more correct to collect the data in the ViewModel class?
Any room for improvement? Thanks
My personal opinion is that data should be collected in the VM so it survives configuration changes.
The scope of the view (activity/fragment) should not drive your data flow.
The VM will outlive activities and fragments during configuration changes, so all the data you have collected and transformed, will still be there (or in progress if it's still being obtained).
StateFlow is good (in this last step) because it has the ability to tell the VM: This is no longer needed, don't waste resources.
But I haven't yet used StateFlow in production code so there's that.

Spring cloud stream - Autowiring underlying Consumer for a given PollableMessageSource

Is it possible to get a hold of underlying KafkaConsumer bean for a defined PollableMessageSource?
I have Binding defined as:
public interface TestBindings {
String TEST_SOURCE = "test";
#Input(TEST_SOURCE)
PollableMessageSource testTopic();
}
and config class:
#EnableBinding(TestBindings.class)
public class TestBindingsPoller {
#Bean
public ApplicationRunner testPoller(PollableMessageSource testTopic) {
// Get kafka consumer for PollableMessageSource
KafkaConsumer kafkaConsumer = getConsumer(testTopic);
return args -> {
while (true) {
if (!testTopic.poll(...) {
Thread.sleep(500);
}
}
};
}
}
The question is, how can I get KafkaConsumer that corresponds to testTopic? Is there any way to get it from beans that are wired in spring cloud stream?
The KafkaMessageSource populates a KafkaConsumer into headers, so it is available in the place you receive messages: https://github.com/spring-projects/spring-kafka/blob/master/spring-kafka/src/main/java/org/springframework/kafka/support/converter/MessageConverter.java#L57.
If you are going to do stuff like poll yourself, I would suggest to inject a ConsumerFactory and use a consumer from there already.

Kafka consumer health check

Is there a simple way to say if a consumer (created with spring boot and #KafkaListener) is operating normally?
This includes - can access and poll a broker, has at least one partition assigned, etc.
I see there are ways to subscribe to different lifecycle events but this seems to be a very fragile solution.
Thanks in advance!
You can use the AdminClient to get the current group status...
#SpringBootApplication
public class So56134056Application {
public static void main(String[] args) {
SpringApplication.run(So56134056Application.class, args);
}
#Bean
public NewTopic topic() {
return new NewTopic("so56134056", 1, (short) 1);
}
#KafkaListener(id = "so56134056", topics = "so56134056")
public void listen(String in) {
System.out.println(in);
}
#Bean
public ApplicationRunner runner(KafkaAdmin admin) {
return args -> {
try (AdminClient client = AdminClient.create(admin.getConfig())) {
while (true) {
Map<String, ConsumerGroupDescription> map =
client.describeConsumerGroups(Collections.singletonList("so56134056")).all().get(10, TimeUnit.SECONDS);
System.out.println(map);
System.in.read();
}
}
};
}
}
{so56134056=(groupId=so56134056, isSimpleConsumerGroup=false, members=(memberId=consumer-2-32a80e0a-2b8d-4519-b71d-671117e7eaf8, clientId=consumer-2, host=/127.0.0.1, assignment=(topicPartitions=so56134056-0)), partitionAssignor=range, state=Stable, coordinator=localhost:9092 (id: 0 rack: null))}
We have been thinking about exposing getLastPollTime() to the listener container API.
getAssignedPartitions() has been available since 2.1.3.
I know that you haven't mentioned it in your post - but beware of adding items like this to a health check if you then deploy in AWS and use such a health check for your ELB scaling environment.
For example one scenario that can happen is that your app loses connectivity to Kafka - your health check turns RED - and then elastic beanstalks begins a process of killing and re-starting your instances (which will happen continually until your Kafka instances are available again). This could be costly!
There is also a more general philosophical question on whether health checks should 'cascade failures' or not e.g. kafka is down so app connected to kafka claims it is down, the next app in the chain also does the same, etc etc. This is often more normally implemented via circuit breakers which are designed to minimise slow calls destined for failure.
You could check using the AdminClient for the topic description.
final AdminClient client = AdminClient.create(kafkaConsumerFactory.getConfigurationProperties());
final String topic = "someTopicName";
final DescribeTopicsResult describeTopicsResult = client.describeTopics(Collections.singleton(topic));
final KafkaFuture<TopicDescription> future = describeTopicsResult.values().get(topic);
try {
// for healthcheck purposes we're fetching the topic description
future.get(10, TimeUnit.SECONDS);
} catch (final InterruptedException | ExecutionException | TimeoutException e) {
throw new RuntimeException("Failed to retrieve topic description for topic: " + topic, e);
}

Resources