From consumer end, Is there an option to create a topic with custom configurations? - spring-kafka

I'm writing a kafka consumer using 'org.springframework.kafka.annotation.KafkaListener' (#KafkaListener) annotation. This annotation is expecting the topic to be already at the time of subscribing and trying to create the topic if the topic is not present.
In my case, i don't want the consumer to create a topic with default configuration but it should create a topic with custom configurations (like the no of partitions, clean up policy etc). Is there any option for this in spring-kafka?

See the documentation configuring topics.
If you define a KafkaAdmin bean in your application context, it can automatically add topics to the broker. To do so, you can add a NewTopic #Bean for each topic to the application context. The following example shows how to do so:
#Bean
public KafkaAdmin admin() {
Map<String, Object> configs = new HashMap<>();
configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,
StringUtils.arrayToCommaDelimitedString(embeddedKafka().getBrokerAddresses()));
return new KafkaAdmin(configs);
}
#Bean
public NewTopic topic1() {
return new NewTopic("thing1", 10, (short) 2);
}
#Bean
public NewTopic topic2() {
return new NewTopic("thing2", 10, (short) 2);
}
By default, if the broker is not available, a message is logged, but the context continues to load. You can programmatically invoke the admin’s initialize() method to try again later. If you wish this condition to be considered fatal, set the admin’s fatalIfBrokerNotAvailable property to true. The context then fails to initialize.
If the broker supports it (1.0.0 or higher), the admin increases the number of partitions if it is found that an existing topic has fewer partitions than the NewTopic.numPartitions.
If you are using Spring Boot, you don't need an admin bean because boot will automatically configure one for you.

Related

Combining blocking and non-blocking retries in Spring Kafka

I am trying to implement non blocking retries with single topic fixed back-off.
I am able to do so, thanks to documentation https://docs.spring.io/spring-kafka/reference/html/#single-topic-fixed-delay-retries.
Now I also need to perform a few blocked/local retries on main topic. I have been trying to implement this using DefaultErrorHandler as below:
#Bean
public DefaultErrorHandler retryErrorHandler() {
return new DefaultErrorHandler(new FixedBackOff(2000, 3));
}
This does not seem to work with RetryableTopic.
I have also tried the following approach retry-topic-combine-blocking https://docs.spring.io/spring-kafka/reference/html/#retry-topic-combine-blocking using ListenerContainerFactoryConfigurer
but issue I am facing here is creating beans KafkaConsumerBackoffManager, DeadLetterPublishingRecovererFactory and especially KafkaConsumerBackoffManager.
I need to know if this another way to achieve this using spring kafka framework or is there a way to construct above beans ?
We're currently working on improving configuration for the non-blocking retries components.
For now, as documented here, you should inject these beans such as:
#Bean(name = RetryTopicInternalBeanNames.LISTENER_CONTAINER_FACTORY_CONFIGURER_NAME)
public ListenerContainerFactoryConfigurer lcfc(KafkaConsumerBackoffManager kafkaConsumerBackoffManager,
DeadLetterPublishingRecovererFactory deadLetterPublishingRecovererFactory,
#Qualifier(RetryTopicInternalBeanNames
.INTERNAL_BACKOFF_CLOCK_BEAN_NAME) Clock clock) {
ListenerContainerFactoryConfigurer lcfc = new ListenerContainerFactoryConfigurer(kafkaConsumerBackoffManager, deadLetterPublishingRecovererFactory, clock);
lcfc.setBlockingRetryableExceptions(MyBlockingRetryException.class, MyOtherBlockingRetryException.class);
lcfc.setBlockingRetriesBackOff(new FixedBackOff(500, 5)); // Optional
return lcfc;
}}
Also, there's a known issue where if you try to inject the beans before the first #KafkaListener bean with retryable topic is processed, the feature's component's beans won't be present in the context yet and will throw an error.
Does that happen to you?
We're currently working on a fix for this, but we should be able to work around that if that's your problem.
EDIT: Since the problem is that components are not instantiated yet, the most guaranteed workaround is to provide the components yourself.
Here's a sample on how to do that. Of course, adjust it accordingly if you need any further customization.
#Configuration
public static class SO71705876Configuration {
#Bean(name = RetryTopicInternalBeanNames.LISTENER_CONTAINER_FACTORY_CONFIGURER_NAME)
public ListenerContainerFactoryConfigurer lcfc(KafkaConsumerBackoffManager kafkaConsumerBackoffManager,
DeadLetterPublishingRecovererFactory deadLetterPublishingRecovererFactory) {
ListenerContainerFactoryConfigurer lcfc = new ListenerContainerFactoryConfigurer(kafkaConsumerBackoffManager, deadLetterPublishingRecovererFactory, Clock.systemUTC());
lcfc.setBlockingRetryableExceptions(IllegalArgumentException.class, IllegalStateException.class);
lcfc.setBlockingRetriesBackOff(new FixedBackOff(500, 5)); // Optional
return lcfc;
}
#Bean(name = RetryTopicInternalBeanNames.KAFKA_CONSUMER_BACKOFF_MANAGER)
public KafkaConsumerBackoffManager backOffManager(ApplicationContext context) {
PartitionPausingBackOffManagerFactory managerFactory =
new PartitionPausingBackOffManagerFactory();
managerFactory.setApplicationContext(context);
return managerFactory.create();
}
#Bean(name = RetryTopicInternalBeanNames.DEAD_LETTER_PUBLISHING_RECOVERER_FACTORY_BEAN_NAME)
public DeadLetterPublishingRecovererFactory dlprFactory(DestinationTopicResolver resolver) {
return new DeadLetterPublishingRecovererFactory(resolver);
}
#Bean(name = RetryTopicInternalBeanNames.DESTINATION_TOPIC_CONTAINER_NAME)
public DestinationTopicResolver destinationTopicResolver(ApplicationContext context) {
return new DefaultDestinationTopicResolver(Clock.systemUTC(), context);
}
In the next release this should not be a problem anymore. Please let me know if that works for you, or if any further adjustment to this workaround is necessary.
Thanks.

How to Grab Auto-Generated KafkaTemplate in Spring Cloud Stream?

A few examples for implementing HealthIndicator needs a KafkaTemplate. I don't actually manually create a KafkaTemplate but the HealthIndicator requires one. Is there a way to automatically grab the created KafkaTemplate (that uses the application.yml configurations)? This is versus manually creating duplicated configurations that already exist in the application.yml in a newly created consumerFactory.
See this answer.
(S)he wanted to get access to the template's ProducerFactory but you can use the same technique to just get a reference to the template.
That said, the binder comes with its own health indicator.
https://github.com/spring-cloud/spring-cloud-stream-binder-kafka/blob/a7299df63f495af3a798637551c2179c947af9cf/spring-cloud-stream-binder-kafka/src/main/java/org/springframework/cloud/stream/binder/kafka/KafkaBinderHealthIndicator.java#L52
We also had the same requirement of creating a custom health checker Spring Cloud stream. We leveraged the inbuild health checker(KafkaBinderHealthIndicator). But while injecting the KafkaBinderHealthIndicator bean facing lot of issue. So instead of that we inject the health checker holder HealthContributorRegistry, and got the KafkaBinderHealthIndicator bean from it.
Example:
#Component
#Slf4j
#RequiredArgsConstructor
public class KafkaHealthChecker implements ComponentHealthChecker {
private final HealthContributorRegistry registry;
public String checkHealth() {
String status;
try {
BindersHealthContributor bindersHealthContributor = (BindersHealthContributor)
registry.getContributor("binders");
KafkaBinderHealthIndicator kafkaBinderHealthIndicator = (KafkaBinderHealthIndicator)
bindersHealthContributor.getContributor("kafka");
Health health = kafkaBinderHealthIndicator.health();
status = UP.equals(health.getStatus()) ? "OK" : "FAIL";
} catch (Exception e) {
log.error("Error occurred while checking the kafka health ", e);
status = "DEGRADED";
}
return status;
}
}

Custom metrics confusion

I added https://micrometer.io to our staging server in google cloud. The metric does not show up in "Cloud Run Revision" resource types. It is only visible if I select "Global" as seen here...
The instructions were very simple and very clear (MUCH UNLIKE opencensus which has a way overdesigned api). In fact, unlike opencensus, it worked out of the box except for it is not recording into "Cloud Run Revision".
I can't even choose the service_name in the filter so once I deploy to production, the metric will be recording BOTH prod and staging which is not what we want.
How do I debug micrometer further
If anyone knows offhand as well what the issue might be, that would be great as well? (though I don't mind learning micrometer and debugging it a bit more).
For now the only available monitored-resource types in your custom metrics are:
aws_ec2_instance: Amazon EC2 instance.
dataflow_job: Dataflow job.
gce_instance: Compute Engine instance.
gke_container: GKE container instance.
generic_node: User-specified computing node.
generic_task: User-defined task.
global: Use this resource when no other resource type is suitable. For most use cases, generic_node or generic_task are better choices than global.
k8s_cluster: Kubernetes cluster.
k8s_container: Kubernetes container.
k8s_node: Kubernetes node.
k8s_pod: Kubernetes pod.
So, global is the correct monitored-resource type in this case, since there is not a Cloud Run monitored-resource type yet.
To identify better the metrics, you can create metric descriptors, either Auto-creation or manually
For completeness, I have it recording all the JVM stats now but have a new post on aggregation in google's website here that seems to be a new issue...
Google Cloud Metrics and MicroMeter JVM reporting (is this a Micrometer bug or?)
My code that did the trick was (and using revisionName is CRITICALL for not getting errors!!!)
String projectId = MetadataConfig.getProjectId();
String service = System.getenv("K_SERVICE");
String revisionName = System.getenv("K_REVISION");
String config = System.getenv("K_CONFIGURATION");
String zone = MetadataConfig.getZone();
Map<String, String> map = new HashMap<>();
map.put("namespace", service);
map.put("job", "nothing");
map.put("task_id", revisionName);
map.put("location", zone);
log.info("project="+projectId+" svc="+service+" r="+revisionName+" config="+config+" zone="+zone);
StackdriverConfig stackdriverConfig = new OurGoogleConfig(projectId, map);
//figure out how to put in template better
MeterRegistry googleRegistry = StackdriverMeterRegistry.builder(stackdriverConfig).build();
Metrics.addRegistry(googleRegistry);
//This is what would be used in Development Server
//Metrics.addRegistry(new SimpleMeterRegistry());
//How to expose on #backend perhaps at /#metrics
CompositeMeterRegistry registry = Metrics.globalRegistry;
new ClassLoaderMetrics().bindTo(registry);
new JvmMemoryMetrics().bindTo(registry);
new JvmGcMetrics().bindTo(registry);
new ProcessorMetrics().bindTo(registry);
new JvmThreadMetrics().bindTo(registry);
and then the config is simple...
private static class OurGoogleConfig implements StackdriverConfig {
private String projectId;
private Map<String, String> resourceLabels;
public OurGoogleConfig(String projectId, Map<String, String> resourceLabels) {
this.projectId = projectId;
this.resourceLabels = resourceLabels;
}
#Override
public String projectId() {
return projectId;
}
#Override
public String get(String key) {
return null;
}
#Override
public String resourceType() {
return "generic_task";
}
#Override
public Map<String, String> resourceLabels() {
//they call this EVERY time, so save on memory by only passing the same
//map every time instead of re-creating it...
return resourceLabels;
}
};

Configure kafka topic retention policy during creation in spring-mvc?

Configure retention policy of all topics during creation
Trying to configure rentention.ms using spring, as I get an error of:
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.PolicyViolationException: Invalid retention.ms specified. The allowed range is [3600000..2592000000]
From what I've read the new value is -1 (infinity) so is out of that range
Following what was in
How to configure kafka topic retention policy during creation in spring-mvc? ,I added the below code but it seems to have no effect.
Any ideas/hints on how might solve this?
ApplicationConfigurationTest.java
#test
public void kafkaAdmin () {
KafkaAdmin admin = configuration.admin();
assertThat(admin, instanceOf(KafkaAdmin.class));
}
ApplicationConfiguration.java
#Bean
public KafkaAdmin admin() {
Map<String, Object> configs = new HashMap<>();
configs.put(TopicConfig.RETENTION_MS_CONFIG, "1680000");
return new KafkaAdmin(configs);
}
Found the solution by setting the value
spring.kafka.streams.topic.retention.ms: 86400000
in application.yml.
Our application uses spring mvc, hence the spring notation.
topic.retention.ms is the value that needs to be set in the streams config
86400000 is a random value just used as it is in range of [3600000..2592000000]

Kafka consumer health check

Is there a simple way to say if a consumer (created with spring boot and #KafkaListener) is operating normally?
This includes - can access and poll a broker, has at least one partition assigned, etc.
I see there are ways to subscribe to different lifecycle events but this seems to be a very fragile solution.
Thanks in advance!
You can use the AdminClient to get the current group status...
#SpringBootApplication
public class So56134056Application {
public static void main(String[] args) {
SpringApplication.run(So56134056Application.class, args);
}
#Bean
public NewTopic topic() {
return new NewTopic("so56134056", 1, (short) 1);
}
#KafkaListener(id = "so56134056", topics = "so56134056")
public void listen(String in) {
System.out.println(in);
}
#Bean
public ApplicationRunner runner(KafkaAdmin admin) {
return args -> {
try (AdminClient client = AdminClient.create(admin.getConfig())) {
while (true) {
Map<String, ConsumerGroupDescription> map =
client.describeConsumerGroups(Collections.singletonList("so56134056")).all().get(10, TimeUnit.SECONDS);
System.out.println(map);
System.in.read();
}
}
};
}
}
{so56134056=(groupId=so56134056, isSimpleConsumerGroup=false, members=(memberId=consumer-2-32a80e0a-2b8d-4519-b71d-671117e7eaf8, clientId=consumer-2, host=/127.0.0.1, assignment=(topicPartitions=so56134056-0)), partitionAssignor=range, state=Stable, coordinator=localhost:9092 (id: 0 rack: null))}
We have been thinking about exposing getLastPollTime() to the listener container API.
getAssignedPartitions() has been available since 2.1.3.
I know that you haven't mentioned it in your post - but beware of adding items like this to a health check if you then deploy in AWS and use such a health check for your ELB scaling environment.
For example one scenario that can happen is that your app loses connectivity to Kafka - your health check turns RED - and then elastic beanstalks begins a process of killing and re-starting your instances (which will happen continually until your Kafka instances are available again). This could be costly!
There is also a more general philosophical question on whether health checks should 'cascade failures' or not e.g. kafka is down so app connected to kafka claims it is down, the next app in the chain also does the same, etc etc. This is often more normally implemented via circuit breakers which are designed to minimise slow calls destined for failure.
You could check using the AdminClient for the topic description.
final AdminClient client = AdminClient.create(kafkaConsumerFactory.getConfigurationProperties());
final String topic = "someTopicName";
final DescribeTopicsResult describeTopicsResult = client.describeTopics(Collections.singleton(topic));
final KafkaFuture<TopicDescription> future = describeTopicsResult.values().get(topic);
try {
// for healthcheck purposes we're fetching the topic description
future.get(10, TimeUnit.SECONDS);
} catch (final InterruptedException | ExecutionException | TimeoutException e) {
throw new RuntimeException("Failed to retrieve topic description for topic: " + topic, e);
}

Resources