How to get threadlocal for concurrency consumer? - spring-kafka

I am developing spring kafka consumer. Due to message volume, I need use concurrency to make sure throughput. Due to used concurrency, I used threadlocal object to save thread based data. Now I need remove this threadlocal object after use it.
Spring document with below links suggested to implement a EventListener which listen to event ConsumerStoppedEvent . But did not mention any sample eventlistener code to get threadlocal object and remove the value. May you please let me know how to get the threadlocal instance in this case?
Code samples will be appreciated.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#thread-safety

Something like this:
#SpringBootApplication
public class So71884752Application {
public static void main(String[] args) {
SpringApplication.run(So71884752Application.class, args);
}
#Bean
public NewTopic topic2() {
return TopicBuilder.name("topic1").partitions(2).build();
}
#Component
static class MyListener implements ApplicationListener<ConsumerStoppedEvent> {
private static final ThreadLocal<Long> threadLocalState = new ThreadLocal<>();
#KafkaListener(topics = "topic1", groupId = "my-consumer", concurrency = "2")
public void listen() {
long id = Thread.currentThread().getId();
System.out.println("set thread id to ThreadLocal: " + id);
threadLocalState.set(id);
}
#Override
public void onApplicationEvent(ConsumerStoppedEvent event) {
System.out.println("Remove from ThreadLocal: " + threadLocalState.get());
threadLocalState.remove();
}
}
}
So, I have two concurrent listener containers for those two partitions in the topic. Each of them is going to call this my #KafkaListener method anyway. I store the thread id into the ThreadLocal. For simple use-case and testing the feature.
The I implement ApplicationListener<ConsumerStoppedEvent> which is emitted in the appropriate consumer thread. And that one helps me to extract ThreadLocal value and clean it up in the end of consumer life.
The test against embedded Kafka looks like this:
#SpringBootTest
#EmbeddedKafka(bootstrapServersProperty = "spring.kafka.bootstrap-servers")
#DirtiesContext
class So71884752ApplicationTests {
#Autowired
KafkaTemplate<String, String> kafkaTemplate;
#Autowired
KafkaListenerEndpointRegistry kafkaListenerEndpointRegistry;
#Test
void contextLoads() throws InterruptedException {
this.kafkaTemplate.send("topic1", "1", "foo");
this.kafkaTemplate.send("topic1", "2", "bar");
this.kafkaTemplate.flush();
Thread.sleep(1000); // Give it a chance to consume data
this.kafkaListenerEndpointRegistry.stop();
}
}
Right. It doesn't verify anything, but it demonstrate how that event can happen.
I see something like this in log output:
set thread id to ThreadLocal: 125
set thread id to ThreadLocal: 127
...
Remove from ThreadLocal: 125
Remove from ThreadLocal: 127
So, whatever that doc says is correct.

Related

How to activate RequestScope inside CompletableFuture (getting org.jboss.weld.context.ContextNotActiveException) [duplicate]

I could not find a definitive answer to whether it is safe to spawn threads within session-scoped JSF managed beans. The thread needs to call methods on the stateless EJB instance (that was dependency-injected to the managed bean).
The background is that we have a report that takes a long time to generate. This caused the HTTP request to time-out due to server settings we can't change. So the idea is to start a new thread and let it generate the report and to temporarily store it. In the meantime the JSF page shows a progress bar, polls the managed bean till the generation is complete and then makes a second request to download the stored report. This seems to work, but I would like to be sure what I'm doing is not a hack.
Check out EJB 3.1 #Asynchronous methods. This is exactly what they are for.
Small example that uses OpenEJB 4.0.0-SNAPSHOTs. Here we have a #Singleton bean with one method marked #Asynchronous. Every time that method is invoked by anyone, in this case your JSF managed bean, it will immediately return regardless of how long the method actually takes.
#Singleton
public class JobProcessor {
#Asynchronous
#Lock(READ)
#AccessTimeout(-1)
public Future<String> addJob(String jobName) {
// Pretend this job takes a while
doSomeHeavyLifting();
// Return our result
return new AsyncResult<String>(jobName);
}
private void doSomeHeavyLifting() {
try {
Thread.sleep(SECONDS.toMillis(10));
} catch (InterruptedException e) {
Thread.interrupted();
throw new IllegalStateException(e);
}
}
}
Here's a little testcase that invokes that #Asynchronous method several times in a row.
Each invocation returns a Future object that essentially starts out empty and will later have its value filled in by the container when the related method call actually completes.
import javax.ejb.embeddable.EJBContainer;
import javax.naming.Context;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
public class JobProcessorTest extends TestCase {
public void test() throws Exception {
final Context context = EJBContainer.createEJBContainer().getContext();
final JobProcessor processor = (JobProcessor) context.lookup("java:global/async-methods/JobProcessor");
final long start = System.nanoTime();
// Queue up a bunch of work
final Future<String> red = processor.addJob("red");
final Future<String> orange = processor.addJob("orange");
final Future<String> yellow = processor.addJob("yellow");
final Future<String> green = processor.addJob("green");
final Future<String> blue = processor.addJob("blue");
final Future<String> violet = processor.addJob("violet");
// Wait for the result -- 1 minute worth of work
assertEquals("blue", blue.get());
assertEquals("orange", orange.get());
assertEquals("green", green.get());
assertEquals("red", red.get());
assertEquals("yellow", yellow.get());
assertEquals("violet", violet.get());
// How long did it take?
final long total = TimeUnit.NANOSECONDS.toSeconds(System.nanoTime() - start);
// Execution should be around 9 - 21 seconds
assertTrue("" + total, total > 9);
assertTrue("" + total, total < 21);
}
}
Example source code
Under the covers what makes this work is:
The JobProcessor the caller sees is not actually an instance of JobProcessor. Rather it's a subclass or proxy that has all the methods overridden. Methods that are supposed to be asynchronous are handled differently.
Calls to an asynchronous method simply result in a Runnable being created that wraps the method and parameters you gave. This runnable is given to an Executor which is simply a work queue attached to a thread pool.
After adding the work to the queue, the proxied version of the method returns an implementation of Future that is linked to the Runnable which is now waiting on the queue.
When the Runnable finally executes the method on the real JobProcessor instance, it will take the return value and set it into the Future making it available to the caller.
Important to note that the AsyncResult object the JobProcessor returns is not the same Future object the caller is holding. It would have been neat if the real JobProcessor could just return String and the caller's version of JobProcessor could return Future<String>, but we didn't see any way to do that without adding more complexity. So the AsyncResult is a simple wrapper object. The container will pull the String out, throw the AsyncResult away, then put the String in the real Future that the caller is holding.
To get progress along the way, simply pass a thread-safe object like AtomicInteger to the #Asynchronous method and have the bean code periodically update it with the percent complete.
Introduction
Spawning threads from within a session scoped managed bean is not necessarily a hack as long as it does the job you want. But spawning threads at its own needs to be done with extreme care. The code should not be written that way that a single user can for example spawn an unlimited amount of threads per session and/or that the threads continue running even after the session get destroyed. It would blow up your application sooner or later.
The code needs to be written that way that you can ensure that an user can for example never spawn more than one background thread per session and that the thread is guaranteed to get interrupted whenever the session get destroyed. For multiple tasks within a session you need to queue the tasks.
Also, all those threads should preferably be served by a common thread pool so that you can put a limit on the total amount of spawned threads at application level.
Managing threads is thus a very delicate task. That's why you'd better use the built-in facilities rather than homegrowing your own with new Thread() and friends. The average Java EE application server offers a container managed thread pool which you can utilize via among others EJB's #Asynchronous and #Schedule. To be container independent (read: Tomcat-friendly), you can also use the Java 1.5's Util Concurrent ExecutorService and ScheduledExecutorService for this.
Below examples assume Java EE 6+ with EJB.
Fire and forget a task on form submit
#Named
#RequestScoped // Or #ViewScoped
public class Bean {
#EJB
private SomeService someService;
public void submit() {
someService.asyncTask();
// ... (this code will immediately continue without waiting)
}
}
#Stateless
public class SomeService {
#Asynchronous
public void asyncTask() {
// ...
}
}
Asynchronously fetch the model on page load
#Named
#RequestScoped // Or #ViewScoped
public class Bean {
private Future<List<Entity>> asyncEntities;
#EJB
private EntityService entityService;
#PostConstruct
public void init() {
asyncEntities = entityService.asyncList();
// ... (this code will immediately continue without waiting)
}
public List<Entity> getEntities() {
try {
return asyncEntities.get();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new FacesException(e);
} catch (ExecutionException e) {
throw new FacesException(e);
}
}
}
#Stateless
public class EntityService {
#PersistenceContext
private EntityManager entityManager;
#Asynchronous
public Future<List<Entity>> asyncList() {
List<Entity> entities = entityManager
.createQuery("SELECT e FROM Entity e", Entity.class)
.getResultList();
return new AsyncResult<>(entities);
}
}
In case you're using JSF utility library OmniFaces, this could be done even faster if you annotate the managed bean with #Eager.
Schedule background jobs on application start
#Singleton
public class BackgroundJobManager {
#Schedule(hour="0", minute="0", second="0", persistent=false)
public void someDailyJob() {
// ... (runs every start of day)
}
#Schedule(hour="*/1", minute="0", second="0", persistent=false)
public void someHourlyJob() {
// ... (runs every hour of day)
}
#Schedule(hour="*", minute="*/15", second="0", persistent=false)
public void someQuarterlyJob() {
// ... (runs every 15th minute of hour)
}
#Schedule(hour="*", minute="*", second="*/30", persistent=false)
public void someHalfminutelyJob() {
// ... (runs every 30th second of minute)
}
}
Continuously update application wide model in background
#Named
#RequestScoped // Or #ViewScoped
public class Bean {
#EJB
private SomeTop100Manager someTop100Manager;
public List<Some> getSomeTop100() {
return someTop100Manager.list();
}
}
#Singleton
#ConcurrencyManagement(BEAN)
public class SomeTop100Manager {
#PersistenceContext
private EntityManager entityManager;
private List<Some> top100;
#PostConstruct
#Schedule(hour="*", minute="*/1", second="0", persistent=false)
public void load() {
top100 = entityManager
.createNamedQuery("Some.top100", Some.class)
.getResultList();
}
public List<Some> list() {
return top100;
}
}
See also:
Spawning threads in a JSF managed bean for scheduled tasks using a timer
I tried this and works great from my JSF managed bean
ExecutorService executor = Executors.newFixedThreadPool(1);
#EJB
private IMaterialSvc materialSvc;
private void updateMaterial(Material material, String status, Location position) {
executor.execute(new Runnable() {
public void run() {
synchronized (position) {
// TODO update material in audit? do we need materials in audit?
int index = position.getMaterials().indexOf(material);
Material m = materialSvc.getById(material.getId());
m.setStatus(status);
m = materialSvc.update(m);
if (index != -1) {
position.getMaterials().set(index, m);
}
}
}
});
}
#PreDestroy
public void destory() {
executor.shutdown();
}

Multi-entity Aggregates command handling

I have an aggregate root like this:
Aggregate root:
#NoArgsConstructor
#Aggregate(repository = "positionAggregateRepository")
#AggregateRoot
#XSlf4j
#Data
public class HopAggregate {
#AggregateIdentifier
private String hopId;
private FilteredPosition position;
private LocalDate positionDate;
#AggregateMember
private Security security;
#CommandHandler
public HopAggregate(NewHopCommand cmd) {
log.info("creating new position , {}", cmd.getDateId());
apply(new HopEvent(cmd.getHopId(), cmd.getDateId(), cmd.getFilteredPosition(), cmd.getSecurity(), false));
}
#CommandHandler
public void handle(UpdateHopCommand cmd) {
log.info("creating hop update event {}", cmd);
apply(new HopEvent(this.hopId, this.positionDate, cmd.getFilteredPosition(), this.security, true));
}
#CommandHandler
public void handle(SecurityUpdate cmd) {
log.info("updating security {}", cmd);
apply(new SecurityUpdateEvent(this.hopId, cmd.getFilteredSecurity()));
}
#EventSourcingHandler
public void on(HopEvent evt) {
if (evt.getIsUpdate()) {
log.info("updating position {}", evt);
this.position = evt.getFilteredPosition();
} else {
log.info("adding new position to date {}", evt);
this.hopId = evt.getHopId();
this.positionDate = evt.getDate();
this.position = evt.getFilteredPosition();
this.security= evt.getSecurity();
}
}
#EventSourcingHandler
public void on(SecurityUpdateEvent evt) {
log.info("hop id {}, security update {}", this.hopId, evt.getFilteredSecurity().getSecurityId());
}
}
Child entity:
#XSlf4j
#Data
#RequiredArgsConstructor
#NoArgsConstructor
public class IpaSecurity implements Serializable {
#EntityId
#NonNull
private String id;
#NonNull
private FilteredSecurity security;
}
My issue is that when i am pushing and update like this:
#EventHandler
public void handleSecurityEvent(SecurityUpdate securityUpdate) {
log.info("got security event {}", securityUpdate);
commandGateway.send(securityUpdate);
}
and my command being:
#Data
#RequiredArgsConstructor
#NoArgsConstructor
#ToString
public class SecurityUpdate {
#NonNull
#TargetAggregateIdentifier
private String id;
#NonNull
private FilteredSecurity filteredSecurity;
}
I am getting aggregate root not found exception:
Command 'com.hb.apps.ipa.events.SecurityUpdate' resulted in org.axonframework.modelling.command.AggregateNotFoundException(The aggregate was not found in the event store)
I am not sure how to handle this scenario. My requirement is that each aggregate should check whether it contains the security and then update it if the command was issued. What am i missing? let me know if you need any more info on the code.
Thanks for your help.
A Command is always targeted towards a single entity.
This entity can be an Aggregate, an entity contained in an Aggregate (what Axon Framework calls an Aggregate Member) or a simple singleton component.
Important to note though, is that there will only be one entity handling the command.
This is what requires you to set the #TargetAggregateIdentifier in your Command for Axon to be able to route it to a single Aggregate instance if the Command Handler in question is part of it.
The AggregateNotFoundException you're getting signals that the #TargetAggregateIdentifier annotated field in your SecurityUpdate command does no correspond to any existing Aggregate.
I'd thus suspect that the id field in the SecurityUpdate does not correspond to any #AggregateIdentifier annotated field in your HopAggregate aggregates.
A part from the above, I have a couple of other suggestions when looking at your snippets which I'd like to share with you:
#Aggregate is meta-annotated with #AggregateRoot. You're thus not required to specify both on an Aggregate class
For logging messages being handled, you can utilize LoggingInterceptor. You can configure this on any component capable of handling messages, thus providing a universal way of logging. This will omit the necessity to add log lines in your message handling functions
You're publishing a HopEvent on both the create and update commands. Doing so makes your HopEvent very generic. Ideally, your events clarify business operations occurring in your system. My rule of thumb typically is such: "If I tell my business manager/customer about the event class, he/she should know exactly what it does". I'd thus suggest to rename the event to something more specific
Just as with the HopEvent, the UpdateHopCommand is quite generic. Your commands should express the intent to perform an operation in your application. Users will typically not desire an update, they desire an address change for example. Your commands classes ideally reflect this
The suggested naming convention for commands is to start with verb in the present tense. Thus, it should no be SecurityUpdate, but UpdateSecurity. A command is a request expressing intent, the messages ideally reflect this
Hope this helps you out #juggernaut!

Calling seek of ConsumConsumerSeekCallback from a Spring Boot application

Here is my setup:
ConsumerSeekAware implementation:
public class ReplayJobKafkaConsumer implements ConsumerSeekAware, AcknowledgingMessageListener<String, String> {
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> map, ConsumerSeekCallback consumerSeekCallback) {
}
#Override
public void onIdleContainer(Map<TopicPartition, Long> map, ConsumerSeekCallback consumerSeekCallback) {
}
private static final ThreadLocal<ConsumerSeekCallback> seekCallBack = new ThreadLocal<>();
private static ConsumerSeekCallback consumerSeekCallback;;
#Override
public void registerSeekCallback(ConsumerSeekCallback callback) {
this.seekCallBack.set(callback);
consumerSeekCallback = callback;
}
public void onMessage(final ConsumerRecord<String, String> data, final Acknowledgment acknowledgment) {
}
public static ThreadLocal<ConsumerSeekCallback> getSeekCallback(){
return seekCallBack;
}
public static ConsumerSeekCallback getAnotherSeekCallback(){
return consumerSeekCallback;
}
}
My Spring Boot application approximates to:
#SpringBootApplication
public class ReplayJobApplication{
...
public void run(final String... args){
context = SpringApplication.run(ReplayJobApplication.class, args);
ReplayJobKafkaConsumer.getAnotherSeekCallback().seek("top", 0, 23);
}
...}
The above setup works. Now I can run this application using
java -jar -Dstart.offset=0....
But it only works if the seekcallback variable is not a ThreadLocal. I need this to be accessible at the Spring Boot application as that is how I intend running this consumer. TEMP-TOPIC's other consumers can still be processing, but I intend to run this consumer on a need basis with a start and end offset. While the command line parameters can be read in the consumer, the concerns I have are
callback variable is static (I cannot possibly create an instance of ReplayJobKafkaConsumer
it is a plain variable and not a ThreadLocal
Though the life time of this container is only going to be from start to end, I wonder if this setup is flawed and need some confirmation that this implementation is OK.
You appear to have some fundamental misunderstanding of what's going on.
The ThreadLocal is needed because the Kafka consumer object is not thread-safe. If you store the callback in a ThreadLocal, you can perform arbitrary seek operations at runtime - either from the onMessage method, or by listening for an ListenerContainerIdleEvent when there are no messages.
You can't perform arbitrary seeks ReplayJobKafkaConsumer.getAnotherSeekCallback().seek("top", 0, 23); from another thread.
You can't perform arbitrary seeks before partitions have been assigned.
So, as I have been telling you in other answers/comments, you must do the seek when the partition(s) are assigned.
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> map, ConsumerSeekCallback consumerSeekCallback) {
// Do the seeks here using the `consumerSeekCallback` parameter.
}
With modern versions of spring-kafka, you don't need to use ConsumerSeekAware unless you want to perform arbitrary seeks at runtime (after the initial seek). You can use a ConsumerAwareRebalanceListener instead.

spring-integration-dsl: Make Feed-Flow Work

I'm trying to code a RSS-feed reader with a configured set of RSS-feeds. I thought that a good approach is to solve that by coding a prototype-#Bean and call it with each RSS-feed found in the configuration.
However, I guess that I'm missing a point here as the application launches, but nothing happens. I mean the beans are created as I'd expect, but there is no logging happening in that handle()-method:
#Component
public class HomeServerRunner implements ApplicationRunner {
private static final Logger logger = LoggerFactory.getLogger(HomeServerRunner.class);
#Autowired
private Configuration configuration;
#Autowired
private FeedConfigurator feedConfigurator;
#Override
public void run(ApplicationArguments args) throws Exception {
List<IntegrationFlow> feedFlows = configuration.getRssFeeds()
.entrySet()
.stream()
.peek(entry -> System.out.println(entry.getKey()))
.map(entry -> feedConfigurator.feedFlow(entry.getKey(), entry.getValue()))
.collect(Collectors.toList());
// this one appears in the log-file and looks good
logger.info("Flows: " + feedFlows);
}
}
#Configuration
public class FeedConfigurator {
private static final Logger logger = LoggerFactory.getLogger(FeedConfigurator.class);
#Bean
#Scope(ConfigurableBeanFactory.SCOPE_PROTOTYPE)
public IntegrationFlow feedFlow(String name, FeedConfiguration configuration) {
return IntegrationFlows
.from(Feed
.inboundAdapter(configuration.getSource(), getElementName(name, "adapter"))
.feedFetcher(new HttpClientFeedFetcher()),
spec -> spec.poller(Pollers.fixedRate(configuration.getInterval())))
.channel(MessageChannels.direct(getElementName(name, "in")))
.enrichHeaders(spec -> spec.header("feedSource", configuration))
.channel(getElementName(name, "handle"))
//
// it would be nice if the following would show something:
//
.handle(m -> logger.debug("Payload: " + m.getPayload()))
.get();
}
private String getElementName(String name, String postfix) {
name = "feedChannel" + StringUtils.capitalize(name);
if (!StringUtils.isEmpty(postfix)) {
name += "." + postfix;
}
return name;
}
}
What's missing here? It seems as if I need to "start" the flows somehow.
Prototype beans need to be "used" somewhere - if you don't have a reference to it anywhere, no instance will be created.
Further, you can't put an IntegrationFlow #Bean in that scope - it generates a bunch of beans internally which won't be in that scope.
See the answer to this question and its follow-up for one technique you can use to create multiple adapters with different properties.
Alternatively, the upcoming 1.2 version of the DSL has a mechanism to register flows dynamically.

Atmosphere Request Scoped Broadcasts

I'm having trouble restricting the scope of my broadcasts, using Atmosphere 0.7.2.
I have a Jersey POJO, with an open method, and I want to pass the broadcaster to a non-webservice class. This is to allow separation between WAR and EJB layers.
#Path("/")
public class MyJerseyPojo {
#Context
private Broadcaster broadcaster;
#GET
#Suspend
public String open() {
Manager.register(/* Session ID */, new NonWebservice(this.broadcaster));
return "";
}
}
public class NonWebService implements Sender {
private Broadcaster broadcaster;
public NonWebService(Broadcaster b) {
this.broadcaster = b;
}
#Override
public void send(String thing) {
this.broadcaster.broadcast(thing);
}
}
The idea is that update events will call send, and this will notify the client with the suspended response. Each client should be associated with a separate broadcaster.
The problem is, this solution uses the same broadcaster for all clients. I have tried adding #Suspend(scope = Suspend.SCOPE.REQUEST) to the open method, but this causes no broadcast messages to be received.
I also tried the following in the open method:
#GET
#Suspend
public String open() {
Broadcaster b = BroadcasterFactory.getDefault().get(JerseyBroadcaster.class, UUID.randomUUID());
b.setScope(Broadcaster.SCOPE.REQUEST);
Manager.register(/* Session ID */, new NonWebservice(b));
}
This didn't work, either using #Suspend or #Suspend(scope = Suspend.SCOPE.REQUEST). In each case, the client didn't receive any broadcast messages. I did once get a message that the broadcaster had been destroyed and couldn't be used, but I can't remember how I did this!
I have seen a similar question but I'm not sure how to translate it to my POJO, as I'm not extending AtmosphereHandler.
Thanks
It turns out, after a lot of digging, that it's not safe to cache broadcasters, as the AtmosphereFilter was creating a new, Request-scoped broadcaster after my open() method completed.
Hence, in the NonWebservice class now looks like this:
public class NonWebService implements Sender {
private Class<? extends Broadcaster> clazz;
private Object broadcasterId;
public NonWebService(Class<? extends Broadcaster> clazz, Object broadcasterId) {
this.clazz = clazz;
this.broadcasterId = broadcasterId;
}
#Override
public void send(String thing) {
Broadcaster b = BroadcasterLookup.getDefault().get(this.clazz, this.broadcasterId);
b.broadcast(thing);
}
}
Now I could do with figuring out how to schedule request-scoped fixed broadcasts...
Actually, in the end I found that I needed to return a broadcastable from my suspended method. The broadcaster was recreated as session-scoped but kept available for later broadcasts. Just make sure that you don't broadcast anything until the suspended method has returned, as that might get lost.

Resources