I've a long-running task defined in a Spring service. It is started by a Spring MVC controller. I want to start the service and return back an HttpResponse to the caller before the service ends. The service saves a file on file system at end.
In javascript I've created a polling job to check service status.
In Spring 3.2 I've found the #Async annotation, but I don't understand how it is different from DeferredResult and Callable. When do I have to use #Async and when should I use DeferredResult?
Your controller is eventually a function executed by the servlet container (I will assume it is Tomcat) worker thread. Your service flow start with Tomcat and ends with Tomcat. Tomcat gets the request from the client, holds the connection, and eventually returns a response to the client. Your code (controller or servlet) is somewhere in the middle.
Consider this flow:
Tomcat get client request.
Tomcat executes your controller.
Release Tomcat thread but keep the client connection (don't return response) and run heavy processing on different thread.
When your heavy processing complete, update Tomcat with its response and return it to the client (by Tomcat).
Because the servlet (your code) and the servlet container (Tomcat) are different entities, then to allow this flow (releasing tomcat thread but keep the client connection) we need to have this support in their contract, the package javax.servlet, which introduced in Servlet 3.0 . Now, getting back to your question, Spring MVC use the new Servlet 3.0 capability when the return value of the controller is DeferredResult or Callable, although they are two different things. Callable is an interface that is part of java.util, and it is an improvement for the Runnable interface (should be implemented by any class whose instances are intended to be executed by a thread). Callable allows to return a value, while Runnable does not. DeferredResult is a class designed by Spring to allow more options (that I will describe) for asynchronous request processing in Spring MVC, and this class just holds the result (as implied by its name) while your Callable implementation holds the async code. So it means you can use both in your controller, run your async code with Callable and set the result in DeferredResult, which will be the controller return value. So what do you get by using DeferredResult as the return value instead of Callable? DeferredResult has built-in callbacks like onError, onTimeout, and onCompletion. It makes error handling very easy.In addition, as it is just the result container, you can choose any thread (or thread pool) to run on your async code. With Callable, you don't have this choice.
Regarding #Async, it is much more simple – annotating a method of a bean with #Async will make it execute in a separate thread. By default (can be overridden), Spring uses a SimpleAsyncTaskExecutor to actually run these methods asynchronously.
In conclusion, if you want to release Tomcat thread and keep the connection with the client while you do heavy processing, then your controller should return Callable or DeferredResult. Otherwise, you can run the code on method annotated with #Async.
Async annotates a method so it is going to be called asynchronously.
#org.springframework.stereotype.Service
public class MyService {
#org.springframework.scheduling.annotation.Async
void DoSomeWork(String url) {
[...]
}
}
So Spring could do so you need to define how is going to be executed. For example:
<task:annotation-driven />
<task:executor id="executor" pool-size="5-10" queue-capacity="100"/>
This way when you call service.DoSomeWork("parameter") the call is put into the queue of the executor to be called asynchronously. This is useful for tasks that could be executed concurrently.
You could use Async to execute any kind of asynchronous task. If what you want is calling a task periodically you could use #Scheduled (and use task:scheduler instead of task:executor). They are simplified ways of calling java Runnables.
DeferredResult<> is used to answer to a petition without blocking the Tomcat HTTP thread used to answer. Usually is going to be the return value for a ResponseBody annotated method.
#org.springframework.stereotype.Controller
{
private final java.util.concurrent.LinkedBlockingQueue<DeferredResult<String>> suspendedRequests = new java.util.concurrent.LinkedBlockingQueue<>();
#RequestMapping(value = "/getValue")
#ResponseBody
DeferredResult<String> getValue() {
final DeferredResult<String> result = new DeferredResult<>(null, null);
this.suspendedRequests.add(result);
result.onCompletion(new Runnable() {
#Override
public void run() {
suspendedRequests.remove(result);
}
});
service.setValue(result); // Sets the value!
return result;
}
}
The previous example lacks one important thing and it's that doesn't show how the deferred result is going to be set. In some other method (probably the setValue method) there is going to be a result.setResult(value). After the call to setResult Spring is going to call the onCompletion procedure and return the answer to the HTTP request (see https://en.wikipedia.org/wiki/Push_technology#Long_polling).
But if you just are executing the setValue synchronously there is no advantage in using a deferred result.Here is where Async comes in hand. You could use an async method to set the return value in some point in the future using another thread.
#org.springframework.scheduling.annotation.Async
void SetValue(DeferredResult<String> result) {
String value;
// Do some time consuming actions
[...]
result.setResult(value);
}
Async is not needed to use a deferred result, its just one way of doing it.
In the example there is a queue of deferred results that, for example, a scheduled task could be monitoring to process it's pending requests. Also you could use some non blocking mechanism (see http://en.wikipedia.org/wiki/New_I/O) to set the returning value.
To complete the picture you could search information about java standard futures (http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/Future.html) and callables (http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/Callable.html) that are somewhat equivalent to Spring DeferredResult and Async.
DeferredResult takes advantage of the Servlet 3.0 AsyncContext. It will not block the thread like the others will when you need a result returned.
Another big benefit is that DeferredResult supports callbacks.
Related
Consider this extremely simple .NET Core 3.1 (and .NET 5) application with no special config or hosted services:
using System.Threading.Tasks;
using Microsoft.Extensions.Hosting;
internal class Program
{
public static async Task Main(string[] args)
{
var builder = Host.CreateDefaultBuilder(args);
builder.UseWindowsService();
var host = builder.Build();
var fireAndforget = Task.Run(async () => await host.RunAsync());
await Task.Delay(5000);
await host.StopAsync();
await Task.Delay(5000);
await host.RunAsync();
}
The first Run (sent as a background fire and forget task only for the purpose of this test) and Stop complete successfully. Upon calling Run a second time, I receive this exception:
System.AggregateException : 'Object name: 'EventLogInternal'.Cannot access a disposed object. Object name: 'EventLogInternal'.)'
If I do the same but using StartAsync instead of RunAsync (this time no need for a fireAndForget), I receive a System.OperationCanceledException upon called StartAsync the second time.
Am I right to deduce that .NET Generic Host aren't meant to be stopped and restarted?
Why do I need this?
My goal is to have a single application running as a Windows Service that would host two different .NET Generic Host. This is based on recommendation from here in order to have separate configuration and dependency injection rules and message queues.
One would stay active for all application lifetime (until the service is stopped in the Windows services) and would serve as a entry point to receive message events that would start/stop the other one which would be the main processing host with full services. This way the main services could be in "idle" state until they receive a message triggering their process, and another message could return them to idle state.
The host returned by CreateDefaultBuilder(...).Build() is meant to represent the whole application. From docs:
The main reason for including all of the app's interdependent resources in one object is lifetime management: control over app startup and graceful shutdown.
The default builder registers many services in singleton scope and when the host is stopped all of these services are disposed or switched to some "stopped" state. For example before calling StopAsync you can resolve IHostApplicationLifetime:
var appLifetime = host.Services.GetService<IHostApplicationLifetime>();
It has cancellation tokens representing application states. When you call StartAsync or RunAsync after stopping, all tokens still have IsCancellationRequested set to true. That's why the OperactionCancelledException is thrown in Host.StartAsync.
You can list other services during configuration:
For me it sounds like you just need some background jobs to process messages but I've never used NServiceBus so I don't know how it will work with something like Hangfire. You can also implement IHostedService and use it in the generic host builder.
I'm doing something like:
do
{
using IHost host = BuildHost();
await host.RunAsync();
} while (MainService.Restart);
with MainService constructor:
public MainService(IHostApplicationLifetime HostApplicationLifetime)
MainService.Restart is a static bool set by the MainService itself in response to some event which also calls HostApplicationLifetime.StopApplication().
Consider the normal scenario where an ASP.NET Core Web API application executes the service Controller action, but instead of executing all the work under the same thread (thread pool thread) until the response is created, I would like to use non-pooled threads (ideally pre-created) to execute the main work, either by scheduling one of these threads from the initial action pooled thread and free the pooled thread for serving other incoming requests, or passing the job to a pre-created non-pooled thread.
Among other reasons, the main reason to have these non-pooled and long running threads is that some requests may be prioritized and their threads put on hold (synchronized), thus it would not block new incoming requests to the API due to thread pool starvation, but older requests on hold (non-pooled threads) may be waked up and rejected and some sort of call back to the thread pool to return the web response back to the clients.
In summary, the ideal solution would be using a synchronization mechanism (like .NET RegisterWaitForSingleObject) where the pooled thread would hook to the waitHandle but be freed up for other thread pool work, and a new non-pooled thread would be created or used to carry on the execution. Ideally from a list of pre-created and idle non-pooled threads.
Seems async-await only works with Tasks and threads from the .NET thread pool, not with other threads. Also most techniques to create non-pooled threads do not allow the pooled thread to be free and return to the pool.
Any ideas? I'm using .NET Core and latest versions of tools and frameworks.
Thank you for the comments provided. The suggestion to check TaskCompletionSource was fundamental. So my goal was to have potentially hundreds or thousands of API requests on ASP.NET Core and being able to serve only a portion of them at a given time frame (due to backend constraints), choosing which ones should be served first and hold the others until backends are free or reject them later. Doing all this with thread pool threads is bad: blocking/holding and having to accept thousands in short time (thread pool size growing).
The design goal was the request jobs to move their processing from the ASP.NET threads to non pooled threads. I plan to to have these pre-created in reasonable numbers to avoid the overhead of creating them all the time. These threads implement a generic request processing engine and can be reused for subsequent requests. Blocking these threads to manage request prioritization is not a problem (using synchronization), most of them will not use CPU at all time and the memory footprint is manageable. The most important is that the thread pool threads will only be used on the very start of the request and released right away, to be only be used once the request is completed and return a response to the remote clients.
The solution is to have a TaskCompletionSource object created and passed to an available non-pooled thread to process the request. This can be done by queuing the request data together with the TaskCompletetionSource object on the right queue depending the type of service and priority of the client, or just passing it to a newly created thread if none available. The ASP.NET controller action will await on the TaskCompletionSouce.Task and once the main processing thread sets the result on this object, the rest of the code from the controller action will be executed by a pooled thread and return the response to the client. Meanwhile, the main processing thread can either be terminated or go get more request jobs from the queues.
using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
namespace MyApi.Controllers
{
[Route("api/[controller]")]
public class ValuesController : Controller
{
public static readonly object locker = new object();
public static DateTime time;
public static volatile TaskCompletionSource<string> tcs;
// GET api/values
[HttpGet]
public async Task<string> Get()
{
time = DateTime.Now;
ShowThreads("Starting Get Action...");
// Using await will free the pooled thread until a Task result is available, basically
// returns a Task to the ASP.NET, which is a "promise" to have a result in the future.
string result = await CreateTaskCompletionSource();
// This code is only executed once a Task result is available: the non-pooled thread
// completes processing and signals (TrySetResult) the TaskCompletionSource object
ShowThreads($"Signaled... Result: {result}");
Thread.Sleep(2_000);
ShowThreads("End Get Action!");
return result;
}
public static Task<string> CreateTaskCompletionSource()
{
ShowThreads($"Start Task Completion...");
string data = "Data";
tcs = new TaskCompletionSource<string>();
// Create a non-pooled thread (LongRunning), alternatively place the job data into a queue
// or similar and not create a thread because these would already have been pre-created and
// waiting for jobs from queues. The point is that is not mandatory to create a thread here.
Task.Factory.StartNew(s => Workload(data), tcs,
CancellationToken.None, TaskCreationOptions.LongRunning, TaskScheduler.Default);
ShowThreads($"Task Completion created...");
return tcs.Task;
}
public static void Workload(object data)
{
// I have put this Sleep here to give some time to show that the ASP.NET pooled
// thread was freed and gone back to the pool when the workload starts.
Thread.Sleep(100);
ShowThreads($"Started Workload... Data is: {(string)data}");
Thread.Sleep(10_000);
ShowThreads($"Going to signal...");
// Signal the TaskCompletionSource that work has finished, wich will force a pooled thread
// to be scheduled to execute the final part of the APS.NET controller action and finish.
// tcs.TrySetResult("Done!");
Task.Run((() => tcs.TrySetResult("Done!")));
// The only reason I show the TrySetResult into a task is to free this non-pooled thread
// imediately, otherwise the following line would only be executed after ASP.NET have
// finished processing the response. This briefly activates a pooled thread just execute
// the TrySetResult. If there is no problem to wait for ASP.NET to complete the response,
// we do it synchronosly and avoi using another pooled thread.
Thread.Sleep(1_000);
ShowThreads("End Workload");
}
public static void ShowThreads(string message = null)
{
int maxWorkers, maxIos, minWorkers, minIos, freeWorkers, freeIos;
lock (locker)
{
double elapsed = DateTime.Now.Subtract(time).TotalSeconds;
ThreadPool.GetMaxThreads(out maxWorkers, out maxIos);
ThreadPool.GetMinThreads(out minWorkers, out minIos);
ThreadPool.GetAvailableThreads(out freeWorkers, out freeIos);
Console.WriteLine($"Used WT: {maxWorkers - freeWorkers}, Used IoT: {maxIos - freeIos} - "+
$"+{elapsed.ToString("0.000 s")} : {message}");
}
}
}
}
I have placed the whole sample code so anyone can easily create as ASP.NET Core API project and test it without any changes. Here is the resulting output:
MyApi> Now listening on: http://localhost:23145
MyApi> Application started. Press Ctrl+C to shut down.
MyApi> Used WT: 1, Used IoT: 0 - +0.012 s : Starting Get Action...
MyApi> Used WT: 1, Used IoT: 0 - +0.015 s : Start Task Completion...
MyApi> Used WT: 1, Used IoT: 0 - +0.035 s : Task Completion created...
MyApi> Used WT: 0, Used IoT: 0 - +0.135 s : Started Workload... Data is: Data
MyApi> Used WT: 0, Used IoT: 0 - +10.135 s : Going to signal...
MyApi> Used WT: 2, Used IoT: 0 - +10.136 s : Signaled... Result: Done!
MyApi> Used WT: 1, Used IoT: 0 - +11.142 s : End Workload
MyApi> Used WT: 1, Used IoT: 0 - +12.136 s : End Get Action!
As you can see the pooled thread runs until the await on the TaskCompletionSource creation, and by the time the Workload starts to process the request on the non-pooled thread there is ZERO ThreadPool threads being used and remains using no pooled threads for the entire duration of the processing. When the Run.Task executes the TrySetResult fires a pooled thread for a brief moment to trigger the rest of the controller action code, reason the Worker thread count is 2 for a moment, then a fresh pooled thread runs the rest of the ASP.NET controller action to finish with the response.
I am using the Azure Blob Storage SDK Microsoft.WindowsAzure.Storage for my ASP.NET application. I have seen some synchronous method calling asynchronous method. They use an helper method named RunWithoutSynchronizationContext from Microsoft.WindowsAzure.Storage.Core.Util.
The code is basically doing something like
SynchronizationContext current = SynchronizationContext.Current;
try
{
SynchronizationContext.SetSynchronizationContext((SynchronizationContext) null);
methodAsync().Wait();
}
finally
{
SynchronizationContext.SetSynchronizationContext(current);
}
I was just wondering if this is a way to avoid deadlock in .NET Framework when blocking on asynchronous code? If not, then what is it the purpose of this method?
One of the common API developer pitfalls in asynchronous development with the .Net TPL is deadlocks. Most commonly, this is caused by SDK consumers using your asynchronous SDK in a synchronous manner.
You could use ConfigureAwait(false) to avoid deadlocking. Calling this routine before awaiting a task will cause it to ignore the SynchronizationContext.
var temp = await methodAsync().ConfigureAwait(false);
However, you need to place ConfigureAwait(false) calls throughout the SDK, and it is easy to forget.
So the trick is understanding how the SynchronizationContext works. Anytime it is used, a call will be made to either it’s Send or Post method.
So, all we need to do is ensuring that these methods never get called:
public void Test_SomeActionNoDeadlock()
{
var context = new Mock<SynchronizationContext>
{
CallBase = true
};
SynchronizationContext.SetSynchronizationContext(context.Object);
try
{
context.Verify(m =>
m.Post(It.IsAny<SendOrPostCallback>(), It.IsAny<object>()), Times.Never);
context.Verify(m =>
m.Send(It.IsAny<SendOrPostCallback>(), It.IsAny<object>()), Times.Never);
}
finally
{
SynchronizationContext.SetSynchronizationContext(null);
}
}
Now we have a way to guarantee that ConfigureAwait(false) was used throughout the SDK method, so long as we get 100% test coverage through the logical paths in the method.
In Camel, there is an AsyncProcessor . Is there any equivalent in Spring Integration
There are several components in Spring Integration which deal with the async hands-off.
The #MessagingGateway can be configured with the ListenableFuture what is fully similar to mentioned AsyncProcessor in the Apache Camel: http://docs.spring.io/spring-integration/docs/4.3.10.RELEASE/reference/html/messaging-endpoints-chapter.html#async-gateway.
Also this Gateway can have Mono return type for Reactive Streams manner of async processing.
For simple thread shifting and parallel processing there is an ExecutorChannel. The PublishSubscribeChannel also can be configured with the TaskExecutor for parallelism: http://docs.spring.io/spring-integration/docs/4.3.10.RELEASE/reference/html/messaging-channels-section.html#channel-configuration.
The QueueChannel can also be used for some kind of async tasks.
At the same time any POJO invocation component (e.g. #ServiceActivator) can simply deal with the ListenableFuture as a return from the underlying POJO and perform similar async callback work: http://docs.spring.io/spring-integration/docs/4.3.10.RELEASE/reference/html/messaging-endpoints-chapter.html#async-service-activator
I am new to the EJB Projects. And am trying to understand the usage of #Transactional annotation at top of my EJB methods. I have searched for the content and there is no clear explanation on this. Can anyone explain clearly about this.
#Transactional comes from the Spring world, but Oracle finally included it in Java EE 7 specification (docs). Previously, you could only annotate EJBs with #TransactionAttribute annotation, and similar is now possible for CDIs as well, with #Transactional. What's the purpose of these annotations? It is a signal to the application server that certain class or method is transactional, indicating also how it is gonna behave in certain conditions, e.g. what if it's called inside a transaction etc.
An example:
#Transactional(Transactional.TxType.MANDATORY)
public void methodThatRequiresTransaction()
{
..
}
The method above will throw an exception if it is not called within a transaction.
#Transactional(Transactional.TxType.REQUIRES_NEW)
public void methodThatWillStartNewTransaction()
{
..
}
Interceptor will begin a new JTA transaction for the execution of this method, regardless whether it is called inside a running transaction or not. However, if it is called inside a transaction, that transaction will be suspended during the execution of this method.
See also:
TransactionalTxType