This is the flow I am working towards,
Make a call to web api
Web api will immediately return OK
Web api will do some work in background
This is what I have achieved so far,
[Route("api/[controller]")]
public class PremController : Controller
{
private readonly myDbContext _context;
public PremController(myDbContext context)
{
_context = context;
}
[HttpGet]
public HttpResponseMessage Get()
{
Task.Factory.StartNew(() => DoWork());
return new HttpResponseMessage(HttpStatusCode.Accepted);
}
private void DoWork()
{
Delay(2000).ContinueWith(_ => GetProducts());
}
private void GetProducts()
{
var productUrls = _context.Products.Select(p => p.Url).ToArrayAsync();
}
static Task Delay(int milliseconds)
{
var tcs = new TaskCompletionSource<object>();
new Timer(_ => tcs.SetResult(null)).Change(milliseconds, -1);
return tcs.Task;
}
}
But I am getting error that myDbContext is disposed off before newly created task has completed. How can I solve this problem ?
This is because you're creating a fire-and-forget task, with no synchronization context. If you awaited DoWork() directly, synchronization context would be preserved, ensuring context is not disposed. More specifically, the task you're creating runs outside the lifetime of your context, as defined by the DI container (most likely request-scoped). As soon as the request completes, the context is disposed, killing the work your task is trying to complete outside the request.
Long and short, this is bad design for a number of reasons. If you need to do "background" work, that should be offloaded to an entirely different process, not just a new thread. The code that runs there should be responsible for maintaining its own context, unaffected by what's going on in your web app. Task.Run/Task.Factory.StartNew is extremely bad for web applications since there's a finite thread pool, and starting up new threads from that pool reduces your server's total load capacity.
If you find yourself wanting to spin up a new thread in a web application, don't. It's almost universally wrong. Instead, schedule the work using a background processing solution like Hangfire or similar.
Related
I could not find a definitive answer to whether it is safe to spawn threads within session-scoped JSF managed beans. The thread needs to call methods on the stateless EJB instance (that was dependency-injected to the managed bean).
The background is that we have a report that takes a long time to generate. This caused the HTTP request to time-out due to server settings we can't change. So the idea is to start a new thread and let it generate the report and to temporarily store it. In the meantime the JSF page shows a progress bar, polls the managed bean till the generation is complete and then makes a second request to download the stored report. This seems to work, but I would like to be sure what I'm doing is not a hack.
Check out EJB 3.1 #Asynchronous methods. This is exactly what they are for.
Small example that uses OpenEJB 4.0.0-SNAPSHOTs. Here we have a #Singleton bean with one method marked #Asynchronous. Every time that method is invoked by anyone, in this case your JSF managed bean, it will immediately return regardless of how long the method actually takes.
#Singleton
public class JobProcessor {
#Asynchronous
#Lock(READ)
#AccessTimeout(-1)
public Future<String> addJob(String jobName) {
// Pretend this job takes a while
doSomeHeavyLifting();
// Return our result
return new AsyncResult<String>(jobName);
}
private void doSomeHeavyLifting() {
try {
Thread.sleep(SECONDS.toMillis(10));
} catch (InterruptedException e) {
Thread.interrupted();
throw new IllegalStateException(e);
}
}
}
Here's a little testcase that invokes that #Asynchronous method several times in a row.
Each invocation returns a Future object that essentially starts out empty and will later have its value filled in by the container when the related method call actually completes.
import javax.ejb.embeddable.EJBContainer;
import javax.naming.Context;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
public class JobProcessorTest extends TestCase {
public void test() throws Exception {
final Context context = EJBContainer.createEJBContainer().getContext();
final JobProcessor processor = (JobProcessor) context.lookup("java:global/async-methods/JobProcessor");
final long start = System.nanoTime();
// Queue up a bunch of work
final Future<String> red = processor.addJob("red");
final Future<String> orange = processor.addJob("orange");
final Future<String> yellow = processor.addJob("yellow");
final Future<String> green = processor.addJob("green");
final Future<String> blue = processor.addJob("blue");
final Future<String> violet = processor.addJob("violet");
// Wait for the result -- 1 minute worth of work
assertEquals("blue", blue.get());
assertEquals("orange", orange.get());
assertEquals("green", green.get());
assertEquals("red", red.get());
assertEquals("yellow", yellow.get());
assertEquals("violet", violet.get());
// How long did it take?
final long total = TimeUnit.NANOSECONDS.toSeconds(System.nanoTime() - start);
// Execution should be around 9 - 21 seconds
assertTrue("" + total, total > 9);
assertTrue("" + total, total < 21);
}
}
Example source code
Under the covers what makes this work is:
The JobProcessor the caller sees is not actually an instance of JobProcessor. Rather it's a subclass or proxy that has all the methods overridden. Methods that are supposed to be asynchronous are handled differently.
Calls to an asynchronous method simply result in a Runnable being created that wraps the method and parameters you gave. This runnable is given to an Executor which is simply a work queue attached to a thread pool.
After adding the work to the queue, the proxied version of the method returns an implementation of Future that is linked to the Runnable which is now waiting on the queue.
When the Runnable finally executes the method on the real JobProcessor instance, it will take the return value and set it into the Future making it available to the caller.
Important to note that the AsyncResult object the JobProcessor returns is not the same Future object the caller is holding. It would have been neat if the real JobProcessor could just return String and the caller's version of JobProcessor could return Future<String>, but we didn't see any way to do that without adding more complexity. So the AsyncResult is a simple wrapper object. The container will pull the String out, throw the AsyncResult away, then put the String in the real Future that the caller is holding.
To get progress along the way, simply pass a thread-safe object like AtomicInteger to the #Asynchronous method and have the bean code periodically update it with the percent complete.
Introduction
Spawning threads from within a session scoped managed bean is not necessarily a hack as long as it does the job you want. But spawning threads at its own needs to be done with extreme care. The code should not be written that way that a single user can for example spawn an unlimited amount of threads per session and/or that the threads continue running even after the session get destroyed. It would blow up your application sooner or later.
The code needs to be written that way that you can ensure that an user can for example never spawn more than one background thread per session and that the thread is guaranteed to get interrupted whenever the session get destroyed. For multiple tasks within a session you need to queue the tasks.
Also, all those threads should preferably be served by a common thread pool so that you can put a limit on the total amount of spawned threads at application level.
Managing threads is thus a very delicate task. That's why you'd better use the built-in facilities rather than homegrowing your own with new Thread() and friends. The average Java EE application server offers a container managed thread pool which you can utilize via among others EJB's #Asynchronous and #Schedule. To be container independent (read: Tomcat-friendly), you can also use the Java 1.5's Util Concurrent ExecutorService and ScheduledExecutorService for this.
Below examples assume Java EE 6+ with EJB.
Fire and forget a task on form submit
#Named
#RequestScoped // Or #ViewScoped
public class Bean {
#EJB
private SomeService someService;
public void submit() {
someService.asyncTask();
// ... (this code will immediately continue without waiting)
}
}
#Stateless
public class SomeService {
#Asynchronous
public void asyncTask() {
// ...
}
}
Asynchronously fetch the model on page load
#Named
#RequestScoped // Or #ViewScoped
public class Bean {
private Future<List<Entity>> asyncEntities;
#EJB
private EntityService entityService;
#PostConstruct
public void init() {
asyncEntities = entityService.asyncList();
// ... (this code will immediately continue without waiting)
}
public List<Entity> getEntities() {
try {
return asyncEntities.get();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new FacesException(e);
} catch (ExecutionException e) {
throw new FacesException(e);
}
}
}
#Stateless
public class EntityService {
#PersistenceContext
private EntityManager entityManager;
#Asynchronous
public Future<List<Entity>> asyncList() {
List<Entity> entities = entityManager
.createQuery("SELECT e FROM Entity e", Entity.class)
.getResultList();
return new AsyncResult<>(entities);
}
}
In case you're using JSF utility library OmniFaces, this could be done even faster if you annotate the managed bean with #Eager.
Schedule background jobs on application start
#Singleton
public class BackgroundJobManager {
#Schedule(hour="0", minute="0", second="0", persistent=false)
public void someDailyJob() {
// ... (runs every start of day)
}
#Schedule(hour="*/1", minute="0", second="0", persistent=false)
public void someHourlyJob() {
// ... (runs every hour of day)
}
#Schedule(hour="*", minute="*/15", second="0", persistent=false)
public void someQuarterlyJob() {
// ... (runs every 15th minute of hour)
}
#Schedule(hour="*", minute="*", second="*/30", persistent=false)
public void someHalfminutelyJob() {
// ... (runs every 30th second of minute)
}
}
Continuously update application wide model in background
#Named
#RequestScoped // Or #ViewScoped
public class Bean {
#EJB
private SomeTop100Manager someTop100Manager;
public List<Some> getSomeTop100() {
return someTop100Manager.list();
}
}
#Singleton
#ConcurrencyManagement(BEAN)
public class SomeTop100Manager {
#PersistenceContext
private EntityManager entityManager;
private List<Some> top100;
#PostConstruct
#Schedule(hour="*", minute="*/1", second="0", persistent=false)
public void load() {
top100 = entityManager
.createNamedQuery("Some.top100", Some.class)
.getResultList();
}
public List<Some> list() {
return top100;
}
}
See also:
Spawning threads in a JSF managed bean for scheduled tasks using a timer
I tried this and works great from my JSF managed bean
ExecutorService executor = Executors.newFixedThreadPool(1);
#EJB
private IMaterialSvc materialSvc;
private void updateMaterial(Material material, String status, Location position) {
executor.execute(new Runnable() {
public void run() {
synchronized (position) {
// TODO update material in audit? do we need materials in audit?
int index = position.getMaterials().indexOf(material);
Material m = materialSvc.getById(material.getId());
m.setStatus(status);
m = materialSvc.update(m);
if (index != -1) {
position.getMaterials().set(index, m);
}
}
}
});
}
#PreDestroy
public void destory() {
executor.shutdown();
}
We are in the process of migrating a .net framework web jobs implementation to dotnet core. I'm using the documented extension method (IHostBuilder.ConfigureServices) on IHostBuilder to register dependencies with the scopes that seem fit, i.e., scoped, because most of the time I want an instance per web job invocation.
In the unit of work implementation that we use, the Entity Framework DbContext is disposed when the unit of work completes. In local development, and this is the issue that leads to this question, I bump into the issue that a second trigger (the web job is triggered via a ServiceBusTrigger) reuses the same instances of my dependencies, while they are properly registered on the IServiceCollection via the regular AddScoped<,> API. In my scenario, this manifests itself a DisposedObjectException on the DbContext.
While investigating this, I found that all scoped services are reused over invocations, which leads to the question whether you have to do the scoping differently in Azure Webjobs? Is this a local development thing only?
So, in pseudo code, this is how stuff is implemented:
// Program.cs
public static async Task Main()
{
var builder = new HostBuilder();
builder.ConfigureLogging((ctx, loggingBuilder) => { /* ... */});
builder.ConfigureWebJobs(webJobsBuilder => {
// DO STUFF
webJobsBuilder.AddServiceBus(options => { /* ... */ });
});
builder.ConfigureServices(services => {
services.AddScoped<IService, ServiceImplementation>();
// ...
services.AddScoped<IContextFactory, ContextFactoryImplementation>();
// ...
});
var host = builder.Build();
using(host)
{
await host.RunAsync();
}
}
And the unit of work is basically:
public class UnitOfWork: IUnitOfWork
{
public UnitOfWork(DbContext context)
{
// ...
}
public void Commit()
{
dbContext.SaveChanges();
}
public void Dispose()
{
...
}
public void Dispose(bool disposing)
{
...
dbContext?.Dispose();
dbContext = null;
}
}
Thanks!
Ok guys, sorry to waste your time, it turns out that a particular service was incorrectly registered as a singleton. I have to admit I might have jumped to conclusions, given that recently we bumped into the issues op scoped services in combination with usage of HttpClient(Factory) in Azure Functions (which is a real problem).
I am moving an asp.net mvc5 application using EF6 to asp.net core MVC 3.0 using EF Core.
In my mvc5 application I have some administrative operation that modify the database and take a long time, so I use a pattern when I create a new DBContext that is not the one that is associated with the request context and then run the task in the background using Task.Run. This has been working fine for years.
In converting to .net core it was unclear how to create a new DBContext in the way that I was doing it in my old codebase. It seems like I should be able to create a Transient DBContext in these cases and all should be fine.
So I created a subclass of MyDbContext called MyTransientDbContex and in my Configure class I added this service:
services.AddDbContext<MyTransientDbContex>(options =>
options.UseSqlServer(
context.Configuration.GetConnectionString("MyContextConnection")),
ServiceLifetime.Transient, ServiceLifetime.Transient);
In my controller I inject the context in the action that needs the transient service and spawn a thread to do something with it:
public ActionResult Update([FromServices] MyTransientContext context) {
Task.Run(() =>
{
try {
// Do some long running operation with context
}
Catch (Exception e) {
// Report Exception
}
finally {
context.Dispose();
}
}
return RedirectToAction("Status");
}
I would not expect my transient context to be disposed until the finally block. But I am getting this exception when attempting to access the context on the background thread:
Cannot access a disposed object. A common cause of this error is disposing a context that was resolved from dependency injection and then later trying to use the same context instance elsewhere in your application. This may occur if you are calling Dispose() on the context, or wrapping the context in a using statement. If you are using dependency injection, you should let the dependency injection container take care of disposing context instances.
Object name: 'MyTransientContext'.'
And indeed the _disposed flag is set to true on the context object.
I put a breakpoint on the constructer for MyTransientContext and "Made an Object ID" of the this pointer so that I could track the object. This transient object is being created and is the same one that is inject into my controller action. It's also the same object that I'm trying to reference when the exception is thrown.
I tried setting a data breakpoint on the _disposed member in order to get a callstack on when disposed is being set to true, but the breakpoint won't bind.
I also tried overriding the Dispose method on MyTransientContext, and it isn't called until my explicit dispose in the finally block, which is after the exception is thrown and caught.
I feel like I'm missing something fundamental here. Isn't this what the transient services are for? What would dispose a Transient service?
One last detail - MyTransientContext is derived from MyContext, which is in turn derived from IdentityDbContext (Microsoft.AspNetCore.Identity.EntityFrameworkCore.IdentityDbContex)
Edit: The reason that I went down the path of using a Transient was because of this ef core document page: https://learn.microsoft.com/en-us/ef/core/miscellaneous/configuring-dbcontext. It states that "...any code that explicitly executes multiple threads in parallel should ensure that DbContext instances aren't ever accessed concurrently. Using dependency injection, this can be achieved by either registering the context as scoped and creating scopes (using IServiceScopeFactory) for each thread, or by registering the DbContext as transient (using the overload of AddDbContext which takes a ServiceLifetime parameter)."
As xabikos pointed out, this seems to be overriden by the scoping of the asp.net DI system, where it looks like anything created by that system is scoped to the request context, including Transient objects. Can someone point out where that's documented so that I can better understand how to work with the limitations?
f you want manage the lifetime of service, you can instantiate it manually (or use a factory) :
public ActionResult Update()
{
Task.Run(() =>
{
using(var context = new MyTransientContext(...))
{
try
{
// Do some long running operation with context
}
catch (Exception e)
{
// Report Exception
}
}
}
return RedirectToAction("Status");
}
Or you can use IServiceProvider to get and manage a service :
public class MyController
{
private IServiceProvider _services;
public MyController(IServiceProvider services)
{
_services = services;
}
public ActionResult Update()
{
var context = (MyTransientContext)_services.GetService(typeof(MyTransientContext));
Task.Run(() =>
{
using (context)
{
try
{
// Do some long running operation with context
}
catch (Exception e)
{
// Report Exception
}
}
}
return RedirectToAction("Status");
}
}
You mixed the concepts of transient objects that are created by internal DI container asp.net core provides.
You configure the MyTransientContext to be transient in the internal DI system. This practically means that every time a scope is created then a new instance is returned. For asp.net application this scope matches an HTTP request. When the requests ends then all the objects are disposed if applicable.
Now in your code, that is a synchronous action method you spawn a Task with Task.Run. This is an async operation and you don't await for this. Practically during execution this will be started but not wait to finish, the redirect will happen and the request will end. At this point if you try to use the injected instance you will get the exception.
If you would like to solve this you need change to an async action and await on the Task.Run. And most likely you don't need to spawn a new Task. But you need to understand that this is not probably the best way as it will need for the long operation to finish before the redirect takes place.
An alternative to this would be to use a messaging mechanism, and send a message that triggers this operation. And you have another component, like worker service that listens for those messages and process them.
I have a requirement to start a process on the server that may run for several minutes, so I was thinking of exposing the following hub method:-
public async Task Start()
{
await Task.Run(() => _myService.Start());
}
There would also be a Stop() method that allows a client to stop the running process, probably via a cancellation token. I've also omitted code that prevents it from being started if already running, error handling, etc.
Additionally, the long-running process will be collecting data which it needs to periodically broadcast back to the client(s), so I was wondering about using an event - something like this:-
public async Task Start()
{
_myService.AfterDataCollected += AfterDataCollectedHandler;
await Task.Run(() => _myService.Start());
_myService.AfterDataCollected -= AfterDataCollectedHandler;
}
private void AfterDataCollectedHandler(object sender, MyDataEventArgs e)
{
Clients.All.SendData(e.Data);
}
Is this an acceptable solution or is there a "better" way?
You don't need to use SignalR to start the work, you can use the applications already existing framework / design / API for this and only use SignalR for the pub sub part.
I did this for my current customers project, a user starts a work and all tabs belonging to that user is updated using signalr, I used a out sun library called SignalR.EventAggregatorProxy to abstract the domain from SignalR. Disclaimer : I'm the author of said library
http://andersmalmgren.com/2014/05/27/client-server-event-aggregation-with-signalr/
edit: Using the .NET client your code would look something like this
public class MyViewModel : IHandle<WorkProgress>
{
public MyViewModel(IEventAggregator eventAggregator)
{
eventAggregator.Subscribe(this);
}
public void Handle(WorkProgress message)
{
//Act on work progress
}
}
In the book Pro ASP.NET MVC 4 there is an example of an asynchronous action:
public class RemoteDataController : AsyncController
{
public async Task<ActionResult> ConsumeAsyncMethod() {
string data = await new RemoteService().GetRemoteDataAsync();
return View("Data", (object)data);
}
}
public class RemoteService
{
public async Task<string> GetRemoteDataAsync() {
return await Task<string>.Factory.StartNew(() => {
Thread.Sleep(2000);
return "Hello from the other side of the world";
});
}
}
My question is: Would the task not just use a thread from the threadpool that is also used for serving requests?
Say I have a synchronous I/O bound method. I think calling this method with Task.Run and await in my action wouldn't lead to more requests that can be handled concurrently because the task for the I/O bound method is not available any longer for request handling. Or is there a separate threadpool only for the requests and using Task.Run in actions automatically uses a different one? What got me thinking is this question: Using ThreadPool.QueueUserWorkItem in ASP.NET in a high traffic scenario where the answer was more or less that only async methods from libraries should be used, where those libraries use their own thread pool.
Is it possible to configure the behavior? Does it work the same way with ASP.NET WebForms?
example
That's a really poor example. There are three things that I see immediately wrong with it, but the major one is as you pointed out:
Would the task not just use a thread from the threadpool that is also used for serving requests?
Yes, that example would.
Please consider this example instead:
public class RemoteDataController : Controller
{
public async Task<ActionResult> ConsumeAsyncMethod() {
string data = await new RemoteService().GetRemoteDataAsync();
return View("Data", data);
}
}
public class RemoteService
{
public async Task<string> GetRemoteDataAsync() {
await Task.Delay(2000);
return "Hello from the other side of the world";
}
}
The original example blocked a thread pool thread using Thread.Sleep. That's completely counterproductive on ASP.NET. As a general rule, do not use Task.Factory.StartNew or Task.Run on ASP.NET.
In contrast, Task.Delay is a naturally-asynchronous operation. By "naturally-asynchronous", I mean asynchronous in the same way that I/O operations are asynchronous (e.g., HttpClient for web calls). Naturally-asynchronous operations do not use threads, hence their appeal for ASP.NET servers (reducing pressure on the thread pool, allowing you to scale more).
It's interesting to think about how this works: when you use naturally-asynchronous methods as in my example, a thread starts the request up until it hits the await; at that point the request thread is returned to the thread pool (!) and for the next two seconds there are no threads processing that request (and yet the request has not completed). I like to call this phenomenon "zero-threaded concurrency". When the Delay finishes, a thread resumes processing the request and completes it.
On a side note, AsyncController is a leftover from MVC3. It is not needed with async/await.