Custom Windows Workflow activity that executes an asynchronous operation - redone using generic service - asynchronous

I am writing a custom Windows Workflow Foundation activity, that starts some process asynchronously, and then should wake up when an async event arrives.
All the samples I’ve found (e.g. this one by Kirk Evans) involve a custom workflow service, that does most of the work, and then posts an event to the activity-created queue. The main reason for that seems to be that the only method to post an event [that works from a non-WF thread] is WorkflowInstance.EnqueueItem, and the activities don’t have access to workflow instances, so they can't post events (from non-WF thread where I receive the result of async operation).
I don't like this design, as this splits functionality into two pieces, and requires adding a service to a host when a new activity type is added. Ugly.
So I wrote the following generic service that I call from the activity’s async event handler, and that can reused by various async activities (error handling omitted):
class WorkflowEnqueuerService : WorkflowRuntimeService
{
public void EnqueueItem(Guid workflowInstanceId, IComparable queueId, object item)
{
this.Runtime.GetWorkflow(workflowInstanceId).EnqueueItem(queueId, item, null, null);
}
}
Now in the activity code, I can obtain and store a reference to this service, start my async operation, an when it completes, use this service to post an event to my queue. The benefits of this - I keep all the activity-specific code inside activity, and I don't have to add new services for each activity types.
But seeing the official and internet samples doing it will specialized non-reusable services, I would like to check if this approach is OK, or I’m creating some problems here?

There is a potential problem here with regard to workflow persistence.
If you create long running worklfows that are persisted in a database to the runtime will be able to restart these workflows are not reloaded into memory until there is some external event that reloads them. As there they are responsible for triggering the event themselves but cannot until they are reloaded. And we have a catch 22 :-(
The proper way to do this is using an external service. And while this might feel like dividing the code into two places it really isn't. The reason is that the workflow is responsible for the big picture, IE what should be done. And the runtime service is responsible for the actual implementation or how it should be done. That way you can change the how without changing the why and when part.

A followup - regardless of all the reasons, why it "should be done" using a service, this will be directly supported by .NET 4.0, which provides a clean way for an activity to start an asynchronous work, while suspending the persistence of the activity.
See
http://msdn.microsoft.com/en-us/library/system.activities.codeactivitycontext.setupasyncoperationblock(VS.100).aspx
for details.

Related

Axoniq Event Handler Resuming from offset

I am looking at the AxonIQ framework and have managed to get a test application up and running. But I have a question about how EventHandlers should be treated when using a store that has persistence in the Read Model.
From my (possible naive) understanding. #EventHandler annotated methods in my Projection class get called from the beginning when first launched. This would mechanism seems to assume that the Projection utilises some kind of in volatile store (e.g. an in memory sql like h2) which is re-created from scratch during the application bootup.
However, if the store was persistent in something like Elastic Search, I would want the #EventHandler to resume from its last persisted event instead of from the beginning event.
Is there anyway to control the behaviour of the #EventHandler in this way?
Axon has two types of Event Processors: Subscribing and Tracking.
The Subscribing mode (which was the default up to Axon 3) will handle events in the thread that delivers them. That means you're at "the mercy" of the delivery guarantees of whichever component delivers the events.
The Tracking mode (which is the default since Axon 4 when using an Event Store or otherwise a source that supports it) will have events handled in dedicated threads, managed by the Event Processor itself. That means events are handled asynchronously from the actual publication mechanism.
The Tracking Event Processor uses Tokens to keep track of progress. These Tokens are stored in a TokenStore and updates as the Processor has correctly processed each incoming event (possibly batched). You decide where those tokens are stored. If you update a relational database, we recommend storing the tokens in the same database, so that event changes and tokens are updated atomically.
If you don't specify any TokenStore, it depends on whether you're on Spring Boot, in which case Axon will attempt to detect a suitable TokenStore implementation for you. Otherwise, it may very well just be an in-memory TokenStore, which causes Processors to re-initialize on every startup (and possibly start from the beginning).
To configure a TokenStore
On Spring (Boot), simply add a bean of type TokenStore with the implementation you want to use
When using Axon's Configuration API, on the EventProcessingConfigurer, use one of the registerTokenStore(...) methods.
When the Tracking Processor starts, it will check the Token Store for previous progress, and continue from there automatically.

Asp.net web api + entity framework: multiple requests cause data conflict

I'm developing an app with VS2013, using EF6.02, and Web API 2. I'm using the ASP.NET SPA template, and creating a RESTful api against an entity framework data source backed by a sql server. (In development, this resides on the SQL Server local instance.)
I've got two API methods so far (one that just reads data, one that writes data), and I'm testing them by calling them in the javascript. When I only call a single method in my script, either one works perfectly. But if I call both in script (without waiting for either's callback to fire), I get bad results and different exceptions in the debugger. Some exceptions state that the save can't be completed because there are pending transactions. Another exception stated something about a conflict with other threads. And sometimes, the read operation fails with a null pointer exception when trying to read a result set.
"New transaction is not allowed because there are other threads running in the session."
This makes me question if I'm correctly getting a new DBContext per request. My code for this looks like:
static Startup()
{
context = new Data.SqlServer.AppDbContext();
...
}
and then whenever instantiating a unit of work, I access Startup.context.
I've tried to implement the unit of work pattern, and each request shares a single UOW object which has a single DBContext object.
My question: Do I have additional responsibility to ensure that web requests "play nicely" with eachother? I hope that this is a problem that others have already dealt with. Perhaps the errors that I'm seeing are legitimate in the sense that if one user's data is being touched, it is temporarily in an invalid state and if other requests come in at that exact moment, they indeed will fail (and I should code anticipating these failures). I guess that even if each request has its own DBContext, they still share the same underlying SQL data source so perhaps that's causing issues.
I can try to put together a testcase, but I get differing behavior depending on where I put breakpoints and how long I spend on them, reaffirming to me that this is timing related.
Thanks for any help or suggestions...
-Ben
Your problem is where you are setting your context. The Startup method is for when the entire application starts, thus any request made will all use the same context. This is not a per request setup, but rather a per application setup. As to why you are getting the errors, EntityFramework is NOT thread-safe. Since IIS spawns many threads to handle concurrent request, your single context is being used across multiple threads.
As for a solution, you can look into
-Dependency Injection frameworks (such as Ninject or Unity)
-place a using statement in your UnitOfWork classes
using(var context = new Data.SqlServer.AppDbContext()){//do stuff}
-Or, I have seen instances of people creating a class that gets the context for that request and stores it in the HttpContext.Cache[] element (using a unique name so you can retrieve it in another class easily), making it so that you will reuse the same context for the same request. Something like this:
public AppDbContext GetDbContext()
{
var httpContext = HttpContext.Current;
if (httpContext == null) return new AppDbContext();
const string contextTypeKey = "AppDbContext";
if (httpContext.Items[contextTypeKey] == null)
{
httpContext.Items.Add(contextTypeKey, new AppDbContext());
}
return httpContext.Items[contextTypeKey] as AppDbContext;
}
To use the above method, make a simple call var context = GetDbContext();
Note
We have all of the above methods, but this is specifically to the third method. It seems to work well with two caveats. First, do not use this in a using statement as it will not be available to any other classes during the scope of the request (you dispose it). And secondly, ensure that you have a call on Application_EndRequest that does actually dispose of it. We saw these little buggers hanging around after the request ended in memory causing a huge spike in memory usage.

Starting and Forgetting an Async task in MVC Action

I have a standard, non-async action like:
[HttpPost]
public JsonResult StartGeneratePdf(int id)
{
PdfGenerator.Current.GenerateAsync(id);
return Json(null);
}
The idea being that I know this PDF generation could take a long time, so I just start the task and return, not caring about the result of the async operation.
In a default ASP.Net MVC 4 app this gives me this nice exception:
System.InvalidOperationException: An asynchronous operation cannot be started at this time. Asynchronous operations may only be started within an asynchronous handler or module or during certain events in the Page lifecycle. If this exception occurred while executing a Page, ensure that the Page is marked <%# Page Async="true" %>.
Which is all kinds of irrelevant to my scenario. Looking into it I can set a flag to false to prevent this Exception:
<appSettings>
<!-- Allows throwaway async operations from MVC Controller actions -->
<add key="aspnet:AllowAsyncDuringSyncStages" value="true" />
</appSettings>
https://stackoverflow.com/a/15230973/176877
http://msdn.microsoft.com/en-us/library/hh975440.aspx
But the question is, is there any harm by kicking off this Async operation and forgetting about it from a synchronous MVC Controller Action? Everything I can find recommends making the Controller Async, but that isn't what I'm looking for - there would be no point since it should always return immediately.
Relax, as Microsoft itself says (http://msdn.microsoft.com/en-us/library/system.web.httpcontext.allowasyncduringsyncstages.aspx):
This behavior is meant as a safety net to let you know early on if
you're writing async code that doesn't fit expected patterns and might
have negative side effects.
Just remember a few simple rules:
Never await inside (async or not) void events (as they return immediately). Some WebForms Page events support simple awaits inside them - but RegisterAsyncTask is still the highly preferred approach.
Don't await on async void methods (as they return immediately).
Don't wait synchronously in the GUI or Request thread (.Wait(), .Result(), .WaitAll(), WaitAny()) on async methods that don't have .ConfigureAwait(false) on root await inside them, or their root Task is not started with .Run(), or don't have the TaskScheduler.Default explicitly specified (as the GUI or Request will thus deadlock).
Use .ConfigureAwait(false) or Task.Run or explicitly specify TaskScheduler.Default for every background process, and in every library method, that does not need to continue on the synchronization context - think of it as the "calling thread", but know that it is not one (and not always on the same one), and may not even exist anymore (if the Request already ended). This alone avoids most common async/await errors, and also increases performance as well.
Microsoft just assumed you forgot to wait on your task...
UPDATE: As Stephen clearly (pun not intended) stated in his answer, there is an inherit but hidden danger with all forms of fire-and-forget when working with application pools, not solely specific to just async/await, but Tasks, ThreadPool, and all other such methods as well - they are not guaranteed to finish once the request ends (app pool may recycle at any time for a number of reasons).
You may care about that or not (if it's not business-critical as in the OP's particular case), but you should always be aware of it.
The InvalidOperationException is not a warning. AllowAsyncDuringSyncStages is a dangerous setting and one that I would personally never use.
The correct solution is to store the request to a persistent queue (e.g., an Azure queue) and have a separate application (e.g., an Azure worker role) processing that queue. This is much more work, but it is the correct way to do it. I mean "correct" in the sense that IIS/ASP.NET recycling your application won't mess up your processing.
If you absolutely want to keep your processing in-memory (and, as a corollary, you're OK with occasionally "losing" reqeusts), then at least register the work with ASP.NET. I have source code on my blog that you can drop in your solution to do this. But please don't just grab the code; please read the entire post so it's clear why this is still not the best solution. :)
The answer turns out to be a bit more complicated:
If what you're doing, as in my example, is just setting up a long-running async task and returning, you don't need to do more than what I stated in my question.
But, there is a risk: If someone expanded this Action later where it made sense for the Action to be async, then the fire and forget async method inside it is going to randomly succeed or fail. It goes like this:
The fire and forget method finishes.
Because it was fired from inside an async Task, it will attempt to rejoin that Task's context ("marshal") as it returns.
If the async Controller Action has completed and the Controller instance has since been garbage collected, that Task context will now be null.
Whether it is in fact null will vary, because of the above timings - sometimes it is, sometimes it isn't. That means a developer can test and find everything working correctly, push to Production, and it explodes. Worse, the error this causes is:
A NullReferenceException - very vague.
Thrown inside .Net Framework code you can't even step into inside of Visual Studio - usually System.Web.dll.
Not captured by any try/catch because the part of the Task Parallel Library that lets you marshal back into existing try/catch contexts is the part that's failing.
So, you'll get a mystery error where things just don't occur - Exceptions are being thrown but you're likely not privy to them. Not good.
The clean way to prevent this is:
[HttpPost]
public JsonResult StartGeneratePdf(int id)
{
#pragma warning disable 4014 // Fire and forget.
Task.Run(async () =>
{
await PdfGenerator.Current.GenerateAsync(id);
}).ConfigureAwait(false);
return Json(null);
}
So, here we have a synchronous Controller with no issues - but to ensure it still won't even if we change it to async later, we explicitly start a new Task via Run, which by default puts the Task on the main ThreadPool. If we awaited it, it would attempt to tie it back to this context, which we don't want - so we don't await it, and that gets us a nuisance warning. We disable the warning with the pragma warning disable.

Invoke Child Workflow Activity Asynchronously

Team:
I need to invoke a WF activity (XAML) from a WF service (XAMLX) asynchronously. I am already referencing the Microsoft.Activities.Extensions framework and I'm running on the Platform Update 1 for the state machine -- so if the solution is already in one of those libraries I'm ready!
Now, I need to invoke that activity (XAML) asynchronously -- but it has an output parameter that needs to set a variable in the service (XAMLX). Can somebody please provide me a solution to this?
Thanks!
* UPDATE *
Now I can post pictures, * I think *, because I have enough reputation! Let me put a couple out here and try to better explain my problem. The first picture is the WF Service that has the two entry points for the workflow -- the second is the workflow itself.
This workflow is an orchestration mechanism that constantly restarts itself, and has some failover mechanisms (e.g. exit on error threshold and soft exit) so that we can manage our queue of durable transactions using WF!
Now, we had this workflow working great when it was all one WF Service because we could call the service, get a response back and send the value of that response back into another entry point in a trigger to issue a soft exit. However, a new requirement has arrisen asking us to make the workflow itself a WF activity in another project and have the Receive/Send-Reply sequences in the WF Service Application project.
However, we need to be able to startup this workflow and forget about it -- then let it know somehow that a soft exit is necessary later on down the road -- but since WF executes on a single thread this has become a bit challenging at best.
Strictly speaking in XAML activities Parallel and ParallelForEach are how you perform asynchrony.
The workflow scheduler only uses a single thread (much like UI) so any activity that is running will typically be running on the same thread, unless it implements AsyncCodeActivity, in which case you are simply handing back the scheduler thread to the runtime while waiting for a callback from whichever async code your AsyncCodeActivity implementation is calling.
Therefore are you sure this is what you want to achieve? Do you mean you want to run it after you have sent your initial response? In this case place your activity after the Send Reply.
Please provide more info if these suggestions don't answer your question./
Update:
The original requirement posed (separating implementation from the service Receive/Send activities) may actually be solved by hosting the target activity as a service. See the following link
http://blog.petegoo.com/index.php/2011/09/02/building-an-enterprise-workflow-system-with-wf4/

Special considerations for using threads in IIS

I'd like to start using asynchronous processing in IIS. Edit: I'm talking about using the task parallel library.
For example, on certain page loads I want to log a bunch of crap, send an email, update some tables, etc. But I don't want to make the user wait for me to log all that crap.
So normally what I do is I have a static Queue that I push the log info onto, and then I have a cron job that calls a special page every 10 minutes whose OnLoad flushes out the queue. This works, but it's kind of clunky to setup, especially when you want to log 50 things. I'd rather do this:
Task.CreateNew(() => Log(theStuff));
However I'm terrified of running tasks in IIS because one slip up and your entire website goes down.
So now I have
SafeTask.FireAndForget(() => Log(theStuff));
This wraps the delegate in some try/catch and passes it into Task.CreateNew. So if someone changes something that affects something else that generates an exception somewhere else that accidentally gets thrown on the task thread, we get a notification instead of a crashed website. Also, the error notification inside the catch is also inside its own try/catch, and the catch for that also has a try/catch that tries to log in a different way.
Now that I can safely run stuff asynchronously in IIS, what other things do I need to worry about before I can start using my SafeTask class?
Every request in IIS and .net is processed in one thread by default. This thread comes from a thread pool called the "Application Pool". Existing threads are reused so you can't really use them for thread state unless you clear or set it every time. You define the size of this thread pool using a formula from MSDN in the machine.config or even your web.config.
Now, every async function call is put on a different thread. This includes async web service calls, async page functions, async delegates, etc. This thread comes from the "application pool" thus reducing the number of thread available for IIS to service new requests.
Most likely, your application will work just fine while using async function calls. In case you are worried or you have a lot of async tasks then you may want to create your own thread pool or look at SmartThreadPool on codeplex.
Hope this helps.
Consider using the page's OnUnload event. Read about it here: http://msdn.microsoft.com/en-us/library/ms178472.aspx
This event fires after the content is sent to the user (so the user isn't blocked while you do work), and should completely satisfy your requirement without introducing additional threads.
Specific to your question, you should be concerned about thread pool exhaustion only if your load and performance testing suggests you're running up against thread limits. If you're not then what you propose is certainly reasonable.

Resources