WF4 persistence is slow - workflow-foundation-4

I currently have a WCF workflow service with 5000 instances idling waiting for human input. When the service receives a request that updates the database, the persistence takes 5 seconds before persisting to the database. If I only have 500 instances, the persisting is instant as per the timeToPersist=0 seconds. Is there anyway to speed this up?
'<sqlWorkflowInstanceStore
connectionStringName="Request"
instanceCompletionAction="DeleteAll"
instanceLockedExceptionAction="BasicRetry"
instanceEncodingOption="GZip"
hostLockRenewalPeriod="00:00:05"
runnableInstancesDetectionPeriod="00:00:02"
/>
<workflowIdle
timeToUnload="00:00:00"
timeToPersist="00:00:00" />
<serviceThrottling maxConcurrentInstances="15"/>'

If you are deleting the instance of the workflow after the workflow has completed, then this could possibly be one reason for the time required.
This is done in a workflow service using the attribute
<sqlWorkflowInstanceStore connectionStringName="SQLInstancing"
instanceCompletionAction="DeleteAll"/>

It is note easy to speed this up There is latency to retrieve data from the database and to deserialize the information. Try to keep your objects as small as possible
you can also try to create your own peristence provider - which for example keeps data in memory rather than in the database. You could use a caching engine such as nCache for this. But it's difficult to create a persistence provider.

Related

Azure in-memory session state?

I will host my ASP.NET MVC4 app as a redundant Azure app. During a session, the app performs computationally expensive operations that produce non-serializable objects. Creation of the objects is repeatable; I could perform the expensive operation each time I need the object, but I would prefer to just do it the first time and save the object for later reuse.
I want to use the standard distributed session state mechanism in Azure for storing the usual session state info, but that mechanism requires that session data be serializable. Is there another mechanism I can use to cache the expensive-to-create, non-serializable objects?
Bob
All distributed cache services provided by Windows Azure need serialization currently, not only the shared cache, but dedicate/co-located cache as well.
But it's not necessary to serialize if you are going to use in memory cache. But this is not good for scaling-out, and you may not be able to have azure SLA if you've only 1 instance.
So my suggestion is to optimize your serialization and try to use azure cache.
Do these objects have to be stored in centralized storage or can you store them in the "InProc" session state?
If not, I'm afraid you'll need to serialize them into something (either SQL Azure, file, app-fabric cache, etc).
So either find a way to serialize them into something persistable or store them in RAM, with an extra copy on every web server

Create workflow service instances for large number of records at once

I’m working on a business problem which has to import files which has 1000s of records. Each record has to be registered in a Workflow as individual record which has to go through its own workflow.
WF4 Corporate Purchase Process example has a good solution, as in the first step it create bookmarks for all the required record ids. So the workflow can be resumed with rest of the actions for each individual record/id.
I would like to know how to implement same thing using Workflow services as I could get the benefits of AppFabric for my workflows.
Is there any other solutions to handle batch of records/ids? Otherwise workflow service has to be called 1000s of times just to register every record in a workflow instance which is a not a good solution.
I would like to know how to implement same thing using Workflow services as I could get the benefits of AppFabric for my workflows.
This is pretty straight forward. You're going to have one workflow that reads the file and loops through the results using the looping activities that exist. Then, inside the loop you'll be starting up the workflow that each record needs (the "Service") by calling the endpoint with a Send activity.
Now, as for the workflow that is the Service, you're going to have a Receive activity at the top of the workflow that also has CanCreateInstance set the true. The everything after the Receive is no different than any other workflow. You may consider having a Send activity right after the Receive just to let the caller know that the Service has been started. But that's not a requirement -- the Receive will be required because it forces WF to build the workflow to use the WorkflowServiceHost.
Is there any other solutions to handle batch of records/ids? Otherwise workflow service has to be called 1000s of times just to register every record in a workflow instance which is a not a good solution.
Are you indicating that a for a web server to receive 1000's of requests is not a good solution? Consider the fact that an IIS server can handle roughly 25-50 requests, per instant in time, per core. Now consider the fact that you're loop that's loading the workflows isn't going to average more than maybe 5 in that instant of time but probably more like 1 or 2.
I don't think the web server is going to be your issue. I've started up literally 10,000's of workflows on a server via a loop just like the one you're going to build and it didn't break a sweat.
One way would be to use WCF's MSMQ binding to launch your workflows. Requests can come in normally through HTTP, and WCF would route them to MSMQ and process the load. You can throttle how many workflow instances are used through the MSMQ binding + IIS settings.
Download this word document that describes setting up a workflow application with WCF and MSMQ: http://www.microsoft.com/en-us/download/details.aspx?id=21245
In the spirit of the doing the simplest thing that could work, you can bring the subworkflow in as an activity to the main workflow and use a parallel for each to execute the branch for each input from your file. No extra invoking is required and the tooling supports this out of the box because all workflows are activities. Hosting the main process in a service so you can avoid contention with the rest of your IIS users, real people that they may are, might be a good idea.
I do agree that calling IIS or a WCF service 1000's of times is not a problem though, unless you want to do it in a few seconds!
It is important to remember that one of the good things about workflow is that it has fairly low overhead (compared to other workflow products) so you should be more concerned about what your workflow does than just the idea of launching lots of instances. The idea of batches like your example is very common.

WCF Service Throttling settings for concurrency with SQL Transaction

I have a WCF service that has a complex operationcontract that has to executed atomically i.e. either the entire operation succeeds or fails. The WCF service is hosted on IIS server in an ASP .NET application. This operation has a sequence of SQL commands that execute in a transaction. During tests I found that with concurrent access by 4 - 5 users, atleast one user gets "Transaction Deadlock" error.
I then looked at the serviceThrottling settings which I had set to
<serviceThrottling maxConcurrentCalls ="5" maxConcurrentInstances ="50" maxConcurrentSessions ="5" />
and changed it to
<serviceThrottling maxConcurrentCalls ="1" maxConcurrentInstances ="1" maxConcurrentSessions ="1" />
I have turned off session since I don't need in the service contract. So I don't know whether maxConcurrentSessions will be having any effect at all
<ServiceContract([Namespace]:="http://www.8343kf393.com", SessionMode:=SessionMode.NotAllowed)>
This way I was queuing up the requests so that the request are processed serially instead of concurrently. While the transaction issue got away, the process time increased which was expected.
I was wondering
Whether serviceThrottling is the only way to resolve this issue ?
How can I set serviceThrottling such that while the service will accept many requests at the same time but will process one at a time?
Is setting the InstanceContextMode=InstancePerContext.PerCall relevant here since the application is ASP .Net application which in itself is multithreaded ?
Well, i think your going about this the wrong way trying to solve a database deadlock with WCF throttling.
you should try to understand why your database operations causes a deadlock, and try to avoid it (by using maybe locking hints.)
a singleton will do what you ask , but that isnt very scalable.
it is relevant but i think you get my drift , solve the deadlock in the database not in WCF.
if its SQL server that you are using , theres a great tool to analyze deadlocks (and a lot more) and its called the SQL Profiler. Also its a fairly well documented topic in the SQL Books Online
Your changes caused the WCF service to function as a singleton instance. That fixed your database concurrency issue but it only pushed the process blocking into the client.
I'd recommend using a different approach to remove the client blocking penalty. You should consider making this service, or at least extracting that operation into a new service that uses a netMsmqBinding (a good overview is here). This means the client will never be blocked and it guarantees delivery of the request to the service. The tradeoff is there can be no immediate response to the request, you'll need to add another operation to poll for completion status and to retrieve any expected results. It does require more work to spin up an MSMQ based service but the reliability is usually worth the effort.

Listing Currently Running Workflows in .Net 4.0

I've got a .Net 4.0 Workflow application hosted in WCF that takes a request to process some information. This information is passed to a secondary system via a web service and returns a bool indicating that its going to process that information.
My workflow then loops, sleeping for 5 minutes and then querying the secondary system to see if processing of the information is complete.
When its complete the workflow finishes.
I have this persisting in SQL, and it works perfectly.
My question is how do I retrieve a list of the persisted workflows in such a way that I can tie them back to the original request? I'd like my UI to be able to list the running workflows in a grid with the elapsed time that they've been run.
I've thought about storing the workflow GUID in my primary DB and generating the list that way, but what I'd really like is to be able to reconcile what I think is running, and what the persistant store thinks is running.
I'd also like to be able to select a running workflow and kill it off or completely restart it if the user determines that its gone screwy.
You can promote data from the workflow using the SqlWorkflowInstanceStore. The result is they are stored alongside the workflow data in the InstancesTable using the InstancePromotedPropertiesTable. Using the InstancePromotedProperties view is the easiest way of querying you data.
This blog post will show you the code you need.
Another option, use the WorkflowRuntime GetAllServices().
Then you can loop through each one to pull out the data you need. I would cache the results, given this may be an expensive operation. If you have only 100 or less running workflows, and only a few users on your page, don't bother caching.
This way you don't have to create a DAL or Repo layer. Especially if you are using sql for persistence.
http://msdn.microsoft.com/en-us/library/ms594874(v=vs.100).aspx

Performance issues with ASP.NET MVC/WCF site & Oracle backend

We are building an extranet loan status check website using ASP.NET MVC with a WCF backend. Its a pretty standard design with the MVC site using a WCF service reference to get customer objects. The ervice uses an Oracle backend + http binding, and won't be hosted on the same server as the MVC site (so we can't use tcp binding to reduce latency).
The problem we encountered is that every call to the service is resulting in a 7-8s response time which is unacceptable for an extranet site and much higher than the 2s magic mark. The service method(s) call 12 stored procedures to create the customer object. The database is, unfortunately, denormalized (we can't change it as its also used by other inhouse production systems) so most of the calls are basic select statements which populate the customer object and its associated objects. The service proxy is properly opened and closed/disposed in the MVC actions so there are no instances of any service connection leaks. A new client proxy is created for every request (i.e., we are not using the singleton pattern for the service).
Any ideas how we can speed this up ?
Thanks
It sounds like you already know where the problem is - it's the database.
I've never heard of a WCF operation taking more than a fraction of a second to set up and tear down, excluding any logic inside. So even if you could shave off 1-2 seconds of latency (which is probably an optimistic estimate), that doesn't really help if the database operation takes 5-6 seconds by itself.
Honestly? Running 12 stored procedures to create a customer is completely off-the-wall. The purpose of a stored procedure is to encapsulate all of the logic necessary to perform a complex database operation. The very first thing you need to do is change this to be one stored procedure - then if it's still slow, profile the database to see what's taking so long and fix it accordingly. Usually poor database performance is due to one or more missing indexes.
Until you accurately measure what is really happening, don't be too quick to assume where the bottleneck is.
You really need to do an Oracle extended SQL trace to see where that slowness is coming from. Anything other than that is mostly guesswork. Here is a paper from Cary Millsap (of Method R and formerly of Hotsos) that you can download that details doing this:
http://method-r.com/downloads/doc_details/10-for-developers-making-friends-with-the-oracle-database-cary-millsap

Resources