We have a .NET application which is calling over OpenRia services on the server (IIS). This web service call is running a heavy calculation, where we are loading over LoadLibrary some DLL's, which we need to solve some linear systems. We need to go over a list of 1000 events. Every single event is a separate calculation and can be run independently from each other.
What we are doing is, that we create on a 64-core machine 60 tasks and every task is taking one event => run the calculation => take the next event => run the calculation and so on until the list is empty.
As soon the list is empty our calculation is finished.
We have now the strange behaviour that on the first run the calculation seems to run fast, but when we run the same calculation again it's getting slower on every run.
If we restart the server the calculation is running fast again.
We have done an analysis with PerfView and we have seen that on the second/third/fourth run the used threads from the IIS worker process are less than at the beginning.
On the first run the IIS worker process is using 60 threads (as we have defined) and on the second the process is using less than 60. On every run the actual threads used are less and less.
The first run the calculation needs around 3min. The second run we need 6min and the third run we are already around 15min.
What could be the problem? I have tried to use the ThreadPool, but I have the same effect as with the Tasks.
Here is some sample code:
//This part of code is called after the web service call
ConcurrentStack<int> events = new ConcurrentStack<int>();//This is a list of 1000 entries
ParallelOptions options = new ParallelOptions();
int interfacesDone = 0;
Task[] tasks = new Task[options.MaxDegreeOfParallelism];
for (int i = 0; i < options.MaxDegreeOfParallelism; i++)
{
tasks[i] = Task.Run(() =>
{
StartAnalysis(events);
});
}
Task.WaitAll(tasks);
private void StartAnalysis(ConcurrentStack<int> events)
{
while (!events.IsEmpty)
{
int index;
if (events.TryPop(out index))
{
DoHeavyCalculation();
}
}
}
ASP.NET processes requests by using threads from the .NET thread pool. The thread pool maintains a pool of threads that have already incurred the thread initialization costs.
Therefore, these threads are easy to reuse. The .NET thread pool is also self-tuning. It monitors CPU and other resource utilization, and it adds new threads or trims the thread pool size as needed.
Related
Using ASP.NET Core .NET 5. Running on Windows.
Users upload large workbooks that need to be converted to a different format. Each conversion process is CPU intensive and takes around a minute to complete.
The idea is to use a pattern where the requests are queued in a background queue and then processed by background tasks.
So, I followed this Microsoft article
The queuing part worked well but the issue was that workbooks were executing sequentially in the background:
private async Task BackgroundProcessing(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
var workItem =
await TaskQueue.DequeueAsync(stoppingToken);
try
{
await workItem(stoppingToken);
}
catch (Exception ex)
{
_logger.LogError(ex,
"Error occurred executing {WorkItem}.", nameof(workItem));
}
}
}
If I queued 10 workbooks. Workbook 2 wouldn't start until workbook 1 is done. Workbook 3 wouldn't start until workbook 2 is done, etc.
So, I modified the code to run tasks without await and hid the warning with the discard operator (please note workItem is now Action, not Task):
while (!stoppingToken.IsCancellationRequested)
{
var workItem = await TaskQueue.DequeueAsync(stoppingToken);
_ = Task.Factory.StartNew(() =>
{
try
{
workItem(stoppingToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error occurred executing {WorkItem}.", nameof(workItem));
}
}, TaskCreationOptions.LongRunning);
}
That works -- I get all workbooks starting processing around the same time, and then they complete around the same time too. But, I am not sure if doing this is dangerous and can lead to bugs, crashes, etc.
Is the second version a workable solution, or will it lead to some disaster in the future? Is there a better way to implement parallel workloads on the background threads in ASP.NET?
Thanks.
Using an external queue has some advantages over in-memory queueing. In particular, the queue message are stored in a reliable external store with features around retries, multiple consumers, etc. If your app crashes, the queue item remains and can be tried again.
In Azure, you can use several services including Azure Storage Queues and Service Bus. I like Service Bus because it uses push-based behavior to avoid the need for a polling loop in your code. Either way, you can create an instance of IHostedService that will watch the queue and process the work items in a separate thread with configurable parallelization.
Look for examples on using within ASP.NET Core, for example:
https://damienbod.com/2019/04/23/using-azure-service-bus-queues-with-asp-net-core-services/
The idea is to use a pattern where the requests are queued in a background queue and then processed by background tasks.
The proper solution for request-extrinsic code is to use a durable queue with a separate backend processor. Any in-memory solution will lose that work any time the application is shut down (e.g., during a rolling upgrade).
Consider the normal scenario where an ASP.NET Core Web API application executes the service Controller action, but instead of executing all the work under the same thread (thread pool thread) until the response is created, I would like to use non-pooled threads (ideally pre-created) to execute the main work, either by scheduling one of these threads from the initial action pooled thread and free the pooled thread for serving other incoming requests, or passing the job to a pre-created non-pooled thread.
Among other reasons, the main reason to have these non-pooled and long running threads is that some requests may be prioritized and their threads put on hold (synchronized), thus it would not block new incoming requests to the API due to thread pool starvation, but older requests on hold (non-pooled threads) may be waked up and rejected and some sort of call back to the thread pool to return the web response back to the clients.
In summary, the ideal solution would be using a synchronization mechanism (like .NET RegisterWaitForSingleObject) where the pooled thread would hook to the waitHandle but be freed up for other thread pool work, and a new non-pooled thread would be created or used to carry on the execution. Ideally from a list of pre-created and idle non-pooled threads.
Seems async-await only works with Tasks and threads from the .NET thread pool, not with other threads. Also most techniques to create non-pooled threads do not allow the pooled thread to be free and return to the pool.
Any ideas? I'm using .NET Core and latest versions of tools and frameworks.
Thank you for the comments provided. The suggestion to check TaskCompletionSource was fundamental. So my goal was to have potentially hundreds or thousands of API requests on ASP.NET Core and being able to serve only a portion of them at a given time frame (due to backend constraints), choosing which ones should be served first and hold the others until backends are free or reject them later. Doing all this with thread pool threads is bad: blocking/holding and having to accept thousands in short time (thread pool size growing).
The design goal was the request jobs to move their processing from the ASP.NET threads to non pooled threads. I plan to to have these pre-created in reasonable numbers to avoid the overhead of creating them all the time. These threads implement a generic request processing engine and can be reused for subsequent requests. Blocking these threads to manage request prioritization is not a problem (using synchronization), most of them will not use CPU at all time and the memory footprint is manageable. The most important is that the thread pool threads will only be used on the very start of the request and released right away, to be only be used once the request is completed and return a response to the remote clients.
The solution is to have a TaskCompletionSource object created and passed to an available non-pooled thread to process the request. This can be done by queuing the request data together with the TaskCompletetionSource object on the right queue depending the type of service and priority of the client, or just passing it to a newly created thread if none available. The ASP.NET controller action will await on the TaskCompletionSouce.Task and once the main processing thread sets the result on this object, the rest of the code from the controller action will be executed by a pooled thread and return the response to the client. Meanwhile, the main processing thread can either be terminated or go get more request jobs from the queues.
using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
namespace MyApi.Controllers
{
[Route("api/[controller]")]
public class ValuesController : Controller
{
public static readonly object locker = new object();
public static DateTime time;
public static volatile TaskCompletionSource<string> tcs;
// GET api/values
[HttpGet]
public async Task<string> Get()
{
time = DateTime.Now;
ShowThreads("Starting Get Action...");
// Using await will free the pooled thread until a Task result is available, basically
// returns a Task to the ASP.NET, which is a "promise" to have a result in the future.
string result = await CreateTaskCompletionSource();
// This code is only executed once a Task result is available: the non-pooled thread
// completes processing and signals (TrySetResult) the TaskCompletionSource object
ShowThreads($"Signaled... Result: {result}");
Thread.Sleep(2_000);
ShowThreads("End Get Action!");
return result;
}
public static Task<string> CreateTaskCompletionSource()
{
ShowThreads($"Start Task Completion...");
string data = "Data";
tcs = new TaskCompletionSource<string>();
// Create a non-pooled thread (LongRunning), alternatively place the job data into a queue
// or similar and not create a thread because these would already have been pre-created and
// waiting for jobs from queues. The point is that is not mandatory to create a thread here.
Task.Factory.StartNew(s => Workload(data), tcs,
CancellationToken.None, TaskCreationOptions.LongRunning, TaskScheduler.Default);
ShowThreads($"Task Completion created...");
return tcs.Task;
}
public static void Workload(object data)
{
// I have put this Sleep here to give some time to show that the ASP.NET pooled
// thread was freed and gone back to the pool when the workload starts.
Thread.Sleep(100);
ShowThreads($"Started Workload... Data is: {(string)data}");
Thread.Sleep(10_000);
ShowThreads($"Going to signal...");
// Signal the TaskCompletionSource that work has finished, wich will force a pooled thread
// to be scheduled to execute the final part of the APS.NET controller action and finish.
// tcs.TrySetResult("Done!");
Task.Run((() => tcs.TrySetResult("Done!")));
// The only reason I show the TrySetResult into a task is to free this non-pooled thread
// imediately, otherwise the following line would only be executed after ASP.NET have
// finished processing the response. This briefly activates a pooled thread just execute
// the TrySetResult. If there is no problem to wait for ASP.NET to complete the response,
// we do it synchronosly and avoi using another pooled thread.
Thread.Sleep(1_000);
ShowThreads("End Workload");
}
public static void ShowThreads(string message = null)
{
int maxWorkers, maxIos, minWorkers, minIos, freeWorkers, freeIos;
lock (locker)
{
double elapsed = DateTime.Now.Subtract(time).TotalSeconds;
ThreadPool.GetMaxThreads(out maxWorkers, out maxIos);
ThreadPool.GetMinThreads(out minWorkers, out minIos);
ThreadPool.GetAvailableThreads(out freeWorkers, out freeIos);
Console.WriteLine($"Used WT: {maxWorkers - freeWorkers}, Used IoT: {maxIos - freeIos} - "+
$"+{elapsed.ToString("0.000 s")} : {message}");
}
}
}
}
I have placed the whole sample code so anyone can easily create as ASP.NET Core API project and test it without any changes. Here is the resulting output:
MyApi> Now listening on: http://localhost:23145
MyApi> Application started. Press Ctrl+C to shut down.
MyApi> Used WT: 1, Used IoT: 0 - +0.012 s : Starting Get Action...
MyApi> Used WT: 1, Used IoT: 0 - +0.015 s : Start Task Completion...
MyApi> Used WT: 1, Used IoT: 0 - +0.035 s : Task Completion created...
MyApi> Used WT: 0, Used IoT: 0 - +0.135 s : Started Workload... Data is: Data
MyApi> Used WT: 0, Used IoT: 0 - +10.135 s : Going to signal...
MyApi> Used WT: 2, Used IoT: 0 - +10.136 s : Signaled... Result: Done!
MyApi> Used WT: 1, Used IoT: 0 - +11.142 s : End Workload
MyApi> Used WT: 1, Used IoT: 0 - +12.136 s : End Get Action!
As you can see the pooled thread runs until the await on the TaskCompletionSource creation, and by the time the Workload starts to process the request on the non-pooled thread there is ZERO ThreadPool threads being used and remains using no pooled threads for the entire duration of the processing. When the Run.Task executes the TrySetResult fires a pooled thread for a brief moment to trigger the rest of the controller action code, reason the Worker thread count is 2 for a moment, then a fresh pooled thread runs the rest of the ASP.NET controller action to finish with the response.
Originally trying to create an HTTP endpoint that would remain open for a long time (until a remote service executes and finished, then return the result to the original caller), I hit some concurrency issues: this endpoint would only execute a small number of times concurrently (like 10 or so, whereas I'd expect hundreds if not more).
I then narrowed down my code to a test endpoint that merely returns after a certain amount of MS you give it via the URL. This method should, in theory, give maximum concurrency, but it doesn't happen neither when running under an IIS on a Windows 10 desktop PC nor when running on a Windows 2012 Server.
This is the test Web API endpoint:
[Route("throughput/raw")]
[HttpGet]
public async Task<IHttpActionResult> TestThroughput(int delay = 0)
{
await Task.Delay(delay);
return Ok();
}
And this is a simple test app:
class Program
{
static readonly HttpClient HttpClient = new HttpClient();
static readonly ConcurrentBag<long> Stats = new ConcurrentBag<long>();
private static Process _currentProcess;
private static string url = "http://local.api/test/throughput/raw?delay=0";
static void Main()
{
// Warm up
var dummy = HttpClient.GetAsync(url).Result;
Console.WriteLine("Warm up finished.");
Thread.Sleep(500);
// Get current process for later
_currentProcess = Process.GetCurrentProcess();
for (var i = 1; i <= 100; i++)
{
Thread t = new Thread(Proc);
t.Start();
}
Console.ReadKey();
Console.WriteLine($"Total requests: {Stats.Count}\r\nAverage time: {Stats.Average()}ms");
Console.ReadKey();
}
static async void Proc()
{
Stopwatch sw = Stopwatch.StartNew();
sw.Start();
await HttpClient.GetAsync(url);
sw.Stop();
Stats.Add(sw.ElapsedMilliseconds);
Console.WriteLine($"Thread finished at {sw.ElapsedMilliseconds}ms. Total threads running: {_currentProcess.Threads.Count}");
}
}
The results I get are these:
Warm up finished.
Thread finished at 118ms. Total threads running: 32
Thread finished at 114ms. Total threads running: 32
Thread finished at 130ms. Total threads running: 32
Thread finished at 110ms. Total threads running: 32
Thread finished at 115ms. Total threads running: 32
Thread finished at 117ms. Total threads running: 32
Thread finished at 119ms. Total threads running: 32
Thread finished at 112ms. Total threads running: 32
Thread finished at 163ms. Total threads running: 32
Thread finished at 134ms. Total threads running: 32
...
...
Some more
...
...
Thread finished at 4511ms. Total threads running: 32
Thread finished at 4504ms. Total threads running: 32
Thread finished at 4500ms. Total threads running: 32
Thread finished at 4507ms. Total threads running: 32
Thread finished at 4504ms. Total threads running: 32
Thread finished at 4515ms. Total threads running: 32
Thread finished at 4502ms. Total threads running: 32
Thread finished at 4528ms. Total threads running: 32
Thread finished at 4538ms. Total threads running: 32
Thread finished at 4535ms. Total threads running: 32
So:
I'm not sure why are there only 32 threads running (I assume it's related to the number of cores on my machine although sometimes the number is 34 and anyway it should be much more I think).
The main issue I'm trying to tackle: The running time goes up as more calls are created, whereas I'd expect it to remain relatively constant.
What am I missing here? I'd expect an ASP.NET site (API in this case but it doesn't matter), running on a Windows Server (so no artificial concurrency limit is applied) to handle all these concurrent requests just fine and not increase the response time. I believe the response time is increased because threads are capped on the server side so subsequent HTTP calls wait for their turn. I'd also expect more than 32/34 threads running on the client (test) application.
I also tried to tweak machine.config without much success but I think that even the default should give much more throughput.
HTTP Client
The number of simultaneous HttpClient connections is limited by your ServicePointManager. If you believe this article, the default is 2. TWO!! So your requests are getting queued. You can increase the number by setting the DefaultConnectionLimit.
Threads
Edit of the OP: although factually true for thread pools, my question did not involve a usage of the thread pool. I'm leaving this here though for any future reference (with usages slightly different than the one demonstrated in the question) and with respect to the person who gave this answer.
There is a maximum number of threads in your default thread pool. The default is not preset; it depends on the amount of memory available and other factors, and is apparently 32 on your machine. See this article, which states:
Beginning with the .NET Framework 4, the default size of the thread pool for a process depends on several factors, such as the size of the virtual address space. A process can call the GetMaxThreads method to determine the number of threads.
You can, of course, change it.
John's answer addresses setting the default connection limit. Additionally, don't use blocking threads at all; that way you won't need to care about the size of the thread pool. Your tester is I/O bound, not CPU bound. Your Proc already returns immediately, so just call it without a new thread. Change its return type to Task so you can tell when its deferred portion is done.
Then Main will go something like this:
public static async Task Main() {
await HttpClient.GetAsync(url);
await Task.Delay(500); // Wait for warm up.
await Task.WhenAll(Enumerable.Range(0, 100).Select(_ => Proc()));
// Print results here.
}
I am working on an asp.net mvc 5 web application , deployed inside IIS-8, and i have a method inside my application to perform a long running task which mainly scans our network for servers & VMs and update our database with the scan results. method execution might last between 30-40 minutes to complete on production environment. and i am using a schedule tool named Hangfire which will call this method 2 times a day.
here is the job definition inside the startup.cs file, which will call the method at 8:01 am & 8:01 pm:-
public void Configuration(IAppBuilder app)
{
var options = new SqlServerStorageOptions
{
PrepareSchemaIfNecessary = false
};
GlobalConfiguration.Configuration.UseSqlServerStorage("scanservice",options);
RecurringJob.AddOrUpdate(() => ss.Scan(), "01 8,20 ***");
}
and here is the method which is being called twice a day by the schedule tool:-
public void Scan()
{
Service ss = new Service();
ss.NetworkScan().Wait();
}
Finally the method which do the real scan is (i only provide a high level description of what the method will do):-
public async Task<ScanResult> NetworkScan()
{
// retrieve the server info from the DB
// loop over all servers & then execute some power shell commands to scan the network & retrieve the info for each server one by one...
// after the shell command completed for each server, i will update the related server info inside the DB
currently i did some tests on our test environment and every thing worked well ,, where the scan took around 25 seconds to scan 2 test servers.but now we are planning to move the application to production and we have around 120++ servers to scan. so i estimate the method execution to take around 30 -40 minutes to complete on the production environment. so my question is how i can make sure that this execution will never expire , and the ScanNetwork() method will complete till the end?
Instead of worrying about your task timing out, perhaps you could start a new task for each server. In this way each task will be very short lived, and any exceptions caused by scanning a single server will not effect all the others. Additionally, if your application is restarted in IIS any scans which were not yet completed will be resumed. With all scans happening in one sequential task this is not possible. You will likely also see the total time to complete a scan of your entire network plummet, as the majority of time would likely be spent waiting on remote servers.
public void Scan()
{
Service ss = new Service();
foreach (var server in ss.GetServers())
{
BackgroundJob.Enqueue<Service>(s => s.ServerScan(server));
}
}
Now your scheduled task will simply enqueue one new task for each server.
I have IIS7.5. We currently have a weighted rating for entities on our website. Calculating the weighted rating is extremely slow, to the point loading the homepage now takes more than 10 seconds to load.
To solve this, I'd like to store the weighting in the database with each entity, and have IIS run a script every 5-10 minutes that recalculates the weightings.
How do I go about doing this? It would be easiest for me if it ran a webpage URL.
One approach is to use a Windows service for this rather than calling a web URL.
This can then run completely out-of-band in the background to perform calculations. Details on this are here:
http://msdn.microsoft.com/en-us/library/d56de412%28v=VS.100%29.aspx
A few advantages include:
Your IIS process will not be affected, so your users will see no slowdown
The service can be stopped or started independently of the Web site
However, you'll need to have reasonably full access to the server to install and run the service.
You can use a Cache entry for this, set to expire 10 minutes in the future.
When you add the item, use a callback function for the CacheItemRemovedCallback parameter - in this callback function do your database work and re-add the expiring cache entry.
Other options include:
Using one of the timer classes included in the BCL - there is a MSDN magazine article describing and comparing the different ones.
Writing a windows service to do this.
Using a scheduled task.
Windows service and schedules tasks still require you to have some way to communicate the results to IIS.
To not use client, only server to continuously call this function, you can create a thread on the server to call the function that calculates it.
A better way to start the thread or timer is in the Global.asax, like this sample:
public class Global : System.Web.HttpApplication
{
private Timer _timer;
void Application_Start(object sender, EventArgs e)
{
int period = 1000 * 60 * 10; // 10 minutes
_timer = new Timer(TimerCallback, null, 1000, period);
}
private void TimerCallback(object state)
{
// Do your stuff here
}
}
}
I have done something like this earlier and I had used windows scheduled tasks to call my script at specific intervals of time.
A simple batch file with WGET or similar, and help from Scheduled Tasks will do it.
http://www.gnu.org/software/wget/
you can try this to test this idea:
wget http://localhost/filename.ashx