Using SemaphoreSlim with Parallel.ForEach - asp.net

This is what I am trying to achieve. Let's say I have a process which runs every minute and performs some I/O operations. I want 5 threads to execute simultaneously and do the operations. Suppose if 2 threads took longer than a minute and when the process runs again after a minute, it should execute 3 threads simultaneously as 2 threads are already doing some operations.
So, I used the combination of SemaphoreSlim and Parallel.ForEach. Please let me know if this is the correct way of achieving this or there is any other better way.
private static SemaphoreSlim _semaphoreSlim = new SemaphoreSlim(5);
private async Task ExecuteAsync()
{
try
{
var availableThreads = _semaphoreSlim.CurrentCount;
if (availableThreads > 0)
{
var lists = await _feedSourceService.GetListAsync(availableThreads); // select #top(availableThreads) * from table
Parallel.ForEach(
lists,
new ParallelOptions
{
MaxDegreeOfParallelism = availableThreads
},
async item =>
{
await _semaphoreSlim.WaitAsync();
try
{
// I/O operations
}
finally
{
_semaphoreSlim.Release();
}
});
}
}
catch (Exception ex)
{
_logger.LogError(ex.Message, ex);
}
}

Let's say I have a process which runs every minute and performs some I/O operations... Suppose if 2 threads took longer than a minute and when the process runs again after a minute, it should execute 3 threads simultaneously as 2 threads are already doing some operations.
This kind of problem description is somewhat common, but is surprisingly difficult to code correctly. This is because you have a polling-style timer (time based) that is trying to periodically adjust a throttling mechanism. Doing this correctly is quite difficult.
So, the first thing I'd recommend is to change the problem description. Consider having the polling mechanism read all outstanding work, and then use normal throttling from there (e.g., adding then to an execution-constrained ActionBlock).
That said, if you'd prefer continuing down the more complex path, code like this would avoid the Parallel with async problem:
private static SemaphoreSlim _semaphoreSlim = new SemaphoreSlim(5);
private async Task ExecuteAsync()
{
try
{
var availableThreads = _semaphoreSlim.CurrentCount;
if (availableThreads > 0)
{
var lists = await _feedSourceService.GetListAsync(availableThreads); // select #top(availableThreads) * from table
var tasks = lists.Select(
async item =>
{
await _semaphoreSlim.WaitAsync();
try
{
// I/O operations
}
finally
{
_semaphoreSlim.Release();
}
}).ToList();
await Task.WhenAll(tasks);
}
}
catch (Exception ex)
{
_logger.LogError(ex.Message, ex);
}
}

Related

How to use CompletableFuture to execute the threads parallaly without waiting and combine the result?

I have executeGetCapability method which is executed in different threads but these threads runs sequentially..meaning one is completed after the other
#Async("threadPoolCapabilitiesExecutor")
public CompletableFuture<CapabilityDTO> executeGetCapability(final String id, final LoggingContextData data){...}
and this method is called in following way:
public CapabilityResponseDTO getCapabilities(final List<String> ids) {
final CapabilityResponseDTO responseDTO = new CapabilityResponseDTO();
final List<CapabilityDTO> listOfCapabilityDTOS = new ArrayList<>();
try {
for (String id: ids) {
listOfCapabilityDTOS .add(
asyncProcessService.executeGetCapability(id, LoggingContext.getLoggingContextData()).get());
}
} catch (Exception e) {
....
}
responseDTO.setDTOS(listOfCapabilityDTOS );
return responseDTO;
}
How can i call executeGetCapability method using CompletableFuture so that thread runs in parallel without waiting for each other and then the result is combined ?? how can i use here CompletableFuture.supplyAsync and or .allOf methods ? Can someone explain me
Thanks
The reduce helper function from this answer converts a CompletableFuture<Stream<T>> to a Stream<CompletableFuture<T>>. You can use it to asynchronously combine the results for multiple calls to executeGetCapability:
// For each id create a future to asynchronously execute executeGetCapability
Stream<CompletableFuture<CapabilityDTO>> futures = ids.stream()
.map(id -> executeGetCapability(id, LoggingContext.getLoggingContextData()));
// Reduce the stream of futures to a future for a stream
// and convert the stream to list
CompletableFuture<List<CapabilityDTO>> capabilitiesFuture = reduce(futures)
.thenApply(Stream::toList);
responseDTO.setDTOS(capabilitiesFuture.get());

Is there explanation for this threading code?

So have come across some code very similar to this. I am just wondering if someone can explain this to me.
See how it uses RX scheduler then Parallel.For and inside that a new TaskFactory.StartNew
IDisposable subscription = someObservable.ObserveOn(ThreadPoolScheduler.Instance)
.Subscribe(o =>
{
Parallel.ForEach(xxxs,
x =>
{
var theKey = x.Key;
if (!theTasks.ContainsKey(theKey) ||
theTasks.ContainsKey(theKey) && theTasks[theKey].IsCompleted)
{
theTasks[theKey] = Task.Factory.StartNew(
() =>
{
.....
}
catch (CommunicationObjectAbortedException ex)
{
....
}
catch (ObjectDisposedException ex)
{
....
}
catch (Exception e)
{
....
}
});
}
});
},
ex =>
{
....
},
() =>
{
....
});
}
I know what all these things do individually, but am not really sure what the combined threading effect here is. Can anyone hazzard a guess
Ah yes, the concurrency Turducken.
ThreadPoolScheduler schedules work on the thread pool which is distinct from the task pool. ThreadPoolScheduler was meant to be used to on platforms where a task pool was not available - prefer TaskPoolScheduler when possible.
It feels like the writer was trying to save up the task pool for only the task at hand (pardon the pun), by using the thread pool.
Parallel.ForEach blocks until the loop has been completed. So while it was running on the thread pool, when a new item is emitted, do the next ForEach on a borrowed thread from the thread pool.
As for the inner bit, the writer wants one Task to be run per unique key, if isn't already running.

Plugin BLE (v1.3.0) delay after characteristic.ReadAsync()

I’m developing an app and I want to read some characteristics one after one.
My issue is that after a read is done I must do a delay otherwise I get an error.
Why does it need a delay ? is there a way to write correctly read tasks one after other ?
I'm using Xamarin.forms and Ble v1.3.0 plugin.
I've tried "await Task.Delay(200)" between two consecutive ReadAsync() functions and it works fine but if I remove the delay, the second ReadAsync gets exception.
private async Task ReadChr(ICharacteristic LocalCharacteristic)
{
byte[] localData = { };
if (LocalCharacteristic.CanRead)
{
try
{
return localData = await LocalCharacteristic.ReadAsync();
}
catch (Exception ex)
{
Debug.WriteLine(ex.Message);
return null;
}
}
}
if (firstCharacteristic.CanRead)
{
var ccc = await ReadChr(firstCharacteristic);
}
await Task.Delay(200);
if (secondCharacteristic.CanRead)
{
var ddd = await ReadChr(secondCharacteristic);
}
I'm searching something like polling the read process status of the characteristic. Delay after ReadAsync does not seem a good practice coding.
Any idea ?

Best practice for long running SQL queries in ASP.Net MVC

I have an action method which needs to complete 15~52 long running SQL queries (all of them are similar, each takes more than 5 seconds to complete) according to user-selected dates.
After doing a lot of research, it seems the best way to do this without blocking the ASP.Net thread is to use async/await task methods With SQL Queries:
[HttpPost]
public async Task<JsonResult> Action() {
// initialization stuff
// create tasks to run async SQL queries
ConcurrentBag<Tuple<DateTime, List<long>>> weeklyObsIdBag =
new ConcurrentBag<Tuple<DateTime, List<long>>>();
Task[] taskList = new Task[reportDates.Count()];
int idx = 0;
foreach (var reportDate in reportDates) { //15 <= reportDates.Count() <= 52
var task = Task.Run(async () => {
using (var sioDbContext = new SioDbContext()) {
var historyEntryQueryable = sioDbContext.HistoryEntries
.AsNoTracking()
.AsQueryable<HistoryEntry>();
var obsIdList = await getObsIdListAsync(
historyEntryQueryable,
reportDate
);
weeklyObsIdBag.Add(new Tuple<DateTime,List<long>>(reportDate, obsIdList));
}
});
taskList[idx++] = task;
}
//await for all the tasks to complete
await Task.WhenAll(taskList);
// consume the results from long running SQL queries,
// which is stored in weeklyObsIdBag
}
private async Task<List<long>> getObsIdListAsync(
IQueryable<HistoryEntry> historyEntryQueryable,
DateTime reportDate
) {
//apply reportDate condition to historyEntryQueryable
//run async query
List<long> obsIdList = await historyEntryQueryable.Select(he => he.ObjectId)
.Distinct()
.ToListAsync()
.ConfigureAwait(false);
return obsIdList;
}
After making this change, the time taken to complete this action is greatly reduced since now I am able to execute multiple (15~52) async SQL queries simultaneously and await for them to complete rather than running them sequentially. However, users start to experience lots of time out issues, such as :
(from Elmah error log)
"Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool.
This may have occurred because all pooled connections were in use and max pool size was
reached."
"The wait operation timed out"
Is it caused by thread starvation? I got a feeling that I might be using too many threads from thread pool to achieve what I want, but I thought it shouldn't be a problem because I used async/await to prevent all the threads from being blocked.
If things won't work this way, then what's the best practice to execute multiple long running SQL queries?
Consider limiting the number of concurrent tasks being executed, for example:
int concurrentTasksLimit = 5;
List<Task> taskList = new List<Task>();
foreach (var reportDate in reportDates) { //15 <= reportDates.Count() <= 52
var task = Task.Run(async () => {
using (var sioDbContext = new SioDbContext()) {
var historyEntryQueryable = sioDbContext.HistoryEntries
.AsNoTracking()
.AsQueryable<HistoryEntry>();
var obsIdList = await getObsIdListAsync(
historyEntryQueryable,
reportDate
);
weeklyObsIdBag.Add(new Tuple<DateTime,List<long>>(reportDate, obsIdList));
}
});
taskList.Add(task);
if (concurrentTasksLimit == taskList.Count)
{
await Task.WhenAll(taskList);
// before clearing the list, you should get the results and store in memory (e.g another list) for later usage...
taskList.Clear();
}
}
//await for all the remaining tasks to complete
if (taskList.Any())
await Task.WhenAll(taskList);
Take note I changed your taskList to an actual List<Task>, it just seems easier to use it, since we need to add/remove tasks from the list.
Also, you should get the results before clearing the taskList, since you are going to use them later.

AsDocumentQuery.HasMore Result parallelism?

I have not yet faced a situation where a query for documents has more than 1 "set". I was wondering, what would happen, if instead of
while (queryable.HasMoreResults)
{
foreach(Book b in await queryable.ExecuteNextAsync<Book>())
{
// Iterate through books
}
}
i use
ConcurrentBag<IPost> result = new ConcurrentBag<IPost>();
List<Task> tasks = new List<Task>();
var outQ = query.AsDocumentQuery<IPost>();
while (outQ.HasMoreResults)
{
var parcialResult = outQ.ExecuteNextAsync<IPost>().ContinueWith((t) =>
{
foreach (var item in t.Result)
{
result.Add(item);
}
});
tasks.Add(parcialResult);
}
return Task.WhenAll(tasks).ContinueWith((r) => { return result.AsEnumerable(); });
I'm under the impression that the second approach, being a parallel one would yield more performance, once the partial queries would be execute concurrently... but i'm afraid that while (outQ.HasMoreResults) won't become false until all async ops have finished...
Your concern regarding HasMoreResults not updating quickly enough is well-founded. You cannot dispatch an async operation and assume the object will immediately transition to the state expected of it at the end of the async operation. In this particular case you have no choice but to wait for the async operation to complete.
I assume that the library authors deliberately chose the API which does not lend itself to parallelisation - most likely because it encapsulates the kind of work which needs to be done serially.

Resources