AsDocumentQuery.HasMore Result parallelism? - asynchronous

I have not yet faced a situation where a query for documents has more than 1 "set". I was wondering, what would happen, if instead of
while (queryable.HasMoreResults)
{
foreach(Book b in await queryable.ExecuteNextAsync<Book>())
{
// Iterate through books
}
}
i use
ConcurrentBag<IPost> result = new ConcurrentBag<IPost>();
List<Task> tasks = new List<Task>();
var outQ = query.AsDocumentQuery<IPost>();
while (outQ.HasMoreResults)
{
var parcialResult = outQ.ExecuteNextAsync<IPost>().ContinueWith((t) =>
{
foreach (var item in t.Result)
{
result.Add(item);
}
});
tasks.Add(parcialResult);
}
return Task.WhenAll(tasks).ContinueWith((r) => { return result.AsEnumerable(); });
I'm under the impression that the second approach, being a parallel one would yield more performance, once the partial queries would be execute concurrently... but i'm afraid that while (outQ.HasMoreResults) won't become false until all async ops have finished...

Your concern regarding HasMoreResults not updating quickly enough is well-founded. You cannot dispatch an async operation and assume the object will immediately transition to the state expected of it at the end of the async operation. In this particular case you have no choice but to wait for the async operation to complete.
I assume that the library authors deliberately chose the API which does not lend itself to parallelisation - most likely because it encapsulates the kind of work which needs to be done serially.

Related

how to work with XREAD .NET Redis to read actual changes

When i am set stop point at line with XREAD programm doing nothing. Maybe i need to configure this XREAD command?
public async void ListenTask()
{
var readTask = Task.Run(async () =>
{
while (!Token.IsCancellationRequested)
{
var result = db.StreamRead(streamName, "$", 1);
if (result.Any())
{
var dict = ParseResult(result.Last());
var sb = new StringBuilder();
foreach (var key in dict.Keys)
{
sb.Append(dict[key]);
}
Console.WriteLine(sb.ToString());
}
await Task.Delay(1000);
}
});
}
It's an illegal operation on StackExchange.Redis
Because of its unique multiplexed architecture, StackExchange.Redis (the library you appear to be using) does not support blocking XREAD operations. This is because all the commands going over the interactive interface (basically everything non-pub/sub) uses the same connection. If you block one of those connections, everything else in your app dependent on the multiplexer will be backed up awaiting the block to complete. The StackExchange.Redis library actually goes so far as to consider the $ id an illegal id, it's only purpose after all is to block. What's most likely happening (you don't see it happen because it's being swallowed up by the synchronization context) is that var result = db.StreamRead(streamName, "$", 1); is throwing an InvalidOperationException: System.InvalidOperationException: StreamPosition.NewMessages cannot be used with StreamRead.
Work Arounds
There are 2 potential workarounds in this case, first, you can use poll with XRANGE command rather than using blocking reads.
var readTask = Task.Run(async () =>
{
var lastId = "-";
while (!token.IsCancellationRequested)
{
var result = await db.StreamRangeAsync("a-stream", lastId, "+");
if(result.Any()
{
lastId = result.Last().Id;
}
await Task.Delay(1000);
}
});
You're already effectively doing a thread sleep so this polling operation is probably good enough for what you're looking for.
If you really need to do blocking operations, you'll have to use a different library, if you try to use StackExchange.Redis (you can possibly force the issue with the Execute/ExecuteAsync commands) you can seriously negatively degrade its performance.
Articles on doing so with ServiceStack.Redis and CsRedis are available on the redis developer site (I'm the author of them)
One final thing
Probably want to make sure that when you are issuing these commands that you are being as async as possible, you're using the sync XREAD command in an Async context (mostly every command in StackExchange.Redis has a sync/async version you can use - use the async when possible).

Using SemaphoreSlim with Parallel.ForEach

This is what I am trying to achieve. Let's say I have a process which runs every minute and performs some I/O operations. I want 5 threads to execute simultaneously and do the operations. Suppose if 2 threads took longer than a minute and when the process runs again after a minute, it should execute 3 threads simultaneously as 2 threads are already doing some operations.
So, I used the combination of SemaphoreSlim and Parallel.ForEach. Please let me know if this is the correct way of achieving this or there is any other better way.
private static SemaphoreSlim _semaphoreSlim = new SemaphoreSlim(5);
private async Task ExecuteAsync()
{
try
{
var availableThreads = _semaphoreSlim.CurrentCount;
if (availableThreads > 0)
{
var lists = await _feedSourceService.GetListAsync(availableThreads); // select #top(availableThreads) * from table
Parallel.ForEach(
lists,
new ParallelOptions
{
MaxDegreeOfParallelism = availableThreads
},
async item =>
{
await _semaphoreSlim.WaitAsync();
try
{
// I/O operations
}
finally
{
_semaphoreSlim.Release();
}
});
}
}
catch (Exception ex)
{
_logger.LogError(ex.Message, ex);
}
}
Let's say I have a process which runs every minute and performs some I/O operations... Suppose if 2 threads took longer than a minute and when the process runs again after a minute, it should execute 3 threads simultaneously as 2 threads are already doing some operations.
This kind of problem description is somewhat common, but is surprisingly difficult to code correctly. This is because you have a polling-style timer (time based) that is trying to periodically adjust a throttling mechanism. Doing this correctly is quite difficult.
So, the first thing I'd recommend is to change the problem description. Consider having the polling mechanism read all outstanding work, and then use normal throttling from there (e.g., adding then to an execution-constrained ActionBlock).
That said, if you'd prefer continuing down the more complex path, code like this would avoid the Parallel with async problem:
private static SemaphoreSlim _semaphoreSlim = new SemaphoreSlim(5);
private async Task ExecuteAsync()
{
try
{
var availableThreads = _semaphoreSlim.CurrentCount;
if (availableThreads > 0)
{
var lists = await _feedSourceService.GetListAsync(availableThreads); // select #top(availableThreads) * from table
var tasks = lists.Select(
async item =>
{
await _semaphoreSlim.WaitAsync();
try
{
// I/O operations
}
finally
{
_semaphoreSlim.Release();
}
}).ToList();
await Task.WhenAll(tasks);
}
}
catch (Exception ex)
{
_logger.LogError(ex.Message, ex);
}
}

Best practice for long running SQL queries in ASP.Net MVC

I have an action method which needs to complete 15~52 long running SQL queries (all of them are similar, each takes more than 5 seconds to complete) according to user-selected dates.
After doing a lot of research, it seems the best way to do this without blocking the ASP.Net thread is to use async/await task methods With SQL Queries:
[HttpPost]
public async Task<JsonResult> Action() {
// initialization stuff
// create tasks to run async SQL queries
ConcurrentBag<Tuple<DateTime, List<long>>> weeklyObsIdBag =
new ConcurrentBag<Tuple<DateTime, List<long>>>();
Task[] taskList = new Task[reportDates.Count()];
int idx = 0;
foreach (var reportDate in reportDates) { //15 <= reportDates.Count() <= 52
var task = Task.Run(async () => {
using (var sioDbContext = new SioDbContext()) {
var historyEntryQueryable = sioDbContext.HistoryEntries
.AsNoTracking()
.AsQueryable<HistoryEntry>();
var obsIdList = await getObsIdListAsync(
historyEntryQueryable,
reportDate
);
weeklyObsIdBag.Add(new Tuple<DateTime,List<long>>(reportDate, obsIdList));
}
});
taskList[idx++] = task;
}
//await for all the tasks to complete
await Task.WhenAll(taskList);
// consume the results from long running SQL queries,
// which is stored in weeklyObsIdBag
}
private async Task<List<long>> getObsIdListAsync(
IQueryable<HistoryEntry> historyEntryQueryable,
DateTime reportDate
) {
//apply reportDate condition to historyEntryQueryable
//run async query
List<long> obsIdList = await historyEntryQueryable.Select(he => he.ObjectId)
.Distinct()
.ToListAsync()
.ConfigureAwait(false);
return obsIdList;
}
After making this change, the time taken to complete this action is greatly reduced since now I am able to execute multiple (15~52) async SQL queries simultaneously and await for them to complete rather than running them sequentially. However, users start to experience lots of time out issues, such as :
(from Elmah error log)
"Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool.
This may have occurred because all pooled connections were in use and max pool size was
reached."
"The wait operation timed out"
Is it caused by thread starvation? I got a feeling that I might be using too many threads from thread pool to achieve what I want, but I thought it shouldn't be a problem because I used async/await to prevent all the threads from being blocked.
If things won't work this way, then what's the best practice to execute multiple long running SQL queries?
Consider limiting the number of concurrent tasks being executed, for example:
int concurrentTasksLimit = 5;
List<Task> taskList = new List<Task>();
foreach (var reportDate in reportDates) { //15 <= reportDates.Count() <= 52
var task = Task.Run(async () => {
using (var sioDbContext = new SioDbContext()) {
var historyEntryQueryable = sioDbContext.HistoryEntries
.AsNoTracking()
.AsQueryable<HistoryEntry>();
var obsIdList = await getObsIdListAsync(
historyEntryQueryable,
reportDate
);
weeklyObsIdBag.Add(new Tuple<DateTime,List<long>>(reportDate, obsIdList));
}
});
taskList.Add(task);
if (concurrentTasksLimit == taskList.Count)
{
await Task.WhenAll(taskList);
// before clearing the list, you should get the results and store in memory (e.g another list) for later usage...
taskList.Clear();
}
}
//await for all the remaining tasks to complete
if (taskList.Any())
await Task.WhenAll(taskList);
Take note I changed your taskList to an actual List<Task>, it just seems easier to use it, since we need to add/remove tasks from the list.
Also, you should get the results before clearing the taskList, since you are going to use them later.

async / await: If it doen't use threads what is it doing to running processing at the same time?

I am doing a little research to understand async / await of C# better.
I found a web site that has the following code to show how much slower synchronous processing is vs async / await:
public IActionResult Index()
{
Stopwatch watch = new Stopwatch();
watch.Start();
ContentManagement service = new ContentManagement();
var content = service.GetContent();
var count = service.GetCount();
var name = service.GetName();
watch.Stop();
ViewBag.WatchMilliseconds = watch.ElapsedMilliseconds;
return View();
}
[HttpGet]
public async Task<ActionResult> IndexAsync()
{
Stopwatch watch = new Stopwatch();
watch.Start();
ContentManagement service = new ContentManagement();
var contentTask = service.GetContentAsync();
var countTask = service.GetCountAsync();
var nameTask = service.GetNameAsync();
var content = await contentTask;
var count = await countTask;
var name = await nameTask;
watch.Stop();
ViewBag.WatchMilliseconds = watch.ElapsedMilliseconds;
return View("Index");
}
public class ContentManagement
{
public string GetContent()
{
Thread.Sleep(2000);
return "content";
}
public int GetCount()
{
Thread.Sleep(5000);
return 4;
}
public string GetName()
{
Thread.Sleep(3000);
return "Matthew";
}
public async Task<string> GetContentAsync()
{
await Task.Delay(2000);
return "content";
}
public async Task<int> GetCountAsync()
{
await Task.Delay(5000);
return 4;
}
public async Task<string> GetNameAsync()
{
await Task.Delay(3000);
return "Matthew";
}
}
I understand the above code at a high level and why it performs faster.
What I don't understand is if threads are not being used, how is the processing running at the same time?
I have read in a couple of places that async / await does not create new threads to do the processing. So, what is async / await doing to allow processing to happen at the same time? The three await Task.Delay are running in parallel, correct? If it is not creating 3 threads, what is it doing?
I just want to understand what is happening at a high level.
Let me know.
Thanks in advance.
if threads are not being used, how is the processing running at the same time?
Threads let you parallelize computations on the same system. When communications or other I/O are involved, there is a different system with which your code communicates. When you initiate the task, the other system starts doing work. This happens in parallel to your system, which is free to do whatever else it needs to do until you await the task.
The three await Task.Delay are running in parallel, correct?
They are not exactly running, they are sleeping in parallel. Sleeping takes very little resources. That's why they appear to be "running" in parallel.
What I don't understand is if threads are not being used, how is the processing running at the same time?
You can think of it as an event firing when the operation is complete, as opposed to a thread being blocked until the operation is complete.
I have read in a couple of places that async / await does not create new threads to do the processing.
async and await do not; that is true. For more about how async and await work, see my intro post.
So, what is async / await doing to allow processing to happen at the same time?
One of the primary use cases of async/await is for I/O-based code. I have a long blog post that goes into the details of how asynchronous I/O does not require threads.
The three await Task.Delay are running in parallel, correct?
I prefer to use the term "concurrently", just to avoid confusion with Parallel and Parallel LINQ, both of which were created for CPU-bound parallelism and do not work as generally expected with async/await. So, I would say that both parallelism and asynchrony are forms of concurrency, and this is an example of asynchronous concurrency.
(That said, using the term "parallel" is certainly in concord with the common usage of the term).
If it is not creating 3 threads, what is it doing?
Task.Delay is not an I/O-based operation, but it is very similar to one. It uses timers underneath, so it's completely different than Thread.Sleep.
Thread.Sleep will block a thread - I believe it does go all the way to an OS Sleep call, which causes the OS to place the thread in a wait state until its sleep time is expired.
Task.Delay acts more like an I/O operation. So, it sets up a timer that fires off an event when the time expires. Timers are managed by the OS itself - as time proceeds forward (clock ticks on the CPU), the OS will notify the timer when it has completed. It's a bit more complex than that (for efficiency, .NET will coalesce managed timers), but that's the general idea.
So, the point is that there is no dedicated thread for each Task.Delay that is blocked.

Parallel httprequest in UWP app

I'm creating an app that requires todo parallel http request, I'm using HttpClient for this.
I'm looping over the urls and foreach URl I start a new Task todo the request.
after the loop I wait untill every task finishes.
However when I check the calls being made with fiddler I see that the request are being called synchronously. It's not like a bunch of request are being made, but one by one.
I've searched for a solution and found that other people have experienced this too, but not with UWP. The solution was to increase the DefaultConnectionLimit on the ServicePointManager.
The problem is that ServicePointManager does not exist for UWP. I've looked in the API's and I thought I could set the DefaultConnectionLimit on HttpClientHandler, but no.
So I have a few Questions.
Is DefaultConnectionLimit still a property that could be set somewhere?
if so, where do i set it?
if not, how do I increase the connnectionlimit?
Is there still a connectionlimit in UWP?
this is my code:
var requests = new List<Task>();
var client = GetHttpClient();
foreach (var show in shows)
{
requests.Add(Task.Factory.StartNew((x) =>
{
((Show)x).NextEpisode = GetEpisodeAsync(((Show)x).NextEpisodeUri, client).Result;}, show));
}
}
await Task.WhenAll(requests.ToArray());
and this is the request:
public async Task<Episode> GetEpisodeAsync(string nextEpisodeUri, HttpClient client)
{
try
{
if (String.IsNullOrWhiteSpace(nextEpisodeUri)) return null;
HttpResponseMessage content; = await client.GetAsync(nextEpisodeUri);
if (content.IsSuccessStatusCode)
{
return JsonConvert.DeserializeObject<EpisodeWrapper>(await content.Content.ReadAsStringAsync()).Episode;
}
}
catch (Exception ex)
{
Debug.WriteLine(ex.Message);
}
return null;
}
Oke. I have the solution. I do need to use async/await inside the task. The problem was the fact I was using StartNew instead of Run. but I have to use StartNew because i'm passing along a state.
With the StartNew. The task inside the task is not awaited for unless you call Unwrap. So Task.StartNew(.....).Unwrap(). This way the Task.WhenAll() will wait untill the inner task is complete.
When u are using Task.Run() you don't have to do this.
Task.Run vs Task.StartNew
The stackoverflow answer
var requests = new List<Task>();
var client = GetHttpClient();
foreach (var show in shows)
{
requests.Add(Task.Factory.StartNew(async (x) =>
{
((Show)x).NextEpisode = await GetEpisodeAsync(((Show)x).NextEpisodeUri, client);
}, show)
.Unwrap());
}
Task.WaitAll(requests.ToArray());
I think an easier way to solve this is not "manually" starting requests but instead using linq with an async delegate to query the episodes and then set them afterwards.
You basically make it a two step process:
Get all next episodes
Set them in the for each
This also has the benefit of decoupling your querying code with the sideeffect of setting the show.
var shows = Enumerable.Range(0, 10).Select(x => new Show());
var client = new HttpClient();
(Show, Episode)[] nextEpisodes = await Task.WhenAll(shows
.Select(async show =>
(show, await GetEpisodeAsync(show.NextEpisodeUri, client))));
foreach ((Show Show, Episode Episode) tuple in nextEpisodes)
{
tuple.Show.NextEpisode = tuple.Episode;
}
Note that i am using the new Tuple syntax of C#7. Change to the old tuple syntax accordingly if it is not available.

Resources