I have been trying to find a way to make this task more efficient. I am consuming a REST based web service and need to update information for over 2500 clients.
I am using fiddler to watch the requests, and I'm also updating a table with an update time when its complete. I'm getting about 1 response per second. Are my expectations to high? I'm not even sure what I would define as 'fast' in this context.
I am handling everything in my controller and have tried running multiple web requests in parallel based on examples around the place but it doesn't seem to make a difference. To be honest I don't understand it well enough and was just trying to get it to build. I suspect it is still waiting for each request to complete before firing again.
I have also increased connections in my web config file as per another suggestion with no success:
<system.net>
<connectionManagement>
<add address="*" maxconnection="20" />
</connectionManagement>
</system.net>
My Controllers action method looks like this:
public async Task<ActionResult> UpdateMattersAsync()
{
//Only get matters we haven't synced yet
List<MatterClientRepair> repairList = Data.Get.AllUnsyncedMatterClientRepairs(true);
//Take the next 500
List<MatterClientRepair> subRepairList = repairList.Take(500).ToList();
FinalisedMatterViewModel vm = new FinalisedMatterViewModel();
using (ApplicationDbContext db = new ApplicationDbContext())
{
int jobCount = 0;
foreach (var job in subRepairList)
{
// If not yet synced - it shouldn't ever be!!
if (!job.Synced)
{
jobCount++;
// set up some Authentication fields
var oauth = new OAuth.Manager();
oauth["access_token"] = Session["AccessToken"].ToString();
string uri = "https://app.com/api/v2/matters/" + job.Matter;
// prepare the json object for the body
MatterClientJob jsonBody = new MatterClientJob();
jsonBody.matter = new MatterForUpload();
jsonBody.matter.client_id = job.NewClient;
string jsonString = jsonBody.ToJSON();
// Send it off. It returns the whole object we updated - we don't actually do anything with it
Matter result = await oauth.Update<Matter>(uri, oauth["access_token"], "PUT", jsonString);
// update our entities
var updateJob = db.MatterClientRepairs.Find(job.ID);
updateJob.Synced = true;
updateJob.Update_Time = DateTime.Now;
db.Entry(updateJob).State = System.Data.Entity.EntityState.Modified;
if (jobCount % 50 == 0)
{
// save every 50 changes
db.SaveChanges();
}
}
}
// if there are remaining files to save
if (jobCount % 50 != 0)
{
db.SaveChanges();
}
return View("FinalisedMatters", Data.Get.AllMatterClientRepairs());
}
}
And of course the Update method itself which handles the Web requesting:
public async Task<T> Update<T>(string uri, string token, string method, string json)
{
var authzHeader = GenerateAuthzHeader(uri, method);
// prepare the token request
var request = (HttpWebRequest)WebRequest.Create(uri);
request.Headers.Add("Authorization", authzHeader);
request.Method = method;
request.ContentType = "application/json";
request.Accept = "application/json, text/javascript";
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(json);
request.ContentLength = bytes.Length;
System.IO.Stream os = request.GetRequestStream();
os.Write(bytes, 0, bytes.Length);
os.Close();
WebResponse response = await request.GetResponseAsync();
using (var reader = new System.IO.StreamReader(response.GetResponseStream()))
{
return JsonConvert.DeserializeObject<T>(reader.ReadToEnd());
}
}
If it's not possible to do more than 1 request per second then I'm interested in looking at an Ajax solution so I can give the user some feedback while it is processing. In my current solution I cannot give the user feedback while the action method hasn't reached 'return' yet can I?
Okay it's taken me a few days (and a LOT of trial and error) but I've worked this out. Hopefully it can help others. I finally found my silver bullet. And it was probably the place I should have started:
MSDN: Consuming the Task-based Asynchronous Pattern
In the end this following line of code is what brought it all to light.
string [] pages = await Task.WhenAll(from url in urls select DownloadStringAsync(url));
I substituted a few things to make it work for a Put request as follows:
HttpResponseMessage[] results = await Task.WhenAll(from p in toUpload select client.PutAsync(p.uri, p.jsonContent));
'toUpload' is a List of MyClass:
public class MyClass
{
// the URI should be relative to the base pase
// (ie: /api/v2/matters/101)
public string uri { get; set; }
// a string in JSON format, being the body of the PUT request
public StringContent jsonContent { get; set; }
}
The key was to stop trying to put my PutAsync method inside a loop. My new line of code IS still blocking until ALL responses have come back, but that is what I wanted. Also, learning that I could use this LINQ style expression to create a Task List on the fly was immeasurably helpful. I won't post all the code (unless someone wants it) because it's not as nicely refactored as the original and I still need to check whether the response of each item was 200 OK before I record it as successfully saved in my database. So how much faster is it?
Results
I tested a sample of 50 web service calls from my local machine. (There is some saving of records to a SQL Database in Azure at the end).
Original Synchronous Code: 70.73 seconds
Asynchronous Code: 8.89 seconds
That's gone from 1.4146 requests per second down to a mind melting 0.1778 requests per second! (if you average it out)
Conclusion
My journey isn't over. I've just scratched the surface of asynchronous programming and am loving it. I need to now work out how to save only the results that have returned 200 OK. I can deserialize the HttpResponse which returns a JSON object (which has a unique ID I can look up etc.) OR I could use the Task.WhenAny method, and experiment with Interleaving.
Related
I'm running into a problem sending massive requests to a .NET Core web service. I'm using a SemaphoreSlim to limit the number of simultaneous requests. When I get a 10061 error (the web service has refused the connection), I want to dial back the number of simultaneous requests. My idea at the moment is to de-reference the SemaphoreSlim and create another:
await this.semaphoreSlim.WaitAsync().ConfigureAwait(false);
counter++;
Uri uri = new Uri($"{api}/{keyProperty}", UriKind.Relative);
string rowVersion = string.Empty;
try
{
HttpResponseMessage getResponse = await this.httpClient.GetAsync(uri).ConfigureAwait(false);
if (getResponse.IsSuccessStatusCode)
{
using (HttpContent httpContent = getResponse.Content)
{
JObject currentObject = JObject.Parse(await httpContent.ReadAsStringAsync().ConfigureAwait(false));
rowVersion = currentObject.Value<string>("rowVersion");
}
}
}
catch (HttpRequestException httpRequestException)
{
SocketException socketException = httpRequestException.InnerException as SocketException;
if (socketException != null && socketException.ErrorCode == PutHandler.ConnectionRefused)
{
this.semaphoreSlim = new SemaphoreSlim(counter * 90 / 100, counter * 90 / 100);
}
}
}
finally
{
this.semaphoreSlim.Release();
}
If I do this, what will happen to the other tasks that are waiting on the Semaphore that I just de-referenced? My guess is that nothing will happen until the object is garbage collected and disposed.
A SemaphoreSlim (just like any other object in .NET) will exist as long as there are references to it.
However, there is a bug in your code: the SemaphoreSlim being released is this.semaphoreSlim, and if this.semaphoreSlim is changed between being acquired and being released, then the code will release a different semaphore than the one that was acquired. To avoid this problem, copy this.semaphoreSlim into a local variable at the beginning of your method, and acquire and release that local variable.
More broadly, there's a difficult in the attempted solution. If you start 1000 tasks, they will all reference the old semaphore and ignore the updated this.sempahoreSlim. So you'd need a separate solution. For example, you could define a disposable "token" which is permission to call the API. Then have an asynchronous collection of these tokens (e.g., a Channel). This gives you full control over how many tokens are released at once.
This is actually a 2-part question related directly to .net core 3.0 and specifically with PipeWriter: 1) How should I read in the HttpResponse body? 2) How can I update the HttpResponse? I'm asking both questions because I feel like the solution will likely involve the same understanding and code.
Below is how I got this working in .net core 2.2 - note that this is using streams instead of PipeWriter and other "ugly" things associated with streams - eg. MemoryStream, Seek, StreamReader, etc.
public class MyMiddleware
{
private RequestDelegate Next { get; }
public MyMiddleware(RequestDelegate next) => Next = next;
public async Task Invoke(HttpContext context)
{
var httpResponse = context.Response;
var originalBody = httpResponse.Body;
var newBody = new MemoryStream();
httpResponse.Body = newBody;
try
{
await Next(context);
}
catch (Exception)
{
// In this scenario, I would log out the actual error and am returning this "nice" error
httpResponse.StatusCode = StatusCodes.Status500InternalServerError;
httpResponse.ContentType = "application/json"; // I'm setting this because I might have a serialized object instead of a plain string
httpResponse.Body = originalBody;
await httpResponse.WriteAsync("We're sorry, but something went wrong with your request.");
return;
}
// If everything worked
newBody.Seek(0, SeekOrigin.Begin);
var response = new StreamReader(newBody).ReadToEnd(); // This is the only way to read the existing response body
httpResponse.Body = originalBody;
await context.Response.WriteAsync(response);
}
}
How would this work using PipeWriter? Eg. it seems that working with pipes instead of the underlying stream is preferable, but I can not yet find any examples on how to use this to replace my above code?
Is there a scenario where I need to wait for the stream/pipe to finish writing before I can read it back out and/or replace it with a new string? I've never personally done this, but looking at examples of PipeReader seems to indicate to read things in chunks and check for IsComplete.
To Update HttpRepsonse is
private async Task WriteDataToResponseBodyAsync(PipeWriter writer, string jsonValue)
{
// use an oversized size guess
Memory<byte> workspace = writer.GetMemory();
// write the data to the workspace
int bytes = Encoding.ASCII.GetBytes(
jsonValue, workspace.Span);
// tell the pipe how much of the workspace
// we actually want to commit
writer.Advance(bytes);
// this is **not** the same as Stream.Flush!
await writer.FlushAsync();
}
I have a form with hundreds of check boxes and dropdown menus (Which value of many of them are coupled together). In the action there is updating mechanism to update an object in Session. This object does all validation and coupling of values, for example if user types %50 in one input filed, we might add 3 new SelectListItem to a dropdown.
Everything works fine, but if use starts to clicking on check boxes very quick (which is the normal case in our scenario), controller get multiple posts while it is processing previous ones. Fortunately we are only interested in the last POST, so we need a way to abort\cancel on going requests when newer request from same form comes.
What I tried:
1- blocking client side to make multiple posts when server still working on previous one. It is not desirable because it makes noticeable pauses on browser side.
2- There are several solutions for blocking multiple post backs by using HASH codes or AntiForgeryToken. But they don't what I need, I need to abort on-going thread in favor of new request, not blocking incoming request.
3- I tried to extend pipeline by adding two message handlers (one before action and another after executing action) to keep a hash code (or AntiForgeryToken) but problem is still there, even I can detect there is on-going thread working on same request, I have no way to abort that thread or set older request to Complete.
Any thoughts?
The only thing you can do is throttle the requests client-side. Basically, you need to set a timeout when a checkbox is clicked. You can let that initial request go through, but then any further requests are queued (or actually dropped after the first queued request in your scenario) and don't run that until the timeout clears.
There's no way to abort a request server-side. Each request is idempotent. There is no inherent knowledge of anything that's happened before or since. The server has multiple threads fielding requests and will simply process those as fast as it can. There's no order to how the requests are processed or how responses are sent out. The first request could be the third one that receives a response, simply due to how the processing of each request goes.
You are trying to implement transactional functionality (i.e. counting only the last request) over an asynchronous technology. This is a design flaw.
Since you refuse to block on the client side, you have no method by which to control which requests process first, OR to correctly process the outcome again on the client-side.
You might actually run into this scenario:
Client sends Request A
Server starts processing Request B
Client sends Request B
Server starts processing Request B
Server returns results of Request B, and client changes accordingly
Server returns results of Request A, and client changes accordingly (and undoes prior changes resulting from Request B)
Blocking is the only way you can ensure the correct order.
Thanks for your help #xavier-j.
After playing around this, I wrote this. Hope it be useful for someone who needs same thing:
First you need add this ActionFilter
public class KeepLastRequestAttribute : ActionFilterAttribute
{
public string HashCode { get; set; }
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
base.OnActionExecuting(filterContext);
Dictionary<string, CancellationTokenSource> clt;
if (filterContext.HttpContext.Application["CancellationTokensDictionary"] != null)
{
clt = (Dictionary<string, CancellationTokenSource>)filterContext.HttpContext.Application["CancellationTokensDictionary"];
}
else
{
clt = new Dictionary<string, CancellationTokenSource>();
}
if (filterContext.HttpContext.Request.Form["__RequestVerificationToken"] != null)
{
HashCode = filterContext.HttpContext.Request.Form["__RequestVerificationToken"];
}
CancellationTokenSource oldCt = null;
clt.TryGetValue(HashCode, out oldCt);
CancellationTokenSource ct = new CancellationTokenSource();
if (oldCt != null)
{
oldCt.Cancel();
clt[HashCode] = ct;
}
else
{
clt.Add(HashCode, ct);
}
filterContext.HttpContext.Application["CancellationTokensDictionary"] = clt;
filterContext.Controller.ViewBag.CancellationToken = ct;
}
public override void OnResultExecuted(ResultExecutedContext filterContext)
{
base.OnResultExecuted(filterContext);
if (filterContext.Controller.ViewBag.ThreadHasBeenCanceld == null && filterContext.HttpContext.Application["CancellationTokensDictionary"] != null) {
lock (filterContext.HttpContext.Application["CancellationTokensDictionary"])
{
Dictionary<string, CancellationTokenSource> clt = (Dictionary<string, CancellationTokenSource>)filterContext.HttpContext.Application["CancellationTokensDictionary"];
clt.Remove(HashCode);
filterContext.HttpContext.Application["CancellationTokensDictionary"] = clt;
}
}
}
}
I am using AntiForgeryToken here as key token, you can add your own custom hash code to have more control.
In the controller you will have something like this
[HttpPost]
[KeepLastRequest]
public async Task<ActionResult> DoSlowJob(CancellationToken ct)
{
CancellationTokenSource ctv = ViewBag.CancellationToken;
CancellationTokenSource nct = CancellationTokenSource.CreateLinkedTokenSource(ct, ctv.Token, Response.ClientDisconnectedToken);
var mt = Task.Run(() =>
{
SlowJob(nct.Token);
}, nct.Token);
await mt;
return null;
}
private void SlowJob(CancellationToken ct)
{
for (int i = 0; i < 10; i++)
{
Thread.Sleep(200);
if (ct.IsCancellationRequested)
{
this.ViewBag.ThreadHasBeenCanceld = true;
System.Diagnostics.Debug.WriteLine("cancelled!!!");
break;
}
System.Diagnostics.Debug.WriteLine("doing job " + (i + 1));
}
System.Diagnostics.Debug.WriteLine("job done");
return;
}
And finally in your JavaScript you need to abort ongoing requests, otherwise browser blocks new requests.
var onSomethingChanged = function () {
if (currentRequest != null) {
currentRequest.abort();
}
var fullData = $('#my-heavy-form :input').serializeArray();
currentRequest = $.post('/MyController/DoSlowJob', fullData).done(function (data) {
// Do whatever you want with returned data
}).fail(function (f) {
console.log(f);
});
currentRequest.always(function () {
currentRequest = null;
})
}
I'm creating an app that requires todo parallel http request, I'm using HttpClient for this.
I'm looping over the urls and foreach URl I start a new Task todo the request.
after the loop I wait untill every task finishes.
However when I check the calls being made with fiddler I see that the request are being called synchronously. It's not like a bunch of request are being made, but one by one.
I've searched for a solution and found that other people have experienced this too, but not with UWP. The solution was to increase the DefaultConnectionLimit on the ServicePointManager.
The problem is that ServicePointManager does not exist for UWP. I've looked in the API's and I thought I could set the DefaultConnectionLimit on HttpClientHandler, but no.
So I have a few Questions.
Is DefaultConnectionLimit still a property that could be set somewhere?
if so, where do i set it?
if not, how do I increase the connnectionlimit?
Is there still a connectionlimit in UWP?
this is my code:
var requests = new List<Task>();
var client = GetHttpClient();
foreach (var show in shows)
{
requests.Add(Task.Factory.StartNew((x) =>
{
((Show)x).NextEpisode = GetEpisodeAsync(((Show)x).NextEpisodeUri, client).Result;}, show));
}
}
await Task.WhenAll(requests.ToArray());
and this is the request:
public async Task<Episode> GetEpisodeAsync(string nextEpisodeUri, HttpClient client)
{
try
{
if (String.IsNullOrWhiteSpace(nextEpisodeUri)) return null;
HttpResponseMessage content; = await client.GetAsync(nextEpisodeUri);
if (content.IsSuccessStatusCode)
{
return JsonConvert.DeserializeObject<EpisodeWrapper>(await content.Content.ReadAsStringAsync()).Episode;
}
}
catch (Exception ex)
{
Debug.WriteLine(ex.Message);
}
return null;
}
Oke. I have the solution. I do need to use async/await inside the task. The problem was the fact I was using StartNew instead of Run. but I have to use StartNew because i'm passing along a state.
With the StartNew. The task inside the task is not awaited for unless you call Unwrap. So Task.StartNew(.....).Unwrap(). This way the Task.WhenAll() will wait untill the inner task is complete.
When u are using Task.Run() you don't have to do this.
Task.Run vs Task.StartNew
The stackoverflow answer
var requests = new List<Task>();
var client = GetHttpClient();
foreach (var show in shows)
{
requests.Add(Task.Factory.StartNew(async (x) =>
{
((Show)x).NextEpisode = await GetEpisodeAsync(((Show)x).NextEpisodeUri, client);
}, show)
.Unwrap());
}
Task.WaitAll(requests.ToArray());
I think an easier way to solve this is not "manually" starting requests but instead using linq with an async delegate to query the episodes and then set them afterwards.
You basically make it a two step process:
Get all next episodes
Set them in the for each
This also has the benefit of decoupling your querying code with the sideeffect of setting the show.
var shows = Enumerable.Range(0, 10).Select(x => new Show());
var client = new HttpClient();
(Show, Episode)[] nextEpisodes = await Task.WhenAll(shows
.Select(async show =>
(show, await GetEpisodeAsync(show.NextEpisodeUri, client))));
foreach ((Show Show, Episode Episode) tuple in nextEpisodes)
{
tuple.Show.NextEpisode = tuple.Episode;
}
Note that i am using the new Tuple syntax of C#7. Change to the old tuple syntax accordingly if it is not available.
I have a controller which has a function that will return a file. The file is generated on the server as a temp file and then streamed via a HttpResponseMessage. What I'd like to do, is delete the file after I've finished sending it (maybe in the future we might keep them for a little while in case the exact same request is made again). I have something like this:
[HttpGet]
public HttpResponseMessage GetReport()
{
string fileName = //function that creates the file and returns the filename...
HttpResponseMessage response = new HttpResponseMessage();
response.Content = new StreamContent(new FileStream(fileName, FileMode.Open, FileAccess.Read));
response.Content.Headers.ContentDisposition = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment");
response.Content.Headers.ContentDisposition.FileName = "test.docx";
//File.Delete(fileName);
return response;
}
I can't delete the file at the commented out point above because the file is in use at that point. So is there an event or something that will be fired once the stream has finished being sent so I can handle deleting?
I could, of course, just start a task to wait some (hopefully sufficiently long) period of time and then delete, but that seems a little hit-or-miss.
Because you mentioned keeping the files around for awhile (potentially), you will need some kind of expiration architecture. Create a database table that tracks these temporary file system objects along with an expiration timestamp. Then, create a scheduled task using Windows Task Scheduler or a library like Quartz.NET to periodically query for expired objects and delete them.
I do this in my own projects for cleaning up files that were uploaded by the user but aren't necessarily used because the user canceled the encompassing process.
The tricky part is defining what constitutes a successful response. Is the response successful because the client received all the data and acted upon it? If so, then only the client has all the information necessary to determine if the data was received successfully. In this case, the client could perhaps tell the server that it (the client) received and acted upon the data. Then, the server could either delete the file immediately or mark it for expiration in the architecture I mentioned previously.
HttpResponseMessage is disposable than my suggestion is define your class derived from HttpResponseMessage and override Dispose(bool disposing) method to clean up your file.
class FileResponseMessage : HttpResponseMessage
{
public string FileResponseMessage(string fileName)
{
this.Content = new StreamContent(new FileStream(fileName, FileMode.Open, FileAccess.Read));
this.Content.Headers.ContentDisposition = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment");
this.Content.Headers.ContentDisposition.FileName = "test.docx";
}
override void Dispose(bool disposing)
{
if(disposing)
{
//your cleanup
}
}
}