Returning a filestream - how to know when it's done - asp.net

I have a controller which has a function that will return a file. The file is generated on the server as a temp file and then streamed via a HttpResponseMessage. What I'd like to do, is delete the file after I've finished sending it (maybe in the future we might keep them for a little while in case the exact same request is made again). I have something like this:
[HttpGet]
public HttpResponseMessage GetReport()
{
string fileName = //function that creates the file and returns the filename...
HttpResponseMessage response = new HttpResponseMessage();
response.Content = new StreamContent(new FileStream(fileName, FileMode.Open, FileAccess.Read));
response.Content.Headers.ContentDisposition = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment");
response.Content.Headers.ContentDisposition.FileName = "test.docx";
//File.Delete(fileName);
return response;
}
I can't delete the file at the commented out point above because the file is in use at that point. So is there an event or something that will be fired once the stream has finished being sent so I can handle deleting?
I could, of course, just start a task to wait some (hopefully sufficiently long) period of time and then delete, but that seems a little hit-or-miss.

Because you mentioned keeping the files around for awhile (potentially), you will need some kind of expiration architecture. Create a database table that tracks these temporary file system objects along with an expiration timestamp. Then, create a scheduled task using Windows Task Scheduler or a library like Quartz.NET to periodically query for expired objects and delete them.
I do this in my own projects for cleaning up files that were uploaded by the user but aren't necessarily used because the user canceled the encompassing process.
The tricky part is defining what constitutes a successful response. Is the response successful because the client received all the data and acted upon it? If so, then only the client has all the information necessary to determine if the data was received successfully. In this case, the client could perhaps tell the server that it (the client) received and acted upon the data. Then, the server could either delete the file immediately or mark it for expiration in the architecture I mentioned previously.

HttpResponseMessage is disposable than my suggestion is define your class derived from HttpResponseMessage and override Dispose(bool disposing) method to clean up your file.
class FileResponseMessage : HttpResponseMessage
{
public string FileResponseMessage(string fileName)
{
this.Content = new StreamContent(new FileStream(fileName, FileMode.Open, FileAccess.Read));
this.Content.Headers.ContentDisposition = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment");
this.Content.Headers.ContentDisposition.FileName = "test.docx";
}
override void Dispose(bool disposing)
{
if(disposing)
{
//your cleanup
}
}
}

Related

What happens to a SemaphoreSlim when you dereference it?

I'm running into a problem sending massive requests to a .NET Core web service. I'm using a SemaphoreSlim to limit the number of simultaneous requests. When I get a 10061 error (the web service has refused the connection), I want to dial back the number of simultaneous requests. My idea at the moment is to de-reference the SemaphoreSlim and create another:
await this.semaphoreSlim.WaitAsync().ConfigureAwait(false);
counter++;
Uri uri = new Uri($"{api}/{keyProperty}", UriKind.Relative);
string rowVersion = string.Empty;
try
{
HttpResponseMessage getResponse = await this.httpClient.GetAsync(uri).ConfigureAwait(false);
if (getResponse.IsSuccessStatusCode)
{
using (HttpContent httpContent = getResponse.Content)
{
JObject currentObject = JObject.Parse(await httpContent.ReadAsStringAsync().ConfigureAwait(false));
rowVersion = currentObject.Value<string>("rowVersion");
}
}
}
catch (HttpRequestException httpRequestException)
{
SocketException socketException = httpRequestException.InnerException as SocketException;
if (socketException != null && socketException.ErrorCode == PutHandler.ConnectionRefused)
{
this.semaphoreSlim = new SemaphoreSlim(counter * 90 / 100, counter * 90 / 100);
}
}
}
finally
{
this.semaphoreSlim.Release();
}
If I do this, what will happen to the other tasks that are waiting on the Semaphore that I just de-referenced? My guess is that nothing will happen until the object is garbage collected and disposed.
A SemaphoreSlim (just like any other object in .NET) will exist as long as there are references to it.
However, there is a bug in your code: the SemaphoreSlim being released is this.semaphoreSlim, and if this.semaphoreSlim is changed between being acquired and being released, then the code will release a different semaphore than the one that was acquired. To avoid this problem, copy this.semaphoreSlim into a local variable at the beginning of your method, and acquire and release that local variable.
More broadly, there's a difficult in the attempted solution. If you start 1000 tasks, they will all reference the old semaphore and ignore the updated this.sempahoreSlim. So you'd need a separate solution. For example, you could define a disposable "token" which is permission to call the API. Then have an asynchronous collection of these tokens (e.g., a Channel). This gives you full control over how many tokens are released at once.

Uploading multiple HttpPostedFileBase using Parallel.ForEach breaking files

I have a form that uploads multiple files. My model has a List<HttpPostedFileBase> called SchemaFileBases, which is correctly binded. I need to upload these files to s3 and would like to do it in parallel. I'm unable to use asyc and await because this code is run from both ASP.Net and a queue based application that currently doesn't have async/await support (working on it).
If I change the foreach below to Parallel.ForEach(this.SchemaFileBases, schemaFileBase => {... Then I get some funkiness going on. The two files end up being mashed. Each file will contain some of the other files content after it's uploaded. AwsDocument is being used elsewhere in parallel so I don't think it has to do with that. Each AwsDocument has it's own AmazonS3Client.
public override void UploadToS3(IMetadataParser parser)
{
string hash;
string key;
foreach (var schemaFileBase in this.SchemaFileBases)
{
AwsDocument aws = new AwsDocument(AwsBucket.Received);
hash = schemaFileBase.InputStream.Md5Hash().ToByteArray().ToHex();
key = String.Format("{0}/{1}", this.S3Prefix, schemaFileBase.FileName);
Stream inputStream = schemaFileBase.InputStream;
aws.UploadToS3(key, inputStream, hash);
}
}
My coworker suspect's it's something to do with how the InputStream on the HttpPostedFileBase is implemented. Perhaps it is not thread safe, and the streams are both reading from the original request at the same time? I can't imagine MS would do that though.
Multi-threaded version:
public override void UploadToS3(IMetadataParser parser)
{
Parallel.ForEach(this.SchemaFileBases, f =>
{
AwsDocument aws = new AwsDocument(AwsBucket.Received);
string hash = f.InputStream.Md5Hash().ToByteArray().ToHex();
string key = String.Format("{0}/{1}", this.S3Prefix, f.FileName);
Stream inputStream = f.InputStream;
aws.UploadToS3(key, inputStream, hash);
});
}
Above solution is what I tried to multi-thread it. Does not work (files get mixed up all weird).

Handle large number of PUT requests to a rest api

I have been trying to find a way to make this task more efficient. I am consuming a REST based web service and need to update information for over 2500 clients.
I am using fiddler to watch the requests, and I'm also updating a table with an update time when its complete. I'm getting about 1 response per second. Are my expectations to high? I'm not even sure what I would define as 'fast' in this context.
I am handling everything in my controller and have tried running multiple web requests in parallel based on examples around the place but it doesn't seem to make a difference. To be honest I don't understand it well enough and was just trying to get it to build. I suspect it is still waiting for each request to complete before firing again.
I have also increased connections in my web config file as per another suggestion with no success:
<system.net>
<connectionManagement>
<add address="*" maxconnection="20" />
</connectionManagement>
</system.net>
My Controllers action method looks like this:
public async Task<ActionResult> UpdateMattersAsync()
{
//Only get matters we haven't synced yet
List<MatterClientRepair> repairList = Data.Get.AllUnsyncedMatterClientRepairs(true);
//Take the next 500
List<MatterClientRepair> subRepairList = repairList.Take(500).ToList();
FinalisedMatterViewModel vm = new FinalisedMatterViewModel();
using (ApplicationDbContext db = new ApplicationDbContext())
{
int jobCount = 0;
foreach (var job in subRepairList)
{
// If not yet synced - it shouldn't ever be!!
if (!job.Synced)
{
jobCount++;
// set up some Authentication fields
var oauth = new OAuth.Manager();
oauth["access_token"] = Session["AccessToken"].ToString();
string uri = "https://app.com/api/v2/matters/" + job.Matter;
// prepare the json object for the body
MatterClientJob jsonBody = new MatterClientJob();
jsonBody.matter = new MatterForUpload();
jsonBody.matter.client_id = job.NewClient;
string jsonString = jsonBody.ToJSON();
// Send it off. It returns the whole object we updated - we don't actually do anything with it
Matter result = await oauth.Update<Matter>(uri, oauth["access_token"], "PUT", jsonString);
// update our entities
var updateJob = db.MatterClientRepairs.Find(job.ID);
updateJob.Synced = true;
updateJob.Update_Time = DateTime.Now;
db.Entry(updateJob).State = System.Data.Entity.EntityState.Modified;
if (jobCount % 50 == 0)
{
// save every 50 changes
db.SaveChanges();
}
}
}
// if there are remaining files to save
if (jobCount % 50 != 0)
{
db.SaveChanges();
}
return View("FinalisedMatters", Data.Get.AllMatterClientRepairs());
}
}
And of course the Update method itself which handles the Web requesting:
public async Task<T> Update<T>(string uri, string token, string method, string json)
{
var authzHeader = GenerateAuthzHeader(uri, method);
// prepare the token request
var request = (HttpWebRequest)WebRequest.Create(uri);
request.Headers.Add("Authorization", authzHeader);
request.Method = method;
request.ContentType = "application/json";
request.Accept = "application/json, text/javascript";
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(json);
request.ContentLength = bytes.Length;
System.IO.Stream os = request.GetRequestStream();
os.Write(bytes, 0, bytes.Length);
os.Close();
WebResponse response = await request.GetResponseAsync();
using (var reader = new System.IO.StreamReader(response.GetResponseStream()))
{
return JsonConvert.DeserializeObject<T>(reader.ReadToEnd());
}
}
If it's not possible to do more than 1 request per second then I'm interested in looking at an Ajax solution so I can give the user some feedback while it is processing. In my current solution I cannot give the user feedback while the action method hasn't reached 'return' yet can I?
Okay it's taken me a few days (and a LOT of trial and error) but I've worked this out. Hopefully it can help others. I finally found my silver bullet. And it was probably the place I should have started:
MSDN: Consuming the Task-based Asynchronous Pattern
In the end this following line of code is what brought it all to light.
string [] pages = await Task.WhenAll(from url in urls select DownloadStringAsync(url));
I substituted a few things to make it work for a Put request as follows:
HttpResponseMessage[] results = await Task.WhenAll(from p in toUpload select client.PutAsync(p.uri, p.jsonContent));
'toUpload' is a List of MyClass:
public class MyClass
{
// the URI should be relative to the base pase
// (ie: /api/v2/matters/101)
public string uri { get; set; }
// a string in JSON format, being the body of the PUT request
public StringContent jsonContent { get; set; }
}
The key was to stop trying to put my PutAsync method inside a loop. My new line of code IS still blocking until ALL responses have come back, but that is what I wanted. Also, learning that I could use this LINQ style expression to create a Task List on the fly was immeasurably helpful. I won't post all the code (unless someone wants it) because it's not as nicely refactored as the original and I still need to check whether the response of each item was 200 OK before I record it as successfully saved in my database. So how much faster is it?
Results
I tested a sample of 50 web service calls from my local machine. (There is some saving of records to a SQL Database in Azure at the end).
Original Synchronous Code: 70.73 seconds
Asynchronous Code: 8.89 seconds
That's gone from 1.4146 requests per second down to a mind melting 0.1778 requests per second! (if you average it out)
Conclusion
My journey isn't over. I've just scratched the surface of asynchronous programming and am loving it. I need to now work out how to save only the results that have returned 200 OK. I can deserialize the HttpResponse which returns a JSON object (which has a unique ID I can look up etc.) OR I could use the Task.WhenAny method, and experiment with Interleaving.

What is the way to perform some session cleanup logic regardless of user logout/timeout/browser close?

I have an IIS hosted web application with a C# backend.
When a user logs in, I want to instantiate an instance of HttpClient() for the logged in user to communicate with the back-end over a REST API. Once that client is created, the backend will initialize some user-specific memory which should be cleared once the user has logged out (that is, the HttpClient() object is disposed).
It seems like the right thing to do here is to instantiate that HttpClient() object at log-in, and then have some code that is called when either the user manually logs out or the user session times out or the user closes the browser, and that code will dispose of the HttpClient() manually.
This is surely a well-travelled problem, so there must be an elegant solution to it. How can I dispose of this user-specific HttpClient() when any possible log-out scenario occurs (manual/timeout/browser close)?
Handling the departure of a web user is not trivial, as the HTTP protocol is stateless. The server can never be certain if the user is still there; a HTTP connection that gets closed doesn't mean that user have to have gone away, and the server can think that a connection is still open eventhough the user is no longer there.
Unless you will be using the HttpClient object intensly, so that you expect that keeping it alive would save a lot of resources, you should just dispose it at the end of each REST request, and open a new one for the next request.
A web request normally takes a short time to handle, and most resources used for it is freed when the request is gone. That make most of the objects short lived, and those are the ones that the garbage collector handles most efficiently. Holding on to objects across several requests makes them very long lived, which uses up memory on the server, and make the garbage collector work harder. Unless there is a specific reason to hold on to an object, you shouldn't let it live longer than it takes to handle the request.
What you could do is create a class which performs the user-specific memory functions you want to perform. This class would contain a method which instantiates the HttpClient() object and then performs the user-specific operations(functions). This class would also contain another method which clears the user-specific memory functions i.e. it disposes the HttpClient() object and performs cleanup of any user-specific data.
So, essentially, you code would look like this:
public class HttpHelper
{
public void LoadUserInformation()
{
HttpClient httpClientObj = new HttpClient();
//perform user-specific tasks
//your logic here
//Store the httpClientObj object in session
}
public void DisposeUserInformation()
{
//Fetch the httpClientObj from session
//perform user-specific tasks
//your logic here
httpClient.Dispose();
}
}
Now, in either of the scenarios, whether the session times out or the user logs out, you could call the DisposeUserInformation() method and that would handle both of your scenario's be it session timing out or user logging out.
There is a Session_End() method in global.asax. The global.asax file will be wired to call this method when the session ends. You can call the DisposeUserInformation() method there.
You could also call this method on the logout button click in the controller.
Hope this helps!!!
I really don't recommend storing anything IDisposable in the session. What if in the process of downloading from the Web APi, in another window the user clicks Logout, you disposed of the HttpClient while it's in use. That is a small edge-case, but there can be plenty of edge cases with storing IDisposable in session. Also if you need to scale out to multiple servers, that requires storing Session in something other than in-proc which requires the object to be serializable (which HttpClient is not).
Instead:
[serializable]
public sealed class ApiClient
{
public ApiClient(uri baseAddress)
{
this._BaseAddress = baseAddress;
}
public Uri BaseAddress { get; set; }
public IEnumerable<Person> GetPersons()
{
var address = new Uri(this.BaseAddress, "Employees/Persons");
using (var client = new HttpClient())
{
// something like this
var task = GetStringAsync(address);
await task;
var json = task.Result;
}
}
}
Nice session wrapper:
public static class SessionExtensions
{
public static bool TryGetValue<T>(this HttpSessionStateBase session, out T value)
where T : class
{
var name = typeof(T).FullName;
value = session[name] as T;
var result = value != null;
return result;
}
public static void SetValue<T>(this HttpSessionStateBase session, T value)
{
var name = typeof(T).FullName;
session[name] = value;
}
public static void RemoveValue<T>(this HttpSessionStateBase session)
{
var name = typeof(T).FullName;
session[name] = null;
}
public static bool ValueExists(this HttpSessionStateBase session, Type objectType)
{
var name = objectType.FullName;
var result = session[name] != null;
return result;
}
}
Now you can create the api per client:
Session.SetValue(new ApiClient(new Uri("http://localhost:443")));
Somewhere else you can get persons:
ApiClient client;
if (Session.TryGetValue(out client))
{
client.GetPersons();
}

Increase Http Runtime MaxRequestLength from C# code

How can I increase
from my C# code ? I can't do this in Web.config, My application is created to deploy web
application in IIS.
Take a look at http://bytes.com/topic/asp-net/answers/346534-how-i-can-get-httpruntime-section-page
There's how you get access to an instance of HttpRuntimeSection. Then modify the property MaxRequestLength.
An alternative to increasing the max request length is to create an IHttpModule implementation. In the BeginRequest handler, grab the HttpWorkerRequest to process it entirely in your own code, rather than letting the default implementation handle it.
Here is a basic implementation that will handle any request posted to any file called "dropbox.aspx" (in any directory, whether it exists or not):
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
namespace Example
{
public class FileUploadModule: IHttpModule
{
#region IHttpModule Members
public void Dispose() {}
public void Init(HttpApplication context)
{
context.BeginRequest += new EventHandler(context_BeginRequest);
}
#endregion
void context_BeginRequest(object sender, EventArgs e)
{
HttpApplication application = (HttpApplication)sender;
HttpContext context = application.Context;
string filePath = context.Request.FilePath;
string fileName = VirtualPathUtility.GetFileName( filePath );
string fileExtension = VirtualPathUtility.GetExtension(filePath);
if (fileName == "dropbox.aspx")
{
IServiceProvider provider = (IServiceProvider)context;
HttpWorkerRequest wr = (HttpWorkerRequest)provider.GetService(typeof(HttpWorkerRequest));
//HANDLE REQUEST HERE
//Grab data from HttpWorkerRequest instance, as reflected in HttpRequest.GetEntireRawContent method.
application.CompleteRequest(); //bypasses all other modules and ends request immediately
}
}
}
}
You could use something like that, for example, if you're implementing a file uploader, and you want to process the multi-part content stream as it's received, so you can perform authentication based on posted form fields and, more importantly, cancel the request on the server-side before you even receive any file data. That can save a lot of time if you can determine early on in the stream that the upload is not authorized or the file will be too big or exceed the user's disk quota for the dropbox.
This is impossible to do with the default implementation, because trying to access the Form property of the HttpRequest will cause it to try to receive the entire request stream, complete with MaxRequestLength checks. The HttpRequest object has a method called "GetEntireRawContent" which is called as soon as access to the content is needed. That method starts with the following code:
HttpRuntimeSection httpRuntime = RuntimeConfig.GetConfig(this._context).HttpRuntime;
int maxRequestLengthBytes = httpRuntime.MaxRequestLengthBytes;
if (this.ContentLength > maxRequestLengthBytes)
{
if (!(this._wr is IIS7WorkerRequest))
{
this.Response.CloseConnectionAfterError();
}
throw new HttpException(SR.GetString("Max_request_length_exceeded"), null, 0xbbc);
}
The point is that you'll be skipping that code and implementing your own custom content length check instead. If you use Reflector to look at the rest of "GetEntireRawContent" to use it as a model implementation, you'll see that it basically does the following: calls GetPreloadedEntityBody, checks if there's more to load by calling IsEntireEntityBodyIsPreloaded, and finally loops through calls to ReadEntityBody to get the rest of the data. The data read by GetPreloadedEntityBody and ReadEntityBody are dumped into a specialized stream, which automatically uses a temporary file as a backing store once it crosses a size threshold.
A basic implementation would look like this:
MemoryStream request_content = new MemoryStream();
int bytesRemaining = wr.GetTotalEntityBodyLength() - wr.GetPreloadedEntityBodyLength();
byte[] preloaded_data = wr.GetPreloadedEntityBody();
if (preloaded_data != null)
request_content.Write( preloaded_data, 0, preloaded_data.Length );
if (!wr.IsEntireEntityBodyIsPreloaded()) //not a type-o, they use "Is" redundantly in the
{
int BUFFER_SIZE = 0x2000; //8K buffer or whatever
byte[] buffer = new byte[BUFFER_SIZE];
while (bytesRemaining > 0)
{
bytesRead = wr.ReadEntityBody(buffer, Math.Min( bytesRemaining, BUFFER_SIZE )); //Read another set of bytes
bytesRemaining -= bytesRead; // Update the bytes remaining
request_content.Write( buffer, 0, bytesRead ); // Write the chunks to the backing store (memory stream or whatever you want)
}
if (bytesRead == 0) //failure to read or nothing left to read
break;
}
At that point, you'll have your entire request in a MemoryStream. However, rather than download the entire request like that, what I've done is offload that "bytesRemaining" loop into a class with a "ReadEnough( int max_index )" method that is called on demand from a specialized MemoryStream that "loads enough" into the stream to access the byte being accessed.
Ultimately, that architecture allows me to send the request directly to a parser that reads from the memory stream, and the memory stream automatically loads more data from the worker request as needed. I've also implemented events so that as each element of the multi-part content stream is parsed, it fires events when each new part is identified and when each part is completely received.
You can do that in the web.config
<httpRuntime maxRequestLength="11000" />
11000 == 11 mb

Resources