When can I delete a file after using it in Response.WriteFile()? - asp.net

Is the WriteFile call properly synchronous, and can I delete the file written immediately after the call?

If you're writing a file to the client with Response.WriteFile(), a call to Response.Flush() will make sure it's been fully output to the client. Once that's done you can delete it off of the webserver.
You may want to come up with a more robust system if the file is mission-critical. Say, a client-side script to validate that the file was received OK and then alerts the webserver that the file can be deleted.

That is the solution, after use the syntax Response.WriteFile(fileName);, type the following code lines:
Response.Flush();
System.IO.File.Delete(fullPathFileName);
Response.End();

It is fully synchronous, as you can see by looking at the implementation of HttpResponse.WriteFile with Lutz Reflector. You can delete the file immediately after the call to Response.WriteFile.
You don't have the guarantee that the response stream has been completely transmitted to the client, but calling Response.Flush doesn't give you that guarantee either. So I don't see a need to call Response.Flush before deleting the file.
Avoid loading the file into a MemoryStream, it brings you no benefit, and has a cost in memory usage, especially for large files.

If memory serves it is synchronous, as are the rest of the RESPONSE commands.

TransmitFile
You can also call TransmitFile to let IIS take care of it. It actually gets sent by IIS outside of your worker processs.
Memory Stream
If you are REALLY paranoid, don't send the file. Load it into a memory stream (if the size is reasonable) and transmit that. Then you can delete the file whenever you like. The file on disk will never be touched by IIS.

Related

Efficiently handling HTTP uploads of many large files in Go

There is probably an answer within reach, but most of the search results are "handling large file uploads" where the user does not know what they're doing or "handing many uploads" where the answer consistently is just an explanation of how to work with multipart requests and/or Flash uploader widgets.
I haven't had time to sift through Go's HTTP implementation, yet, but when does the application have the first chance to see the incoming body? Not until it has been completely received?
If I were to [poorly] decide to use HTTP to transfer a large amount of data and posted a single request with several 10-gigabyte parts, would I have to wait for the whole thing to be received before processing it or does the io.Reader with the body iteratively process it?
This is only tangentially related, but I also haven't been able to get a clear answer about whether I can choose to forcibly close the connection in the middle; whether or not, even if I close it, it will just keep receiving it on the port.
Thanks so much.
An application's handler is called after the headers are parsed and before the request body is read. The handler can read the request body as soon as the handler is called. The server does not buffer the entire request body.
An application can read file uploads without buffering the entire request by getting a multipart reader and iterating through the parts.
An application can replace the request body with a MaxBytesReader to force close the connection after a specified limit is breached.
The above comments are about the net/http server included in the standard library. The comments may not apply to other servers.
While I haven't done this with GB size files, my strategy with file processing (mostly stuff I read from and write to S3) is to use https://golang.org/pkg/os/exec/ with a cmd line utility that handles chunking a way you like. Then read and process by tailing the file as explained here: Reading log files as they're updated in Go
In my situations, network utilities can download the data far faster than my code can process it, so it makes sense to send it to disk and pick it up as fast as I can, that way I'm not holding some connection open while I process.

File Upload / The connection was reset

I am writing an upload handler (asp.net) to handle image uploads.
The aim is to check the image type and content size before the entire file is uploaded. So I cannot use the Request object directly as doing so loads the entire file input stream. I therefore use the HttpWorkerRequest.
However, I keep getting "The connection to the server was reset while the page was loading".
After quite a bit of investigation it has become apparent that when posting the file the call works only if the entire input stream is read.
This, of course, is exactly what I do not want to do :)
Can someone please tell me how I can close off the request without causing the "connection reset" issue and having the browser process the response?
There is no way to do this, as this is how HTTP functions. The best you can do is slurp the data from the client (i.e. read it in chunks) and immediately forget about it. This should prevent your memory requirements from being hammered, though will hurt your bandwidth.

Dealing with IOExceptions with XmlWriter.Create() and XmlDocument.Load() in separate threads

I have inherited some code which involves a scheduled task that writes data (obtained from an external source) to XML files, and a website that reads said XML files to get information to be presented to the visitor.
There is no synchronization in place, and needless to say, sometimes the scheduled task fails to write the file because it is currently open for reading.
The heart of the writer code is:
XmlWriter writer = XmlWriter.Create(fileName);
try
{
xmldata.WriteTo(writer);
}
finally
{
writer.Close();
}
And the heart of the reader code is:
XmlDocument theDocument = new XmlDocument();
theDocument.Load(filename);
(yep no exception handling at either end)
I'm not sure how to best approach trying to synchronize these. As far as I know neither XmlWriter.Create() nor XmlDocument.Load() take any parameters regarding file access modes. Should I manage the underlying FileStreams myself (with appropriate access modes) and use the .Create() and .Load() overloads that take Stream parameters?
Or should I just catch the IOExceptions and do some sort of "catch, wait a few seconds, retry" approach?
Provided that your web site does not need to write back to the XmlDocument that is loaded, I would load it via a FileStream that has FileShare.ReadWrite set. That should allow your XmlWriter in the other thread to write to the file.
If that does not work, you could also try reading the xml from the FileStream into a MemoryStream, and close the file as quickly as possible. I would still open the file with FileShare.ReadWrite, but this would minimize the amount of time your reader needs to access data in the file.
By using FileShare.ReadWrite (or FileShare.Write for that matter) as the sharing mode, you run the risk that the document is updated while you are still reading it. That could result in invalid XML content, preventing the XmlDocument.Load call from successfully parsing it. If you wish to avoid this, you could try synchronizing with a temporary "locking file". Rather than allowing file sharing, you prevent either thread from concurrently accessing, and when either of them is processing the file, write an empty, temporary file to disk that indicates this. When processing (reading or writing) is done, delete the temporary file. This prevents an exception from being thrown on either end, and allows you to synchronize access to the file.
There are a couple other options you could use as well. You could simply let both ends swallow any exception and wait a short time before trying again, although that isn't really the best design. If you understand the threading options of .NET well enough, you could also use a named system Mutex that both processes (your writing process and your web site process) know about. You could then use the Mutex to lock, and not have to bother with the locking file.

BinaryWrite vs WriteFile

What should I use for writing a file to the response? There are two different options as I see it. Option one is to read the file into a stream and then write the bytes to the browser with
Response.BinaryWrite(new bytes[5])
Next option is to just write the file from the file system directly with Response.WriteFile. Any advantages/disadvantages with either approach?
Edit: Corrected typos
Response.TransmitFile is preferred if you have the file on disk and are using at least asp.net 2.0.
Response.WriteFile reads the whole file into memory then writes the file to response. TransmitFile "Writes the specified file directly to an HTTP response output stream without buffering it in memory."
The other consideration is whether this is a file which is written one time or frequently. If you are frequently writing this file, then you might want to cache it, thus Response.BinaryWrite makes the most sense.
If you have it in memory, I would not write it to the file system and use Response.WriteFile.

IIS is keeping hold of my generated files

My web application generates pdf files and either e-mails or faxes them to our customers. Somehow IIS6 is keeping hold of the file and blocking any other requests for it claiming the old '..the process cannot access the file 'xxx.pdf' because it is being used by another process.'
When I recycle the application pool all is ok. Does anybody know why this is happening and how can I stop it.
Thanks
As with everyone said, do call the Close and Dispose method on any IO objects you have open when reading/writing the PDF files.
But I suppose you'd incorporated a 3rd party component? to do the PDF writing for you? If that's the case you might want to check with the vendor and/or its documentation to make sure that you are doing things in the way the vendors intended them to be. Don't trust the black box you got from someone else unless it has proven itself.
Another place to look might be what happens during multiple web request to the PDF files, are you sure that the file is not written simultaneously from multiple places? e.g. 2-3 requests genrating PDF simultaneously? or 2-3 pages along the PDF generation process?
And lastly, you might want to check the exception logs to make sure that nothing is crashing/thread exiting and leaving the file handle open without you noticing it. It happens a lot in multiple threading scenarios, sometimes the thread just crashes and exits - which could happen especially if you use 3rd party components, they might be performing some magic tricks, you'd never know.
Sounds like, the files - after being created - are still locked by the worker process. Make sure that you close all the connections for your file.
(remember, using using blocks'll take care of that)
I'd look through your code and make sure all handles to open (generated) files have been closed properly. Sometimes you just can't rely on the garbage collector to sort these things out.
Like mentioned before: Take care that you close all open handlers.
Sometimes the indexing service of Microsoft blocks files. Exclude your directory
Check that all the code writing files on disk properly close every handle using the proper .Close() in the finally clause or trough the "using" clause of C#
byte[] asciiBytes = getPdf(...);
try{
BinaryWriter bw = new BinaryWriter(File.Create(filename));
bw.Write(pdfBytes);
}
finally {
if(null != bw)
bw.Close();
}
Use the Response and the Content-Disposition clause to send the file
Response.ContentType = "application/pdf";
Response.AppendHeader("Content-disposition", "attachment; filename=" + PDID + ".pdf");
Response.WriteFile(filename);
Response.Flush();
The code shown creates and send Pdf files to customer from about 18 months and we've never seen a file locked.

Resources