Downloading >10,000 rows from database table in asp.net - asp.net

How should I go about providing download functionality on an asp.net page to download a series of rows from a database table represented as a linq2sql class that only has primitive types for members (ideally into a format that can be easily read by Excel)?
E.g.
public class Customer
{
public int CustomerID;
public string FirstName;
public string LastName;
}
What I have tried so far.
Initially I created a DataTable, added all the Customer data to this table and bound it to a DataGrid, then had a download button that called DataGrid1.RenderControl to an HtmlTextWriter that was then written to the response (with content type "application/vnd.ms-excel") and that worked fine for a small number of customers.
However, now the number of rows in this table is >10,000 and is expected to reach upwards of 100,000, so it is becoming prohibitive to display all this data on the page before the user can click the download button.
So the question is, how can I provide the ability to download all this data without having to display it all on a DataGrid first?

After the user requests the download, you could write the data to a file (.CSV, Excel, XML, etc.) on the server, then send a redirect to the file URL.

I have used the following method on Matt Berseth blog for large record sets.
Export GridView to Excel
If you have issues with the request timing out try increasing the http request time in the web.config

Besides the reasonable suggestion to save the data on server first to a file in one of the answers here, I would like to also point out that there is no reason to use a DataGrid (it’s one of you questions as well). DataGrid is overkill for almost anything. You can just iterate over the records, and save them directly using HtmlTextWriter, TextWriter (or just Response.Write or similar) to the server file or to a client output stream. It seems to me like an obvious answer, so I must be missing something.
Given the number of records, you may run into a number of problems. If you write directly to the client output stream, and buffer all data on server first, it may be a strain on the server. But maybe not; it depends on the amount of memory on the serer, the actual data size and how often people will be downloading the data. This method has the advantage of not blocking a database connection for too long. Alternatively, you can write directly to the client output stream as you iterate. This may block the database connection for too long as it depends on the download speed of the client. But again; it your application is of a small or medium size (in audience) then anything is fine.

You should definitely check out the FileHelpers library. It's a freeware, excellent utility set of classes to handle just this situation - import and export of data, from text files; either delimited (like CSV), or fixed width.
It offer a gazillion of options and ways of doing things, and it's FREE, and it works really well in various projects that I'm using it in. You can export a DataSet, an array, a list of objects - whatever it is you have.
It even has import/export for Excel files, too - so you really get a bunch of choices.
Just start using FileHelpers - it'll save you so much boring typing and stuff, you won't believe it :-)
Marc

Just a word of warning, Excel has a limitation on the number of rows of data - ~65k. CSV will be fine, but if your customers are importing the file into Excel they will encounter that limitation.

Why not allow them to page through the data, perhaps sorting it before paging, and then give them a button to just get everything as a cvs file.
This seems like something that DLinq would do well, both the paging, and writing it out, as it can just fetch one row at a time, so you don't read in all 100k rows before processing them.
So, for cvs, you just need to use a different LINQ query to get all of the rows, then start to save them, separating each cell by a separator, generally a comma or tab. That could be something picked by the user, perhaps.

OK, I think you are talking too many rows to do a DataReader and then loop thru to create the cvs file. The only workable way will be to run:
SQLCMD -S MyInstance -E -d MyDB -i MySelect.sql -o MyOutput.csv -s
For how to run this from ASP.Net code see here. Then once that is done, your ASP.Net page will continue with:
string fileName = "MyOutput.csv";
string filePath = Server.MapPath("~/"+fileName);
Response.Clear();
Response.AppendHeader("content-disposition",
"attachment; filename=" + fileName);
Response.ContentType = "application/octet-stream";
Response.WriteFile(filePath);
Response.Flush();
Response.End();
This will give the user the popup to save the file. If you think more than one of these will happen at a time you will have to adjust this.

So after a bit of research, the solution I ended up trying first was to use a slightly modified version of the code sample from http://www.asp.net/learn/videos/video-449.aspx and format each row value in my DataTable for CSV using the following code to try to avoid potentially problematic text:
private static string FormatForCsv(object value)
{
var stringValue = value == null ? string.Empty : value.ToString();
if (stringValue.Contains("\"")) { stringValue = stringValue.Replace("\"", "\"\""); }
return "\"" + stringValue + "\"";
}
For anyone who is curious about the above, I'm basically surrounding each value in quotes and also escaping any existing quotes by making them double quotes. I.e.
My Dog => "My Dog"
My "Happy" Dog => "My ""Happy"" Dog"
This appears to be doing the trick for now for small numbers of records. I will try it soon with the >10,000 records and see how it goes.
Edit: This solution has worked well in production for thousands of records.

Related

Save File Prompt instead of FileWriteAllBytes

Long time lurker first time poster. Working with .Net / Linq for just a few years so I'm sure I'm missing something here. After countless hours of research I need help.
I based my code on a suggestion from https:http://damieng.com/blog/2010/01/11/linq-to-sql-tips-and-tricks-3
The following code currently saves a chosen file (pdf, doc, png, etc) which is stored in an sql database to the C:\temp. Works great. I want to take it one step further. Instead of saving it automatically to the c:\temp can I have the browser prompt so they can save it to their desired location.
{
var getFile = new myDataClass();
//retrieve attachment id from selected row
int attachmentId = Convert.ToInt32((this.gvAttachments.SelectedRow.Cells[1].Text));
//retrieve attachment information from dataclass (sql attachment table)
var results = from file in getFile.AttachmentsContents
where file.Attachment_Id == attachmentId
select file;
string writePath = #"c:\temp";
var myFile = results.First();
File.WriteAllBytes(Path.Combine(writePath, myFile.attach_Name), myFile.attach_Data.ToArray());
}
So instead of using File.WriteAllBytes can I instead take the data returned from my linq Query (myFile) and pass it into something that would prompt for the user to save the file instead?). Can this returned object be used with response.transmitfile? Thanks so much.
Just use the BinaryWrite(myFile.attach_Data.ToArray()) method to send the data since it is already in memory.
But first set headers appropriately, for example:
"Content-Disposition", "attachment; filename="+myFile.attach_Name
"Content-Type", "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
Content-type guides the receiving system on how it should handle the file. Here are more MS Office content types. If they are known at the point the data is stored, the content-type should be stored, too.
Also, since the file content is the only data you want in the response, call Clear before and End after BinaryWrite.

Sending large amount of data for Excel report to client

MS VS 2008, ASp.Net 3.5.
On the client side :
client selects start and end dates, report format as Excel, clicks "run report" button
On that click redirected to reportToExcel.aspx, in reportToExcel.aspx.vb in Page_Load event stored procedure is executed to retrieve report data :
oSQLDataReader = oSqlCommand.ExecuteReader()
Then:
Response.ContentType = "application/ms-excel"
Response.AddHeader("Content-Disposition", "attachment; filename=" + MyBase.UserSession.ReportName + ".xls")
Then Response.Write is used to write retrieved report data into Response object in XML format, like
Response.Write("<td>" & FormatColumnValue(oSQLDataReader.GetValue(I), arrColHeader(I + 1).ColumnFormat) & "</td>"), etc. Last callis Response.End().
I know Response.End should not be used, I plan to substitute is with
context.Response.Flush()
context.ApplicationInstance.CompleteRequest()
but I doubt it will improve response time.
Problem: on the client side take 6 mins to receive 32.5 MB of data. This is too long.
How to reduce this time ?
As I understood so far: chuncking is not possible for Excel report and anyway client wants to receive Excel report as one and whole.
In order to use Response.TransferFile : Excel file has to be created first, then zipped to reduce amount of data to download, then downloaded. For this to work Excel should be installed on the server, which is not acceptable in our case.
Deliver data as csv to client is not acceptable: client will have to import it to Excel, which they would not like to do.
Stored procedure executed from SQL management studio shows inconsistent run times: from 12 secs to 4 mins.
So, are there any other ways to reduce report 'delivery' time to the client ?
Thank you for all replies
You do need to get the stored procedure run time down. But that's a whole question and answer in itself.
Your method of writing out the HTML is slower than it needs to be. Essentially, you are doing repeated string concatenations, which are slow. Consider using a StringBuilder to construct the entire document before writing it to the Response stream.
Another option (and perhaps a better one) would be to try something like the free Excel Xml Writer library: http://www.carlosag.net/tools/excelxmlwriter/. I haven't used it, but I've heard good things about it. This would (I believe) let you write your Excel file on the server without needing Excel itself installed.

How do i store a file for proccessing in asp.net

Path.GetTempFileName is pretty close to what i want. But i wouldnt want to restart the machine and lose these files (as they would be temp). What i need is a unique filename. Whats the best way to do it? I was thinking inserting a key into a db, commit them pulling it but i dont think its a good idea.
I was thinking of using a random number but i am always worried about using random numbers when on a server. Since two request can occur at the same time getting the same number (assuming i dont lock it which would make it slow). So, what can i do?
I plan to use the filename so i can take file(s) from the users post request and save them to a file. Which i then put into a queue to be processed which may be immediately, a second from now or minutes/hours if something has gone wrong.
Store filenames using GUID?
If you are expecting a lot of files. I replace guid dashes to make it into a directory structure.
d524532e-8337-422f-925c-14500972c843.jpg
becomes
\d524532e\8337\422f\925c\14500972c843.jpg
How about a Guid:
var appData = Server.MapPath("~/"App_Data);
var filename = Path.Combine(appData, string.Format("{0}.tmp", Guid.NewGuid()));
or some timestamp or something:
var appData = Server.MapPath("~/"App_Data);
var filename = Path.Combine(appData, string.Format("{0:dd_MM_yyyy_fffff}.tmp", DateTime.Now));

ASP/VB Byte arrays, iframes, parents, children and variables

I have an aspx page which houses an iframe. When a button is clicked, a WCF is called to produce a PDF which is read into a byte array. I was storing the byte array in a Globals.vb file like this:
Public Shared PDF_Data as Byte()
The global was loaded from the parent aspx page like this:
PDF_Data = MyWCF.Create_PDF_File(SomeVariable)
After that, the iFrame's src was set to a blank aspx page, which had the following code in the page_load event:
'Write the PDF binary data to the screen (viewer)
Response.Clear()
Response.Buffer = True
Response.ContentType = "application/pdf"
Response.BinaryWrite(Globals.PDF_Data.ToArray)
However, realizing that this application will have several users who will get different PDF documents, I have learned that this is not the way to go. My shared variable would be accessible to all users, a big no-no.
However, I am stumped as to how I'm going to store the byte array and make it available to a child aspx page from it's parent.
Any ideas would be greatly appreciated!
Thanks,
Jason
The shared variable is definitely not the way to go. I took over a project that used that technique and there were slews of issues with one user getting another users data. You should either use Session, which in itself can be an issue.
One suggestion that I've used saving the byte data to a database with a key and passing that key to the iframe within the URL with the query string. In that case, you should have a way to clear out the past records from the db before it takes up too much space. Depending on if this PDF document is supposed to be secure, this will open up so that the PDF would be accessible by people fiddling with the query string.
Another suggestion, passing it as B64 encoded POST data. Those are a couple suggestions.

Working with big files in classic ASP

I was wondering what's the best practise for serving a generated big file in classic asp.
We have an application with "export to excel" function that produces 10MB files. The excels are created by just calling a .asp page that has the Response.ContentType set to excel and has an HTML table for the data.
This gives as problem that it takes 4 minutes before the user sees the "Save as..." dialog.
My current solution is to call an .asp page that creates the excel on the server with AJAX and lets the page return the URL of the generated document. Then I can use javascript to display the on the original page.
Is this easy to do with classic asp (creating files on server with some kind of stream) while keeping security in mind? (URL should make people be able to guess the location of other files)
How would I go about handling deleted the generated files overtime? They have to be deleted periodicly as the data changes in realtime.
Thanks.
edit: I realized now that creating the file on the server will probably also take 4 minutes...
I think you are selecting a complex route, when the solution is simple enough (Though I may be missing some requirements)
If you to generate an excel, just call an asp page that do the following:
Response.clear
Response.AddHeader "content-disposition", "attachment; filename=myexcel.xls"
Response.ContentType = "application/excel"
'//write the content of the file
Response.write "...."
Response.end
This will a start a download process in the browser without needing to generate a extra call, javascript or anything
See this question for more info on the format you will choose to generate the excel.
Edit
Since Thomas update the question and the real problem is that the file take 4 minutes to generate, the solution could be:
Offer the user the send the file by email (if this is a workable solution in you server or hosting).
Generate the file async, and let the user know when the file generation is done (with an ajax call, like SO does when other user have added an answer)
To generate the file on the server
'//You should change for a random name or something that makes sense
FileName = "C:\temp\myexcel.xls"
FileNumber = FreeFile
Open FileName For Append As #FileNumber
'//generate the content
TheRow = "...."
Print #FileNumber, TheRow
Close #FileNumber
To delete the temp files generated
I use Empty Temp Folders a freeware app that I run daily on the server to take care of temp files generated. (Again, it depends on you server or hosting)
About security
Generate the files using random numbers or GUIds for a light protection. If the data is sensitive, you will need to download the file from a ASP page, but I think that you will be in the same problem again...(waiting 4 minutes to download)
Read file using FSO.
Set headers for Excel file-type, name according to file read and for download (attachment)
Flush response after headers are set. The client should display "save as" dialogue.
Output FSO to response. Client will download file and see progress bar.
How do you plan to generate the Excel? I hope you don't plan to call Excel to do that, as it is unsupported, and generally won't work well.
You should check to see if there are COM components to generate Excel that you can call from Classic ASP. Alternatively, add one ASP.NET page for the purpose. I know for a fact that there are compoonents that can be called from ASP.NET pages to do this. Worse come to worst, there's an Excel exporter component from Infragistics that works with their UltraWebGrid control to export. The grid need not be visible in order to accomplish this, but styles in the grid translate to styles in the spreadsheet. They also allow you to manipulate the spreadsheet programmatically.

Resources