Output Web Page as DOCX? - docx

In the past, if I wanted a web page to display as a .DOC word document, I could do so by doing this in the page load:
Response.AddHeader("content-disposition", "attachment;filename=FullDetail.doc")
Response.ContentType = "application/vnd.word"
I was hoping to output the web page as a .DOCX by doing:
Response.AddHeader("content-disposition", "attachment;filename=FullDetail.docx")
Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
but it doesn't work. I get an error:
The file FullDetail.docx cannot be opened because there are problems with the contents. The file is corrupt and cannot be opened.
The contents of both files look pretty much identical - just an HTML page.
HR Full Detail Report
etc...
The .doc opens fine. The .docx doesn't. If I rename the .docx to .doc, it opens fine in Word 2010. Any suggestions? 
Thanks!
Brad

A docx file is actually a zip file that contains several other files. For example, create a new MS Word doc, put the text "Hello world" in it and save it (example.docx). Then rename the docx file to "example.zip" and open it. You will see that a the content is much more complicated than you might have expected.
Most people find that it is much easier to generate a Word XML file (https://msdn.microsoft.com/en-us/library/bb266220(v=office.12).aspx) or use an API for generating a real docx file (for instance: http://docx.codeplex.com/).

Related

ASP.NET MVC export html table to excel not working

I have this method here:
[Authorize]
[HttpPost]
[ValidateInput(false)]
public void ExportCostMatrixExcel(string GridHtmlExcel, string GridCommunityExcel)
{
Response.ClearContent();
Response.ClearHeaders();
Response.BufferOutput = true;
Response.ContentType = "application/excel";
Response.AddHeader("Content-Disposition", "attachment; filename=Reliquat.xlsx");
Response.Write(GridHtmlExcel);
Response.Flush();
Response.Close();
Response.End();
}
This takes me html table and converts it over to an Excel spreadsheet, when I try to open the file, I get this error message:
Excel cannot open the file 'Reliquat.xlsx' because the file format
or file extension is not valid. Verify that the file has not been
corrupted and that the file extension matches the format of the file.
Why is this happening, you can see GridHtmlExcel here on the link below, its an HTML table with colspans, is the colspans messing it up?
https://jsfiddle.net/2nyjhpaz/3/
In essence it looks like you're merely dumping the contents into a file and just renaming it to an XLSX. However, that extension follows a specific XML-based schema, and that's why it doesn't play well.
You have a few options:
Find a library that can do this for you - initial searches list a few but they're often fickle little beings.
Use something like HTML Agility Pack to parse the HTML into a usable format and write it into an excel file. You might have to create an excel file manually, possibly using the Office Interop stuff.
If the excel format itself isn't that much of an issue, you could choose to write a CSV file instead (and can be opened by excel), using CSV Helper - but you'd still have to parse the HTML.

Opening a web page as an Excel file, but with .xlsx files

Okay, this might be a bit unusual. If there are better ways to do this (that are just as easy, I'm open to ideas). I found a while ago that I could open a web page consisting of a Gridview or a table, with titles, etc. as an excel file and it worked great! It formatted the Excel file with colors and alignment similar to the html from the page. With later versions of excel, though, it gives me a warning that the format time isn't valid before opening it, though it still seemed to work. So I tried changing the content type to a more current version of excel, but then I don't get anything at all. Here's what I have been doing (below).
Does anyone know how to change it so that I can open the page in a current version of Excel without getting the warning?
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs)
Response.AppendHeader("Content-disposition", "attachment; filename=Filename.xls")
Response.ContentType = "application/vnd.ms-excel"
End Sub
You are not creating an Excel file.
You are creating a HTML file with a .xls file extension. That's the wrong extension for a HTML file, and that's why Excel gives you a warning. The correct extension would be .html or .htm. Unfortunately, .html files don't automatically open in Excel, so changing the extension would require your users to manually open the file in Excel instead of just double-clicking it.
I'm afraid there's no easy way to solve this. We had the same problem, and we solved it by creating a real Excel file. There are lots of Excel libraries for .NET available. We used SpreadsheetLight, because it easily allows you to copy a DataTable to an Excel file and send that file to the web client.

Providing PDF specific parameters in an ASP.NET Generic Handler writing a PDF BLOB stream

EDIT: modified Title to be more specific
I've created a generic handler in VS2012 using their basic template as a starting point and modified it to grab a pdf from our sqlserver. The primary code block is this:
buffer = DirectCast(rsp.ScalarValue, Byte())
context.Response.ContentType = "application/pdf"
context.Response.OutputStream.Write(buffer, 0, buffer.Length)
context.Response.Flush()
And this works fine to display the BLOB as a pdf using whichever pdf plugin is installed on any given browser.
My Question: How can I modify the handler to write Adobe PDF specific parameters to the output? Specifically I'm trying to set width='fit' such that the output PDF stream will autofit the document to the width of the popup window.
NB: Writing the BLOB to a pdf file and serving the PDF is not an option.
Thanks in advance for any advice or links
I don't think there's anything that you can do in your handler. According to that document PDF viewers can examine the URL that was used to open the PDF but there are no HTTP headers that you can set. So you'll need to modify the thing that links to your handler to have those parameters in place. Alternatively, you could build a pre-handler that HTTP redirects to your new handler with those parameters in place.
Also, that document was written in 2007 and was intended for Adobe Acrobat and Adobe Reader. Most modern browsers ship with their own internal PDF viewer these days so unless you are only targeting Adobe your efforts might be wasted.

asp.net image jpeg not saving correctly

I am trying to save a jpeg image in an uploads folder which has correct permissions setup. When I test the file is being saved (eg: images/uploads/Winter.jpg) but if I try to view the image in my browser or if I attempt to open the image using anything else the image does not display.
I think that the file is not being encoded correctly before saving it to disk but am not very experienced dealing with the saving of files, encoding. Does the below code look ok or do I need to encode the file being uploaded somehow before saving it to disk?
String imgPath = "newsletter\\images\\uploads\\";
String filename = System.IO.Path.GetFileName(upload.PostedFile.FileName);
filepath = imgPath + filename;
filepath = Request.PhysicalApplicationPath + filepath;
upload.PostedFile.SaveAs(filepath);
The file saves to the correct folder but is only 150bytes in size. If I try to browse to the file and view it with an image viewer it does not display correctly.
Encoding shouldn't be a problem - the raw data isn't changing. However, it's possible the browser isn't sending all the data, or that the upload control is deleting the data before you're saving it.
Make sure that you call .SaveAs() before the page begins unloading, and before any additional postbacks. I think we'll need to see more surrounding code to help further.
Another note - by allowing the existing file extension to be used, you're allowing users to upload .aspx files, which could subsequently be executed through a request. Safe filenames are GUIDs and whitelisted file extensions. Using un-sanitized uploaded path information is very dangerous. If you re-use filenames, sanitize them to alphanumerics.

Downloaded word file displaying incorrectly

I am working on a website at the moment which is displaying a strange bug with generated word documents. The site has a feature on it which allows the user to download a word document containing information related to their visit. This file is generated via some vb.net code and takes an xml template of the final document and inserts the relevant content required.
The strange behaviour is that on some machines the .doc file generated displays fine and on others it displays as XML when opened in Word. Both behaviours have been seen in the same version of Office (2003) but on seperate machines. My question is really whether the error lies with the set up of word on the individual machines, or whether there is an error in the code.
The code to create the file and download it is as follows:
Response.Clear()
Response.ClearHeaders()
Response.AddHeader("content-disposition", "inline; filename=MyNewFile")
Response.ContentType = "application/msword"
'Create the word file as a byte array based off an xml template document'
Dim objWordGenerator As New WordFileGenerator
Response.BinaryWrite(objWordGenerator.GetWordBytes)
Response.Flush()
Response.Clear()
Response.End()
The actual xml template is quite large so probably not suitable to post here but I can provide any more information if necessary.
Update:
Having managed to fix the original bug (it turns out that the original filename being used didn't have the .doc extension) I have found another bit of strange behaviour.
When the file is opened it opens in Word correctly, however when you go to save it the default file type is XML. When saved as an XML file it will open in Word correctly, but I feel this is slightly confusing behaviour for the end user. I would like the file to default to saving as a DOC file instead. Is there a way to force this to happen?
Update 2:
Below is a section of the XML that relates to the Document properties. The rest of the document deals with content and styles etc, so my assumption is that this is the most relevant section. To reiterate, my problem is that when the downloaded .doc file is opened in word, the default "save as" option is as an XML file.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/core" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" w:macrosPresent="no" w:embeddedObjPresent="no" w:ocxPresent="no" xml:space="preserve">
<o:DocumentProperties>
<o:Title>Fancy Word Doc</o:Title>
<o:Author>Bob Bobertson</o:Author>
<o:Characters>999</o:Characters>
<o:Company>A Fancy Company</o:Company>
<o:Version>1.1.1</o:Version>
</o:DocumentProperties>
Cheers
The File -> SaveAs filetype is XML because that is what the file open in Word is. If you want it to say 'Word Document (*.doc) then you will need to create a real Word document on the server and not an XML. Just by putting a .doc extension on the filename doesn't change it's real contents. Word knows the file type that is loaded into it and suggests that as the file type when saving. I don't know of any way to override this behavior.
I've been using Office XML with Excel for awhile now and this is very similar to the code that I'm using to send it down to the client. You might want to try and see if it works for you.
Dim xml As XmlDocument = New XmlDocument()
xml.Load("report.doc")
Response.ContentType = "application/vnd.ms-word"
Response.AppendHeader("CONTENT-DISPOSITION", "attachment; filename=report.doc")
Response.Write(xml.OuterXml)
Try it with firefox and you will probably find that it will be saved with the correct extension.
IIRC, since version 3 IE prefers to ignore the mime type and sniff the file content to see what the "correct" file format is. Maybe is uses the magic cookie?
Is this Word 2007 or later? Try
Response.AddHeader("content-disposition", "attachment; filename='MyNewFile.doc'")
attachment encourages the browser to save the file instead of displaying it.
I ran some tests and could not reproduce your problem on my system in Word 2003. Without a specific example (and actual file that is misbehaving), it would be pure speculation to make any suggestions.

Resources